[Solved]: Compute square root using (bit) additions and shifts as primitives

Problem Detail: 

Question: Given an $n$-bit natural number $N$, how to compute $lceil sqrt{N} rceil$ using only $O(n)$ (bit) additions and shifts?

The tip is to use binary search. However, I could not achieve the required complexity (I got $O(n^2)$). What does it mean by using only $O(n)$ (bit) additions and shifts: This is an exercise in an algorithm book.
In my opinion, it means that adding two, say $n$-bit, natural numbers costs $O(1)$ and shifting a, say $n$-bit, natural number also costs $O(1)$. Then we are only allowed to use such $O(1)$ operations $O(n)$ times.
It does not mention the cost of comparison. I guess we can ignore it or assume that comparing two, say $n$-bit, natural numbers costs $O(1)$ as well. My $O(n^2)$ algorithm:

  1. Determine the range of the number of bits $t$ of $lceil sqrt{N} rceil$: $$2^{frac{n-1}{2}} le sqrt{N} le 2^{frac{n}{2}} Rightarrow 2^{lfloor frac{n-1}{2} rfloor} le lceil sqrt{N} rceil le 2^{lceil frac{n}{2} rceil}$$ Therefore, $$t_1 triangleq lfloor frac{n-1}{2} rfloor + 1 le t le lceil frac{n}{2} rceil + 1 triangleq t_2.$$
  2. Binary search: Find $lceil sqrt{N} rceil$ between $2^{t_1}$ and $2^{t_2}$ using binary search. For each number $x$, to compute $x^2$ using additions and shifts as primitives and compare it with $N$.

The complexity is thus $O(n times n) = O(n^2)$ for $O(n)$ times of binary search and computing $x^2$, each of which in turn takes $O(n)$ additions and shifts.

Asked By : hengxin

Answered By : D.W.

An iterative algorithm seems like it should work. Let $M=lfloor N/4 rfloor$. Suppose we know that $x$ is the integer approximation to $sqrt{M}$, i.e., $x=lceil sqrt{M} rceil$, and suppose we know the value of $x^2$ (obtained previously). Now we want to find $y=lceil sqrt{N} rceil$. What are the possible values of $y$? I’m pretty sure the only possible values are $y=2x$ or $y=2x+1$. And, it’s easy to try both of them and see which is correct. In particular, for $y=2x$, we have $y^2=4x^2$, which can be obtained from $x^2$ by two left-shifts ($O(1)$ time); for $y=2x+1$, we have $y^2=4x^2+4x+1$, which can be obtained from $x^2$ and $x$ with four left-shifts and two additions ($O(1)$ time). Now just compare those two values to $N$ to see which one is correct. In this way, we get an iterative algorithm where we do $n/2$ iterations, and where each iteration takes $O(1)$ time. The total running time is $O(n)$, as required. I realize this didn’t use binary search. Oh well.
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/30334