Find all intervals that overlap with a given interval

Question Detail: Note: I moved this question from stackoverflow.com I have an algorithmic problem where I would like to see if it can be solved in better than $O(n)$: I have given a table $T$ of $n$ elements where each element is a tuple $(s_i, e_i)$ with $s_i, e_i in mathbb{N}$ and $s_i < e_i$, i.e. each tuple is some kind of an interval. I have to find all intervals that overlap with a given interval $[t_0, t_1]$ with $t_0, t_1 in mathbb{N}$ and $t_0 < t_1$. Further, I have available two sorted lists $S$ and $E$, containing the $s$ values, or $e$ values respectively, together with the index $i$ pointing to the respective entry in $T$. The lists are sorted by $s$ values, or $e$ values respectively. (Let’s assume both, $s$ and $e$ values, are unique.) Problem: We have to find each interval/tuple $(s_i, e_i) in T$ where $s_i leqslant t_1$ and $e_i geqslant t_0$. My thoughts so far: We can exclude some elements by either applying one of the interval boundaries, i.e. searching $t_1$ in $S$ or $t_0$ in $E$. This gives us a list $L$ of remaining elements: $$ L leftarrow {e in E mid e geqslant t_0} text{ or } L leftarrow {s in S mid s leqslant t_1} $$ However, there is no lower bound on the number of elements in $L$, no matter which search we perform. Further, we have to check every element in $L$ if $s leqslant t_1$, or $e geqslant t_0$ respectively depending on which search we performed before. The complexity for this solution is $O(n)$. However, let’s say that $k$ is the maximum number of elements overlapping with interval $[t_0, t_1]$. If we assume $k ll n$, then the complexity is $O(n/2)$ since we can exclude at least $n/2$ elements by choosing the appropriate search for $L$. Still $O(n/2)$ is in $O(n)$. Can you think of a better approach to solve this problem? For record: The complexity for finding all intervals overlapping with a certain given interval using an interval tree is $O(log n + k)$ where $k$ is the number of overlapping intervals. However, in my practical case I am using a MySQL database which provides index trees for each value, i.e. $s$ and $e$, separately. This way I can not find overlapping intervals in less than $O(n)$. I would need to create an interval tree which is a search tree that stores both interval boundaries, i.e. $s$ and $e$, in a single data structure. The complexity for constructing the interval tree is $O(n log n)$. [http://www.dgp.utoronto.ca/people/JamesStewart/378notes/22intervals/]

Asked By : sema
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/14311

Answered By : D.W.

I believe that interval trees offer a solution to your problem. Basically, you store your intervals in the interval tree data structure; then, to find all intervals that overlap with $[t_0,t_1]$, you do a query into the interval tree. This should solve your problem and run faster than $O(n)$ time.