[Solved]: What is meant by superlinear speedup? Is it possible to have superlinear speedup in practice?

Problem Detail: In parallel computing, I know the speed up equation is $$ frac{1}{ s + frac{1-s}{p} } $$ But what is meant by superlinear speed up? Is it something theoritical? Could you explain it with equations?

Asked By : LuckyB

Answered By : Evil

With equatuon: not really. Superlinear speedup comes from exceeding naively calculated speedup even after taking into account the communication process (which is fading, but still this is the bottleneck). For example you have serial algorithm that takes $1t$ to execute. You have $1024$ cores, so naive speedup is $1024x$, or it takes $t/1024$, but it should be calculated like in your equation taking into account memory transfer, slight modifications to algorithm, parallelisation time. So speedup should be lower than 1024x, but sometimes it happens that speedup is bigger, then we call it $superlinear$. Where it comes from?
From vast amount of places: cache usage (what fit into registers, main memory or mass storage, where very often more processing units gives overall more registers per subtask), memory hit patterns, simply better (or a slight different) algorithm, flaws in the serial code.
For example random process that searches space for result is now divided into $1024$ searchers covering more space at once so finding solution faster is more probable.
There are byproducts (if you swap elements like in bubble sort and switch into gpu it swaps all pairs at once, while serial only up to point). On the distributed system communication is even more costly, so programs are changed to make memory usage local (which also changes memory access, divides problem differently than in sequential application). And the most important, the sequential program is not ideally the same as parallel version – different technology, environment, algorithm etc. so it is hard to compare them. Excerpt from “Introduction to Parallel Computing” Second edition by Ananth Grama, 2003

Theoretically speedup can never exceed the number of processing elements $p$.
If the best sequential algorithm takes $T_s$ units of time to solve a given problem on a single processing element, then a speedup of $p$ can be obtained on $p$ processing elements if none of them spends more than time $T_s/p$.
A speedup greater than $p$ is possible only if each processing element spends less than time $T_s/p$ solving the problem.
In this case, a single processing element could emulate the $p$ processing elements and solve the problem in fewer than $T_s$ units of time.
This is a contradiction because speedup, by definition is computed with respect to the best sequential algorithm.

So the name “superlinear” in this context comes from definition of speedup.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/55433 Ask a Question Download Related Notes/Documents