[Solved]: The "CPS" approach has done great harm to performance in SML/NJ; reasoning desired

Problem Detail: In a comment to Learning F#: What books using other programming languages can be translated to F# to learn functional concepts? Makarius stated:

Note that the “CPS” approach has done great harm to performance in SML/NJ. Its physical evaluation model violates too many assumptions that are built into the hardware. If you take big symbolic applications of SML like Isabelle/HOL, SML/NJ with CPS comes out approx. 100 times slower than Poly/ML with its conventional stack.

Can someone explain the reasons for this? (Preferably with some examples) Is there an impedance mismatch here?

Asked By : Guy Coder

Answered By : Makarius

At first approximation, there is a difference in “locality” of memory access, when a programm just runs forward on the heap in CPS style, instead of the traditional growing and shrinking of stack. Also note that CPS will always need GC to recover your seemingly local data placed on the heap. These observations alone would have been adequate 10 or 20 years ago, when hardware was much simpler than today. I am myself neither a hardware nor compiler guru, so as second approximation, here are some concrete reasons for the approx. factor 100 seen in Isabelle/HOL:

Basic performance loss according to the “first approximation” above.
SML/NJ heap management and GC has severe problems to scale beyond several tens MB; Isabelle now uses 100-1000 MB routinely, sometimes several GB.
SML/NJ compilation is very slow — this might be totally unrelated (note that Isabelle/HOL alternates runtime compilation and running code).
SML/NJ lacks native multithreading — not fully unrelated, since CPS was advertized as “roll your own threads in user space without separate stacks”.

The correlation of heap and threads is also discussed in the paper by Morriset/Tolmach PPOPP 1993 “Procs and Locks: A Portable Multiprocessing Platform for Standard ML of New Jersey” (CiteSeerX) Note: PDF at CiteSeerX is backward, pages from from 10-1 instead of 1-10.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/10233

[Solved]: The “CPS” approach has done great harm to performance in SML/NJ; reasoning desired

Asked By : Guy Coder

Answered By : Makarius

Best Answer from StackOverflow

Related