Note that the “CPS” approach has done great harm to performance in SML/NJ. Its physical evaluation model violates too many assumptions that are built into the hardware. If you take big symbolic applications of SML like Isabelle/HOL, SML/NJ with CPS comes out approx. 100 times slower than Poly/ML with its conventional stack.
Can someone explain the reasons for this? (Preferably with some examples) Is there an impedance mismatch here?
Asked By : Guy Coder
Answered By : Makarius
- Basic performance loss according to the “first approximation” above.
- SML/NJ heap management and GC has severe problems to scale beyond several tens MB; Isabelle now uses 100-1000 MB routinely, sometimes several GB.
- SML/NJ compilation is very slow — this might be totally unrelated (note that Isabelle/HOL alternates runtime compilation and running code).
- SML/NJ lacks native multithreading — not fully unrelated, since CPS was advertized as “roll your own threads in user space without separate stacks”.
The correlation of heap and threads is also discussed in the paper by Morriset/Tolmach PPOPP 1993 “Procs and Locks: A Portable Multiprocessing Platform for Standard ML of New Jersey” (CiteSeerX) Note: PDF at CiteSeerX is backward, pages from from 10-1 instead of 1-10.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/10233