25.6k
views14
comments

0 score	+
	-

What makes Haskell sooo much faster than F#?

introduction f#functional compiler haskell

Hi,

just for fun I implemented one of the famous Haskell qsort language introduction samples in F#. I was surprised how much slower F# performed compared to Haskell.

A naive qsort on a list in Haskell looks like this:

1
2
3

qsort [] = []
qsort (x:xs) = qsort ys ++ x : qsort zs
  where (ys, zs) = partition (< x) xs

A nice article about the performance of such a naive implementation can be found here. Basically the article says that sorting 100,000 ints takes less than 3 seconds (on a 1.5GHz Apple Powerbook with 1GB of RAM).

Here is an implementation fo a similar naive qsort in F#:

let rec qsort l =
    match l with
    | x::t -> let l,u = List.partition ((>=)x) t in (qsort l) @ x :: (qsort u) 
    | [] -> []

When I first tried to sort 100,000 elements this way in the F# interactive console I thought that it got stuck because it did not respond for a very long time. After trying a few times I realized that it simply runs extremely slow compared to the Haskell code, e.g. sorting 10,000 ints took 15 seconds, 20,000 took 67 seconds (4,6x), 30,000 took 165 seconds (11x) ... so sorting 100,000 elements this way was completely impractical. Using lists qsort no longer follows the O(n log n) performance expectation but seems to become nearly O(n^2).

I guess the list concatenation dominates the hole algorithm in F# since it requires the copying of the immutable list that is prepended to generate a new element that links to the appended list... But I wonder how the Haskell GHC compiler can eliminate these expensive operations? Has anybody an idea how lists are represented in Haskell in memory and why the code is so much faster with Haskell than F#?

Please understand that I do not want to criticize F#. I am pretty new to functional programming...

Andreas

PS: Of course I know that List.sort exists...

.Reply

Are you ensuring that you're benchmarking equivalent cases? For favorable partitions, the F# version is also O(n log n). Here's my reasoning, where T(n) is the time to calculate qsort over a list of size n where the two partitions are of equal size. Then the partition function itself is O(n), as is the append, so that T(n) = 2T(n/2) + O(n), which by the Master Theorem means that T(n) = O(n log n).

To test this, define the following function:

let rec lst = function
| 0 -> []
| n ->
  let l = lst (n-1)
  let k = pown 2 (n - 1)
  k :: l @ List.map (fun x -> x + k) l

Then lst n is a list of 2^n - 1 numbers where the first number is the median, and the first number of each partition is also the median of that partition, etc., so that partitions are always of equal size. On my computer, qsort (lst 18) takes about 1.8 seconds whereas qsort (lst 19) takes about 3.7 seconds, which is consistent with a complexity of O(n log n) and inconsistent with a complexity of O(n^2).

Unfortunately, for less fortunate arrangements of numbers, the complexity can be much worse. For instance, when sorting [1 .. 100000] or List.rev [1 .. 100000], the complexity is at least O(n^2) and you'll see very poor performance. I think that for random lists, the result should still be O(n log n), which can be tested by something like:

let r = System.Random() in qsort [for i in 1 .. 100000 -> r.Next()]

This runs in ~0.8 seconds on my computer, and I think that the Haskell benchmark you're citing is also run on a random list. I don't think that laziness will prevent Haskell's performance from degrading when given a pessimally ordered list, either.

By Keith Battocchi on 5/20/2010 5:57 PM (permalink)Reply

Hi kvb & brianmcm,

Thank you guys for the quick responses.

@kvb: You are right: I tested with [1..10000] which is definitely one of the input-cases that cause very bad performance (maybe only reversed sorted inputs show even worse performance?).

To do some direct comparison of Haskell and F# I will install GHC later and run some tests with the same inputs. When I saw the Haskell programm running on the machine of a friend of mine the output of the sorting-result in Haskell started quickly after the program was started but the speed of the output seemed to be dependent on the size of the list to sort. I am not sure whether this is linked to the overall laziness of Haskell which might be able to determine when the next element for output has been determined.

I have not programmed in Haskell yet, I just noticed that all IO happens in terms of monads. I am not sure whether it is correct to think of the execution process of IO operations in the bind operation of the IO monad...? (sorry to ask this on a a F# forum)

Andreas

By blueling on 5/21/2010 1:35 AM (permalink)Reply

Here is the result of the quick comparision that I did (sorting an already sorted list with the naive qsort that I posted above):

Haskell:
10000: 1.01 secs
20000: 4.48 secs
30000: 11.84 secs
40000: 27.38 secs
50000: 59.05 secs

F#:
10000: 14.086 secs
20000: 65.036 secs
30000: 157.997 secs
40000: 302.095 secs
50000: "Process is terminated due to StackOverflowException."

Execution time of F# relative to Haskell:
14.1 / 1.0 = ~14.1 times
65.0 / 4.5 = ~14.4 times
158.0 / 11.8 = ~13.4 times
302 / 27.9 = ~10.8 times

So in this case it looks like that the code generated by the GHC is approximately by a constant factor of 10-14 faster than the code generated by F#. Native code execution vs. jitted CLR code might of course be a small advantage for Haskell. What it clearly shows is that the overall runtime complexity of the naive list qsort in F# is the same as the Haskell version. So sorry for bothering you guys .. I should have done these comparision-test before posting my initial post.

Andreas

PS: I used the following haskell source:

import System.Environment (getArgs)
qsort :: Ord a => [a] -> [a]
qsort [] = []
qsort (x:xs) = qsort ys ++ x : qsort zs where (ys, zs) = partition (< x) xs
main :: IO ()        
main = do
    (x:xs) <- getArgs
    let count = read x :: Int
    print (last (qsort  [1..count]))

To compile I used "ghc -c -O qsort.hs"

By blueling on 5/21/2010 5:13 PM (permalink)Reply

Actually the real bottleneck appears to be the use of "generic comparison" in the partition. Simply put a type constraint on qsort and the performance nearly matches Haskell on 20000 elements (11.9 seconds vs 11.2 seconds on my machine). That is, write

let rec qsort (l:int list) = ...

By andrewkennedy on 5/25/2010 7:27 AM (permalink)Reply

Actually the real bottleneck appears to be the use of "generic comparison" in the partition. Simply put a type constraint on qsort and the performance nearly matches Haskell on 20000 elements (11.9 seconds vs 11.2 seconds on my machine). That is, write let rec qsort (l:int list) = ...

Interesting. What would happen if we give Haskell some type hint to use unboxed element (is it !Int or something like that) ?

By gary on 5/25/2010 6:32 PM (permalink)Reply

Just have to point out that it's one thing to say you know about List.sort, but another to not mention it in the supposed benchmarks. The code below runs in 17ms on my box, which is about 3500 times faster than your Haskell example.

 

let l = [1..50000]
let sw = System.Diagnostics.Stopwatch.StartNew()
let r = List.sort l
sw.Stop()
printfn "%A" sw.ElapsedMilliseconds

By brianmcn on 5/21/2010 11:07 PM (permalink)Reply

Yes, the built-in List.sort functions of both Haskell and F# are many times more efficient than the naive list qsort that I used for the benchmark...

By blueling on 5/22/2010 9:22 AM (permalink)Reply

This thread was somewhat disconcerting.

What options do you have if you have elegantly and succinctly represented the problem in F#, but you find that it does not perform and you have nothing to tweak. Do you have to abandon it?

Quicksort is a classic poster boy for functional languages. When you have to implement something like that, I would expect F# to be the go-to language.

By Tormod on 5/24/2010 3:27 PM (permalink)Reply

What options do you have if you have elegantly and succinctly represented the problem in F#, but you find that it does not perform and you have nothing to tweak. Do you have to abandon it?

Having a background that includes both embedded systems programming and optimization in high-level languages, I've had to deal with this problem in every language from C onwards.

My general approach is this:

1) Decide what's most important for this particular block of code. Is it performance, maintainability, implementing a specific algorithm?

2) If it's performance, and the algorithm isn't making the grade, is there an equally elegant algorithm that does?

3) If there's not an elegant algorithm, and performance is paramount, there's simply no alternative to the ugly code block. Think of it as the pumbling behind the walls that has to be there sometimes even in an elegant hotel, lol. (And sometimes I will leave a comment saying: "I would have done it this way had it been fast enough...")

So I wince when I have to pick option 3, but I remember what Robert Browning said:

"Ah, but a man's reach should exceed his grasp, or what's a heaven for?"

And it keeps me from feeling bad about the fact that I've had to write something ugly in an otherwise elegant language.

-Neil

By TechNeilogy on 5/25/2010 7:28 AM (permalink)Reply

While we are at it(about performance), this read is quite interesting

[link:labs.oracle.com]

By gary on 5/25/2010 6:25 PM (permalink)Reply

While we are at it(about performance), this read is quite interesting
[link:labs.oracle.com]

That's good food for thought.

By mau on 5/26/2010 2:27 AM (permalink)Reply

What is disconcerting is the fact that anyone gives any pragmatic credence to a 'functional quicksort'. If you want to sort, use an array, use a built-in sort method, and it will be much much much faster than anything pure/lazy in Haskell. When has anyone in the history of the world ever needed a lazy functional sort (other than to demo Haskell elegance)?

If you find something pragmatic that addresses a real world problem, that is solved more elegantly and performantly in Haskell, then post that. This whole thread (and title) is ridiculous. Haskell has to be very very good at immutable lazy programming, because that's pretty much all you can do in Haskell. In F# we have mutability, which when used in locally-scoped small doses, is likely to outperform Haskell without sacrificing elegance or referential transparency. This whole thread smells to me like philosphers debating angels dancing on pinheads, rather that real-world programmers debating run-time performance.

By brianmcn on 5/24/2010 3:52 PM (permalink)Reply

Yes, the built-in List.sort functions of both Haskell and F# are many times more efficient than the naive list qsort that I used for the benchmark...

List.sort I believe is using mutable data structure(or even calling some underlying .NET thing, similar to Python/Perl where it is implemented in C) whereas Haskell is still implemented in Haskell and not pulling in other support.

So from a 'language' perspective, Haskell is still much more efficient when programming in immutable style. F# allows immutable style with a performance penalty.

By gary on 5/22/2010 12:50 PM (permalink)Reply

Yes, this is an enormously poor algorithm in a strict language with immutable lists; I think the perf may be worse than O(N^2).

In Haskell, lists are lazy, and this affects the big-oh of the 'append' operation '++', I think it can be O(1) by just saving the 'lazy thunk to compute the rest of the list'. (You could do the same thing in F# using the LazyList library in the F# PowerPack.)

Briefly, this boils down to 'why is this algorithm faster than that one' and the answer is 'they have different Big-oh'. Haskell, being lazy, has completely difference performance characteristics for 'syntactically similar' algorithms coded in a strict language.

By brianmcn on 5/20/2010 5:22 PM (permalink)Reply

Topic tags

Built with WebSharper

Home

Answers

Events

Courses

Groups and Conferences

Blogs

Jobs

Developers

Topic tags