Tuesday, May 5, 2009

100 lines of SML code becomes over 600 lines of F# code

Jon H., F#'s biggest advocate and the author of "F# for Scientists" says:
100 lines of SML code becomes over 600 lines of F# code and runs 5x slower.
Now, I've seen everything! I'm going back to C with some ASM spliced in...

Monday, April 27, 2009

Bizarre Python behavior

A redditor pointed this out, but I'm having difficulty locating the link now:

Identity can be quite surprising in Python:

>>> x = 'a'
>>> y = 'a'
>>> x is y
True
>>> x = ' a'
>>> y = ' a'
>>> x is y
False

Saturday, April 18, 2009

[Haskell] still very difficult to write robust programs


Our informal poll shows Haskell to be the most "popular" FPL, but is Haskell suitable for serious work?

Adrian H., a prominent member of the FPL community and an experienced Haskell programmer, shared this in comp.lang.functional newsgroup:






IMO by far the biggest problem with Haskell is that, for all it's sophistication, it's still very difficult to use it to write *robust* programs (as opposed to programs that are merely "correct" in an academic sense). But often you can live with this, if all you're using it for is to knock up quick and dirty "run once" programs to get whatever answer you're looking for. As long as the program terminates eventually (as it usually does), who cares about all those space leaks or the sucky performance?

I suspect that most Haskell users use the language this way (as I kind of super calculator or for private "in house" stuff). I do anyway. As you have observed before, what might be regarded as "serious" publicly available applications written in Haskell seem to be pretty rare :-( (ghc,pugs,darcs,xmonad are about all I can think of off the top of my head).
The second biggest problem with Haskell is..something I dare not mention again (it's tabboo in the Haskell community).
The second biggest problem was explained later:
the embarrassing fact that the worlds finest imperative programming language could never be used to implement most of its own standard or non-standard IO infrastructure
[due the inability to safely use unsafeFoo required for this task (see link) and the flaws in its otherwise neat type system]

Thank you, Adrian, for your candor. Characterizing Haskell as a sophisticated academic calculator and an experiment, whose main value is in teaching us how not to design languages, certainly wouldn't sell many books.

Earlier, I proposed that the FPL community should explore other, more practical, approaches to language design, specifically, for the eagerness/laziness issue I advocate the kind of approach Clojure uses. The functional programming community needs to get out of its rut and move on.

Tuesday, April 14, 2009

What a practical modern FPL ought to be

As just about everyone using FPLs, except a few fanboys, realizes, the functional languages available today are often super-awesome in some aspects, and totally suck in others. I did a detailed write-up on Clean last time. What follows is my opinion on what a modern practical functional language ought to be, but, sadly, ain't. Creating a decent seed implementation like that would probably take someone who knows what he's doing a year or so, taking advantage of existing code.

Syntax: Haskell-like, with significant white space. This is where Haskell and Clean get it right.

Macros: the need for macros is diminished in languages that can be lazy. I don't believe a language should be designed around macros, like Lisp is. Programming in Haskell, I rarely feel the need for macros, if at all. As a classic example, || can be defined as a function in it.

Evaluation: Functions should be strict by default, optionally lazy. Default data structures: lazy by default. This way, you get the best of both worlds. This is where Clojure gets it right, and almost no one else.

Runtime: JVM. JVM is very fast these days, and it will only get faster. Some people will bring up the lack of full tail-call optimization in JVM, but it's coming soon. JVM can do "unthinkable" feats like inlining virtual function calls across separate compilation units and even code generated at runtime. Not having to develop your own runtime is big. Having easy access to huge libraries is important too. And the fundamental difference between calling C from Haskell and, say, Java from Clojure is that the latter is memory-safe, even if you screw up.

Type system: Statically typed (if you still think that cool hackers can write correct programs in dynamically typed languages and sleep at night, google for how Hacker News got hacked, it was a type error!). Uniqueness types a la Clean are the way to go, in my opinion.

How to build it:

Port/combine code from the relevant languages (functional code should be easy to compose, own dogfood and all that). Some bootstrapping may be necessary.

Relevant languages:
  1. CAL: A Haskell-like language for JVM. Unfortunately lazy by default with somewhat odd syntax. No uniqueness types
  2. Haskell: runtime not very performant, not JVM; lazy by default; no uniqueness types
  3. Clojure: dynamically typed; syntax less readable than Haskell's, in my opinion. Great call on JVM and strict/lazy evaluation/data structures.
  4. Clean: great type system; runtime with no future; bad license; lazy by default
  5. MLs: strict and statically typed; but everything else is not so good
  6. Strict Haskell: doesn't really exist

Thursday, April 9, 2009

Clean Programming Language: the insanely great, the weird and the short-sighted

If you are reading this blog, chances are you are one of those people who think that Haskell is the most beautiful language ever. Well, you are wrong! Haskell is only the second most beautiful language ever, after Clean (they actually want you to register before downloading, so don't click that link).

The insanely great:
  • Like Haskell, but simpler. Haskell is primarily a research language. Monads, monad transformers, lifts, arrows, templates, bang patterns, etc. etc. get added to GHC on a regular basis. If you want to use Haskell, you'll have to constantly deal with unproven technology in FPL design, like monads and monad transformers. Clean is simpler. If you know Haskell (even if you don't fully grok monad trasformers), you can learn all there is to Clean in a few hours. For better or for worse, the features are pretty stable and simple, which brings us to my second point:
  • No monads necessary. That's right, a lazy functional language that can perform mutation and IO safely without monads. Imagine that! Clean uses uniqueness types in its otherwise Haskell-like type system. This choice turns out to be both easier on the brains and on the optimizer...
  • As fast as C (the compiler is very fast too), and it ain't no bullshit either! You may have heard from Haskell advocates that Haskell is as fast as C. Well, it's really a half-truth. Sometimes, there is simply no substitute for a good old-fashioned array and mutation of it. It's the only data structure known to man that gives you O(1) read, write with a small constant factor to boot. Haskell can give you that, but your choices are: (1) state and other monads (that don't compose too well) with mutable arrays in GHC (that can cause insane GC slow-downs), defeating the purpose of using them or (2) malloc'ed array hacking. If you look at the Shootout entries, Haskell entries are ugly unsafePerformIO and malloc-using hacks with everything stuck into IO monad, where Clean is idiomatic and simple, and still faster!
The weird:
  • 1980s called, they want... If your Clean program encounters a runtime error, like off-by-1 access, it will print a message naming the error and quit. This may have been good enough in the 80s, but these days I expect more finesse. Haskell allows you to handle these situations in a customized way.
  • No FFI. WTF?
  • Have someone who speaks English natively proof-read your books, manuals, etc. I won't go into detail here. I too suck at writing.
The short-sighted:
  • License. For a long time, Clean was closed-source, offering only commercial licenses. A few years ago, Clean was released as open source under LGPL. Can you now write and distribute your closed-source commercial code using Clean? I'm not a lawyer, so I couldn't give you a definitive answer, but in doing so, you would be incorporating part of Clean in your generated binary (statically linked runtime). This might not sit well with LGPL and the Free Software Foundation that may demand that you make your code open-source for doing that.
  • Development model. Clean may be open-source, but it's basically developed in-house. I do believe that programmers should be paid for their time and expertise developing software, especially programmers and brilliant as the authors of Clean must be, but the reality is that programming languages are different from most other software: As a consumer of a programming language, your investment both learning and writing code in it is tremendous. You want to be assured that your language will continue to exist in the future and won't cost an arm and a leg later. You simply won't use a language for anything important, unless you get these assurances. That's why, unless your vendor is a predictable monster like MS, for a programming language to be viable it has to either have open standards and multiple vendors, or have a fully open and distributed development and place no restrictions on the runtime (generated binaries) whatsoever. I want to ask Clean authors: is it better to have 5 users of whom 0 or 1 are paying customers or to have thousands of non-paying users, who however might make you famous and buy your books?
Conclusion:

If you are developing performance-demanding code and love FP, Clean may be a very good choice for you, assuming that you don't shy away from LGPL and can live with some eccentricities like missing FFI and 80's style error messages.

Wednesday, April 8, 2009

Beware the monads!

It seems that every noob learning Haskell has put up a blog raving about how monads are great. This might make you think twice about whether monads are the right approach to computation:

Boojum writes on Programming Reddit:

I can get along with the IO monad just fine, for example. But then I also need to carry some state of my own around. For one of my projects, Parsec would be pretty handy, but then that has its own monad too. Fairly quickly I find myself overwhelmed by the proliferation of monads to juggle.

I'm aware that monad transformers are supposed mitigate this problem but I've yet to find a tutorial on them that doesn't leave me even more confused about how to put them all together. I think there's definitely room for a better tutorial on this. If anyone knows of a good one, I'd love to see it.


Oleg Kiselyov, a functional programming researcher and practitioner had this to say:

... it seems that the unending stream of monad tutorials explaining once again what is essentially a functional composition (or the A-normal form, to be more precise) points out that perhaps the concept of a Monad isn't at all a good match for a programming language practice. Monads may be deeply rooted in Category Theory -- but then Peano numerals are too have the clearest mathematical foundations and yet nobody seriously proposes to use them for practical numerical computations.


DDS Manual describes the difficulties of using monads in Haskell:

This puts us in a situation where we really need two copies of every higher-order function, one pure and one monadic. This is even more of a problem when a library (like Data.Map) exports a pure version but not a monadic one. In this case we either have to refactor our code to be less monadic, or spend time writing a function that should really be in the library.