OCaml Weekly News

Hello

Here is the latest OCaml Weekly News, for the week of November 25 to December 02, 2014.

teaching OCaml
Reminder: Next OCaml users in Paris meetup, 9th of december
Sundials/ML 2.5.0
cconv-0.2
Other OCaml News

teaching OCaml

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2014-11/msg00094.html

Robert Muller asked:

Greetings. Bob Muller here, in the CS dept. at Boston College. I've set out to
develop an intro CS course in ML that I hope will be well-suited for similar
universities in the US. My original plan was to teach the course in SML but
after talking with a few people at neighboring schools, I switched to OCaml. I
am now in the final weeks of the first run of the course.  I plan to document
my experience more fully at some point but I wanted to touch base with the
OCaml community because I'm teaching the course again in the spring and I am
leaning toward switching to F#.

While OCaml has in many respects been great and it's easy to see that my
students find the OCaml style of coding very compelling, there are significant
problems. Of course, OCaml wasn't designed for teaching but I'm hoping that
someone on this list might be able to advise me about solutions to some of
these that I just don't know about.

1. Error messages: It's difficult to give good type errors for ML but I was
hoping that the state-of-the-art of type error reporting had improved. When my
students receive a type error, they are utterly mystified,

2. GUIs: several of my problem sets work with simple graphics (e.g., rendering
tessellations) or animations (e.g., a maze walk or a simplified form of
tetris, or the game "Flow"). We have been hobbling along with the Graphics and
Labltk modules for this but it has been more pain than my students ought to
know. We also have some problem sets that work with audio so I would like
support for that.

Any thoughts, ideas and/or leads on either of these would be much appreciated.
I already plan to look at js_of_ocaml more closely.

John Whitington replied:

> 1. Error messages: It's difficult to give good type errors for ML but I was
> hoping that the state-of-the-art of type error reporting had improved. When my
> students receive a type error, they are utterly mystified,

Arthur Chargueraud has done some work on this, by fixing up the error
messages using an auxilliary typechecker. There is an OPAM compiler
switch for it, I recall.

http://gallium.inria.fr/blog/making-it-easier-for-beginners-to-learn-ocaml/

I think it's very important for students to understand how to do
a kind of mock type inference on paper or in their head. This helps
a lot with understanding errors, though they will always happen, and
the error messages are tough to begin with. For example, from the
standard library:

let rec fold_left f accu l =
  match l with
    [] -> accu
  | a::l -> fold_left f (f accu a) l

1. Looking at the form of the first line, the type must have the form

... -> ... -> ... -> ...

2. Looking at lines 2 and 3, we know that l is a list, and that the
final output has the same type as the 'accu' (assign it 'a):

... -> 'a -> ... list -> 'a

3. Look at the last line. Now we can see that f is a function, because
it is applied, and we can see its first argument is the same type as
'accu' too:

('a -> ... -> ...) -> 'a -> ... list -> 'a

Since 'accu' feeds back through the recursive call, 'f accu a' must
have that  type too:

('a -> ... -> 'a) -> 'a -> ... list -> 'a

We have no information to show that the type of the list which 'a'
comes from in 'f accu a' must be the same as 'a, so call it 'b:

('a -> 'b -> 'a) -> 'a -> 'b list -> a

If they can do that one, they can probably do most things that might
appear in a beginners course.

I'm also strongly opposed to beginners being encouraged to use type
annotations in code, which people sometimes do to get 'better' error
messages. I think it confuses more than helps. But that seems to be
a controversial opinion, from what I can tell.

> 2. GUIs: several of my problem sets work with simple graphics (e.g., rendering
> tessellations) or animations (e.g., a maze walk or a simplified form of
> tetris, or the game "Flow"). We have been hobbling along with the Graphics and
> Labltk modules for this but it has been more pain than my students ought to
> know. We also have some problem sets that work with audio so I would like
> support for that.

I wrote a version of Graphics which writes to PDF. If you have a PDF
reader which re-renders when the file on disk is updated, this can be
used for the non-interactive stuff, and it feels a bit more modern
than the built-in graphics window.

https://github.com/johnwhitington/graphicspdf

You can get it in OPAM.

Robert Muller then said and John Whitington replied:

> Thank you John. I will look at your graphics library, much appreciated.
> I didn't require it this first time through but many of my students
> bought your book:
>
> http://www.cs.bc.edu/~muller/teaching/cs110105/f14/lib/html/textbooks.html
>
> They tell me it is very helpful.
>
> To me, what is needed to get ML-in-101 off the ground is simplicity:
>
> 1. a simple implementation/IDE that is trivial to install and "just
> works" on macs, Windows (and linux, I suppose), I am thinking of
> something like the experience I had in the past with Dr. Java.
> 2. a simple, no fuss library for graphics, animation and audio something
> like Sedgewick & Wayne's stdlib (http://algs4.cs.princeton.edu/code/)
> 3. decent error messages for both syntax errors and type errors.
>
> Ideally, I would prefer that both the IDE and the graphics/audio were
> hosted in a brower
> much like in Elm (http://elm-lang.org/).

Have you seen this? I've not used it but I believe it's related:

https://github.com/andrewray/iocaml

> I am only dimly aware of other universities teaching ML. I know that
> some folks in Denmark are using F#. Do you know if any schools in Europe
> are using OCaml in intro courses?

Here's a list, to which you can add yourself:

http://ocaml.org/learn/teaching-ocaml.html

We teach Standard ML at Cambridge, so we have even fewer tools :-) But
then we don't do anything graphical. Our first years do ML & Java, so
they get plenty of graphical / GUI work with Java.

Here's what our ML course looks like (and, basically, has done since 1994):

http://www.cl.cam.ac.uk/teaching/1415/FoundsCS/fcs-notes.pdf

Drup then added:

There are custom toplevels for js_of_ocaml where you can experiment
with reactive and interactive features, a bit like Elm's "IDE".

See https://github.com/ocsigen/js_of_ocaml#toplevel and
http://ocsigen.github.io/js_of_ocaml/#version=4.02.0
The source code is [here][1] and should be quite easy to modify if you
want to add a library (or modify the UI to make the drawing area
bigger).

[1]: https://github.com/ocsigen/js_of_ocaml/tree/master/toplevel

Xavier Leroy also added:

>> Ideally, I would prefer that both the IDE and the graphics/audio
>> were hosted in a brower much like in Elm (http://elm-lang.org/).

A first step in this direction:  http://try.ocamlpro.com/

Not a full IDE, obviously, but already very useful for teaching the
basics.

Daniel Bünzli replied to the initial question:

> 2. GUIs: several of my problem sets work with simple graphics (e.g., rendering
> tessellations) or animations (e.g., a maze walk or a simplified form of
> tetris, or the game "Flow"). We have been hobbling along with the Graphics and
> Labltk modules for this but it has been more pain than my students ought to
> know.

You may want to have a look at Vg (http://erratique.ch/software/vg)
which can be used both for offline (PDF, SVG) and interactive
rendering (HTML canvas). Here are two small interactive examples (they
are a little bit sluggish in ff though, needs investigation):

http://erratique.ch/software/useri/demos/chain.html
http://erratique.ch/tmp/2048.html

The later example is from an OCaml Labs OCaml tutorial available here:
https://github.com/ocamllabs/2048-tutorial

Kenichi Asai also replied:

I have been using OCaml as a teaching language for quite a while.

> 1. Error messages: It's difficult to give good type errors for ML but I was
> hoping that the state-of-the-art of type error reporting had improved. When my
> students receive a type error, they are utterly mystified,

Not directly addressing the error messages, but this spring, I used
our type debugger for OCaml in the introductory course.  By answering
questions, it leads you to the source of the type error, with better
error messages.  With proper instruction, I found that students
actually learn how OCaml types work from the interaction with the type
debugger.  It also introduces a few language levels which are good for
novices.

http://pllab.is.ocha.ac.jp/~asai/TypeDebugger/

(English error messages are not polished compared to Japanese error
messages.  Your feedback welcome.)

> 2. GUIs: several of my problem sets work with simple graphics (e.g., rendering
> tessellations) or animations (e.g., a maze walk or a simplified form of
> tetris, or the game "Flow"). We have been hobbling along with the Graphics and
> Labltk modules for this but it has been more pain than my students ought to
> know. We also have some problem sets that work with audio so I would like
> support for that.

One of my students is porting the universe teachpack of Racket into
OCaml, which enables easy game programming similarly to Racket.
I am now encouraging her to make the library public...

Yaron Minsky also replied:

For those following this thread because they're teaching with OCaml,
I really do recommend subscribing to teaching@ocaml.org.  We're
already starting to discuss better support in OPAM for teaching and
what's needed there, and it would be great to have everyone who has a
stake in this looped into that conversation.

Here's the link:

   http://lists.ocaml.org/listinfo/teaching

Arthur Charguéraud also replied:

I have been teaching OCaml for a while, and like you I found error
messages to be a major issue. This motivated my recent work on
improving type error messages for OCaml. 

My patch is ready for action.
opam switch 4.03.0+pr102
ocamlc -new-type-errors -strict-sequence test.ml 
For more information: http://arthur.chargueraud.org/teach/ocaml/

If you have a chance to try it on beginners, please share your
feedback!

Best,
Arthur Charguéraud

PS: I know of on-going efforts to try and improve parsing errors too
---it be great to have those fixed too.

PPS: although my patch is usable, I have planned to fix a few things.
- Have a two-column display for type errors on applications,
with expected types and provided types.
- Remove bracket delimiters around types and use newlines instead.
- Have "-new-type-errors" imply "-strict-sequence",
- Document the fact it does not support GADTs, and does not 
typecheck with "-principal", in particular code that uses overloading 
of record fields.
- Isolate the code associated with "-new-type-errors" so that the
patch
may have a chance of being considered for integration in OCaml.

Alain Frisch then asked and Arthur Charguéraud replied:

>> - Document the fact it does not support GADTs, and does not
>>   typecheck with "-principal", in particular code that uses
>>   overloading of record fields.
>
> Is it really what you mean?  I thought that your alternative
> type-checking rules would be coherent with the existing ones
> *provided* that -principal is enabled, so I'd expect this option to
> be forced when using your mode.

Ah, sorry for the typo, I meant to say: "the patch is not guaranteed
to report appropriate error messages for code that would not
type-check with the -principal flag activated".

In other words, "the patch works as long as the the type-checking of
the code does not depend on the order in which unifications are
performed".

The reason is that the patch performs unification in a different order
than the standard algorithm (precisely to avoid left-right bias).

Then, the question of whether "-new-type-error" should systematically
compile all the code (including the one that does not contain type
errors) using "-principal" is another question, I'm not sure what is
the best choice.

Reminder: Next OCaml users in Paris meetup, 9th of december

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2014-11/msg00106.html

Thomas Braibant announced:

This is a reminder that the next OCaml users in Paris (OUPS) meetup
will take place on the 9th of december at *Irill*, in Paris.

The program is:
- 19H: Introduction
- 19H05: Mihhail Aizatulin, Jane Street. "OCaml: the ultimate
configuration language?"
- 19H35: Frédéric Bour. "Modular implicits: réconcilier OCaml avec les
types classes"
- 20H05: Pierrick Couderc. "Gestion des namespaces en OCaml (depuis
4.02 et expérimentations)".
- 20h35: Vincent Balat. Tutoriel " Appli Web réactive client-serveur
avec Ocsigen"

There will be drinks and pizzas afterwards, thanks to Jane Street.

For more information, go to
http://www.meetup.com/ocaml-paris/events/215017872/

As usual, registration via meetup is preferable. If you do not wish to
do so, please send me an email if you want to be counted for food.

Sundials/ML 2.5.0

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2014-11/msg00112.html

Timothy Bourke announced:

We are pleased to announce Sundials/ML, an OCaml interface to the
Sundials suite of numerical solvers (CVODE, CVODES, IDA, IDAS, KINSOL).

Information and documentation: http://inria-parkas.github.io/sundialsml/
Source code (BSD):             https://github.com/inria-parkas/sundialsml

opam install sundialsml # (requires Sundials 2.5.0)

We gratefully acknowledge the original authors of Sundials, and the
support of the ITEA 3 project 11004 MODRIO (Model driven physical
systems operation), Inria, and the Departement d'Informatique de l'ENS.

Timothy Bourke, Jun Inoue, and Marc Pouzet.

Gabriel Scherer then asked and Jun Inoue replied:

> Thanks for the significant effort put in documenting the bindings
> (and, of course, the cool software and research); your "information
> and documentation" page is impressive.
>
> The page has a very interesting performance comparison of numeric code
> partly or fully written in OCaml (using bigarrays of floats) -- and
> the not-so-surprising results is that the run times of the OCaml
> programs are between 100% and 200% of the run time of the reference C
> implementation.
>
> ( http://inria-parkas.github.io/sundialsml/perf.opt.png )
>
> I'm curious about this specific part of the explanation:
>
>> For instance, some OCaml versions spend a significant fraction of their time
>> in printf, and we were able to lower their ratios by instead using print_string and print_int.
>
> The new 4.02 implementation of formats, due to Benoît Vaugon, should
> be significantly faster (in my experience they match the performance of the
> less-readable print_* sequence in most situations). Did you try those
> OCaml versions with 4.02?

Thank you for sharing this interesting information, Gabriel!  We
benchmarked the code exclusively with 4.01.0 and didn't know about the
performance boost in printf.

I just measured the performance with
examples/kinsol/serial/kinFerTron_dns.ml, which was one of the
examples that had the most pronounced effect.  As you suggest, the
numbers are a lot closer with 4.02.1.  The median wall-clock times of
10 runs were:

printf,  OCaml 4.02.1: 1.76[s]
print_*, OCaml 4.02.1: 1.62[s]
printf,  OCaml 4.01.0: 2.04[s]
print_*, OCaml 4.01.0: 1.70[s]

The overhead of using printf is about 8% in 4.02.1, as opposed to
about 20% in 4.01.0.  So in 4.02 the effect is noticeably smaller,
though not unmeasurable.  We should note this in the doc in a future
release.

FYI, the experiment can be reproduced as follows (in bash syntax),
using the attached patch (referred to as
/tmp/kinFerTron_dns_printf.diff below):

# Start at the root of the source tree.
opam switch 4.02.1; eval `opam config env`
./configure && make clean all
cd examples/kinsol/serial
# Measure without modification.
make PERF_DATA_POINTS=10 kinFerTron_dns.opt.perf
../../utils/crunchperf -m kinFerTron_dns.opt.perf >
no-printf-4.02.1-kinFerTron_dns.opt.perf
# Apply patch and measure again.
patch -p4 < /tmp/kinFerTron_dns_printf.diff
make PERF_DATA_POINTS=10 kinFerTron_dns.opt.perf
../../utils/crunchperf -m kinFerTron_dns.opt.perf >
printf-4.02.1-kinFerTron_dns.opt.perf
# Undo changes.
patch -p4 -R < /tmp/kinFerTron_dns_printf.diff

cd ../../..
# Back at the root of the source tree.
opam switch 4.01.0; eval `opam config env`
# (Basically the same thing as above)
./configure && make clean all
cd examples/kinsol/serial
make PERF_DATA_POINTS=10 kinFerTron_dns.opt.perf
../../utils/crunchperf -m kinFerTron_dns.opt.perf >
no-printf-4.01.0-kinFerTron_dns.opt.perf
patch -p4 < /tmp/kinFerTron_dns_printf.diff
make PERF_DATA_POINTS=10 kinFerTron_dns.opt.perf
../../utils/crunchperf -m kinFerTron_dns.opt.perf >
printf-4.01.0-kinFerTron_dns.opt.perf

# Summarize results
for i in *kinFerTron_dns.opt.perf; do printf "\n[$i]\n";
../../utils/crunchperf -s $i; done

Jun Inoue then corrected:

> The overhead of using printf is about 8% in 4.02.1, as opposed to
> about 20% in 4.01.0.  So in 4.02 the effect is noticeably smaller,
> though not unmeasurable.  We should note this in the doc in a future
> release.

Correction: the numbers are the medians of the ratios between OCaml
and C (wall-clock time).  The C version takes right about 1[s], so the
conclusion is morally the same, though.

cconv-0.2

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2014-12/msg00001.html

Simon Cruanes announced:

I'm happy to announce the release of cconv 0.2 [1]. CConv is a
BSD-licensed combinators library for serializing and unserializing data
structures.  It particularity is that it doesn't target a specific
serialization format, but any format that is expressive enough (using
some type machinery involving a few GADTs and visitor patterns).

The rationale is that, for library writers, the current go-to practice
to provide some serialization support is to use type_conv and "with sexp".
This requires camlp4 and a lot of libraries, and forces the choice of
S-expressions on the user. Instead, by providing CConv encoders and
decoders, the user could choose which serialization format (even simple
printing) to use by herself.

To serialize values of type 'a, one can write a value of type ['a
CConv.Encode.encoder] that can be used with any serialization backend;
to unserialize, it's a value of type ['a CConv.Decode.decoder].
Then, cconv ships with three (optional) backends, to Yojson, Bencode and
Sexplib; an ['a encoder] can be used with any of those three (same for
decoders). See the readme [2] for more details.

There will be a ppx_deriving [3] instance [4] soon for automatically
generating decoders and encoders based on types.

If anyone used cconv-0.1, I apologize for the change of interface. The
ideas took time to evolve...

As usual, bug reports, comments, criticism and accusations of
reinventing the wheel are all welcome.

Cheers,

-- 
Simon

[1] https://github.com/c-cube/cconv/tree/0.2
[2] https://github.com/c-cube/cconv/blob/0.2/README.md#usage
[3] https://github.com/whitequark/ppx_deriving
[4] https://github.com/c-cube/ppx_deriving_cconv (work in progress, comments appreciated!)

Thomas Gazagnaire asked and Simon Cruanes replied:

> Do you have any benchmarks to compare CConv and similar camlp4 generators?

I hadn't, but I just wrote very basic ones to compare with
ppx_deriving_yojson (should be similar to camlp4). The code is at
https://github.com/c-cube/cconv/blob/e80ab0e6c458a01b419ea69c7f41d0a350aebbad/bench/run_bench.ml

It only compares times for encoding into Json right now, with the
following results (recursive records first, recursive terms then;
"manual" is a handwritten encoding function, "cconv" the combinators
version, and "deriving_yojson" uses @whitequark's nice deriver):

% ./run_bench.native

benchmark points
Throughputs for "manual", "cconv", "deriving_yojson" each running for at least 4 CPU seconds:
         manual:  4.20 WALL ( 4.20 usr +  0.00 sys =  4.20 CPU) @ 3057270.82/s (n=12846652)
          cconv:  4.21 WALL ( 4.21 usr +  0.00 sys =  4.21 CPU) @ 784724.92/s (n=3300553)
deriving_yojson:  4.21 WALL ( 4.21 usr +  0.00 sys =  4.21 CPU) @ 3065779.07/s (n=12891601)
                     Rate           cconv          manual deriving_yojson
          cconv  784725/s              --            -74%            -74%
         manual 3057271/s            290%              --             -0%
deriving_yojson 3065779/s            291%              0%              --

benchmark terms
Throughputs for "manual", "cconv", "deriving_yojson" each running for at least 4 CPU seconds:
         manual:  4.20 WALL ( 4.20 usr +  0.00 sys =  4.20 CPU) @ 1679609.71/s (n=7057720)
          cconv:  4.20 WALL ( 4.20 usr +  0.00 sys =  4.20 CPU) @ 726619.43/s (n=3051075)
deriving_yojson:  4.20 WALL ( 4.20 usr +  0.00 sys =  4.20 CPU) @ 1624740.65/s (n=6822286)
                     Rate           cconv deriving_yojson          manual
          cconv  726619/s              --            -55%            -57%
deriving_yojson 1624741/s            124%              --             -3%
         manual 1679610/s            131%              3%              --


So yeah, unsurprisingly, there is some overhead :(. There is some
dispatching through records-of-functions going on, because combinators
should work with any backend, whereas specialized encoders can build the
result directly.

Simon Cruanes later added:

So, it is possible (by changing slightly the interface of encoders to
avoid an intermediate structure), to obtain the following (I also added
a benchmark for decoding, comparing cconv to ppx_deriving_yojson):

====== BEGIN BENCH ======

% ./run_bench.native
encoding...

benchmark points
Throughputs for "manual", "cconv", "deriving_yojson" each running for at least 4 CPU seconds:
         manual:  4.18 WALL ( 4.18 usr +  0.00 sys =  4.18 CPU) @ 2826676.15/s (n=11818333)
          cconv:  4.20 WALL ( 4.20 usr +  0.00 sys =  4.20 CPU) @ 941618.68/s (n=3951032)
deriving_yojson:  4.20 WALL ( 4.20 usr +  0.00 sys =  4.20 CPU) @ 2824134.70/s (n=11867014)
                     Rate           cconv deriving_yojson          manual
          cconv  941619/s              --            -67%            -67%
deriving_yojson 2824135/s            200%              --             -0%
         manual 2826676/s            200%              0%              --

benchmark terms
Throughputs for "manual", "cconv", "deriving_yojson" each running for at least 4 CPU seconds:
         manual:  4.20 WALL ( 4.20 usr +  0.00 sys =  4.20 CPU) @ 1125111.48/s (n=4723218)
          cconv:  4.21 WALL ( 4.21 usr +  0.00 sys =  4.21 CPU) @ 789967.68/s (n=3324184)
deriving_yojson:  4.20 WALL ( 4.20 usr +  0.00 sys =  4.20 CPU) @ 1100920.75/s (n=4626069)
                     Rate           cconv deriving_yojson          manual
          cconv  789968/s              --            -28%            -30%
deriving_yojson 1100921/s             39%              --             -2%
         manual 1125111/s             42%              2%              --


decoding...

benchmark points
Throughputs for "cconv", "deriving_yojson" each running for at least 3 CPU seconds:
          cconv:  3.16 WALL ( 3.16 usr +  0.00 sys =  3.16 CPU) @ 493501.11/s (n=1558970)
deriving_yojson:  3.15 WALL ( 3.15 usr +  0.00 sys =  3.15 CPU) @ 1248812.96/s (n=3932512)
                     Rate           cconv deriving_yojson
          cconv  493501/s              --            -60%
deriving_yojson 1248813/s            153%              --

benchmark terms
Throughputs for "cconv", "deriving_yojson" each running for at least 3 CPU seconds:
          cconv:  3.12 WALL ( 3.12 usr +  0.00 sys =  3.12 CPU) @ 577372.88/s (n=1800826)
deriving_yojson:  3.14 WALL ( 3.14 usr +  0.00 sys =  3.14 CPU) @ 1492303.95/s (n=4688819)
                     Rate           cconv deriving_yojson
          cconv  577373/s              --            -61%
deriving_yojson 1492304/s            158%              --
./run_bench.native  45.01s user 1.16s system 99% cpu 46.181 total

====== END BENCH ======

It is still slower to encode records (intermediate list), and decoding also
has some overhead. However, this is the cost of translating
between the type and the JSON representation; I think it should be
negligible compared to the actual IO + printing/parsing cost.

The new, more efficient interface will probably appear in a future release.
With ppx_deriving_cconv that shouldn't be too big a problem...

Other OCaml News

From the ocamlcore planet blog:

Thanks to Alp Mestan, we now include in the OCaml Weekly News the links to the
recent posts from the ocamlcore planet blog at http://planet.ocaml.org/.

Height, Depth and Level of a Tree:
  http://typeocaml.com/2014/11/26/height-depth-and-level-of-a-tree/

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.

Alan Schmitt