Previous week Up Next week


Here is the latest Caml Weekly News, for the week of March 12 to 19, 2013.

  1. Case study in optimization: porting a compiler from OCaml to F#
  2. OPAM 1.0.0 released !
  3. Use of OCaml in universities and engineering schools
  4. Riakc 0.0.0
  5. Core Suite 109.14.00 released + custom_printf
  6. Other Caml News

Case study in optimization: porting a compiler from OCaml to F#


Jon Harrop said:
There has been some discussion here about the implications of single- vs
multi-threaded garbage collectors and, in particular, their performance in
the context of the kinds of metaprogramming that OCaml has traditionally
been used for.

I recently ported a compiler written in OCaml to the F# programming language
for a client and performance turned out to be an issue so I'd like to
present this as a case study to provide some real data. Unfortunately I
cannot disclose precise details.

The original compiler was 15kLOC of OCaml code. The amounts of DSL code that
it consumes and C code that it produces can be considerable and compilation
can take minutes. Consequently, performance is valued by my client's
customers and, therefore, the original code had been optimized for OCaml's
performance characteristics.

A direct translation of the OCaml code to F# proved to be over 10x slower.
This was so slow that it impeded testing my translation so I did some
optimization early. Specifically, profiling indicated that the biggest
problem was the high rate of exceptions being raised and caught. Exceptions
are around 600x slower on .NET than in OCaml so this can quickly degrade
performance. I changed all of the hot paths to use union types (usually
option types) instead of exceptions, according to F# idioms. Although this
incurs a lot of unnecessary boxing in F# the performance improvements were
substantial and the F# version became 5x slower than the OCaml.

On a related note, thorough testing showed that my almost-blind translation
of 15kLOC of code was completely error free. I think this is a real
testament to the power of ML's static type system. The only error I have
introduced so far occurred when I was replacing the use of an exception in a
function with a union type.

After demonstrating the correctness of the translation, my effort turned to
trying to improve performance in an attempt to compete with the original
OCaml code. I had believed that this could well prove to be prohibitively
difficult or even impossible because symbolic code is OCaml's main strength.
However, I have managed to make the F# around 8x faster than it was and, in
particular, substantially faster than the original OCaml.

So this non-trivial symbolic code base has not had its performance suffer
from the adoption of a multicore-friendly garbage collector.
Julien Verlaguet asked and Jon Harrop replied:
> Thanks for sharing this case-study with us!

No problem. I found it in my drafts folder. :-)

> Have you tried parallelizing the OCaml version?

No. I didn't touch the OCaml code at all.

> I am thinking pre-forked processes communicating with pipes?

That would work but it would be a lot of effort compared to in F#.

> We write a lot of large-scale static-analysis in OCaml here at Facebook.

Good to hear. :-)

> Parallelizing them with pre-forked processes gave us very good

I'm not sure how much message passing would be required in this case and
don't have time to investigate.

> I would be curious to see a case-study of pre-forked OCaml vs threaded F#.

Me too. Only problem is that fork-based parallel OCaml code takes a long
time to write in comparison. Incidentally, the F# does not make direct use
of threads.
Pierre-Alexandre Voye then suggested:
So you could maybe use ?
Parmap.parmap ~ncores:4 funct (Parmap.L elem_list)
Jon Harrop replied and Jean-Marc Alliot said:
We have been using Parmap and are quite happy with it (we use it as an
alternative to the Ocaml/MPI implementation when MPI is not strictly
required), with excellent scalability on our applications (even if we
had to change from time to time a little bit or our code, regarding
the fact that Parmap does not allow "in place" modifications).

> What happens if the inner function returns results via mutation? I
> assume you must rearrange the code to return all results explicitly
> and they will then be deep copied (which destroys scalability due to
> limited shared memory bandwidth on multicores).

As fas as I can tell there is no "copy" of any form. Parmap only
collects results. If you do "in place" modifications, they are lost.
Parmap is, in a way, "functional"...

> Does it do load balancing? I assume not given that ncores is
> hardcoded.

There is an optional argument (chunksize) to manually control load

> Does a parmap with ncores=4 inside a parmap with ncores=4 create 16
> processes? Does it deep copy inputs and/or outputs?

Regarding outputs, Parmap uses a shared memory area and
Marshal/Unmarshal to collect outputs. There is an optimization done
when array of floats are returned, as marshalling is not used, thus
reducing the overhead.

> I assume so, at least for outputs, because you cannot write results
> in-place without a shared mutable heap. Does parmap have a large
> constant overhead? I assume so if it is forking processes.

Well, it depends on what you call "large constant overhead". Forking
is not a so expensive primitive on modern Unix systems, because pages
are only copied when written-to (copy-on-write).

> Another solution is to prefork and explicitly communicate all inputs
> using message passing but this is equally problematic. You have to
> rearrange the code. Deep copying inputs also destroys scalability.
> Cheers, Jon.

There is an article describing the implementation (I think it is also
available online):

A “minimal disruption” skeleton experiment: seamless map & reduce
embedding in OCaml
M. Danelutto, R. Di Cosmo
Procedia Computer Science 9 ( 2012 ) 1837 – 1846
Francois Berenger also replied:
I have observed and measured perfect scalability with up to 4 cores
of an OCaml program using Parmap.
With more than 4 cores, the scalability was degrading.

I think the scalability of the program depends only on the
granularity of the tasks. The tasks were coarse in my case.
Alain Frisch replied to the original post and Jon Harrop then said:
> Too bad, because the really interesting part would to know (i) what kind of
> optimizations you had to do on the F# version (and in particular whether
> they make us of parallelism),

* Replaced the use of exceptions for control flow with variant types.
* Replaced use of fprintf with lower-level .NET functions.
* Parallelized some of the symbolic code.
* Algorithmic optimization to a search function.

> (ii) whether (some of) those optimizations could be applicable to the OCaml
> version as well, and

* No need to replace exceptions in OCaml.
* No need to replace printf and friends in OCaml.
* Could try to parallelize the OCaml but it would be hard to do efficiently.
* Algorithmic optimization could also be done to the OCaml (IIRC, this was a
fairly minor speedup).

> (iii) how much effort you initially put into optimizing the OCaml version (just
> tweaking the GC parameters can easily give very substantial speedups for
> symbolic processing).

None. I didn't change the OCaml code at all and didn't try different GC
parameters. However, it had already been quite heavily optimized.
Alain Frisch then said:
> * No need to replace exceptions in OCaml.

It depends. If you compile in debug mode and run with backtrace
enabled, exceptions are not so fast any more.

> * No need to replace printf and friends in OCaml.

Printf can be quite slow. Format direct-style as well. And
Format.printf is even worse.

( )
oliver asked and Jon Harrop replied:
> What is missing, is the information, how many cores / processors the machine
> has, on which the F# code runs, and if the OCaml code runs on the same 
> machine.

Benchmark measurements used for comparison between OCaml and F# were always 
done on the same machine running Windows. I used two machines:

* 2-core 1.67GHz Intel Atom N570 based netbook.
* 8-core 2.0GHz Intel Xeon E5405 based workstation.

The client verified the relative performance on their own machines.

> What about Ocaml Byteocde vs. Nativecode?

The performance comparison was done using native code. We did not benchmark 
OCaml bytecode.

> And what, if the re-designt program would be back-ported to OCaml?

Only minor changes were made, no redesign. Some of those changes could be 
back-ported to the OCaml but there is no motivation to do so.

OPAM 1.0.0 released !


Thomas Gazagnaire announced:
I am *very* happy to announce the first official release of OPAM!

Many of you already know and use OPAM so I won't be long. Please read for a longer

1.0.0 fixes many bugs and add few new features to the previously
announced beta-release.

The most visible new feature, which should be useful for beginners
with OCaml and OPAM, is an auto-configuration tool. This tool easily
enables all the features of OPAM (auto-completion, fix the loading of
scripts for the toplevel, opam-switch-eval alias, etc). This tool runs
interactively on each `opam init` invocation. If you don't like OPAM
to change your configuration files, use `opam init --no-setup`. If you
trust the tool blindly, use `opam init --auto-setup`. You can later
review the setup by doing `opam config setup --list` and call the tool
again using `opam config setup` (and you can of course manually edit
your ~/.profile (or ~/.zshrc for zsh users), ~/.ocamlinit and

Please report:
- Bug reports and feature requests for the OPAM tool:
- Packaging issues or requests for a new package:
- General queries to:
- More specific queries about the internals of OPAM to:

On behalf on the OPAM team,

=== Install ===

Packages for Debian and OSX (at least homebrew) should follow shortly
and I'm looking for volunteers to create and maintain rpm packages.
The binary installer is up-to-date for Linux and Darwin 64-bit
architectures, the 32-bit version for Linux should arrive shortly.

If you want to build from sources, the full archive (including
dependencies) is available here:

=== Upgrade ===

If you are upgrading from 0.9.* you won't have anything special to do
apart installing the new binary. You can then update your package
metadata by running `opam update`. If you want to use the auto-setup
feature, remove the "eval `opam config env` line you have previously
added in your ~/.profile and run `opam config setup --all`.

So everything should be fine. But you never know ... so if something
goes horribly wrong in the upgrade process (of if your are upgrading
from an old version of OPAM) you can still trash your ~/.opam,
manually remove what OPAM added in your ~/.profile (~/.zshrc for zsh
users) and ~/.ocamlinit, and start again from scratch.

=== Random stats ===

Great success on github. Thanks everybody for the great contributions! +2000 commits, 26 contributors +1700 commits, 75
contributors, 370+ packages

+400 unique visitor per week, 15k 'opam update' per week
+1300 unique visitor per month, 55k 'opam update' per month
3815 unique visitor since the alpha release

=== Changelog ===

The full change-log since the beta release in January:

1.0.0 [Mar 2013]
* Improve the lexer performance (thx to @oandrieu)
* Fix various typos (thx to @chaudhuri)
* Fix build issue (thx to @avsm)

0.9.6 [Mar 2013]
* Fix installation of pinned packages on BSD (thx to @smondet)
* Fix configuration for zsh users (thx to @AltGr)
* Fix loading of `~/.profile` when using dash (eg. in Debian/Ubuntu)
* Fix installation of packages with symbolic links (regression
introduced in 0.9.5)

0.9.5 [Mar 2013]
* If necessary, apply patches and substitute files before removing a
* Fix `opam remove <pkg> --keep-build-dir` keeps the folder if a
source archive is extracted
* Add build and install rules using ocamlbuild to help distro
* Support arbitrary level of nested subdirectories in packages
* Add `opam config exec "CMD ARG1 ... ARGn" --switch=SWITCH` to
execute a command in a subshell
* Improve the behaviour of `opam update` wrt. pinned packages
* Change the default external solver criteria (only useful if you have
aspcud installed on your machine)
* Add support for global and user configuration for OPAM (`opam config
* Stop yelling when OPAM is not up-to-date
* Update or generate `~/.ocamlinit` when running `opam init`
* Fix tests on *BSD (thx Arnaud Degroote)
* Fix compilation for the source archive

0.9.4 [Feb 2013]
* Disable auto-removal of unused dependencies. This can now be enabled
on-demand using `-a`
* Fix compilation and basic usage on Cygwin
* Fix BSD support (use `type` instead of `which` to detect existing
* Add a way to tag external dependencies in OPAM files
* Better error messages when trying to upgrade pinned packages
* Display `depends` and `depopts` fields in `opam info`
* `opam info pkg.version` shows the metadata for this given package
* Add missing `doc` fields in `.install` files
* `opam list` now only shows installable packages

0.9.3 [Feb 2013]
* Add system compiler constraints in OPAM files
* Better error messages in case of conflicts
* Cleaner API to install/uninstall packages
* On upgrade, OPAM now perform all the remove action first
* Use a cache for main storing OPAM metadata: this greatly speed-up
OPAM invocations
* after an upgrade, propose to reinstall a pinned package only if
there were some changes
* improvements to the solver heuristics
* better error messages on cyclic dependencies

0.9.2 [Jan 2013]
* Install all the API files
* Fix `opam repo remove repo-name`
* speed-up `opam config env`
* support for `opam-foo` scripts (which can be called using `opam
* 'opam update pinned-package' works
* Fix 'opam-mk-repo -a'
* Fix 'opam-mk-repo -i'
* clean-up pinned cache dir when a pinned package fails to install

0.9.1 [Jan 2013]
* Use ocaml-re 1.2.0
Thomas Gazagnaire later added:
As I've received few questions related to upgrading OPAM itself, I've
added Anil's response to [1] to the FAQ:


Use of OCaml in universities and engineering schools


Nicolas Barnier asked, triggering many answers:
We use OCaml at ENAC (French Civil Aviation University) to teach the
basics of programming and the design of algorithms in the first year
course of CS major since 1995. The cursus is now under deep
revisionand we're trying to advocate its convenience in the new cursus
to our hierarchy and colleagues.

We were thus wondering which engineering schools and universities are
actually currently using OCaml, and for which cursus. Short of finding
a long enough list on the OCaml websites or by googling, we've decided
to try the caml-list for feedback. So if you are involved in a CS
course using OCaml, we would greatly appreciate that you let us know,
so as to help us arguing to keep this great language in our cursus.
Roberto Di Cosmo asked, Ashish Agarwal suggested, and Philippe Wang said:
> Roberto Di Cosmo wrote:
>> We would really need a central place to keep this info well
>> structured. Even a simple wiki page would do.

Ashish Agarwal wrote:

> Why not the website. I've created an issue to keep track of the
> suggestions so far:

Yes, there should be such a page somewhere, seems to be the
right place for that.

I started getting more information about the courses:
(it's a draft)

Riakc 0.0.0


Malcolm Matalka announced:
I'd like to announce the first release of Riakc, a Riak client in Ocaml
written in Jane St's Core/Async. It is in early development, so bugs
are abound, but hopefully it is a good start for anyone interested in
using Riak from Ocaml.

Thomas has kindly merged riakc into opam today, so it can be installed

opam install riakc

The source can be found:

Core Suite 109.14.00 released + custom_printf


Jeremie Dimino announced:
I'm pleased to announce the 109.14.00 release of the Core suite.

The new package of the week is custom_printf: a syntax extension for
format strings.  Formats prefixed with [!] support the new conversion
specifier [%{<Module>}] and a few variants.  Arguments are wrapped
with the appropriate conversion function.

For example:

    printf !"%{Float}" 42.0;
    printf !"%{Float.pretty}" 42.0;
    printf !"%{sexp:int * string}" (42, "foo");

is the same as:

    printf "%s" (Float.to_string 42.0);
    printf "%s" (Float.Format.pretty 42.0);
    printf "%s" (Sexp.to_string_hum (<:sexp_of< int * string >> (42, "foo")));

Files and documentation for this release are available on our website
and all packages are in opam:

Here are the changelogs for versions 109.12.00 to 109.14.00:

# 109.12.00

## async_extra

- Made explicit the equivalence between type `Async.Command.t` and
type `Core.Command.t`.

## async_unix

- Fixed a bug in `Fd.syscall_in_thread`.

  The bug could cause:

  Fd.syscall_in_thread bug -- should be impossible

  The bug was that `syscall_in_thread` raised rather than returning `Error`.
- Changed `Tcp.connect` and `Tcp.with_connect` to also supply the
connected socket.

  Supplying the connected socket makes it easy to call `Socket`
  functions, e.g.  to find out information about the connection with
  `Socket.get{peer,sock}name`.  This also gives information about the IP
  address *after* DNS, which wouldn't otherwise be available.

  One could reconstruct the socket by extracting the fd from the
  writer, and then calling `Socket.of_fd` with the correct
  `Socket.Type`.  But that is both error prone and not discoverable.
- Added `Writer.schedule_bigsubstring`, which parallels

## core

- Add some functions to `Byte_units`.
  - Added functions: `to_string_hum`, `scale`, `Infix.//`.
  - Eliminated the notion of "preferred measure", so a `Byte_units.t`
    is just a `float`.
- Improved the performance of `Array.of_list_rev`.

  The new implementation puts the list elements directly in the right
  place in the resulting array, rather that putting them in order and
  then reversing the array in place.

  Benchmarking shows that the new implementation runs in 2/3 the time of
  the old one.
- Fixed `Fqueue.t_of_sexp`, which didn't work with `sexp_of_t`.

  There was a custom `sexp_of_t` to abstract away the internal record
  structure and make the sexp look like a list, but there wasn't a
  custom `t_of_sexp` defined, so it didn't work.
- Added `Stable.V1` types for `Host_and_port`.
- Removed `Identifiable.Of_sexpable` and `Identifiable.Of_stringable`,
  in favor of `Identifiable.Make`

  `Identifiable.Of_sexpable` encouraged a terrible implementation of
  `Identifiable.S`.  In particular, `hash`, `compare`, and bin_io were
  all built by converting the type to a sexp, and then to a string.

  `Identifiable.Of_stringable` wasn't as obviously bad as
  `Of_sexpable`.  But it still used the string as an intermediate,
  which is often the wrong choice -- especially for `compare` and
  `bin_io`, which can be generated by preprocessors.

  Added `Identifiable.Make` as the replacement.  It avoids using sexp
  conversion for any of the other operations.
- Added `List.intersperse` and `List.split_while`.

  These came from `Core_extended.List`.

  val intersperse : 'a list -> sep:'a -> 'a list
  val split_while : 'a list -> f:('a -> bool) -> 'a list ** 'a list
- Added a functor, `Pretty_printer.Register`, for registering pretty printers.
  The codifies the idiom that was duplicated in lots of places:

  let pp formatter t = Format.pp_print_string formatter (to_string t)
  let () = Pretty_printer.register "Some_module.pp")

## fieldslib

- Added back `Fields.fold` to `with fields` for `private` types.

  We had removed `Fields.fold` for `private` types, but this caused
  some pain.  So we're putting it back.  At some point, we'll patch
  `with fields` to prevent setting mutable fields on private types via
  the fields provided by `fold`.

## sexplib

- A tiny lexer improvement in `lexer.mll`.  Used
  `lexbuf.lex_{start|curr}_pos` instead of
  `lexbuf.lex_{start|curr}_p.pos_cnum` for computing the length of a
  lexeme since the difference is the same.

# 109.13.00

## async_core

- Fixed `Pipe.iter`'s handling of a closed pipe.

  Fixed the handling by `Pipe.iter` and related foldy functions that
  handle one element at a time, which behaved surprisingly with a pipe
  whose read end has been closed.  These functions had worked by
  reading a queue as a batch and then applying the user function to
  each queue element.  But if the pipe's read end is closed during the
  processing of one queue element, no subsequent element should be
  processed.  Prior to this fix, the `iter` didn't notice the pipe was
  closed for read until it went to read the next batch.
- Renamed `Pipe.read_one` as `Pipe.read_one`', and added
  `Pipe.read_one` that reads a single element.

## async_unix

- Added `Writer.write_line`, which is `Writer.write` plus a newline at
  the end.
- Added `?close_on_exec:bool` argument to `{Reader,Writer}.open_file`
  and `Async.Unix.open_file`.

  Made the default `close_on_exec:true` for `Reader` and `Writer`.
- Added a `compare` function to `Socket.Address.Inet`.

## core

- Added `Command.Spec.flags_of_args_exn`, for compatibility with
  OCaml's standard library.

  This function converts a `Core.Std.Arg.t` into a `Command.Spec.t`.
- Made various modules `Identifiable`: `Char`, `String`, and the
  various `Int` modules.

  In particular, `Int` being identifiable is useful, because one can
  now write:

  module My_numeric_identifier : Identifiable ` Int

  You might think that we could now delete `String_id`, and just

  module My_string_identifier : Identifiable ` String

  But this is not quite equivalent to using `String_id`, because
  `String_id.of_string` enforces that its argument is nonempty.

- Removed module `Space_safe_tuple`, which became unnecessary in OCaml

  OCaml 4.00.0 included Fabrice's patch to fix the space leak that
  `Space_safe_tuple` was circumventing (PR#5288, commit SVN 11085).
- Added `Exn.to_string_mach`, for single-line output.
- Added `Linux_ext.bind_to_interface`, to improve security of UDP

  val bind_to_interface : (File_descr.t -> string -> unit) Or_error.t

  This uses the linux-specifc socket option `BINDTODEVICE` to prevent
  packets being received from any interface other than one named.
- Fixed `Unix.mkdir_p` on Mac OS X.

# 109.14.00

## async

- Added function `Monitor.kill`, which kills a monitor and all its

  This prevents any jobs from ever running in the monitor again.

## async_unix

- Fixed major performance degradation (since 109.04) in `*`
- Added function `Rpc.Implementation.map_inv`.

  val map_inv : 'a t -> f:('b -> 'a) -> 'b t
- Add functions `Reader.file_lines` and `Writer.save_lines`.

  These deal with files as lists of their lines.

  val Reader.file_lines : string -> string list Deferred.t
  val Writer.save_lines : string -> string list -> unit Deferred.t
- Added a `?wakeup_scheduler:bool` optional argument to functions in
  the `Thread_safe` module.

  The default is `true`, which continues the behavior that has been in
place since 109.09.
  However, once can use `~wakeup_scheduler:false` to reduce CPU use,
in return for increased
  latency (because the scheduler won't run a cycle immediately).

## core

- Fixed major performance problem with hashing in `Int.Table`.

  Our `Int.Table.replace` was 3 times slower than polymorphic hash
  table and `find` was _8_ times slower.

  This was caused by using:

  external seeded_hash_param : int -> int -> int -> 'a -> int =
"caml_hash" "noalloc"

  in `Int.Table` but:

  external old_hash_param : int -> int -> 'a -> int =
"caml_hash_univ_param" "noalloc"

  everywhere else.

  The `seeded_hash_param` was introduced in Caml 4.

  We fixed this problem by changing `Int.hash` from:

  let hash (x : t) = Hashtbl.hash x


  let hash (x : t) = if x >= 0 then x else ~-x
- Added `Bigstring.{pread,pwrite}`, which allow reading and writing at
  a specific file offset.
- Added module `Nothing`, which is a type with no values.

  This is useful when an interface requires you to specify a type that
  you know will never be used in your implementation.
- Changed `Identifiable.Make` so that it registers a pretty printer.

  `Identifiable.Make` now uses `Pretty_printer.Register`.  This
  requires all calls to `Identifiable.Make` to supply a `val
  module_name : string`.
- Made `Core.Zone` match the `Identifiable` signature.
- Made polymorphic equality always fail on `Core.Map.t` and

  Before this change, polymorphic equality on a `Core.Map` or a
  `Core.Set` could either raise or return `false`.  It returnd `false`
  if the data structures were unequal, and raised if the data
  structures were equal.

  This is because their type definitions looked like:

  type ('k, 'v, 'comparator) t =
    { tree : ('k, 'v) Tree0.t;
      comparator : ('k, 'comparator) Comparator.t;

  and polymorphic equality visits a block's fields in order.  So, it
  will detect unequal trees and return false, but if the trees are
  equal, it will compare the comparators and raise because of the
  functional value.

  This change reversed the order of the fields so polymorphic equality
  always fails.

## custom_printf

- initial import
  Added support for `%{<Module>}` in `printf`-style format strings.

  If you put `!` before a format string, it allows the use of a spec
  like `%{<Module>}` in the format string.  For example, using
  `%{Time}` wraps `Time.to_string` around the appropriate argument.

  It also allows different formats for a given type:
  `%{<Module>.<identifier>}` wraps `<Module>.Format.<identifier>`
  around the appropriate argument.  For example, `%{Float.pretty}`
  would wrap `Float.Format.pretty` around the appropriate argument.

## fieldslib

- Made `with fields` expose first-class fields for private types while
  preserving privacy.

  There is now an additional phantom type in a first-class field that
  prevents building or modifying elements of a private type.

  One consequence of this change is that the `Field.t` type is now an
  abstract type -- it used to be exposed as a record type.  So, one
  must, e.g., change `` to ` field`.

Other Caml News

From the ocamlcore planet blog:
Thanks to Alp Mestan, we now include in the Caml Weekly News the links to the
recent posts from the ocamlcore planet blog at

An Indentation Engine for OCaml:

[ANN] Riakc 0.0.0:

[ANN] Protobuf 0.0.2:

OPAM 1.0.0 Released:

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.

Alan Schmitt