Hello
Here is the latest Caml Weekly News, for the week of March 12 to 19, 2013.
Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-03/msg00085.html
Jon Harrop said:There has been some discussion here about the implications of single- vs multi-threaded garbage collectors and, in particular, their performance in the context of the kinds of metaprogramming that OCaml has traditionally been used for. I recently ported a compiler written in OCaml to the F# programming language for a client and performance turned out to be an issue so I'd like to present this as a case study to provide some real data. Unfortunately I cannot disclose precise details. The original compiler was 15kLOC of OCaml code. The amounts of DSL code that it consumes and C code that it produces can be considerable and compilation can take minutes. Consequently, performance is valued by my client's customers and, therefore, the original code had been optimized for OCaml's performance characteristics. A direct translation of the OCaml code to F# proved to be over 10x slower. This was so slow that it impeded testing my translation so I did some optimization early. Specifically, profiling indicated that the biggest problem was the high rate of exceptions being raised and caught. Exceptions are around 600x slower on .NET than in OCaml so this can quickly degrade performance. I changed all of the hot paths to use union types (usually option types) instead of exceptions, according to F# idioms. Although this incurs a lot of unnecessary boxing in F# the performance improvements were substantial and the F# version became 5x slower than the OCaml. On a related note, thorough testing showed that my almost-blind translation of 15kLOC of code was completely error free. I think this is a real testament to the power of ML's static type system. The only error I have introduced so far occurred when I was replacing the use of an exception in a function with a union type. After demonstrating the correctness of the translation, my effort turned to trying to improve performance in an attempt to compete with the original OCaml code. I had believed that this could well prove to be prohibitively difficult or even impossible because symbolic code is OCaml's main strength. However, I have managed to make the F# around 8x faster than it was and, in particular, substantially faster than the original OCaml. So this non-trivial symbolic code base has not had its performance suffer from the adoption of a multicore-friendly garbage collector.Julien Verlaguet asked and Jon Harrop replied:
> Thanks for sharing this case-study with us! No problem. I found it in my drafts folder. :-) > Have you tried parallelizing the OCaml version? No. I didn't touch the OCaml code at all. > I am thinking pre-forked processes communicating with pipes? That would work but it would be a lot of effort compared to Array.Parallel.map in F#. > We write a lot of large-scale static-analysis in OCaml here at Facebook. Good to hear. :-) > Parallelizing them with pre-forked processes gave us very good performances. I'm not sure how much message passing would be required in this case and don't have time to investigate. > I would be curious to see a case-study of pre-forked OCaml vs threaded F#. Me too. Only problem is that fork-based parallel OCaml code takes a long time to write in comparison. Incidentally, the F# does not make direct use of threads.Pierre-Alexandre Voye then suggested:
So you could maybe use Parmap.map ? Parmap.parmap ~ncores:4 funct (Parmap.L elem_list)Jon Harrop replied and Jean-Marc Alliot said:
We have been using Parmap and are quite happy with it (we use it as an alternative to the Ocaml/MPI implementation when MPI is not strictly required), with excellent scalability on our applications (even if we had to change from time to time a little bit or our code, regarding the fact that Parmap does not allow "in place" modifications). > What happens if the inner function returns results via mutation? I > assume you must rearrange the code to return all results explicitly > and they will then be deep copied (which destroys scalability due to > limited shared memory bandwidth on multicores). As fas as I can tell there is no "copy" of any form. Parmap only collects results. If you do "in place" modifications, they are lost. Parmap is, in a way, "functional"... > Does it do load balancing? I assume not given that ncores is > hardcoded. There is an optional argument (chunksize) to manually control load balancing. > Does a parmap with ncores=4 inside a parmap with ncores=4 create 16 > processes? Does it deep copy inputs and/or outputs? Regarding outputs, Parmap uses a shared memory area and Marshal/Unmarshal to collect outputs. There is an optimization done when array of floats are returned, as marshalling is not used, thus reducing the overhead. > I assume so, at least for outputs, because you cannot write results > in-place without a shared mutable heap. Does parmap have a large > constant overhead? I assume so if it is forking processes. Well, it depends on what you call "large constant overhead". Forking is not a so expensive primitive on modern Unix systems, because pages are only copied when written-to (copy-on-write). > Another solution is to prefork and explicitly communicate all inputs > using message passing but this is equally problematic. You have to > rearrange the code. Deep copying inputs also destroys scalability. > Cheers, Jon. There is an article describing the implementation (I think it is also available online): A “minimal disruption” skeleton experiment: seamless map & reduce embedding in OCaml M. Danelutto, R. Di Cosmo Procedia Computer Science 9 ( 2012 ) 1837 – 1846Francois Berenger also replied:
I have observed and measured perfect scalability with up to 4 cores of an OCaml program using Parmap. With more than 4 cores, the scalability was degrading. I think the scalability of the program depends only on the granularity of the tasks. The tasks were coarse in my case.Alain Frisch replied to the original post and Jon Harrop then said:
> Too bad, because the really interesting part would to know (i) what kind of > optimizations you had to do on the F# version (and in particular whether > they make us of parallelism), * Replaced the use of exceptions for control flow with variant types. * Replaced use of fprintf with lower-level .NET functions. * Parallelized some of the symbolic code. * Algorithmic optimization to a search function. > (ii) whether (some of) those optimizations could be applicable to the OCaml > version as well, and * No need to replace exceptions in OCaml. * No need to replace printf and friends in OCaml. * Could try to parallelize the OCaml but it would be hard to do efficiently. * Algorithmic optimization could also be done to the OCaml (IIRC, this was a fairly minor speedup). > (iii) how much effort you initially put into optimizing the OCaml version (just > tweaking the GC parameters can easily give very substantial speedups for > symbolic processing). None. I didn't change the OCaml code at all and didn't try different GC parameters. However, it had already been quite heavily optimized.Alain Frisch then said:
> * No need to replace exceptions in OCaml. It depends. If you compile in debug mode and run with backtrace enabled, exceptions are not so fast any more. > * No need to replace printf and friends in OCaml. Printf can be quite slow. Format direct-style as well. And Format.printf is even worse. ( http://www.lexifi.com/blog/note-about-performance-printf-and-format )oliver asked and Jon Harrop replied:
> What is missing, is the information, how many cores / processors the machine > has, on which the F# code runs, and if the OCaml code runs on the same > machine. Benchmark measurements used for comparison between OCaml and F# were always done on the same machine running Windows. I used two machines: * 2-core 1.67GHz Intel Atom N570 based netbook. * 8-core 2.0GHz Intel Xeon E5405 based workstation. The client verified the relative performance on their own machines. > What about Ocaml Byteocde vs. Nativecode? The performance comparison was done using native code. We did not benchmark OCaml bytecode. > And what, if the re-designt program would be back-ported to OCaml? Only minor changes were made, no redesign. Some of those changes could be back-ported to the OCaml but there is no motivation to do so.
Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-03/msg00099.html
Thomas Gazagnaire announced:I am *very* happy to announce the first official release of OPAM! Many of you already know and use OPAM so I won't be long. Please read http://www.ocamlpro.com/blog/2013/01/17/opam-beta.html for a longer description. 1.0.0 fixes many bugs and add few new features to the previously announced beta-release. The most visible new feature, which should be useful for beginners with OCaml and OPAM, is an auto-configuration tool. This tool easily enables all the features of OPAM (auto-completion, fix the loading of scripts for the toplevel, opam-switch-eval alias, etc). This tool runs interactively on each `opam init` invocation. If you don't like OPAM to change your configuration files, use `opam init --no-setup`. If you trust the tool blindly, use `opam init --auto-setup`. You can later review the setup by doing `opam config setup --list` and call the tool again using `opam config setup` (and you can of course manually edit your ~/.profile (or ~/.zshrc for zsh users), ~/.ocamlinit and ~/.opam/opam-init/*). Please report: - Bug reports and feature requests for the OPAM tool: http://github.com/OCamlPro/opam/issues - Packaging issues or requests for a new package: http://github.com/OCamlPro/opam-repository/issues - General queries to: http://lists.ocaml.org/listinfo/platform - More specific queries about the internals of OPAM to: http://lists.ocaml.org/listinfo/opam-devel On behalf on the OPAM team, Thomas === Install === Packages for Debian and OSX (at least homebrew) should follow shortly and I'm looking for volunteers to create and maintain rpm packages. The binary installer is up-to-date for Linux and Darwin 64-bit architectures, the 32-bit version for Linux should arrive shortly. If you want to build from sources, the full archive (including dependencies) is available here: http://www.ocamlpro.com/pub/opam-full-latest.tar.gz === Upgrade === If you are upgrading from 0.9.* you won't have anything special to do apart installing the new binary. You can then update your package metadata by running `opam update`. If you want to use the auto-setup feature, remove the "eval `opam config env` line you have previously added in your ~/.profile and run `opam config setup --all`. So everything should be fine. But you never know ... so if something goes horribly wrong in the upgrade process (of if your are upgrading from an old version of OPAM) you can still trash your ~/.opam, manually remove what OPAM added in your ~/.profile (~/.zshrc for zsh users) and ~/.ocamlinit, and start again from scratch. === Random stats === Great success on github. Thanks everybody for the great contributions! https://github.com/OCamlPro/opam: +2000 commits, 26 contributors https://github.com/OCamlPro/opam-repository: +1700 commits, 75 contributors, 370+ packages on http://opam.ocamlpro.com/ +400 unique visitor per week, 15k 'opam update' per week +1300 unique visitor per month, 55k 'opam update' per month 3815 unique visitor since the alpha release === Changelog === The full change-log since the beta release in January: 1.0.0 [Mar 2013] * Improve the lexer performance (thx to @oandrieu) * Fix various typos (thx to @chaudhuri) * Fix build issue (thx to @avsm) 0.9.6 [Mar 2013] * Fix installation of pinned packages on BSD (thx to @smondet) * Fix configuration for zsh users (thx to @AltGr) * Fix loading of `~/.profile` when using dash (eg. in Debian/Ubuntu) * Fix installation of packages with symbolic links (regression introduced in 0.9.5) 0.9.5 [Mar 2013] * If necessary, apply patches and substitute files before removing a package * Fix `opam remove <pkg> --keep-build-dir` keeps the folder if a source archive is extracted * Add build and install rules using ocamlbuild to help distro packagers * Support arbitrary level of nested subdirectories in packages repositories * Add `opam config exec "CMD ARG1 ... ARGn" --switch=SWITCH` to execute a command in a subshell * Improve the behaviour of `opam update` wrt. pinned packages * Change the default external solver criteria (only useful if you have aspcud installed on your machine) * Add support for global and user configuration for OPAM (`opam config setup`) * Stop yelling when OPAM is not up-to-date * Update or generate `~/.ocamlinit` when running `opam init` * Fix tests on *BSD (thx Arnaud Degroote) * Fix compilation for the source archive 0.9.4 [Feb 2013] * Disable auto-removal of unused dependencies. This can now be enabled on-demand using `-a` * Fix compilation and basic usage on Cygwin * Fix BSD support (use `type` instead of `which` to detect existing commands) * Add a way to tag external dependencies in OPAM files * Better error messages when trying to upgrade pinned packages * Display `depends` and `depopts` fields in `opam info` * `opam info pkg.version` shows the metadata for this given package version * Add missing `doc` fields in `.install` files * `opam list` now only shows installable packages 0.9.3 [Feb 2013] * Add system compiler constraints in OPAM files * Better error messages in case of conflicts * Cleaner API to install/uninstall packages * On upgrade, OPAM now perform all the remove action first * Use a cache for main storing OPAM metadata: this greatly speed-up OPAM invocations * after an upgrade, propose to reinstall a pinned package only if there were some changes * improvements to the solver heuristics * better error messages on cyclic dependencies 0.9.2 [Jan 2013] * Install all the API files * Fix `opam repo remove repo-name` * speed-up `opam config env` * support for `opam-foo` scripts (which can be called using `opam foo`) * 'opam update pinned-package' works * Fix 'opam-mk-repo -a' * Fix 'opam-mk-repo -i' * clean-up pinned cache dir when a pinned package fails to install 0.9.1 [Jan 2013] * Use ocaml-re 1.2.0Thomas Gazagnaire later added:
As I've received few questions related to upgrading OPAM itself, I've added Anil's response to [1] to the FAQ: https://github.com/OCamlPro/opam/wiki/FAQ [1] https://github.com/OCamlPro/opam/issues/528
Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-03/msg00129.html
Nicolas Barnier asked, triggering many answers:We use OCaml at ENAC (French Civil Aviation University) to teach the basics of programming and the design of algorithms in the first year course of CS major since 1995. The cursus is now under deep revisionand we're trying to advocate its convenience in the new cursus to our hierarchy and colleagues. We were thus wondering which engineering schools and universities are actually currently using OCaml, and for which cursus. Short of finding a long enough list on the OCaml websites or by googling, we've decided to try the caml-list for feedback. So if you are involved in a CS course using OCaml, we would greatly appreciate that you let us know, so as to help us arguing to keep this great language in our cursus.Roberto Di Cosmo asked, Ashish Agarwal suggested, and Philippe Wang said:
> Roberto Di Cosmo wrote: > >> We would really need a central place to keep this info well >> structured. Even a simple wiki page would do. Ashish Agarwal wrote: > Why not the ocaml.org website. I've created an issue to keep track of the > suggestions so far: > https://github.com/ocaml/ocaml.org/issues/119 Yes, there should be such a page somewhere, ocaml.org seems to be the right place for that. I started getting more information about the courses: http://philippewang.info/CL/?id=where_ocaml_is_used_for_teaching_purposes (it's a draft)
Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-03/msg00148.html
Malcolm Matalka announced:I'd like to announce the first release of Riakc, a Riak client in Ocaml written in Jane St's Core/Async. It is in early development, so bugs are abound, but hopefully it is a good start for anyone interested in using Riak from Ocaml. Thomas has kindly merged riakc into opam today, so it can be installed with: opam install riakc The source can be found: https://github.com/orbitz/ocaml-riakc/tree/0.0.0
Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-03/msg00173.html
Jeremie Dimino announced:I'm pleased to announce the 109.14.00 release of the Core suite. The new package of the week is custom_printf: a syntax extension for format strings. Formats prefixed with [!] support the new conversion specifier [%{<Module>}] and a few variants. Arguments are wrapped with the appropriate conversion function. For example: printf !"%{Float}" 42.0; printf !"%{Float.pretty}" 42.0; printf !"%{sexp:int * string}" (42, "foo"); is the same as: printf "%s" (Float.to_string 42.0); printf "%s" (Float.Format.pretty 42.0); printf "%s" (Sexp.to_string_hum (<:sexp_of< int * string >> (42, "foo"))); Files and documentation for this release are available on our website and all packages are in opam: https://ocaml.janestreet.com/ocaml-core/109.14.00/individual/ https://ocaml.janestreet.com/ocaml-core/109.14.00/doc/ Here are the changelogs for versions 109.12.00 to 109.14.00: # 109.12.00 ## async_extra - Made explicit the equivalence between type `Async.Command.t` and type `Core.Command.t`. ## async_unix - Fixed a bug in `Fd.syscall_in_thread`. The bug could cause: ```ocaml Fd.syscall_in_thread bug -- should be impossible ``` The bug was that `syscall_in_thread` raised rather than returning `Error`. - Changed `Tcp.connect` and `Tcp.with_connect` to also supply the connected socket. Supplying the connected socket makes it easy to call `Socket` functions, e.g. to find out information about the connection with `Socket.get{peer,sock}name`. This also gives information about the IP address *after* DNS, which wouldn't otherwise be available. One could reconstruct the socket by extracting the fd from the writer, and then calling `Socket.of_fd` with the correct `Socket.Type`. But that is both error prone and not discoverable. - Added `Writer.schedule_bigsubstring`, which parallels `Writer.schedule_bigstring`. ## core - Add some functions to `Byte_units`. - Added functions: `to_string_hum`, `scale`, `Infix.//`. - Eliminated the notion of "preferred measure", so a `Byte_units.t` is just a `float`. - Improved the performance of `Array.of_list_rev`. The new implementation puts the list elements directly in the right place in the resulting array, rather that putting them in order and then reversing the array in place. Benchmarking shows that the new implementation runs in 2/3 the time of the old one. - Fixed `Fqueue.t_of_sexp`, which didn't work with `sexp_of_t`. There was a custom `sexp_of_t` to abstract away the internal record structure and make the sexp look like a list, but there wasn't a custom `t_of_sexp` defined, so it didn't work. - Added `Stable.V1` types for `Host_and_port`. - Removed `Identifiable.Of_sexpable` and `Identifiable.Of_stringable`, in favor of `Identifiable.Make` `Identifiable.Of_sexpable` encouraged a terrible implementation of `Identifiable.S`. In particular, `hash`, `compare`, and bin_io were all built by converting the type to a sexp, and then to a string. `Identifiable.Of_stringable` wasn't as obviously bad as `Of_sexpable`. But it still used the string as an intermediate, which is often the wrong choice -- especially for `compare` and `bin_io`, which can be generated by preprocessors. Added `Identifiable.Make` as the replacement. It avoids using sexp conversion for any of the other operations. - Added `List.intersperse` and `List.split_while`. These came from `Core_extended.List`. ```ocaml val intersperse : 'a list -> sep:'a -> 'a list val split_while : 'a list -> f:('a -> bool) -> 'a list ** 'a list ``` - Added a functor, `Pretty_printer.Register`, for registering pretty printers. The codifies the idiom that was duplicated in lots of places: ```ocaml let pp formatter t = Format.pp_print_string formatter (to_string t) let () = Pretty_printer.register "Some_module.pp") ``` ## fieldslib - Added back `Fields.fold` to `with fields` for `private` types. We had removed `Fields.fold` for `private` types, but this caused some pain. So we're putting it back. At some point, we'll patch `with fields` to prevent setting mutable fields on private types via the fields provided by `fold`. ## sexplib - A tiny lexer improvement in `lexer.mll`. Used `lexbuf.lex_{start|curr}_pos` instead of `lexbuf.lex_{start|curr}_p.pos_cnum` for computing the length of a lexeme since the difference is the same. # 109.13.00 ## async_core - Fixed `Pipe.iter`'s handling of a closed pipe. Fixed the handling by `Pipe.iter` and related foldy functions that handle one element at a time, which behaved surprisingly with a pipe whose read end has been closed. These functions had worked by reading a queue as a batch and then applying the user function to each queue element. But if the pipe's read end is closed during the processing of one queue element, no subsequent element should be processed. Prior to this fix, the `iter` didn't notice the pipe was closed for read until it went to read the next batch. - Renamed `Pipe.read_one` as `Pipe.read_one`', and added `Pipe.read_one` that reads a single element. ## async_unix - Added `Writer.write_line`, which is `Writer.write` plus a newline at the end. - Added `?close_on_exec:bool` argument to `{Reader,Writer}.open_file` and `Async.Unix.open_file`. Made the default `close_on_exec:true` for `Reader` and `Writer`. - Added a `compare` function to `Socket.Address.Inet`. ## core - Added `Command.Spec.flags_of_args_exn`, for compatibility with OCaml's standard library. This function converts a `Core.Std.Arg.t` into a `Command.Spec.t`. - Made various modules `Identifiable`: `Char`, `String`, and the various `Int` modules. In particular, `Int` being identifiable is useful, because one can now write: ```ocaml module My_numeric_identifier : Identifiable ` Int ``` You might think that we could now delete `String_id`, and just write: ```ocaml module My_string_identifier : Identifiable ` String ``` But this is not quite equivalent to using `String_id`, because `String_id.of_string` enforces that its argument is nonempty. - Removed module `Space_safe_tuple`, which became unnecessary in OCaml 4.00.0. OCaml 4.00.0 included Fabrice's patch to fix the space leak that `Space_safe_tuple` was circumventing (PR#5288, commit SVN 11085). - Added `Exn.to_string_mach`, for single-line output. - Added `Linux_ext.bind_to_interface`, to improve security of UDP applications. ```ocaml val bind_to_interface : (File_descr.t -> string -> unit) Or_error.t ``` This uses the linux-specifc socket option `BINDTODEVICE` to prevent packets being received from any interface other than one named. - Fixed `Unix.mkdir_p` on Mac OS X. # 109.14.00 ## async - Added function `Monitor.kill`, which kills a monitor and all its descendants. This prevents any jobs from ever running in the monitor again. ## async_unix - Fixed major performance degradation (since 109.04) in `Reader.read*` functions. - Added function `Rpc.Implementation.map_inv`. ```ocaml val map_inv : 'a t -> f:('b -> 'a) -> 'b t ``` - Add functions `Reader.file_lines` and `Writer.save_lines`. These deal with files as lists of their lines. ```ocaml val Reader.file_lines : string -> string list Deferred.t val Writer.save_lines : string -> string list -> unit Deferred.t ``` - Added a `?wakeup_scheduler:bool` optional argument to functions in the `Thread_safe` module. The default is `true`, which continues the behavior that has been in place since 109.09. However, once can use `~wakeup_scheduler:false` to reduce CPU use, in return for increased latency (because the scheduler won't run a cycle immediately). ## core - Fixed major performance problem with hashing in `Int.Table`. Our `Int.Table.replace` was 3 times slower than polymorphic hash table and `find` was _8_ times slower. This was caused by using: ```ocaml external seeded_hash_param : int -> int -> int -> 'a -> int = "caml_hash" "noalloc" ``` in `Int.Table` but: ```ocaml external old_hash_param : int -> int -> 'a -> int = "caml_hash_univ_param" "noalloc" ``` everywhere else. The `seeded_hash_param` was introduced in Caml 4. We fixed this problem by changing `Int.hash` from: ```ocaml let hash (x : t) = Hashtbl.hash x ``` to: ```ocaml let hash (x : t) = if x >= 0 then x else ~-x ``` - Added `Bigstring.{pread,pwrite}`, which allow reading and writing at a specific file offset. - Added module `Nothing`, which is a type with no values. This is useful when an interface requires you to specify a type that you know will never be used in your implementation. - Changed `Identifiable.Make` so that it registers a pretty printer. `Identifiable.Make` now uses `Pretty_printer.Register`. This requires all calls to `Identifiable.Make` to supply a `val module_name : string`. - Made `Core.Zone` match the `Identifiable` signature. - Made polymorphic equality always fail on `Core.Map.t` and `Core.Set.t`. Before this change, polymorphic equality on a `Core.Map` or a `Core.Set` could either raise or return `false`. It returnd `false` if the data structures were unequal, and raised if the data structures were equal. This is because their type definitions looked like: ```ocaml type ('k, 'v, 'comparator) t = { tree : ('k, 'v) Tree0.t; comparator : ('k, 'comparator) Comparator.t; } ``` and polymorphic equality visits a block's fields in order. So, it will detect unequal trees and return false, but if the trees are equal, it will compare the comparators and raise because of the functional value. This change reversed the order of the fields so polymorphic equality always fails. ## custom_printf - initial import Added support for `%{<Module>}` in `printf`-style format strings. If you put `!` before a format string, it allows the use of a spec like `%{<Module>}` in the format string. For example, using `%{Time}` wraps `Time.to_string` around the appropriate argument. It also allows different formats for a given type: `%{<Module>.<identifier>}` wraps `<Module>.Format.<identifier>` around the appropriate argument. For example, `%{Float.pretty}` would wrap `Float.Format.pretty` around the appropriate argument. ## fieldslib - Made `with fields` expose first-class fields for private types while preserving privacy. There is now an additional phantom type in a first-class field that prevents building or modifying elements of a private type. One consequence of this change is that the `Field.t` type is now an abstract type -- it used to be exposed as a record type. So, one must, e.g., change `field.Field.name` to `Field.name field`.
Thanks to Alp Mestan, we now include in the Caml Weekly News the links to the recent posts from the ocamlcore planet blog at http://planet.ocaml.org/. An Indentation Engine for OCaml: http://www.ocamlpro.com/blog/2013/03/18/monthly-03.html [ANN] Riakc 0.0.0: http://functional-orbitz.blogspot.com/2013/03/ann-riakc-000.html [ANN] Protobuf 0.0.2: http://functional-orbitz.blogspot.com/2013/03/ann-protobuf-002.html OPAM 1.0.0 Released: http://www.ocamlpro.com/blog/2013/03/14/opam-1.0.0.html
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.