Caml Weekly News

Hello

Here is the latest Caml Weekly News, for the week of January 29 to February 05, 2013.

OCaml-Java: new preview
Core Suite 109.07.00 released
ocurl forked and 0.5.4 released
multiplexing several threads
Brand-new BER MetaOCaml for OCaml 4.00.1
Update on docs.camlcity.org
Other Caml News

OCaml-Java: new preview

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-01/msg00210.html

Xavier Clerc announced:

In order to support some of the claims made during the OUPS meetup,
I made a new binary preview of the OCaml-Java project, available at
the following address:
	http://ocamljava.x9c.fr/preview/

The next versions will also be made available on the very same page,
without notification to the OCaml mailing list (at least until the distribution
is binary-only).

I am still eagerly looking for testers... and bug reports.

Rudi Grinberg asked and Pierre Chambart suggested:

> It would be really sweet to be able to install ocamljava through opam.
> Something along the lines of opam switch ocamljava.

You can try
opam remote add
https://github.com/chambart/opam-compilers-repository.git
opam switch ocamljava-preview

But do not expect any package to work.

Core Suite 109.07.00 released

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-01/msg00213.html

Jeremie Dimino announced:

I'm pleased to announce the 109.07.00 release of the Core suite of
libraries.  Core is an industrial strength alternative to OCaml's
standard library.

Tarballs can be found here:

  https://ocaml.janestreet.com/ocaml-core/109.07.00/individual/

And the documentation:

  https://ocaml.janestreet.com/ocaml-core/109.07.00/doc/

All packages are available through opam.

** The Core suite **

The Core suite includes a variety of useful libraries, including:

- Core: the heart of the standard library.
- Several useful syntax extensions
  - type-conv: a library for building type-driven syntax extensions
  - sexplib: a library for handling s-expressions, and a syntax
    extension for auto-generating conversions between OCaml types and
    s-expressions
  - bin-prot: a syntax-extensions for generating
  - pipebang
  - variantslib
  - comparelib
  - fieldslib
- Async: a monadic concurrency library.
- Core_extended: extra components that are not as closely vetted or as
  stable as Core.  This includes, Shell, an interface for interacting
  with the UNIX shell, and Command, a command-line parsing library.

** Repositories **

The official repositories for the Core libraries are now located on
the "janestreet" organisation on github and the "janestreet" team on
bitbucket.

github:            https://github.com/janestreet
bitbucket:         https://bitbucket.org/janestreet
github home page:  http://janestreet.github.com/

We changed the way we are exporting our source tree; there will now be
only one visible commit per release and project. Hopefully this
simpiflied process will allow more frequent updates.

** Contribution **

If you want to contribute to core the preferred way is to submit a
pull-request, which won't be merged in the end but will be used for
working on the changes before they are accepted and added to our
development process for integration in a future release.

** Changes **

He is a list of changes since the previous public release (108.08.00):

- Switched to OCaml 4.0.
- Async:
  - Add a [~perm] argument to [Writer.open_file] to set the file
    permissions in the same way [Unix.openfile] does.
  - Added [Async.Unix.unsetenv].
  - Fixed a bug in [Reader] that in some situations would make the
    reader unusable after an error in user code (e.g. a failed sexp
    conversion).
  - Fixed a bug in [Writer] that manifests when scheduling bigstrings
    with non-zero pos parameter.
  - Fixed a bug in [Scheduler.go], which previously behaved incorrectly
    if an exception had been raised to the main monitor prior to
    [Scheduler.go] being called.  The exception is now dealt with
    immediately, rather than running a cycle.
  - Exposed the type and value [Async.Config.t] as sexpable.
  - Improved error message when a user requests async to manage a file
    descriptor that it is already managing.
  - Improved error message when creation of the async scheduler fails.
  - Added [val _squelch_unused_module_warning_] to [Async.Std].
  - Made [Reader.load_sexp{,s}] handle exceptions other than
    [Of_sexp_error].
  - Fixed a bug in async's handing of file descriptors -- a missed check
    for a file descriptor having been closed.
  - Added [Writer.set_buffer_age_limit].
  - Improved the performance of [Deferred.Queue] by changing the
    implementation to use lists rather than arrays, which reduces gc
    promotion.
  - Added [?should_close_file_descriptor:bool] argument to [Async.Unix.close].
  - Moved the implementation of finalizers from Async_unix to
    Async_core.  This makes it possible to unit test finalizers (and
    things using them) using only the Async_core scheduler.
  - Changed the async scheduler so that if there are no upcoming events,
    it times out in 50ms rather than waiting forever.
  - Added =Monad_sequence.iteri=, which in turn adds it to:
    =Deferred.Array=, =Deferred.List=, and =Deferred.Queue=.
  - Added =Pipe.init=, which is analogous to =Deferred.create=.
  - Improved the performance of =Pipe.iter{,_without_pushback}=.
  - Improved =Reader.read_one_chunk_at_a_time_until_eof=:
    - the callback need not consume everything
    - add =`Eof_with_unconsumed_data= as a possible result
    - grow internal buffer of the reader when needed
  - Added =Shutdown.exit=, removed =Shutdown.shutdown_and_raise=.
  - Added =Scheduler.force_current_cycle_to_end=.
- Core:
  - Added [Char.of_string].
  - Fix [Backtrace.get], which was broken in 109.00, with the
    switch to OCaml 4.0.
  - Added [Heap.iter_el].
  - Updated [Core.Unix.stat] so that access, modify, and change times
    have nanosecond precision.
  - Fixed a bug in [Nano_mutex.invariant].
  - Simplified the implementation of [with_return] using a local
    explicit polymorphic type variable.
  - Added [Map.symmetric_diff], for returning a list of differences
    between two maps.  It has a fast-path implementation for maps that
    share a large amount of their internal structure.
  - Added a number of functions to =Bounded_int_table=: =equal=,
    =exists{,i}=, =for_all{,i}=, =filter_map{,i}=, =map{,i}=.  Also
    added a functor, =Bounded_int_table.With_key=, that makes a
    bounded-int table binable and sexpable, and adds =of_alist= and
    =of_alist_exn=.
  - Added =Doubly_linked.iter_elt= and =Bag.iter_elt=.
  - Added =module Invariant=, which defines signatures that are to
    be included in other signatures to ensure a consistent interface to
    invariant-style functions.
  - Added =module Ordering=, which defines:
      =type t = Less | Equal | Greater=

ocurl forked and 0.5.4 released

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-01/msg00214.html

ygrek announced:

Let me announce the existence of the new ocurl project, namely at
http://ocurl.forge.ocamlcore.org/ This project is the fork of
http://ocurl.sourceforge.net/ and accumulates many bugfixes and
enhancements which didn't make it to the original version, including
the multi API wrappers and pluggable asynchronous interface (which
makes it possible to inject curl transfers in your event loop of
choice). This is the first release for the fork, that incorporates all
the bugfixes and new features from 3 years of its history, except the
asynchronous api. Next release will merge branch event.

I would like to thank Lars Nilsson, the original author of ocurl, who
brought this project to existence. Hope the forked version will prove
useful.

multiplexing several threads

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-01/msg00215.html

Ivan Gotovchits asked and Mauricio Fernandez replied:

> I'm stuck with a simple problem when using lwt. I have two lwt
> threads. One on them reads commands from user (using Lwt_read_line),
> another reads events from system. A need to multiplex them. Speaking in
> ocaml, I have three functions:
> 
>   type state    (* program state to process *)
>   type command  (* command to be executed (issued either by user or by
>                    system event) *)
> 
>   val read_from_user: unit -> command Lwt.t
>   val read_event: unit -> command Lwt.t
>   val process: state -> command -> state Lwt.t
> 
>  
> And a need to write a function loop
> 
>   val loop: state -> state Lwt.t
> 
> accepting some state, taking a command from user or an event from system
> (whichever comes first) and passing it to `process' function, and doing
> it in a cycle, until explicitly ordered to stop (with Exit command). 
> 
> The following solution
> 
>    let rec loop sc =
>      let cmd = read_from_user () <?> read_event () in
>      cmd >>= (process state)
> 
>    and process sc = function
>      | Exit -> return sc
>      | cmd  -> (* modify sc*) loop sc
> 
>    let r = Lwt_main.run (loop state)
> 
> doesn't work when the event thread wakes up first. In that case the next
> step of the loop will call Lwt_read_line.read_line before the previous
> invocation finishes and it breaks user input.
>
> Please, help me to find a feasible solution to my problem!

let process s = function
    Exit -> return (`Exit s)
  | cmd -> let s = ... in return (`State s)

let rec loop state user_cmd event =
  match_lwt
    Lwt.choose
      [ (lwt x = user_cmd in return (`User x));
        (lwt x = event in return (`Event x));
    ]
  with
    `User cmd -> begin match_lwt process state cmd with
                   `Exit s -> Lwt.cancel event; return s
                | `State s -> loop s (read_from_user ()) event
                 end
  | `Event cmd -> begin match_lwt process state cmd with
                   `Exit s -> Lwt.cancel event; return s
                 | `State s -> loop s user_cmd (read_event ())
                 end

Brand-new BER MetaOCaml for OCaml 4.00.1

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-01/msg00216.html

oleg announced:

BER MetaOCaml N100 is now available. It is a strict superset of OCaml
4.00.1, extending it with staging annotations to construct and run
typed code values. BER MetaOCaml has been completely re-implemented
and thus caught up with OCaml. For those who don't know what staging
or MetaOCaml is, a short introduction follows the news summary.

BER MetaOCaml is in the spirit of the original MetaOCaml by Taha and
Calcagno, but has been completely re-written using different
algorithms and techniques. BER MetaOCaml has been re-structured to
minimize the number of the changes to the OCaml type-checker and to
separate the `kernel' (type-checking and constructing code values)
from the `user-level'. Various ways of running the code -- compiling
to byte-code or machine instructions and executing them, or
translating code values to C or LLVM, or printing them -- can be done
in `user-level' libraries, without any need to hack into (Meta)OCaml.
(Printing and byte-code execution are included in BER N100.)  This
release of BER MetaOCaml is meant to incite future research into
type-safe meta-programming. 

To the user, the two major differences of BER N100 from the old
MetaOCaml are:

 -- constructor restriction: all data constructors and record labels 
 used within brackets must come from the types that are declared in 
 separately compiled modules.

 -- scope extrusion check: attempting to build code values with
 unbound or mistakenly bound variables (which is possible with
 mutation or other effects) is caught early, raising an exception
 with good diagnostics.

Both are explained after the short introduction to staging.  Smaller
visible differences are better printing of cross-stage persistent
values and the full support for labeled arguments. A long-standing
problem with records with polymorphic fields has fixed itself. The BER
N100 code is now extensively commented, and has a regression test
suite. BER N100 is much less invasive into OCaml: compare the size of
the patch to the main OCaml type-checker typing/typecore.ml, which
contains the bulk of the changes for type checking the staging
constructs. In the previous version BER N004, the patch had 564 lines
of additions, deletions and context; now, only 328 lines.
(The core MetaOCaml kernel is trx.ml, with 1800 lines.)

BER MetaOCaml N100 is available:

-- as a set of patches to the OCaml 4.00.1 distribution. 
        http://okmij.org/ftp/ML/ber-metaocaml-100.tar.gz
See the INSTALL document in that archive. You need the source
distribution of OCaml 4.00.1, see the following URL for details.
        http://ocaml.org/install.html

-- as a GIT bundle containing the changes relative to OCaml 4.00.1
        http://okmij.org/ftp/ML/metaocaml.bundle
First, you have to obtain the base
       git clone https://github.com/ocaml/ocaml.git -b 4.00 ometa4
then switch to tag 4.00.1 and then apply the bundle.

Jacques Carette has extensively re-written the printing of code
values, and is currently maintaining this part. I'm grateful to him
for encouragement and discussions.

Introduction to staging and MetaOCaml

The standard example of meta-programming -- the running example of
A.P.Ershov's 1977 paper that begat meta-programming -- is the power
function, computing x^n. In OCaml:

    let square x = x * x
    let rec power n x =
      if n = 0 then 1
      else if n mod 2 = 0 then square (power (n/2) x)
      else x * (power (n-1) x)
    (* val power : int -> int -> int = <fun> *)

Suppose our program has to compute x^7 many times. We may define
    let power7 x = power 7 x

In MetaOCaml, we may also specialize the power function to a
particular value n, obtaining the code which will later receive x
and compute x^n. We re-write power n x annotating expressions as
computed `now' (when n becomes known) and `later' (when x is given).

    let rec spower n x =
      if n = 0 then .<1>.
      else if n mod 2 = 0 then .<square .~(spower (n/2) x)>.
      else .<.~x * .~(spower (n-1) x)>.
    (* val spower : int -> ('cl, int) code -> ('cl, int) code = <fun> *)

The two annotations, or staging constructs, are brackets .< e >. and
escape .~e . Brackets .< e >. `quasi-quote' the expression e,
annotating it as computed later. Escape .~e, which must be used within
brackets, tells that e is computed now but produces the result for
later. That result, the code, is spliced-in the containing bracket. 
The inferred type of spower is different. The result is no longer an
int, but ('cl, int) code -- the code of expressions that compute an int.
The first type argument to 'code', often named 'cl, is a so-called
environment classifier and can be skipped on first reading. The type
of spower spells out which argument is received now, and which
later. To specialize spower to 7, we define

    let spower7_code = .<fun x -> .~(spower 7 .<x>.)>.;;
(*
    val spower7_code : ('cl, int -> int) code = .<
      fun x_1 ->
       (x_1 *
         (((* cross-stage persistent value (id: square) *))
           (x_1 * (((* cross-stage persistent value (id: square) *)) (x_1 * 1)))))>.
*)
and obtain the code of a function that will receive x and return
x^7. Code, even of functions, can be printed, which is what MetaOCaml
toplevel did. The print-out contains so-called cross-stage persistent
values, or CSP, which `quote' present-stage values such as square to
be used later. One may think of CSP as references to `external
libraries' -- only in our case the program acts as a library for the
code it generates.

If we want to use thus specialized x^7 now, in our code, we should
compile spower7_code and link it back to our program. This is called
`running the code'

    let spower7 = .! spower7_code
    (*
    val spower7 : int -> int = <fun>
    *)

The specialized spower7 has the same type as the partially applied
power7 above. They behave identically. However, power7 x will do
recursion on n, checking n's parity, etc. In contrast, specialized
spower7 has no recursion (as can be seen from spower7_code). All
operations on n have been done when the spower7_code was computed,
producing the straight-lined code spower7 that operates only on x.

MetaOCaml supports arbitrary number of later stages, letting us write
not only code generators but also generators of code generators, etc.

Data constructor restriction

BER MetaOCaml N100 imposes the restriction that all data constructors
and record labels used within brackets must come from the types that 
are declared in separately compiled modules. For example, the
following all work:

    .<true>.;;
    .<raise Not_found>.;;
    .<Some [1]>.;;
    .<{Complex.re = 1.0; im = 2.0}>.;;
    let open Complex in .<{re = 1.0; im = 2.0}>.;;
    .<let open Complex in {re = 1.0; im = 2.0}>.;;

because data types bool, option, list, Complex are either Pervasive or
defined in the (separately compiled) standard library. However, the
following are not allowed and flagged as compile-time error:

       type foo = Bar;;
       .<Bar>.

       module Foo = struct exception E end;;
      .<raise Foo.E>.

The type declaration foo or the module declaration Foo must be moved
into a separate file. The corresponding .cmi file must also be
available at run-time: either placed into the same directory as the
executable, or somewhere within the OCaml library search path.

Scope extrusion test

Although MetaOCaml permits manipulation and splicing of open code, its
type system statically ensures that only closed code can be printed or
run: We can't run the code we haven't finished constructing. For example,
     .<fun x -> .~(let y = .! .<x>. in .<y>.)>.;;
gives a type error since .<x>. is obviously open code.

This static guarantee holds only for pure code. Effects such as
storing code values in mutable cells void the guarantee. Here is the
example using the _old_ MetaOCaml from 2006 (version 3.09.1 alpha
030).  (The problem can be illustrated simpler, but the following
example is more realistic and devious.)

    let c =
     let r = ref .<fun z->z>. in 
     let f = .<fun x -> .~(r := .<fun y -> x>.; .<0>.)>. in 
     .<fun x -> .~f (.~(!r) 1)>. ;;
    (*
    val c : ('a, '_b -> int) code =
      .<fun x_4 -> ((fun x_2 -> 0) ((fun y_3 -> x_2) 1))>.
    *)

One must look hard to see that x_2 is actually unbound in the resulting
code. The problem is revealed when we attempt to run that code:

    .! c;;
    (*
    Characters 77-78:
      Unbound value x_2

    Exception: Trx.TypeCheckingError.
    *)

Since we get the error anyway (without much diagnostics though), one
may discount the problem. Alas, sometimes scope extrusion results in
no error -- just in wrong results.

    let c1 =
     let r = ref .<fun z->z>. in 
     let _ = .<fun x -> .~(r := .<fun y -> x>.; .<0>.)>. in 
     !r;;
    (*
    val c1 : ('a, '_b -> '_b) code = .<fun y_3 -> x_2>.
    *)

we then use c1 to construct the code c2:

    let c2 = .<fun y -> fun x -> .~c1>.;;
    (*
    val c2 : ('a, 'b -> 'c -> '_d -> '_d) code =
      .<fun y_1 -> fun x_2 -> fun y_3 -> x_2>.
    *)

which contains no unbound variables and can be run without problems.

    (.! c2) 1 2 3;;
    (* - : int = 2 *)

It is most likely that the user did not intend for 'fun x ->' in c2 to
bind x in c1. This is the blatant violation of lexical scope. And yet we
get no error or other overt indication that something went wrong.

BER MetaOCaml N100 has none of this. Although the type system still
permits code values with escaped variables, attempting to use such
code in any way -- splice, print or run -- immediately raises an
exception. For example, entering the expression c in the top-level BER
MetaOCaml N100 gives

Exception: Failure
Scope extrusion at Characters 89-99:
       let f = .<fun x -> .~(r := .<fun y -> x>.; .<0>.)>. in 
                                    ^^^^^^^^^^
 for the identifier x_7 bound at Characters 74-75:
       let f = .<fun x -> .~(r := .<fun y -> x>.; .<0>.)>. in 
                     ^
The exception message tells which variable got away, where it was
bound and where it eloped.

The file NOTES.txt in the BER MetaOCaml distribution describes the
features of BER MetaOCaml in more detail and outlines directions for
further development. Hopefully the release of BER MetaOCaml N100 would
stimulate using and researching typed meta-programming.

rixed and Francois Berenger asked, and oleg replied:

> Could it be (ab)used to translate easily a whole program's OCaml source
> into C code?

In fact at one point MetaOCaml did so -- translated the generated OCaml
code to C or Fortran. This feature -- called offshoring -- was used to
generate FFT kernels. The generated C code was good enough to plug as
it was into the FFTW benchmarking and testing framework. During the
overhaul offshoring has been separated and left for clean-up and
re-writing. Now there is a place for it in the overall architecture.

> I understand that this is very interesting from a research
> perspective, but practically meta-programming is only useful as long
> as performances are the concern, for these problems when some critical
> information is not known until runtime; I can't think of another usage
> for runtime code specialisation anyway.
>
> So the question that immediately arises is then: why is metaocaml
> supporting byte code only? Is a metaocamlopt planned, envisaged,
> doable, in a galaxy not too far away?

Before discussing the plans for the native run, let me clear a
misconception. In two largish MetaOCaml projects I recall offhand,
MetaOCaml was used to generate the optimal code for FFT kernels and
for Gaussian Elimination. Gaussian Elimination is a large family of
the algorithms, parameterized by domain (integer, float, polynomial),
by the choice of pivoting, by the need to compute determinant, rank,
permutation matrix, by the layout of matrices and by a half a dozen
more aspects. Because Gaussian Elimination is often used in inner
loops, the performance matters a great deal. Therefore, all this
parameterization has to be done statically. Jacques Carette and I used
MetaOCaml to generate GE procedures for particular parameter choices.
The generated code was printed out, to be compiled in a
library -- by the native OCaml compiler. Let me emphasize: the code
generator was byte-code OCaml, but the generated code could be stored
and compiled by the native OCaml. We used the byte-code run, but only
for testing. The FFT project was similar: the generator was byte-code,
the byte-code run was used for testing, the generated code was C 
and compiled with the Intel C compiler.

The point is that the byte-code generator produces the code that can
be compiled by any suitable compiler. The mode (byte-code or native)
of the generator and of the generated code are not related. In my
experience, the existing MetaOCaml is already useful in practice --
but more could and should be done.

Regarding native MetaOCaml. As far as generating code is concerned, it
can be done already. (I should try to make optmetaocamlc -- it should
just work). The generated code can be stored on file, compiled any way
we wish -- and linked back using dynlink. What is needed currently is
to make this process automatic and convenient. The refactoring of 
MetaOCaml should help. The functions to run the code are now `user
level'. That is, to change them or to add new ones, one does not have
to re-compile MetaOCaml. Adding a new run function is as simple as
adding a new library function -- just making sure the compiler
can find your *.cmi and *.cmo/*.cmx files.

I too want the end result quicker. Yet basics have to be done
first. At the very least we need to get the codebase that one can work
with and understand, and to be able to add features without breaking
anything.  I think basics is mostly done.

Update on docs.camlcity.org

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-02/msg00028.html

Gerd Stolpmann announced:

I've updated the code search engine on docs.camlcity.org. As before, it  
mainly bases on GODI packages, but it now also includes many OASIS  
packages.

The search engine allows to search for definitions and uses of  
identifiers in OCaml source code. For example, "List.map" shows the  
files with the definitions at the top position of the result list,  
followed by the references of List.map.

You can also search for type names, e.g. "in_channel". Type expressions  
are not yet supported (but I'm working on this).

The search engines is especially useful when you want to read and  
understand code you've not written yourself, and that uses libraries  
you don't know yet. It's also good when you want to find sample code  
that uses a certain function.

Note that there is also a browsing mode. Also, the viewer for source  
code allows it to click on identifiers triggering an ad-hoc search for  
other occurrences.

Start page: http://docs.camlcity.org
Query syntax: http://docs.camlcity.org/docs/syntax.html
FAQ: http://docs.camlcity.org/docs/faq.html

Technically, this update uses a dedicated full-text search engine for  
the first time. It's my own development Amber (not yet publicly  
released). The main advantage is the speed - searches take only a few  
milliseconds now, but it also improves scoring (based on inverted term  
frequencies). More about Amber later.

Other Caml News

From the ocamlcore planet blog:

Thanks to Alp Mestan, we now include in the Caml Weekly News the links to the
recent posts from the ocamlcore planet blog at http://planet.ocaml.org/.

An alternative to camlp4 - Part 2:
  http://lpw25.net/2013/02/05/camlp4-alternative-part-2.html

Implementing HAMT for OCaml:
  http://gallium.inria.fr/blog/implementing-hamt-for-ocaml

User functions in Arakoon:
  http://blog.incubaid.com/2013/02/01/user-functions-in-arakoon/

BER MetaOCaml 100:
  http://caml.inria.fr/cgi-bin/hump.cgi?contrib=844

Ocurl 0.5.4:
  http://caml.inria.fr/cgi-bin/hump.cgi?contrib=416

Discussions on the Syntactic Meta Programming(wg-camlp4 list):
  http://hongboz.wordpress.com/2013/01/31/discussions-on-the-syntactic-meta-programmingwg-camlp4-list/

Introduction to Mezzo, continued:
  http://gallium.inria.fr/blog/introduction-to-mezzo-2

ocurl 0.5.4 released:
  https://forge.ocamlcore.org/forum/forum.php?forum_id=869

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.

Alan Schmitt