Previous week Up Next week


Here is the latest Caml Weekly News, for the week of May 22 to June 05, 2007.

Sorry for last week, I was away for a week with no internet access.

  1. OCaml release 3.10.0
  2. Teaching bottomline, part 1
  3. Teaching bottomline, part 2: what went right.
  4. Teaching bottomline, part 3: what should improve.
  5. Camlp4 Wiki
  6. Enhanced Ocaml Documentation 3.10
  7. EasyLanguage to C# translator
  8. Lisp/Scheme parser
  9. Smoke Unleashed!
  10. OCaml packages in Fedora
  11. My 3-month experience developing a compiler in both OCaml and Haskell
  12. Final CFP: The 2007 ACM SIGPLAN Workshop on ML
  13. Faking concurrency using Unix forks and pipes
  14. concurrency using forks and pipes (some working code)
  15. Faking concurrency using Unix forks and pipes (ray tracing results)
  16. Announcing OMake; pre-announcing OMake 0.9.9 with automated C library wrapping for OCaml
  17. ODT: OCaml Development Tools 1.1 released
  18. JoCaml Released.
  19. plc, a One-Day Prolog Compiler

OCaml release 3.10.0


Aleksey Nogin announced:
I have built the source and binary (i386) RPMs of OCaml 3.10.0 for Red 
Hat Enterprise Linux 4 and 5 and for Fedora 2-6. See for the listing of the RPMs. 

All packages are signed with my GPG key; the public key is available at 

You can also download the latest RPMs of OCaml and OMake via yum/up2date 
- just point your client to the appropriate URL from the following list: 

A sample /etc/yum.repos.d/Nogin.repo for RHEL5 (i386) is attached below.
(see the archive link for the file)

Teaching bottomline, part 1


David Teller said:
 About six months ago, I sent a mail to this list asking for ideas and 
suggestions regarding teaching OCaml. Well, the term is over, except for 
a few students who're going to get a second chance at their exams, so I 
guess now is the time for a bottom line. 

 All in all, I consider this a success. In the following mails, I'll 
detail the successes and problems me and my students have met during the 

= Program of the term = 

* Lectures  (available in French from my webpage). 
* Short exercices during lab courses (also available on my page). 
* Short homework                     (also available on my page). 
* Term-long project on a subject selected by the students. 
-- Pacman          (complete success) 
-- RSA Cryptography(success, despite ugly code) 
-- Connect 4       (mostly success) 
-- Sliding puzzle  (complete failure) 
-- Sudoku          (complete failure) 
-- Another Sudoku  (mostly success) 
-- File explorer   (half-success). 
* Day-long project on Othello/Reversi. I wrote the complete software, 
gave them only the .mli and a pseudo-Makefile and 10 hours to 
reimplement the modules of their choice. 
-- Rules             (complete success) 
-- Graphics-based UI (complete failure) 
-- LablTk-based UI   (mostly failure) 
-- AI                (mostly failure) 
-- Board management  (complete success) 
-- Utilities         (complete success) 
-- Main program      (complete success).

Teaching bottomline, part 2: what went right.


David Teller said:
A number of things went well, sometimes impressively. 

* A number of students seem to get the hang of functional programming 
(programming without side effects, returning closures, functions as 
first-class citizens, recursive loops...) 

* Modules seem generally rather well understood. 

* The students enjoyed Graphics immensely. 

* When asking students to write a specific function, it's much easier to 
show examples with OCaml than with, say, Java. Consequently, exercices 
are generally better understood. 

* Some of the students have started answering some mathematical 
questions with OCaml programs. 

* One of my students did manage to write a function with type 'a -> 'b 
without using Obj or Marshal. Others managed to explain me (almost) 
correctly why this shouldn't be possible. 

* The students seem to have understood exceptions, as well as file 
management. Two things they just couldn't do at all in Java. 

* Most students seem to have no problems using references when they need 

* I believe that students actually understand better Java now that they 
have seen something a bit more abstract. Plus they had much more fun. 

* #trace is good. Very good.

Teaching bottomline, part 3: what should improve.


David Teller said, spawning a long discussion:
Third and (probably) last part of my Teaching bottomline. I hope some of 
you find this useful. 
Let's start by problems I didn't cause. 

= Not my fault = 

== Environment == 

* OCamlWinPlus, in its current incarnation, is just awful. To improve 
it, one would need to fix the bugx, make it possible to remember the 
list of modules loaded, link it to the documentation, etc. 

* Camelia IDE is simple and good but difficult to install, as it 
requires Cygwin. It lacks project management, though -- and, now, 
ocamlbuild support. I believe that it could draw some inspiration from 

* Students just can't install LablGtk, LablGl, Camlimage... by 
themselves, nor would I expect them to. A nice, centralised, installer, 
would be nice. 

* Building projects is just too hard. I hope ocamlbuild will solve this. 

* It seems that Graphics doesn't work with Cygwin, hence with Camelia 
for Windows. Which is bad, as they seem to enjoy both Graphics and 
Camelia. What prevents Graphics in Cygwin+X ? 

== Error messages == 

* Error messages of the type system are somewhat obscure. The reflex of 
many students is "OCaml wants it to be of type XXX", rather than "there 
is a contradiction in what I wrote". It would be nice if there was a way 
to ask OCaml to display additional information on type errors. Say 
something like, whenever typing of an expression fails, restarting the 
type algorithm but printing out the various unifications as they take 

== Documentation == 

* Documentation of LablTk is non-existent. I'm thinking about taking a 
student to write a more OCaml-ish layer on top of LablTk but I don't 
know if/when this will happen. 

* Type 'option' doesn't appear in the list of types of the 
documentation. Nor do 'Some' and 'None' appear in the list of values. 

* A nice *beginner-oriented* tutorial is really missing for students who 
failed to pay attention to the beginning of the lecture. Something more 
applied than _Developing applications with OCaml_ and less technical 
than . Say, leading a beginner to define a 
Connect 4 game. I'm willing to participate into writing this, but not 
alone. I might launch a thread on this subject on the ML. 

= My fault = 

* That's not OCaml-specific but there must be some construction better 
suited than "for" or "while" to write loops without having to handcode a 
recursive loops. Right now, I can't think of anything better than a 
"hidden" Y combinator, but there must be something. 

* Arrays of arrays (of arrays...) are a bit obscure for students, 
although they're getting better at it. 

* Some students rely too much on references. 

* The usual note-taking/attention deficit/motivation deficit problems. 

* Anonymous functions are still beyond most of them.

Camlp4 Wiki


Nicolas Pouillard announced:
We've started a documentation update for the new Camlp4. It is 
available as a Wiki: 

Contributions are welcome (just send me an email for an account). 

Contributions are twofold: 
1/ Enriching or adding pages. 
2/ Arranging documentation priorities on:

Enhanced Ocaml Documentation 3.10


Hendrik Tews announced:
I would like to announce the 

                 The Enhanced Ocaml Documentation 
                          Version 3.10 
   available via 

This version of the Ocaml manual enhances the original html 
version in the following way: 

- Changes (wrt version 3.09) are tagged with icons and color 

- an additional appendix contains just the grammar rules 

[The hot meta symbols are now already in the original docs.] 

Changes not mentioned in the 3.10 announcement: 

- profiler recognizes OCAMLPROF_DUMP environment variable 
- new profiler options: -impl sourcefile, -intf sourcefile, -version 
- positional specifiers for printf [probably broken] 
- Printf.{kfprintf,ksprintf,kbprintf} 
- Scanf.bscanf_format 
- type Event.event is now covariant 
- instance variables of objects are not any longer directly 
  available from C 
- CAMLreturnT macro for C functions not returning a value

EasyLanguage to C# translator


Joel Reymont announced:
I finally released my EasyLanguage [1] to C# translator. You can see   
it live at I actually wrote two versions over the   
course of three months, one in OCaml and one in Haskell. After   
briefly deploying the latter, I ended up going with the former. 

The site is running on top of Cocan Wiki (kudos to Rich!) and I plan   
to populate it as time goes by. I plan to charge traders per   
translation, so I will need to add payment system integration, some   
reporting, etc. Overall, I'm very satisfied with OCaml and my first   
project using it. 

        Thanks, Joel 


Lisp/Scheme parser


Jacques Carette asked:
I tried to find a full Lisp (or Scheme) parser in Ocaml, and did not 
immediately find one.  Is my Google-fu not up to the task? 
Yes, I am aware that parsing s-expressions is trivial, but the full Lisp 
(or Scheme) grammar is somewhat more complicated, and I do not feel like 
reinventing the wheel. 

Any technology that integrates well with Ocaml is fine, since in the end 
in-memory ASTs is going to be what we're really interested in. 


PS: Other than CIL for C, are there other programming languages for 
which I can just get an off-the-shelf (non-commercial) parser?
Pierre-Evariste Dagand answered:
There is an implementation of a minimal lisp called Minilisp here :

It may fulfill your needs.
John Skaller also answered:
OCS Scheme.
Joel Reymont also answered:
This is the first result of the Google search for "scheme ocaml":
Eric Cooper answered the second question:
I wrote an OCaml parser for Java a while ago:
It handled the full language at the time I wrote it, but it probably
needs to be updated now (for generics at least).

Smoke Unleashed!


Jon Harrop announced:
We just released our vector graphics library Smoke, which was the core 
renderer in Presenta: 

This library lets you create stunning interactive vector graphics quickly and 
easily from OCaml programs. 

The site contains three downloadable demos for Linux (both 32- and 64-bit) and 
the library is freely available for non-commercial use in the form of 
compiled bytecode for OCaml 3.09.2. 

Happy hacking!

OCaml packages in Fedora


Richard Jones announced:
[This email is an update on recent efforts to get OCaml packages 
included in the base Fedora distribution]. 
(1) Draft packaging guidelines for Fedora OCaml packages are published 

It would be a great help if people could look at these and point out 
any problems.  Reports should go (in order of preference) to: that 
Bugzilla entry (but you will need to create an account), follow up 
here, or email me directly. 

(2) I've packaged up some things already: 

This list was chosen fairly randomly, in order that I could get some 
of my own code to run for testing purposes.  In the long run, I hope 
we can package most things, but if you think there are important 
packages that should be prioritised, please let me know. 

(3) Would your school[*] / company / organisation be more inclined to 
buy Red Hat Enterprise Linux licenses if it came with high quality 
OCaml packages?  Having packages is nice, but having a business case 
behind them is better.  Please mail me if so. 

(4) Lastly, if you would like to join or follow the effort, you can 
join the SIG here: 


[*] Educational discounts are available.

My 3-month experience developing a compiler in both OCaml and Haskell


Joel Reymont said: 

Final CFP: The 2007 ACM SIGPLAN Workshop on ML


Derek Dreyer announced:
Below is the Final Call for Papers for the 2007 Workshop on ML. 
It's only a few weeks until the submission deadline (June 15), so 
whether you are a designer, developer or user of ML, please consider 
contributing a paper! 

Note that this year, we are introducing a new paper category of 
"work-in-progress reports".  These are intended as a way of informing 
others in the ML community about the status of ML-related research or 
implementation projects, as well as communicating insights gained from 
such projects that do not quite constitute a full research paper. See 
below for more details. 

If you are interested in submitting a paper, but you are not sure 
about what category it belongs in or if it would be appropriate for 
the workshop, please contact me or Claudio Russo, the Workshop Organizer. 

Derek Dreyer (Program Chair) 

                  The 2007 ACM SIGPLAN Workshop on ML 

                        Friday, October 5, 2007 
                           Freiburg, Germany 

                To be held in conjunction with ICFP '07 


                   Submission web site (now open): 

                        FINAL CALL FOR PAPERS 


The ML family of programming languages, whose most popular dialects 
are Standard ML and Objective Caml, has inspired a tremendous amount 
of computer science research, both practical and theoretical. ML 
continues to be employed successfully in applications ranging from 
compilers and theorem provers to low-level systems software, web 
applications and video games. The Workshop on ML aims to bring 
together researchers, developers and users of ML to hear about and 
discuss the latest work on the design, semantics, implementation and 
application of ML and ML-like languages. Previous ML workshops have 
been held in Orlando, Florida (1994), Baltimore, Maryland (1998), 
Tallinn, Estonia (2005), and Portland, Oregon (2006). 

The 2007 Workshop on ML will be held in conjunction with the 12th ACM 
SIGPLAN International Conference on Functional Programming (ICFP 2007) 
in Freiburg, Germany on Friday, October 5, 2007. 


This year, we are seeking paper submissions of two varieties: 
*research papers* and *work-in-progress reports*. 

*Research papers* must present original research that has not been 
published elsewhere. We welcome research papers on any ML-related 
topic, including (but not limited to): 

 * applications 
 * concurrent programming 
 * formal semantics 
 * language design 
 * language formalization and mechanization 
 * language implementation 
 * programming environments 
 * type systems 

*Work-in-progress reports* need not present original research. Rather, 
they are intended as a way of informing others in the ML community 
about the status of ML-related research or implementation projects, as 
well as communicating insights gained from such projects that do not 
quite constitute a full research paper. As such, we expect that 
work-in- progress reports will be shorter than research papers, and we 
will not judge them to the same standard. If you have any questions 
regarding the appropriate paper category for a potential submission or 
its overall suitability for the workshop, please contact the program 

All paper submissions must be at most 12 pages total length in the 
standard ACM SIGPLAN two-column conference format: 
Authors of work-in-progress report submissions should designate their 
papers as such by including the words "work in progress" or "status 
report" in the title. Submissions authored by program committee 
members are permitted, with the usual stipulation that they will be 
judged to a higher standard. 

Accepted papers will be published by the ACM and will appear in the 
ACM Digital Library. 

Submissions are now being accepted electronically via START: 


Submission deadline:        Friday, June 15, 2007 
Notification of acceptance: Friday, July 13, 2007 
Final revision due:         Friday, August 3, 2007 
Workshop:                   Friday, October 5, 2007 


 * Claudio Russo (Microsoft Research, Cambridge) 


 * Derek Dreyer (Toyota Technological Institute at Chicago) 


 * Lars Birkedal (IT University of Copenhagen) 
 * Derek Dreyer (Toyota Technological Institute at Chicago) 
 * Jacques Garrigue (Nagoya University) 
 * Luc Maranget (INRIA Rocquencourt) 
 * Greg Morrisett (Harvard University) 
 * Atsushi Ohori (Tohoku University) 
 * Peter Sestoft (IT University of Copenhagen) 
 * Peter Sewell (University of Cambridge) 
 * Mark Shinwell (CodeSourcery UK Ltd) 
 * Don Syme (Microsoft Research, Cambridge)

Faking concurrency using Unix forks and pipes


Jon Harrop asked, spawning a huge thread. Here a a few posts.:
Has anyone implemented a parallel map function in OCaml using Unix forks, 
pipes and maybe marshalling? 

This seems like an easy way to get concurrency in OCaml...
Oliver Bandel said:
I have thought about such stuff, but ot implemented 
(because there was no real need for it). 
Since I found OCamlP3l, I'm not shure if implementing 
such stuff would make sense, because using OCamlP3l 
might be the right way. 


I didn't tried it so far. 

If people already have done (or will do now), 
I would be interested in feedback.
Luc Maranget said:
This is what we did for a a few examples of using JoCaml, 
the soon-to-be-released extension of OCaml for concurrent programming. 
Fork/Exec is an easy way to get simultaneous execution. 

JoCaml is not released yet, (I am writting the doc 
and web site at the moment). 

The much incomplete web site is at 

One example of fork under jocaml control 

The example may not meet all your concerns (speed I guess), 
but you can replace the shell in the example by somme 
C or Ocaml program that computes something and refine the control 
to collect results.

concurrency using forks and pipes (some working code)


Christopher Cramer said:
Jon Harrop asked about this recently and I thought it was an interesting 
idea and implemented it: 

(BTW I don't think you are "faking" concurrency by using fork...)

Faking concurrency using Unix forks and pipes (ray tracing results)


Jon Harrop said:
I just got a first working version of .NET style asynchronous invocation 
working in OCaml using process forking. 

The following OCaml function forks a new process and computes "f x" in that 
process, returning a function that blocks and returns the result using 

let invoke (f : 'a -> 'b) x : unit -> 'b = 
  let input, output = Unix.pipe() in 
  match Unix.fork() with 
  | 0 -> 
    Unix.close input; 
    let output = Unix.out_channel_of_descr output in 
    Marshal.to_channel output (try `Res(f x) with e -> `Exn e) []; 
    exit 0 
  | _ -> 
      Unix.close output; 
      let input = Unix.in_channel_of_descr input in 
      fun () -> 
        match Marshal.from_channel input with 
        | `Res x -> x 
        | `Exn e -> raise e 

This function tries to account for reraising exceptions on the parent process 
but that is untested. 

You can write a higher-order "map" function in terms of invoke like this: 

let ( |> ) x f = f x 

let map (f : 'a -> 'b) a : 'b array = (invoke f) a |> (fun f -> f()) 

When you apply this map to an array, a new process is forked for each element. 
As forking is time consuming, you should only apply this to short arrays. 

The performance characteristics of this approach are very interesting. 
Firstly, I can observe doubled performance on my dual core by invoking two 
simple but CPU-intensive operations concurrently: 

  map fib [|43; 43|] 

However, performance is easily degraded using this approach, partly because 
forking is expensive but also because of other effects that I do not yet 
understand. My original benchmark summed the elements of an array using 
fold_left. For some reason, this is extremely inefficient, as if the entire 
array is copied. 

Anyway, this function is so simple that it took no time to work it into my ray 
tracer benchmark. The benefits of concurrency on my dual-core system reduce 
the time taken by OCaml from 4s to 3s. 

I'll try a concurrent F# version and see how it compares...

Announcing OMake; pre-announcing OMake 0.9.9 with automated C library wrapping for OCaml


Aleksey Nogin announced:
We are proud to announce the latest release of the OMake Build System - 
OMake ("stable") and pre-announce that OMake 0.9.9 ("unstable") 
is expected to be released within a week. 
OMake is a build system designed for scalability and portability. It 
uses a syntax similar to make utilities you may have used, but it 
features many additional enhancements, including the following. 

   - Support for projects spanning several directories or directory 

   - Fast, reliable, automated, scriptable dependency analysis using MD5 
     digests, with full support for incremental builds. 

   - Fully scriptable, includes a library that providing support for 
     standard tasks in C, C++, OCaml, and LaTeX projects, or a mixture 

     Often, a configuration file is as simple as a single line 

     .DEFAULT: $(OCamlProgram prog, foo bar baz) 

     which states that the program "prog" is built from the files,, and This one line will also invoke the default 
     standard library scripts for discovering implicit dependencies in 
     OCaml files. 

   - Full native support for rules that build several files at once. 

   - Portability: omake provides a uniform interface on Linux/Unix 
     (including 64-bit architectures), Win32, Cygwin, Mac OS X, and other 
     platforms that are supported by OCaml. 

   - Built-in functions that provide the most common features of programs 
     like grep, sed, find, and awk. These are especially useful on Win32. 

   - Active filesystem monitoring, where the build automatically restarts 
     whenever you modify a source file. This can be very useful during 
     the edit/compile cycle. 

   - A built-in command-interpreter osh that can be used interactively. 

OMake preserves the style of syntax and rule definitions used in 
Makefiles, making it easy to port your project to OMake. There is no 
need to code in Perl (cons), or Python (scons). However, there are a few 
things to keep in mind: 

  1. Indentation is significant, but tabs are not required. 
  2. The OMake language is functional: functions are first-class and 
     there are no side-effects apart from I/O. 
  3. Scoping is dynamic. 

OMake is licensed under a mixture of the GNU GPL license (OMake engine 
itself) and the MIT-like license (default configuration files). 

Additional information and extensive documentation can be found on OMake 
Home Page at 

OMake version is a minor feature enhancements and bugfixes 
release. The changes in this version include: 

   - Made it easy to define default ("implicit") rules for phony targets. 
   - Detect case-insensitive filesystems on Unix-like operating systems 
     (especially important under Mac OS X). 
   - A number of performance improvements. 
   - Documentation improvements. 

For a more verbose change log, please visit . 

Source and binary packages of OMake may be downloaded from . In addition, OMake may be 
obtained via the GODI packaging system. 

To try it out, run the command "omake --install" in a project directory, 
and modify the generated OMakefile. 

Even though we call it "stable", OMake should still be 
considered an alpha release. While we have made an effort to ensure that 
it is bug-free, it is possible some functions may not behave as you 
would expect. Please report any comments and/or bugs to the mailing list and/or at 

OMake 0.9.9 will feature a large number of major changes that Jason 
Hickey have been working on for the last two years. These changes include: 

  * Completely redesigned variable naming semantics aimed at making sure 
    that similarly named unrelated variables from different source 
    files do not clash. 

  * Optional ("keyword") arguments to functions. 

  * An option to use an alternative "programming-language-style" syntax, 
    where all string constants have to be quoted, but variable and 
    function references do not have to use the $(...) syntax. 

  * Dynamic loading of C libraries, including: 

    - Tools for automated creation of OCaml wrappers to C libraries by 
      parsing the C header files. 

    - As an example, _automatically generated_ OCaml and OMake wrappers 
      for the GTK library. 

       - As a demo for the above, an OMake GUI capable of presenting a 
         browseable dependency tree, and much more. 

  * And many other features (much more complete and detailed list will 
    accompany the release).

ODT: OCaml Development Tools 1.1 released


Emmanuel Dieul announced:
The new version of ODT has been released this evening. It now supports 
execution via the Java 1.5 JVM and, most of all, has a new feature: automatic 
Automatic indentation has been developed in less than two weeks 
(2 or 3 hours a day) and is completely configurable via an XML file: 
incredible !!! 
This release also fixes (minor) bugs. 

Everything is available on the ODT website: 
The "overview" page explains its current main features, and the 
"install notes" page details some requirements and incompatibilities 
with older versions of OCaml. 
Some screenshots are also available to show the GUI. 

Please, don't hesitate to try ODT (for personal or professional use) 
and forward this mail to anyone which could be interested in.

JoCaml Released.


Luc Maranget announced:
We are happy to annouce the release of JoCaml.

  JoCaml is an extension of Objective Caml for concurrent
  and distributed programming based upon the join calculus.
  More details (including a tutorial) are available on
  the jocaml web site:

  The new JoCaml (born again jocaml) is a total re-implementation of
  the new defunct JoCaml by F. Le Fessant. With respect to this previous
  version, changes are important.

    * New syntax. Believe it or not, the new syntax is better.

    * More convenient command set (bytecode compiler jocamlc,
      toplevel jocaml, native code compiler jocamlopt).

    * Disparition of mobility features. More reasearch is needed
      for those, besides they break OCaml compatibility.

    * Full compatibility with OCaml. For that reason, we adopt
      OCaml releasing scheme: initial version of JoCaml is 3.10.0.

Louis Mandel & Luc Maranget (
Joel Reymont asked and Luc Maranget answered:
> How often do you pull patches from the OCaml tree into the JoCaml tree? 

At every ocaml release. 

> How is this done? 

I wonder why you neeed such information, nevertheless here it is: 
Basically, JoCaml is the 'jocamltrunk' branch in ocaml CVS, 
it also has a 'module' name: jocsl. 

Syncing jocaml with ocaml is a two step process: 

Checkout jocaml: 
cvs co -kk -r jocamltrunk jocsl 

Perform changes from (ocaml) release1 to release2 in jocaml 
cvs update -kk -j release1 -j release2 

(There are a few details left, such as bootstrap)
Yaron Minsky asked and Luc Maranget answered:
> A couple of questions: 
>   - Why is it that JoCaml is a full OCaml distribution as opposed to 
>   just a set of libraries plus a syntax extension.  Was there some 
>   particular 
>   feature that required hacking the compiler directly, or was it just more 
>   convenient to build it that way? 

  As far as I know, access to the guts of the compiler is required at 
  least for the following two features. 
    - Specific typing rules. 

    - Pattern matching compilation. 
      See the buffer example in the doc for instance 

  Besides, JoCaml is not a full OCaml distribution. JoCaml is a restricted 
  OCaml distribution. On the light side, JoCaml compilation is very fast; 
  on the dark side, some of OCaml tools are not available, (camlp4, ocamlbuild, 

>   - What do you think the future of JoCaml is?  Any thoughts on whether 
>   it will be supported in the future, and in particular whether it will get 
>   merged back into the OCaml mainline tree? 

    I can only wish a bright future to JoCaml :) 
    Our team will support JoCaml. 

    Merging JoCaml into the OCaml mainline tree is another story. We have 
    no plans for that at the moment. Let us wait a bit for JoCaml success 
    to deprecate OCaml thread libraries.
Jon Harrop asked and Luc Maranget answered:
> Is there a JoCaml mailing list? I don't want to spam the OCaml list with my 
> noob questions (writing a concurrent ray tracer). :-) 

There is a JoCaml mailing list 
(As for the OCaml list, only subscribers can post). 
By the way we also have a web site 
with such information and others.

plc, a One-Day Prolog Compiler


Bruno De Fraine announced:
I always thought that the presentation "One-Day Compilers", in which   
Graydon Hoare builds a Makefile compiler on top of ocamlc and gcc   
using camlp4, contains very good material to demonstrate the power   
and practicality of OCaml (and camlp4) to a certain audience. (For   
other audiences it might be better to show that you can program a   
real-time 2D simulation of bouncing balls in under 400 lines of   
code.) It's also very funny: 

Nevertheless, some elements about the presentation could be improved:   
(i) the detour via C and gcc makes things needlessly complex since   
ocamlopt can directly generate native and independent executables   
(although historically that was perhaps not yet the case in 2002),   
and (ii) the choice of source language (Makefiles) does not seem   
optimal in the sense that build files hardly contain any computation. 

Anyhow, after some frustration about the execution speed when solving   
a problem with existing Prolog implementations, I had the idea that   
maybe a compiler - even a one-day one - could improve, and that it   
would make a fabulous example for Graydon's approach. So I spent one   
figurative day trying to build a Prolog compiler (plc) based on a   
simplified set-up: 
- a camlp4 preprocessor converts Prolog code to an OCaml AST that   
embodies the (possible) computations 
- ocamlopt can compile the OCaml AST to native code 

At the moment, I have plc working for a simple subset of Prolog   
(consisting only of atoms, variables, predicates and rules; no   
structures/lists, no integers). You can check out the code (for OCaml/ 
camlp4 3.09, no 3.10 yet) at: 

(Look at and try "make demo". To see what is executed, look   
at "make demo.output" and 

Since there is not a lot of documentation, I will elaborate a bit below. 

The difficult part is in, and the approach I take to   
translate Prolog to OCaml, is to represent each predicate as a number   
of functions, one for each variation in state (open or closed) of the   
arguments. The functions take a parameter for each "closed" argument,   
as well as a function parameter. They invoke the latter for each   
solution, with the binding of the "open" arguments as parameters. To   
find solutions, we invoke the functions of the respective subgoals. 

For example, a predicate with two arguments, such as sibling/2, is   
translated to four versions: 

(* both arguments open *) 
val sibling_oo : (atom -> atom -> unit) -> unit 
(* first argument closed *) 
val sibling_co : (atom -> unit) -> atom -> unit 
(* second argument closed *) 
val sibling_oc : (atom -> unit) -> atom -> unit 
(* both arguments closed *) 
val sibling_cc : (unit -> unit) -> atom -> atom -> unit 

If this predicate is defined by the following rule: 

sibling(X,Y) :- parent(Z,X), parent(Z,Y). 

The bodies of the sibling_xx functions will invoke the translations   
of the parent/2 predicate in a manner appropriate to the binding   
state of the variables, for example: 

let sibling_oo _f = 
   (* subgoal parent(Z,X) where Z and X are still open *) 
   parent_oo (fun z x -> 
     (* subgoal parent(Z,Y) where Z is closed and Y open *) 
       parent_co (fun y -> 
         (* solution for sibling(X,Y) *) 
         _f x y) 


let sibling_co _f x = 
   parent_oc (fun z -> parent_co (fun y -> _f y) z) x 

And so on. 

As can be seen, the conjunction (,) is modeled as a nesting of the   
invocations. plc also handles disjunction (multiple rules for the   
same predicate) by translating them to a sequence of statements. Of   
course, it also handles atoms (as in "person(joe).") and reuse of a   
variable inside one goal (e.g. "same(X,X).") by appropriate bindings   
and/or tests. I believe all of this models the execution semantics of   
Prolog correctly. 

I plan to extend plc at some point to support a more complete   
language (so that it can handle the N-Queens problem, for example).   
The intent of this early announcement is to raise some feedback and   
(maybe) help. Do you see obvious mistakes? Could this approach be   
extended to support structures/lists? (Will it be fast?) Can the   
approach/implementation still be improved? Do you know just the right   
high-order functions to make more readable? Etc.
Bruno De Fraine then added (find the attached files at the archive):
An addendum to my previous message: 

Some preliminary benchmarks indicate that plc is a vastly faster   
execution platform than tried-and-tested (and optimized) Prolog   
engines such as SWI Prolog and Sicstus Prolog (although it currently   
only supports a restricted language). 

For my example program, it is almost 12x faster than SWI and 3.7x   
faster than Sicstus. 

Some details: solving the win-predicate in the attached Prolog-file   
causes a search in the space of all six-letter words (i.e. 26^6   
strings); it reports those words that are "prolog". I collected the   
following timings on my computer (Intel Core 2 Duo, MacOS X 10.4.9): 

SWI-Prolog (Multi-threaded, Version 5.6.10): 

$ time swipl -s -g true -t "win(A,B,C,D,E,F),write 
% compiled 0.00 sec, 4,280 bytes 
[p, r, o, l, o, g] 

real    0m59.065s 
user    0m58.845s 
sys     0m0.073s 

SICStus 4.0.1: 

$ time ~/sicstus/bin/sicstus -f -l --goal "win 
% compiling /Users/bruno/my_svn/plc/ 
% compiled /Users/bruno/my_svn/plc/ in module user, 0 msec   
2312 bytes 
SICStus 4.0.1 (i386-darwin-8.9.1): Tue May 15 14:53:23 CEST 2007 
Licensed to Bruno De Fraine 

real    0m18.474s 
user    0m18.417s 
sys     0m0.038s 

plc and ocamlopt 3.09.3: 

$ time { make prolog.cmx; make driver.cmx; ocamlopt -o driver.opt   
prolog.cmx driver.cmx; ./driver.opt; } 
ocamlopt.opt -c -dtypes -pp 'camlp4 ./plc.cma pr_dump.cmo -impl' - 
ocamlopt.opt -c -dtypes 

real    0m5.069s 
user    0m4.749s 
sys     0m0.191s

Using folding to read the cwn in vim 6+

Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).

:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1

If you know of a better way, please let me know.

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.

Alan Schmitt