Previous week   Up   Next week

Here is the latest Caml Weekly News, week 05 to 11 december, 2001.

1) Code-base size comparison vs Language
2) Unreliable Threading?
3) Ensemble release 1.33
4) embedding Ocaml in an existing application
5) SML -> OCaml

1) Code-base size comparison vs Language
David McClain elaborated:

I just finished my experiment to reduce the size of a fielded application by
recoding in either of Lisp or OCaml. I had early indications that, aside
from pure ease of programming in these HLL's, the overall code base would be
drastically reduced (5x to 6x). That is certainly true if you count all the
source code needed to produce the application, but an honest, impartial,
comparison of the lines I actually had to write, of non-reusable,
application specific code shows somewhat disappointing results on this basis

The application is a system network server that performs recursive prefix
mappings of file pathnames, including environment variable substitutions.
This is a variation on the system provided by the Sprite experimental OS
developed at UCB by John Ousterhout, et. al. in the late 1980's and early

The existing version was coded in M$ VC++ making heavy use of STL. It is a
COM/OLE process server based on M$ ATL. All three versions retain a machine
generated ATL wrapper code for this COM/OLE behavior -- I only needed to
write a few lines of IDL to produce the basic skeleton, and all three
versions use identical stuff here...

For the application specific coding, the scores are:

Existing App:
C/C++ = 1106 LOC

Lisp Version:
C/C++ = 284 LOC, Lisp = 798 LOC  --> Total = 1082 LOC

OCaml Version:
C/C++ = 284 LOC, Lisp = 58 LOC, OCaml = 453 LOC --> Total = 888

These LOC counts do *not* include blank lines and comment only lines.

On the basis of code-base size reduction, these results are nearly a tie.

But on the basis of ease of programming, I have to award Lisp first,
followed by OCaml, and distantly trailed by C/C++. The reasons for this are:

1. Lisp is a huge langauge with nearly everything you need already built in.
But it produces very bulky DLL's -- on the order of 15 MBytes.

2. OCaml is equally terse as Lisp, or even slightly better, but needs a fair
amount of additional support routines written, to cover the application
needs. Some of this is in C/C++ (very little) but most has to do with
providing things like unwind-protect, generalized string handling,
generalized list operations. It produces very fast runtime code (not needed
here) and quite reasonably sized DLL's -- about 300 KBytes (50x smaller than

3. C/C++, making heavy use of classes and STL is nearly unreadable, took a
long time to program, and is frightening to revisit after some time away
from it (1 year or more since original writing). C/C++ retains the
capability to utilize Unicode (FWIW -- I don't really need it), but it was
written with some embedded bugs that I found only when I was able to remain
at the abstract levels permitted by HOL's.

Both the Lisp and OCaml versions were written in the course of 2-3 hours.
Writing the C/C++ version took the better part of 1 week. Prior to that I
had written experimental versions in Lisp and had more than a year of
playing with the system to get an understanding of the needed algorithms.

I will say that both Lisp and OCaml allowed me to spot some errors in the
C/C++ implementation, fix those errors, and add some extra capability (about
20 LOC in both Lisp and OCaml for the extra stuff).  I estimate the time
needed to go back and refamiliarize myself with STL and the internal
architecture of the existing application -- in order to fix the bugs I
discovered and add the additional capabilities -- would be several days.

I find it remarkable that OCaml has a slight edge on Lisp for terseness of
expression. OCaml is a highly expressive syntax and you can say quite a lot
in a few keystrokes. Lisp tends to be more wordy, use longer identifiers,
and the code is quite a bit sparser for semantic content over a given number
of LOC.

This is as close as I can come to providing an honest, impartial, comparison
of these languages for the purpose of rewriting existing code to be more
maintainable, robust, and correct. I definitely think the effort is
worthwhile, but not entirely for the reasons I had originally anticipated.

2) Unreliable Threading?
David McClain remarked:

Earlier, when developing a mixed C/C++ and OCaml program, it looked as
thought the OCaml threading code was "unreliable" in that I would get
crashes under the MSVC++ debugger of "Invalid Handle" errors. But running
the code outside of the debugger works just fine.

I now believe the problem has nothing to do with OCaml or its threading
code, but rather the way the M$ debugger attempts to walk stack frames
looking for contexts to report. Since OCaml threading does its own thing, M$
appears to get lost in this process and ends up crashing on its own --
incorrectly attributing the error to the code under test.

All efforts to use OCaml multithreading and Channels for inter-thread
coordination HAVE been quite reliable outside of the M$ development

3) Ensemble release 1.33
Ohad Rodeh announced:

A final release of Ensemble is now ready. This release includes several
interesting libraries for Caml:
1. Cryptographic library interfacing with OpenSSL.
2. A socket library using winsock2 and scatter-gather on Linux, with
   support for raw C iovectors.                                                                    
3. A merge tool allowing packaging of Caml libraries.

All proprietry dependecy generation and build tools have been discarded.
This has two benefits: (1) The license can be BSD-type  (2) it is easy
to cut-and-paste complete files and libraries.

I now have to start working on some "serious" projects, and I no longer
have the time to develop the system. I'll keep maintaining it though.

As usual the system is available from:

  All the best,

     RELEASE_NOTES for Ensemble version 1.33
Author: Ohad Rodeh
Last updated: 6/12/2001


Removed proprietary build tools.

1. Rewritten all the makefiles, all proprietary build tools have been
   removed.  The makefiles have been simplified, and can be easily read.
   Only standard tools are now used (ocamlc, ocamlopt, ocamldep,
   GNU-make, gcc, cl, nmake).
2. A large number of makefile bugs have been removed, makefile code has
   been considerably reduced.
2. Cleanup up and updated INSTALL.htm.
3. Copyright violations have been fixed, the Ensemble code base contains
   no CAML code, nor any other code that is not BSD.
4. Work is underway to add IBM to the copyright holders.
5. Many small bug fixes.

  We are using version 3.01 for this version.

  This version was tested on Linux, WIN2000, and NT4. Light testing was
  carried out on Sparc/Solaris. CE still does not run on WIN32 (yet).

4) embedding Ocaml in an existing application
Basile STARYNKEVITCH wondered:


Some remarks (and wishes) about embedding Ocaml (bytecoded) in an
existing application.

I just embedded (a single day hack) Ocaml inside the Larbin web
crawler (coded in C++ by Sebastien Ailleret) on You should be able to
get the patch from my personal web page or else email me either at
work <> or at home
<>. This because I need (for the POESIA
project) to scan some part of the Web and get some crude measures of
features like: percentage of Web pages using Javascript (about 50%)
and mean length of such scripts (about 2Kbytes). So I hacked larbin to
add Ocaml inside it.

I used ocaml v3.02 or v3.01 (bytecode interpreter).

First remark: the ocaml linking mechanism use the undocumented -p
flag to dump all the Ocaml symbol table. So I had to hack Larbin's
main routine to test for the special -p single argument and then call
only ocaml_main_init(argv); and exit(0) in that case. I believe that
it would be quite trickly to embed Ocaml inside a significant
application which uses the -p flag (see the code in
ocaml/byterun/startup.c routine parse_command_line). At least I
suggest that the linking mechanism use a less natural convention (such
as "-print-ocaml-symbols" as argv[1] of main) and that it could be
documented. (and I did not found quickly where Ocaml invokes this
magic -p flag).

Second remark: for embedding Ocaml inside Larbin I need the
-use-runtime flag to the ocamlc compiler. But it seems that the next
version (v3.03) does not have it anymore. I hope that it will still be
possible to embed OCaml inside an application without having to
compile the whole application as a shared library.

3rd remark:
why don't caml_main take and changes the main argc,argv arguments
(like e.g. gtk_init routine in the Gtk toolkit) so that the
application can use the other arguments. Otherwise, I suggest to
publicize the parse_command_line routine so an embedding application
could parse its own main arguments and pass some others to ocaml
before starting the Ocaml interpreter. (imagine you want to embed
Ocaml inside an existing Gtk application?).

My previous post
regarding a tiny syntactic extension was related to this [Larbin]
patch. I think that novice Ocaml users might be interested in a
syntactic sugar to register bound values.

(Xavier Leroy posted a long reply to this post that can be found at

5) SML -> OCaml
Daniel de Rauglaudre said:

> Is there an SML -> OCaml translator?  I have some SML programs I'd like
> to convert... even something that only partly works would save me a fair
> amount of tedium.

You can try:
     camlp4 pa_sml.cmo pr_o.cmo 

But it is probably incomplete: you may get syntax errors.
And even if it is ok, you probably have to change things by hands. 
You can ask me if you want me to complete of fix some bugs.


Alan Schmitt