Caml Weekly News

Hello

Here is the latest Caml Weekly News, for the week of 27 December, 2005 to 03 January, 2006.

PIC
Ask-if-continue wrapper?
Announcing OMake 0.9.6.7
Persistent storage and stability of Marshal?
LablPCRE 0.9 - a PCRE binding for Objective Caml

PIC

Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/31833

John Skaller asked and Xavier Leroy answered:

> I notice Ocaml 3.09 has -fPIC option. Thanks! This is a step
> towards dynamic loading in native code.
>
> I wonder, is it actually supported -- or even possible
> to load native code (on suitable platforms) at startup
> from C? From an Ocaml mainline?
>
> What's left to be able to 'dlopen()' a function library?
> From C? From Ocaml? Will that ever be possible?

Encapsulation of compiled OCaml code as a shared library that can be
called from C is possible, on x86 at least.

The procedure is similar to that described in section 18.8 of the
reference manual, namely, use "ocamlc -output-obj" or
"ocamlopt -output-obj" to package the Caml code as a .o file, then use
"gcc -shared" (or equivalent commands to build shared objects) to group
together this .o file, the C stub code that calls back into OCaml, and
the OCaml run-time system.

This is a very good approach to use OCaml code from other languages
(e.g. Java) or as plug-ins (e.g. in Apache).

The reason it works particularly well for x86 is that dynamic loaders
for x86 (both under Unix/Linux and Windows) can relocate arbitrary
code, not necessarily PIC code.  This is unfortunately not the case
for all target architectures.  In particular, dynamic loaders for
x86_64 (AMD64) are much less forgiving.  The -fPIC option to ocamlopt
you mention was added to the AMD64 code generator in an attempt to
generate "more position-independent" code.  However, it does not quite
work yet.  A difficulty is the paucity of documentation on what,
exactly, the Linux AMD64 dynamic loader can relocate.

Dynamic loading of OCaml code from OCaml code raises many other
issues.  It is currently supported for bytecode only, and will not be
available for native code in the forseeable future.  I have already
discussed this on this list earlier.

> [Another question, whether one can wrap native code
> so it can be called from bytecode.. this would allow
> bytecode programs to generate and compile bytecode
> extensions .. without the bulk of the code running
> slowly]

Again, I believe this has been discussed at length before.  The short
answer is: no.  A longer answer is the asmdynlink library by Fabrice
Le Fessant, which (used to) enable this kind of things with some speed
penalty on the interpretation of the bytecode.

Ask-if-continue wrapper?

Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/31841

Stephen Brackin asked and Xavier Leroy answered:

> I'd like an OCaml function, which I'll call continueq, with the property
> that for any function f with argument(s) fargs,
>
> continueq f fargs tsecs defaultval
>
> starts evaluating f on fargs and lets this evaluation proceed for up to
> tsecs seconds. If the computation of (f fargs) completes in this time,
> then it returns the result of that computation. Otherwise, it asks the
> user how many seconds to let the computation of (f fargs) proceed. If
> the user inputs a value less than or equal to 0, then it returns
> defaultval. If the user inputs a value tsecs' greater than 0, then it
> evaluates
>
> continueq f' fargs' tsecs' defaultval
>
> where (f' fargs') denotes the computation state of (f fargs) at the time
> it was interrupted.

The latter ("the computation state of ...") is not something you can
manipulate programmatically in OCaml.  However, there is no need to:
Unix timer signals are sufficient to do what you want.  (If you're on
Windows, there's nothing I can do for you, in this particular instance
and in general.)  See below for the code.

A word of caution: if your function f performs I/O operations, be
prepared for them to fail with a EINTR Unix error.  That's the Unix
(SVR4 / POSIX) way of being unhelpful...

- Xavier Leroy

----------------------------------------------------------------------

let set_timer tsecs =
  ignore (Unix.setitimer Unix.ITIMER_REAL
                         { Unix.it_interval = 0.0; Unix.it_value = tsecs })

exception Timeout

let handle_sigalrm signo =
  print_string "Continue for how long? "; flush stdout;
  let f = read_float() in
  if f <= 0.0
  then raise Timeout
  else set_timer f

let continueq f arg tsecs defaultval =
  let oldsig = Sys.signal Sys.sigalrm (Sys.Signal_handle handle_sigalrm) in
  try
    set_timer tsecs;
    let res = f arg in
    set_timer 0.0;
    Sys.set_signal Sys.sigalrm oldsig;
    res
  with Timeout ->
    Sys.set_signal Sys.sigalrm oldsig;
    defaultval

Jonathan Roewen also answered:

Sounds like you'd need continuations to do it nicely/properly. I have
no idea if it's possible to support continuations in ocaml. But why
not use a couple of threads?

Something like:

let continueq f args timeout timeout_val =
  let result = ref timeout_val in
  let on_done = ref (fun () -> ()) in (* hack cause I dunno how to get
around the order of declarations *)
  let do_computation () = result := f args; !on_done () in
  let comp_t = Thread.create do_computation () in
  let rec do_timeout t =
    if t <= 0 then
      Thread.kill comp_t
    else begin
      Thread.delay (float_of_int t);
      print_string "set timeout: ";
      do_timeout (read_int ());
    end in
  let timeout_t = Thread.create do_timeout timeout in
  on_done := (fun () -> Thread.kill timeout_t);
  Thread.join timeout_t;
  !result;;

Very ugly, but should do what you want. I have to add that one of the
thread implementations for ocaml doesn't implement Thread.kill -- I
can't remember which -- which would break this horribly.

Announcing OMake 0.9.6.7

Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/31848

Aleksey Nogin announced:

We are proud to announce the latest release of the OMake Build System -
OMake version 0.9.6.7.

OMake is a build system, similar to GNU make, but with many additional
features. The home site for OMake is http://omake.metaprl.org/ . OMake
features include:

    o Support for projects spanning several directories or directory
      hierarchies.

    o Comes with a default configuration file providing support for
      OCaml, C and LaTeX projects, or a mixture thereof.
      Often, a configuration file is as simple as a single line

         OCamlProgram(prog, foo bar baz)

      which states that the program "prog" is built from the files
      foo.ml, bar.ml, and baz.ml.

    o Fast, reliable, automated, scriptable dependency analysis using
      MD5 digests.

    o Portability: omake provides a uniform interface on Win32, Cygwin,
      and Unix systems including Linux and OS X.

    o Builtin functions that provide the most common features of programs
      like grep, sed, and awk.  These are especially useful on Win32.

    o Full native support for rules that build several files at once.

    o Active filesystem monitoring, where the build automatically
      restarts whenever you modify a source file.  This can be very
      useful during the edit/compile cycle.

    o A companion command interpreter, osh, that can be used
      interactively.

OMake preserves the style of syntax and rule definitions used in
Makefiles, making it easy to port your project to omake.  There is no
need to code in perl (cons), or Python (scons).  However, there are a
few things to keep in mind:

    1. Indentation is significant, but tabs are not required.
    2. The omake language is functional: functions are first-class
       and there are no side-effects apart from I/O.
    3. Scoping is dynamic.

OMake is licensed under a mixture of the GNU GPL license (OMake engine
itself) and the MIT-like license (default configuration files).

OMake version 0.9.6.7 in minor bugfixes/feature enhancements release.
The changes in this version include:

   - Added basic support for C++ projects to OMake standard library.
   - Several improvements in OCaml.om module of the standard library.
   - Portability improvements.
   - Minor documentation fixes.
   - A few other bugfixes and improvements.

For a more verbose change log, please visit
http://omake.metaprl.org/changelog.html#0.9.6.7 .

Source and binary packages of OMake 0.9.6.7 may be downloaded from
http://omake.metaprl.org/download.html . In addition, OMake may be
obtained via the GODI packaging system (release lines "3.08", "3.09" and
"dev").

To try it out, run the command "omake --install" in a project directory,
and modify the generated OMakefile.

OMake 0.9.6.7 is still an alpha release.  While we have made an effort
to ensure that it is bug-free, it is possible some functions may not
behave as you would expect.  Please report any comments and/or bugs to
the mailing list omake@metaprl.org and/or at http://bugzilla.metaprl.org/

Persistent storage and stability of Marshal?

Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/31836

David Mentre asked:

I need to implement some kind of persistent storage in my application
(demexp web proxy part and demexp server part), so I'm looking at the
different possible alternatives.

Roughly, I've found:

 - Dbm standard module of OCaml: file storage, no external dependency,
   safe?

 - Berkeley DB 4.x interface[1]: file storage, seems safe;

 - SQL database[2]: real database, complicated to setup, safe.

Right now, I don't have big needs: few tables of moderate size (below
100,000 records) so I'm in favor of simple solutions: Dbm or Berkeley DB
(no need to setup an external SQL server). I need to lookup values
through known keys. The key and stored data are simple records of
strings and integers. In the future, I might need to lookup rows
according to certain field value.

Looking at code using Berkeley DB, I noticed that stored key and data is
made using the Marshal module. However, Marshal doc says that
marshalling is "compatible across all machines for a given version of
Objective Caml", so there might be issues when changing OCaml version. 

So my questions:

 1. Is the Marshal module that much unstable? I have the feeling that
    marhsaled values are compatible between OCaml releases. True or
    false?

 2. Any advice on implementing persistent storage? I know about Persil
    library[3] but I don't see much advantage of using it (does it
    implement its own marshaling, more stable than Marshal?).

Best wishes,
d.

Footnotes: 
[1]  http://www.eecs.harvard.edu/~stein/ocamlbdb-4.3.21.tar.gz

[2]  OCamlDBI: http://caml.inria.fr/cgi-bin/hump.fr.cgi?contrib=381

[3]  http://cristal.inria.fr/~starynke/persil/

Xavier Leroy answered:

>  1. Is the Marshal module that much unstable? I have the feeling that
>     marhsaled values are compatible between OCaml releases. True or
>     false?

True in practice: the last incompatible change was done in 1996...
Still, the Marshal module is not intended for long-term storage of
data that you cannot easily reconstruct from other sources.  In
particular, the data format is compressed enough that it is not
possible to salvage data using, say, a text editor.  At the very
least, it is prudent to build into your software functions to dump and
restore marshaled data to/from a simpler, textual format.  (That's
what I did for SpamOracle's database, which is a marshaled hash table.)

>  2. Any advice on implementing persistent storage? I know about Persil
>     library[3] but I don't see much advantage of using it (does it
>     implement its own marshaling, more stable than Marshal?).

There are two independent questions:

1- How to encode your data into character strings?
2- How to store these strings in persistent storage?

For 2-, you have many options, ranging from flat files (the
traditional Unix way) to full-fledged databases.

For 1-, in addition to the Marshal module, you can also use textual
formats: key-value pairs, XML, Lisp S-expressions, etc.  Other
options are XDR or ASN1 encodings.

Owen Gunden suggested:

> For 1-, in addition to the Marshal module, you can also use textual
> formats: key-value pairs, XML, Lisp S-expressions, etc.  Other
> options are XDR or ASN1 encodings.

You could consider using the fantastic S-expression library, developed
at Jane Street Capital, for this step.

Here's the thread where sexplib was introduced:

http://tinyurl.com/9m4s4

Florian Weimer suggested and David Mentre answered:

> Have a look at SQLite.  I like it a lot.  During the past couple of
> months, even a few Ocaml bindings have sprung into existence.

Yes, I looked at it through Dbi[1]. There is however an issue: it seems
that everything is stored as a string in an SQLite base. Returned data
after Dbi query are typed as string.

It might be an issue with Dbi itself (I haven't looked to the SQLite
binding themselves) but after looking at SQLite documentation, it does
not seem to be the case[2].

I could work around this small issue by easily converting strings to
integer, float, etc. but it seems a bit awkward to me. Beside that,
you are right the SQLite is quite interesting: full SQL database but
without the need to setup an SQL server.

Best wishes,
david

Footnotes: 
[1]  http://download.savannah.nongnu.org/releases/modcaml/

[2]  http://www.sqlite.org/datatypes.html
     ""
      SQLite is "typeless". This means that you can store any kind of
      data you want in any column of any table, regardless of the
      declared datatype of that column.
     ""

Florian Weimer then said:

It's got manifestant types.  There are strings, integers, reals and
BLOBs (and NULL, of course).  You need a binding which stores integers
as integers, and not as strings, though.  SQLite mostly ignores column
types and uses only the type of the value you insert into the
database.

LablPCRE 0.9 - a PCRE binding for Objective Caml

Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/31828

Following the announcement from last week, Owen Gunden asked and Robert Roessler answered:

> What is the advantage of your PCRE bindings over Markus Mottl's
> pcre-ocaml?  How do they differ?

At the time (mid-June 2005), Markus' package would not build properly 
on Windows... he invited me to contribute a fix, and in good open 
source style, I built my own. ;)

As I was already familiar with PCRE in its "POSIX" interface guise, I 
was looking for a relatively simple interface... I found the sheer 
comprehensiveness of Markus' binding (giving access to *all* of PCRE) 
daunting.  So, deciding that others with modest pattern-matching needs 
might also appreciate a simpler interface, I built LablPCRE (certainly 
not as a replacement, but as a small-footprint alternative).

There are really only three "major" functions in LablPCRE: regcomp to 
compile REs, regexec to test them against input, and regmatch when all 
that is needed is a simple match/nomatch query (this last is my own 
low-resource-consumption addition - it is not included in the POSIX API).

In addition, of course, there are a handful of functions for accessing 
the match state, errors, and/or any captured substrings from a regexec 
invocation.  Some effort is made to make a Pcre.t value as light as 
possible, e.g., a reference to the tested string will only be kept if 
the match succeeded *and* substring capture was requested.

The original release made the above (minus regmatch) available in an 
"object" form... subsequent experience with OCaml vernacular and idiom 
suggested that a "module" interface was superior, so the current 
release has been re-oriented to that style (with the original object 
interface still available).

> Is there API documentation for your library on the web somewhere?

Yes - in the LablPCRE-0.9.tar.gz file on the download site. :)

As seems to be somewhat common in the OCaml world, there is a 
commented .mli file, and a README.txt with more in-depth discussions 
and examples.

Christoph Bauer then said:

> At the time (mid-June 2005), Markus' package would not build properly
> on Windows... he invited me to contribute a fix, and in good open
> source style, I built my own. ;)

OCaml-MinGW-Maxi contains a working version of ocaml-pcre. You can
find my build notes under
http://lasagne.unix-ag.uni-kl.de/omm/protokol.txt ., Section 3. The Makefile.mingw
can be found at http://lasagne.unix-ag.uni-kl.de/omm/ .

Using folding to read the cwn in vim 6+

Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).

:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'&lt;1':1
zM

If you know of a better way, please let me know.

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.

Alan Schmitt