Previous week Up Next week

Hello

Here is the latest Caml Weekly News, for the week of February 16 to 23, 2010.

  1. The need to specify 'rec' in a recursive function defintion
  2. ocamlnet-3.0test2
  3. ORM-0.5 and Dyntype-0.7
  4. Wrapping var_args, or C ... in ocaml?
  5. range of hash function
  6. Other Caml News

The need to specify 'rec' in a recursive function defintion

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/0d579e64dd00daff#

Deep in this thread, Ashish Agarwal said:
It may be worth recalling the OCaml koans at http://eigenclass.org/hiki/fp-ocaml-koans. The first one is:

let rec

One day, a disciple of another sect came to Xavier Leroy and said mockingly:

"The OCaml compiler seems very limited: why do you have to indicate when a
function is recursive, cannot the compiler infer it?"

Xavier paused for a second, and replied patiently with the following story:

"One day, a disciple of another sect came to Xavier Leroy and said
mockingly..."
			

ocamlnet-3.0test2

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/84e7b58c0904d5a2#

Gerd Stolpmann announced:
it is my pleasure to announce the second test version of the upcoming
ocamlnet3 library. This version mainly includes a large number of bug
fixes compared to the first version, but also a few additions:

     * netcamlbox: a fast ipc mechanism for sending ocaml values to
       another process. Netcamlbox is shared-memory based, and works
       well on multi-cores (see
       http://projects.camlcity.org/projects/dl/ocamlnet-3.0test2/doc/html-main/Netcamlbox.html
       for doc)
     * netplex adds per-process sockets, so one can send messages to
       individual containers, and not only to the whole service
     * wrappers for POSIX semaphores
     * wrappers for syslog
     * performance optimizations (serialization, page-aligned I/O)
     * updated documentation

Already in the first test version:

     * Port to Win32 (as outlined in the blog article
       http://blog.camlcity.org/blog/ocamlnet3_win32.html)
     * The new Rpc_proxy layer (as described in
       http://blog.camlcity.org/blog/ocamlnet3_ha.html)
     * Extensions of Netplex (see especially
       http://blog.camlcity.org/blog/ocamlnet3_mp.html)
     * New implementation of the Shell library for starting
       subprocesses
     * Uniform debugging with Netlog.Debug
     * Exception printers (Netexn)
     * Introduction of pollsets (Netsys_pollset); removal of
       Unix.select (i.e. more than 1024 file descriptors)
     * The netcgi1 library has been dropped in favor of netcgi2

I've quickly checked that the library builds on linux, freebsd-7.2, open
solaris, and Win32 (MinGW). Nevertheless, testers are especially
encouraged to check whether Ocamlnet 3 still works on all platforms,
because a lot of new platform-specific code has been added.

Download etc:

     * Homepage: http://projects.camlcity.org/projects/ocamlnet.html
     * Download:
       http://download.camlcity.org/download/ocamlnet-3.0test2.tar.gz
     * Manual:
       http://projects.camlcity.org/projects/dl/ocamlnet-3.0test2/doc/html-main/index.html
     * My "scratch pad" describing changes, plans, etc:
       https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/TODO
     * Subversion:
       https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/

There is a GODI package, but you have to enable a special repository to
get it: Add

GODI_BUILD_SITES+=http://www.ocaml-programming.de/godi-build/ocamlnet3/

to godi.conf to see the new packages in godi_console. This works first
after the bootstrap is finished (godi_console cannot be built with
ocamlnet3 yet). Keep in mind that this is development code, and there is
no easy way to downgrade to ocamlnet2. Best is you do this only for new
GODI installations.

Special thanks to everybody who helped me to produce this new version -
by reporting bugs, or even sending fixes, or by maintaining subtrees
(Christophe Troestler).
			

ORM-0.5 and Dyntype-0.7

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/77024f5e73d85a91#

Thomas Gazagnaire and Anil Madhavapeddy announced:
We are pleased to announce a beta release of an Object-Relational Mapper (ORM)
library for OCaml. It is implemented as a type-conv syntax extension which
auto-generates database save and query functions based on type declarations.

You can obtain the ORM from Github at http://github.com/mirage/orm , and GODI
packages and MacPorts will be available shortly. Please report issues to
mirage@recoil.org or use the Github issue tracker.

Some example code is:

type t = { foo: string; bar: int } with orm
let db = t_init "my.db" in
t_save { foo="t1"; bar=1 } db;
t_get ~foo:(`Contains "t") db

The only backend supported currently in SQLite, but we are working on some
alternative non-SQL backends (such as Tokyo Cabinet and a git-based version
controlled database). This beta release is a preview to get feedback and
testing from a wider audience.

The Dyntype library provides a convenient way of manipulating types and values
at run-time without having to dive into camlp4. It is described more fully in
a WGT2010 paper at:
http://www.cl.cam.ac.uk/research/srg/netos/papers/2010-dyntype-wgt.pdf
			
Thomas Gazagnaire later added:
Normally, the packages are now available in GODI. Please report any
issues that you can have with these to mirage@recoil.org.
			

Wrapping var_args, or C ... in ocaml?

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/ef3835ff56a906cf#

Guillaume Yziquel asked:
I'm currently looking at:

       http://docs.python.org/c-api/arg.html

and I would like to know how to wrap up C functions with va_list of with an
ellipsis. Is this documented somewhere, or has someone already done something
like this?
			
Later on, Richard Jones said:
As an example of a Python binding of the sort we generate in
libguestfs (using OCaml code to generate it :-)

 http://www.annexia.org/tmp/guestfs-py.c.txt
 [Search for calls to PyArg_ParseTuple]

The Python binding technique is sort of interesting.  On the other
hand, Python is decoding that kind-of-C-format-string arg to
PyArg_ParseTuple entirely at runtime which makes it really slow (but
not the slowest thing in Python by any means -- that language takes
being inefficient to a new level).

Out of all the language bindings that we support, the one with the
most natural FFI to C [if you exclude C++] is C#.  In C# you can
express C structures, C calling conventions and so on directly in the
language, and it is very well documented how to do this.  This makes
C# calling C shared libraries / DLLs very natural.

 http://www.annexia.org/tmp/Libguestfs.cs.txt

The worst of all of them is Haskell.  Not because the Haskell FFI is
bad, but because it's (a) obscure and undocumented and (b) the only
one of the programming languages apart from C# where you aren't
basically writing C code.  If you don't already know Haskell, it's
very difficult to writing bindings for Haskell.

 http://www.annexia.org/tmp/Guestfs.hs.txt
			
Richard Jones also asked and Guillaume Yziquel replied:
> However I'm still confused what you are trying to do here.  If you're
> trying to bind the above, maybe look first at PyCaml?

Well, somehow, PyCaml seems to never have wanted to work with me.

I tried Art Yerkes' version, Thomas Fishbacher's version, Henrik Stuart's
version, and Yoann Padioleau's merge of these versions. Somehow things always
seem wrong. I get a segfault with Yoann Padioleau's version due to the layout
of the Python table function (WTF!? BTW).

So I'm rewriting a Python embedding.

http://yziquel.homelinux.org/gitweb/?p=ocaml-python.git;a=shortlog;h=refs/heads/yziquel
http://yziquel.homelinux.org/debian/pool/main/o/ocaml-python/
http://yziquel.homelinux.org/topos/api/ocaml-python/Python.html

Sample session:

yziquel@seldon:~$ ocaml
       Objective Caml version 3.11.1

# #use "topfind";;
- : unit = ()
Findlib has been successfully loaded. Additional directives:
 #require "package";;      to load a package
 #list;;                   to list the available packages
 #camlp4o;;                to load camlp4 (standard syntax)
 #camlp4r;;                to load camlp4 (revised syntax)
 #predicates "p,q,...";;   to set these predicates
 Topfind.reset();;         to force that packages will be reloaded
 #thread;;                 to enable threads

- : unit = ()
# #require "python.interpreter";;
/usr/lib/ocaml/python: added to search path
/usr/lib/ocaml/python/python.cma: loaded
/usr/lib/ocaml/python/oCamlPython.cmo: loaded
# open Python;;
# let dolfin = Module.import "dolfin";;
val dolfin : Python.pymodule Python.t = <abstr>
# let dictionary = Module.get_dict dolfin;;
val dictionary : Python.dict Python.t = <abstr>
# let keys = Dict.keys dictionary;;
val keys : string list Python.t = <abstr>
# let key_list = list_from keys;;
val key_list : string Python.t list =
 [<abstr>; <abstr>; <abstr>; <abstr>; <abstr>; <abstr>; <abstr>; <abstr>;
  <abstr>; <abstr>; <abstr>; <abstr>; <abstr>; <abstr>; <abstr>; <abstr>;
  <abstr>; <abstr>; <abstr>; <abstr>; <abstr>; ...]
# let string_list = List.map string_from key_list;;
val string_list : string list =
 ["restriction"; "gt"; "precedence"; "TrialFunctions"; "ale"; "as_tensor";
  "cross"; "__path__"; "shape"; "PETScFactory"; "has_mpi"; "down_cast";
  "differentiation"; "Mesh"; "has_scotch"; "rot"; "has_slepc"; "DOLFIN_PI";
  "begin"; "le"; "outer"; "VectorElement"; "parameters"; "ln";
  "uBLASVector"; "uBLASDenseMatrix"; "tr"; "Assembler"; "terminal";
  "UnitCube"; "lt"; "CRITICAL"; "hermite"; "derivative"; "logger";
  "uBLASDenseFactory"; "norm"; "MPI"; "info"; "triangle"; "R1"; "R2";
			
Florent Monnier suggested and Guillaume Yziquel concluded:
> > Is there a way to map an OCaml list to an ellipsis? Or is it a C
> >  limitation?
> 
> Yes, as long as I know, for this you should use these kind of tools:
> 
> http://sourceware.org/libffi/
> http://www.gnu.org/software/libffcall/
> http://www.nongnu.org/cinvoke/
> http://dyncall.org/

Got it.

The correct code is:

CAMLprim value ocamlpython_py_tuple_pack (value ml_len, value ml_pyobjs) {

 av_alist argList;
 PyObject * retVal;
 av_start_ptr(argList, &PyTuple_Pack, PyObject*, &retVal);

 #if defined(__s390__) || defined(__hppa__) || defined(__cris__)
 #define av_Py_ssize_t av_long
 #else
 #define av_Py_ssize_t av_int
 #endif

 av_Py_ssize_t(argList, Pyoffset_val(ml_len));
 while (ml_pyobjs != Val_emptylist) {
   av_ptr(argList, PyObject*, Pyobj_val(Field(ml_pyobjs, 0)));
   ml_pyobjs = Field(ml_pyobjs, 1);
 }

 av_call(argList);
 return(Val_owned_pyobj(retVal));
}

Hope that will be useful to people wishing to wrap up varargs (though it's
probably a bad idea, since the stack is limited...)

Python function calls are now possible by constructing the argument tuple.
			

range of hash function

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/5298b01dc0ccf77e#

Grégoire Seux asked and Jacques Garrigue replied:
> i would like to use the polymorhpic hash function on strings. But i would
> like to know what is the probability of a collision between two hashes.
>
> my first question is about the range of the Hashtbl.hash function: what is
> its range ? ( string -> [1..N] ?)

Just to get things straight: this is 0..2^30-1 (0..0x3fffffff).
The result of the hash function is the same on 32-bit and 64-bit
architectures.

> the second question is : can i assume that the result is a uniform
> distribution over [1..N] ? (for 10⁶ words which is an estimation of the
> english vocabulary size)

The algorithm for strings is as follows:

     i = caml_string_length(obj);
     for (p = &Byte_u(obj, 0); i > 0; i--, p++)
       hash_accu = hash_accu * 19 + *p;

So you can assume an uniform distribution for sufficiently long
strings.

> the third one is : is it possible to predict which will be the collision ? I
> mean collisions are between words which are very 'similar' (for ex: "boy"
> and "boys") or are completely unpredictable.

Since you have the algorithm, you can predict collisions. Basically
shifting character n by 1 is equivalent to shifting character n+1 by
19, so you have lots of collisions. But this hash function being
intended for hashtables, collisions are not a problem, uniform
distribution matters more.

By the way, for polymorphic variants collisions matter, and the hash
function is different. The range is 31-bits rather than 30-bits, and
the factor is 243, so that names of no more than 4 characters are
guaranteed to be different. You still have collisions, but they are
going to be less similar.

Both hash functions are defined in byterun/hash.c.
			
Richard Jones then said:
I would slightly dispute your assertion that collisions are not a
problem, because of algorithmic complexity attacks:

http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/index.html

The hash implementation used by Perl was changed to avoid this attack:

http://perl5.git.perl.org/perl.git/blob/HEAD:/perl.c#l1465
			

Other Caml News

From the ocamlcore planet blog:
Thanks to Alp Mestan, we now include in the Caml Weekly News the links to the
recent posts from the ocamlcore planet blog at http://planet.ocamlcore.org/.

OCaml Unix course, the sockets chapter is translated:
  http://forge.ocamlcore.org/forum/forum.php?forum_id=535

OCaml Unix course, the generalities chapter is translated:
  http://forge.ocamlcore.org/forum/forum.php?forum_id=534

A Heap of (Regular) Strings:
  http://alaska-kamtchatka.blogspot.com/2010/02/heap-of-regular-strings.html

Braun Trees:
  http://alaska-kamtchatka.blogspot.com/2010/02/braun-trees.html

OCamlnet 3.0test2:
  http://caml.inria.fr/cgi-bin/hump.cgi?contrib=402

OCamlspotter 1.1rc2:
  http://caml.inria.fr/cgi-bin/hump.cgi?contrib=660

Ocamlnet-3.0test2:
  http://blog.camlcity.org/blog/ocamlnet3_test2.html

OPA bugfix release:
  http://dutherenverseauborddelatable.wordpress.com/2010/02/16/opa-bugfix-release/
      

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.


Alan Schmitt