Caml Weekly News

Hello

Here is the latest Caml Weekly News, for the week of September 26 to October 03, 2006.

Listing toplevel bindings
out-of-heap data structures
APC
Appel aux Communications JFLA 2007
float rounding
Preview release ocamlnet-2.2test12
More problems with memoization

Listing toplevel bindings

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/0b3a6547deb18385/f8139606d7067fe4#f8139606d7067fe4

John Harrison asked and Oleg answered:

> When inside the OCaml toplevel, is there any way of getting a list of 
> all (top-level) bindings of names to objects of some specified type 'a? 

Yes, there is. No need to recompile anything. However, you need the 
OCaml installation directory (after your have compiled the top-level 
and before you did make clean). 
First, please retrieve the following file 
        http://pobox.com/~oleg/ftp/ML/gprint/gprint_toplevel.ml 
and adjust the paths in the "directory" directives to point to your 
OCaml installation directory. Please run your Ocaml top-level and 
execute all #directory and the #load directives 
in that file up to, but not including the loading of genprintval.cmo. 
Please do NOT change the order of the load directives! It took about 
half an hour to find the right order.... 

Next, please enter the code given in the appendix of this message. The 
code will print the signature of the values in the 
existing top-level environment. For example: 

binding: get_value_bindings/79   
val get_value_bindings : Env.t -> (Ident.t * Types.value_description) list 
binding: print_bindings/107   
val print_bindings : 
  Format.formatter -> (Ident.t * Types.value_description) list -> unit 
Done 
- : unit = () 

We then can enter 

# let x = 1;; 
val x : int = 1 
# let x = 2;; 
val x : int = 2 
# let y = 10;; 
val y : int = 10 
# print_int_toplevel Format.std_formatter 
    (get_value_bindings (!Toploop.toplevel_env));; 

binding: x/186  value: 2 
binding: x/187  value: 2 
binding: y/188  value: 10 
Done 
- : unit = () 

As we can see, the type environment keeps track of all the previous 
definitions of a name. Because "x" was defined twice, there are two 
entries in the type environment: "x/186" and "x/187". The counter is 
the timestamp. The top-level value environment keeps the last value, 
however. 

The function print_int_toplevel cannot, generally, be polymorphic over 
type -- unless you're willing to assume responsibility that your type 
representation string matches your desired type -- or you're willing 
to use MetaOCaml. 

Appendix. 

open Ident;; 
open Env;; 

let get_value_bindings env = 
   let rec get_val acc = function 
        | Env_empty -> acc 
        | Env_value (next, ident, val_descr) -> 
                get_val ((ident,val_descr)::acc) next 
        | Env_type (next,_,_) -> get_val acc next 
        | Env_exception (next,_,_) -> get_val acc next 
        | Env_module (next,_,_) -> get_val acc next 
        | Env_modtype (next,_,_) -> get_val acc next 
        | Env_class (next,_,_) -> get_val acc next 
        | Env_cltype (next,_,_) -> get_val acc next 
        | Env_open (next,_) -> get_val acc next 
  in get_val [] (summary env); 
;; 

let print_bindings ppf bindings = 
  List.iter (fun (ident,val_descr) -> 
        Format.fprintf ppf "@\nbinding: "; 
        Ident.print ppf ident; 
        Format.fprintf ppf "@ @ @ "; 
        Printtyp.value_description ident ppf val_descr) 
    bindings; 
  Format.fprintf ppf "@\nDone@."   
;; 

print_bindings Format.std_formatter 
    (get_value_bindings (!Toploop.toplevel_env));; 

let type_to_str (x : Types.type_expr) = 
  Printtyp.type_expr Format.str_formatter x; 
         Format.flush_str_formatter ();; 

(* Print all top-level int bindings *) 

let print_int_toplevel ppf bindings = 
  let print_int_binding (ident,val_descr) = 
   if type_to_str val_descr.Types.val_type = "int" then 
   begin 
    Format.fprintf ppf "@\nbinding: "; 
    Ident.print ppf ident; 
    Format.fprintf ppf "  value: %d" (Obj.obj (Toploop.getvalue (name ident))); 
   end else () 
 in 
   List.iter print_int_binding bindings; 
   Format.fprintf ppf "@\nDone@."   
;; 

print_int_toplevel Format.std_formatter 
    (get_value_bindings (!Toploop.toplevel_env));;

out-of-heap data structures

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/24056e86a29de1b7/e06f1b13fee052f4#e06f1b13fee052f4

Damien Doligez said, Markus Mottl answered, and Richard Jones said:

Markus Mottl wrote: 
> Damien Doligez wrote: 
> > The GC doesn't work by checking whether something has become 
> > unreachable.  It works by marking all reachable data, then deallocating 
> > everything else.  There is no way to do what you want in OCaml, and 
> > I don't think it can be done without a major rework of the runtime 
> > system. 
> But isn't it true that the GC doesn't follow pointers that point 
> outside the OCaml-heap?  In that case it might be conceivable to copy 
> OCaml-data that must not be reclaimed into the C-heap.  Of course, 
> this would mean that pointers must not point back into the OCaml-heap 
> from there, because there is no way the GC could know then that some 
> value in the OCaml-heap is still in use, or how to update the pointer 
> in the C-heap in case the OCaml-value gets moved around, e.g. during a 
> compaction. 

> If the above really works, I'd be glad to know whether there is 
> already functionality to copy OCaml-structures around.  We have some 
> applications that would greatly benefit from this feature, too: they 
> require an enormous amount of static background knowledge, and have to 
> use it for small jobs which can be easily distributed.  We have 
> multi-core, multi-processor machines, and would be able to greatly 
> speed up our compute jobs if we could put these large OCaml-values 
> into a shared-memory region.  It would save us both a hell lot of 
> memory (probably in the range of GBs per machine for some jobs), and 
> also make the GC work less hard marking data that is not going to be 
> reclaimed anyway. 

Really similar case to us! 
In theory you'd have a "MARK_ANCIENT (obj)" operation.  It would copy 
obj and everything pointed to by obj out of the OCaml heap into 
malloc'd memory.  It would need to find everything pointing to obj and 
update those pointers to the out-of-heap copy, or else have a proxy in 
place of obj.  All done in C so that the garbage collector wouldn't be 
running while this happened, and in Markus's case you'd want to move 
out to some sort of shared memory, perhaps using something like 
mmalloc[1]. 

The problem with all this is that such an "ancient" object had really 
better not be changed.  If it was changed such that part of it pointed 
to a new object allocated on the Caml heap, then that new object could 
never be reached by the GC, and so would quickly get collected, 
resulting in a dangling pointer.  You'd have to ensure that the 
ancient object was never changed by careful programming ... 

Rich. 

[1] http://www.math.utah.edu/docs/info/mmalloc_1.html

Richard Jones then added:

Here's a very lightly tested version of this idea for review: 

  http://homes.merjis.com/~rich/ancient-0.0.1.tar.gz 

If you just want to look at the interface or the code, see here: 

  http://homes.merjis.com/~rich/ancient.mli 
  http://homes.merjis.com/~rich/ancient_c.c.txt 

It works for a very simple case; I haven't tried it for anything 
significant yet.

Richard Jones later said:

Updated version which supports shared memory mappings using the
mmalloc library (included).  I've done a bit of testing and it seems
fairly robust so far, but I'm sure there are plenty of segfaults
waiting to happen.

In theory this version would solve Markus's and my huge infrequently-
accessed shared structures problem, but I'm going to do a lot more
testing at this end before making any grand claims.

  http://homes.merjis.com/~rich/ancient-0.0.3.tar.gz
  http://homes.merjis.com/~rich/ancient.mli
  http://homes.merjis.com/~rich/ancient_c.c.txt

Xavier Leroy answered Markus Mottl:

> But isn't it true that the GC doesn't follow pointers that point 
> outside the OCaml-heap?  In that case it might be conceivable to copy 
> OCaml-data that must not be reclaimed into the C-heap. 

Yes, this is possible.  For instance, ocamlopt places structured 
constants (e.g. constant lists) in the initialized data section 
of the executable, outside the Caml heap. 
> Of course, 
> this would mean that pointers must not point back into the OCaml-heap 
> from there, because there is no way the GC could know then that some 
> value in the OCaml-heap is still in use, or how to update the pointer 
> in the C-heap in case the OCaml-value gets moved around, e.g. during a 
> compaction. 

.. unless you register the locations of back-pointers as global roots. 
But be careful that global roots are scanned at every GC (minor as 
well as major), therefore a large number of such roots slow down the 
GC significantly. 
> If the above really works, 

There is one caveat: ad-hoc polymorphic primitives (structural 
equality and comparisons, marshaling, hashing) will not work on data 
structures that reside outside of the Caml heap.  The reason is that 
these primitives treat out-of-heap pointers as opaque data.  There is 
a special case (the "Is_atom" test) for pointers that correspond to 
ocamlopt-generated static data, but extending this special case is 
nonobvious. 
> I'd be glad to know whether there is 
> already functionality to copy OCaml-structures around. 

Apparently, Richard Jones is already working on it...  Basically, it's 
just like a copying collection with only one root.  You could draw 
inspiration from the OCaml minor GC (Cheney-style breadth-first copying) 
and from the marshaller (depth-first quasi-copying).

Oleg then replied:

> There is one caveat: ad-hoc polymorphic primitives (structural 
> equality and comparisons, marshaling, hashing) will not work on data 
> structures that reside outside of the Caml heap.  The reason is that 
> these primitives treat out-of-heap pointers as opaque data.  There is 
> a special case (the "Is_atom" test) for pointers that correspond to 
> ocamlopt-generated static data, but extending this special case is 
> nonobvious. 

Actually, MetaOCaml has done such an extension. The extension is 
general purpose and backward compatible. Furthermore, no base OCaml 
sources were harmed in the process: there is no need to recompile 
ocamlopt or the standard library. The linking of the final executable 
does require a few extra flags. 
The Is_atom test in the ocamlopt-produced executables checks if the 
address of the tested value falls within the data segment of the 
executable. Alas, if we permit dynamic loading of ocaml-opt compiled 
code, static data no longer reside in one segment (within one 
continuous range of virtual addresses). Therefore, modifications are 
necessary. 

Native MetaOCaml (or, to be precise, the dynamic linking part of it) 
has carried out such modifications. The Is_atom test now reads 

CAMLextern struct DYNlimits caml_static_data_limits; /* in new ndl.c */ 
#define Is_atom(v) \ 
  ((((char *)(v) >= caml_static_data_start \ 
     && (char *)(v) < caml_static_data_end) \ 
    || ((v) >= Atom(0) && (v) <= Atom(255))\ 
    || (caml_static_data_limits.used != 0 && \ 
        within_DYNlimitsP((void *)v,&caml_static_data_limits)))) 

The change is fully backward-compatible: if no dynamic linking is used 
(or if the data segment is indeed contiguous), the field 
caml_static_data_limits.used has the value 0 and Is_atom works exactly 
as before. The additional cost then is dereferencing of a static 
address -- which could be as few as two instructions (test and branch) 
on some architectures. 

It seems the extension can be used for maintaining `out-of-heap' OCaml 
structures. One can have as many such heaps as needed. One merely need 
to register the range of these extra-heap addresses. 

The code can be found in the directory natdyn of the current MetaOCaml 
distribution, specifically files mlvalues.h and ndl.c. The Makefile 
contains the necessary voodoo to get the changes take effect.

APC

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/620aae39ad318628/348bef409d1440ac#348bef409d1440ac

malc announced:

> At http://www.boblycat.org/~malc/apc/ you will find a small and not 
> entirely usual CPU load monitor written in OCaml. Perhaps it would be 
> useful to someone. 


There were few serious bugs in the first version (mostly showcasing 
on SMP, though not limited to it), those have been, hopefully, fixed. 
Updated version is available at aforementioned web page. Sorry for 
inconvenience.

Appel aux Communications JFLA 2007

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/05908e2557056ec5/221eb8634ced98ac#221eb8634ced98ac

Pierre-Etienne Moreau announced:

(This message is intentionally written in French) 
* MERCI DE FAIRE CIRCULER *            * MERCI DE FAIRE CIRCULER * 

DEUXIEME APPEL AUX COMMUNICATIONS       DEUXIEME APPEL AUX COMMUNICATIONS 

                         JFLA'2007 (http://jfla.inria.fr/) 

                Journées Francophones des Langages Applicatifs 
                         Organisées par l'INRIA 

                         27-30 janvier 2007 

        JFLA'2007 est la dix-huitième conférence francophone organisée autour des 
langages applicatifs et des techniques de certification basées sur la 
démonstration. 

  Ces nouvelles journées se tiendront les 

             27-30 janvier 2007. 

  Elles auront lieu à la montagne, à Aix-les-Bains. 

        Toujours centrée sur l'approche fonctionnelle de la programmation, la 
conférence a depuis l'an dernier élargi son spectre aux techniques et outils 
complémentaires qui élèvent niveau de qualité des logiciels (systèmes d'aide à 
la preuve, réécriture, tests, démonstration automatique, vérification). 

        Les JFLA réunissent concepteurs et utilisateurs dans un cadre agréable 
facilitant la communication; ces journées ont pour ambition de couvrir le 
domaine des langages applicatifs, en y incluant les apports d'outils d'autres 
domaines qui autorisent la construction de systèmes logiciels plus sûrs. 
L'enseignement de l'approche fonctionnelle du développement logiciel 
(spécification, sémantiques, programmation, compilation, certification) est 
également un sujet concernant fortement les JFLA.   

C'est pourquoi des contributions sur les thèmes suivants sont particulièrement 
recherchées (liste non exclusive) : 

- Langages fonctionnels : sémantique, compilation, optimisation, 
   mesures, tests, extensions par d'autres paradigmes de programmation. 

- Spécification, prototypage, développements formels d'algorithmes. 

- Utilisation industrielle des langages fonctionnels. 

- Assistants de preuve : implémentation, nouvelles tactiques, 
   développements présentant un intéret technique ou méthodologique. 

- Enseignement dans ses aspects liés à l'approche fonctionnelle 
   du développement. 

Orateurs invités 
---------------- 
  Hassan Aït Kaci (ILOG). 
  Andrew Tolmach (projet Gallium, INRIA Rocquencourt). 

Cours 
----- 
  Horatiu Cirstea (LORIA, Université Nancy 2). 
  Marc Pouzet (LRI, Université Paris-Sud 11). 

Comité de programme 
------------------- 
         Pierre-Etienne Moreau, Président (LORIA, INRIA Lorraine) 
         Sandrine Blazy, Vice-Président (CEDRIC, INRIA Rocquencourt) 
         Judicaël Courant (Verimag) 
         Alain Frisch (INRIA Rocquencourt) 
         Jean-Louis Giavitto (IBISC, Evry) 
         Delia Kesner (PPS, Université Paris 7) 
         Jean-François Monin (Verimag) 
         Virgile Prevosto (CEA) 
         Alan Schmitt (INRIA Rhones-Alpes) 
         Benjamin Werner (LIX, INRIA Futur) 

Soumission 
---------- 
Date limite de soumission : 10 octobre 2006 

Les soumissions doivent être soit rédigées en français, soit 
présentées en français. Elles sont limitées à 15 pages A4. Le style 
latex est imposé et se trouve sur le site WEB des journées à l'adresse 
suivante : 

          http://jfla.inria.fr/2007/actes.sty 

La soumission est uniquement électronique, selon la méthode détaillée   
dans : 

          http://jfla.inria.fr/2007/instructions-fra.html 

Les soumissions sont à envoyer aux présidents du comité de programme, 
avec pour titre de votre message ``SOUMISSION JFLA 2007'', à l'adresse 
suivante : 

              Pierre-Etienne.Moreau@loria.fr 

Les intentions de soumission envoyées le plus tôt possible à l'adresse 
ci-dessus seront les bienvenues. 

Dates importantes 
----------------- 
10 octobre 2006    : Date limite de soumission 
15 novembre 2006   : Notification aux auteurs 
10 décembre 2006   : Remise des articles définitifs 
15 janvier 2007    : Date limite d'inscription aux journées 
27-30 janvier 2007 : Journées 

Pour tout renseignement, contacter 
---------------------------------- 
Marie-Françoise Loubressac 
INRIA Rocquencourt 
Bureau des Cours et Colloques 
Domaine de Voluceau - BP 105 
78153 Le Chesnay Cedex 
Tél.: +33 (0) 1 39 63 56 00 - Fax : +33 (0) 1 39 63 56 38 
email : Marie-Francoise.Loubressac@inria.fr 

http://jfla.inria.fr/2007/

float rounding

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/8cced5418a89870a/b50ade009a60d66f#b50ade009a60d66f

Gerd Stolpmann asked and Gerd Stolpmann answered:

> I'm using Ocaml for an interval arithmetic application.  I'm   
> curious about 
> what the Ocaml parser/compiler does to float constants.  May I assume 
> that for any constant I enter, eg. 3.1415... (for N digits of pi), that 
> the compiler will give me a closest machine representable number? 
> i.e., if I bound a constant by the previous and next floating point   
> value to 
> that given me by the compiler, 
> will it always be the case that my original (mathematical) constant   
> lies in that interval? 

Don't think so. float_of_string is a wrapper around the strtod C 
function, and the standard for this function does not require that. I 
found this interesting note about how different OS implement string to 
float conversion: 
http://www.wrcad.com/linux_numerics.txt 

To be sure your constants are the best possible you probably have to use 
float_of_bits...

Xavier Leroy also answered:

> I'm using Ocaml for an interval arithmetic application.  I"m  curious 
> about what the Ocaml parser/compiler does to float constants. 

The lexer, parser and most of the compilers actually keep them as 
strings, just like they were given in the source code.  Conversion to 
IEEE binary representation occurs: 
- For the bytecode compiler ocamlc, when the bytecode is generated. 
The conversion is performed by the float_of_string function, which is 
a thin wrapper around the C library strtod() function, as Gerd 
Stolpmann said. 

- For the native-code compiler ocamlopt, some ports go the 
float_of_string route, but most output the literal as a string in the 
generated assembly code, letting the assembler do the conversion. 
The assembler is likely to use strtod() or atof(), though. 

> May I assume 
> that for any constant I enter, eg. 3.1415... (for N digits of pi), that 
> the compiler will give me a closest machine representable number? 
> i.e., if I bound a constant by the previous and next floating point 
> value to that given me by the compiler, will it always be the case 
> that my original (mathematical) constant lies in that interval? 

As Gerd said, the C standards do not guarantee this, so it really 
depends on how well-implemented your C library is.

Preview release ocamlnet-2.2test12

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/0ea12ab4f10b7e4c/ef4234c30f16f015#ef4234c30f16f015

Gerd Stolpmann announced:

Hi lists, 
there is a new preview release of the upcoming ocamlnet-2.2 library: 

http://www.ocaml-programming.de/packages/ocamlnet-2.2test12.tar.gz 

Developers interested in the upcoming 2.2 version can have look at it, 
and experienced developers are invited to test it, and to help finding 
the remaining bugs and problems. This version is already close to the 
final release. 

In the rest of this (long) email, I'll explain certain aspects of this 
version: 

1. What's new in ocamlnet-2.2 
2. Release notes 
3. How you can help testing 
4. Resources 
5. Credits 

Gerd 

------------------------------------------------------------ 
1. What's new in ocamlnet-2.2 
------------------------------------------------------------ 

Ocamlnet now includes equeue, netclient, and rpc 

These libraries were previously distributed as separate software 
packages. All four libraries form now together the new ocamlnet-2.2. 
This allows much deeper integration of their functionality. 

Building servers with Netplex 

The framework Netplex simplifies the development of server applications 
that require the parallel execution of requests. It focuses on 
multi-processing servers but also allows multi-threading servers. 
Netplex manages the start and stop of processes/threads, and dynamically 
adapts itself to the current workload. Netplex allows it to integrate 
several network protocols into one application, and as it also supports 
SunRPC as protocol, one can consider it even as component architecture. 
Furthermore, it has infrastructure to read configuration files and to 
log messages. 

Ocamlnet includes add-ons for Netplex to build SunRPC servers, web 
servers, and web application servers (the latter over the protocols AJP, 
FastCGI, or SCGI). 

The revised API for web applications 

The library netcgi2 is a revised version of the old cgi API (now also 
called netcgi1). The changes focus on restructuring the library in order 
to improve its modularity. It is hoped that beginners find more quickly 
the relevant functions and methods. The API is essentially the same, but 
the support for cookies has been enhanced. The connectors allowing a web 
server to talk with the application have been completely redeveloped - 
all four connectors (CGI, AJP, FastCGI, SCGI) support the same features. 
The connector for SCGI is new. The connector for AJP has been upgraded 
to protocol version 1.3. There are Netplex add-ons for the connectors. 

The old API is still available, but its features are frozen. It is 
recommended to port applications to netcgi2. 

Improvements for SunRPC applications 

It is now possible to use the SunRPC over SSL tunnels. All features are 
available, including asynchronous messages. As a side effect of this, 
the SunRPC implementation is now transport-independent, i.e. it is 
sufficient to implement a few class types to run RPC over any kind of 
transport. 

Furthermore, a few details have been improved. SunRPC servers can now 
implement several RPC programs or program versions at the same time. 
SunRPC clients can now connect to their servers in the background. A few 
bugs have been fixed, too. 

Shared memory 

As multi-processing has become quite important due to Netplex, Ocamlnet 
supports now the inter-process communication over shared memory. The 
implementation is brand-new and probably not very fast, but shared 
memory makes sometimes things a lot easier for multi-processing 
applications. 

Old things remain good 

Of course, this version of Ocamlnet supports the long list of features 
it inherited from its predecessors. This includes an enhanced HTTP 
client, a Telnet client, a (still incomplete) FTP client, a POP client, 
and an SMTP client. The shell library is an enhanced version of 
Sys.command. The netstring library is a large collection of string 
routines useful in the Internet context (supports URLs, HTML, mail 
messages, date strings, character set conversions, Base 64, and a few 
other things). 

------------------------------------------------------------ 
2. Release notes 
------------------------------------------------------------ 

Stability 

In general, the stability of this version is already excellent. About 90 
% of the code has been taken over from previous versions of ocamlnet, 
equeue, netclient, and rpc, and this means that this code is already 
mature. About 10 % of the code has been newly developed: 

- netcgi2 is a revised version of the cgi library. Large parts 
  are completely new. The stability is unclear yet. 

- netplex is the new server framework. Fortunately, it could already 
  be used in a production environment, and it has proven excellent 
  stability there. 

- netcgi2-plex combines netcgi2 and netplex. Stability is unclear. 

- nethttpd has now the option to use netcgi2 as cgi provider 
  (configure option -prefer-netcgi2). This is mostly untested. 

- netshm adds shared memory support. The stability is unclear 
  yet. 

- equeue-ssl and rpc-ssl add SSL support to the RPC libraries. 
  Stability is unclear. 

The project would profit most if these libraries were tested by 
independent developers. 

Known Problems 

There are known problems in this preview release: 

- There is no good concept to manage signals. This is currently done 
  ad-hoc. For now, this does not make any problems, or better, there 
  is always the workaround that the user sets the signal handlers 
  manually if any problems occur. 

- The new cookie implementation of netcgi2 should replace the old 
  one in netstring. Users should be prepared that Netcgi.Cookie 
  will eventually become Nethttp.Cookie in one of the next releases. 

- In netcgi2-plex, the "mount_dir" and "mount_at" options are not yet 
  implemented. 

- In netclient, aggressive caching of HTTP connections is still 
  buggy. Do not use this option (by default, it is not enabled). 

- The FTP client is still incomplete. 

------------------------------------------------------------ 
3. How you can help testing 
------------------------------------------------------------ 

It would be great if experienced developers tested the libraries, 
especially the new and revised ones. Discussions should take place in 
the Ocamlnet mailing list (see resources section below). 

It is important to know that this version of Ocamlnet also includes the 
libraries formerly distributed as equeue, netclient, and rpc. If you 
upgrade an O'Caml installation, you _must_ remove old versions of these 
libraries prio to the installation of the new Ocamlnet. 

For GODI users, there is a convenient way of installing ocamlnet2. First 
install GODI as usual (either for O'Caml 3.08 or 3.09). Then, change 
godi.conf, and add the line 

GODI_BUILD_SITES += http://www.ocaml-programming.de/godi-build/branch-ocamlnet2/ 

update packages and rebuild. You can install 

godi-ocamlnet 
godi-ocamlnet-gtk1 
godi-ocamlnet-gtk2 
godi-ocamlnet-tcl 
godi-ocamlnet-ssl 

where the latter four packages contain add-ons that need further 
libraries to be installed. The packages 

godi-equeue*, godi-rpc, godi-netclient 

are only fake packages that include godi-ocamlnet as predecessors. 

------------------------------------------------------------ 
4. Resources 
------------------------------------------------------------ 

On online version of the reference manual can be found here: 
http://ocamlnet.sourceforge.net/manual-2.2/ 

The current development version is available in Subversion: 

https://gps.dynxs.de/svn/lib-ocamlnet 

Note that the ocamlnet file tree in Sourceforge refers to 
ocamlnet-1 only. 

There is a mailing list for Ocamlnet development: 

http://sourceforge.net/mail/?group_id=19774 

In case of problems, you can also contact me directly: 
Gerd Stolpmann <gerd@gerd-stolpmann.de> 

------------------------------------------------------------ 
5. Credits 
------------------------------------------------------------ 

A number of people and institutions helped creating this new version: 

- Christophe Troestler wrote the revised CGI library 

- The California State University sponsored the development 
  of Netplex and the SSL support for SunRPC. Special thanks 
  to Eric Stokes who convinced the University, and David 
  Aragon for his business support. 

- All companies who hired me this year and made it possible 
  that I can make a living from O'Caml development. Without 
  that it would have been impossible to put so much energy 
  into this. Special thanks go to Yaron Minsky and Mika 
  Illouz. 

(The movie ended. Rewind and watch again!)

More problems with memoization

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/4540feb573f9a04c/61d7af24a7665633#61d7af24a7665633

Diego Olivier FERNANDEZ PONS asked:

I wrote the following (classical) memoized code for the fibonacci   
function and I have been unsuccessfully trying to generalize it with a   
higher order function. 

let rec fib = function 
   | 0 -> 0 
   | 1 -> 1 
   | n -> fib_mem (n - 1) + fib_mem (n - 2) 
and fib_mem = 
   let table = ref [] in 
     function n -> 
       try 
        List.assoc n !table 
       with Not_found -> 
        let f_n = fib n in 
          table := (n, f_n) :: !table; 
          f_n 

# val fib : int -> int = <fun> 
# val fib_mem : int -> int = <fun> 

It works: fib 35 answers instantaneously. 

Now I want to achieve the same result with a higher order function   
[make_memo] and apply it to fib 

let make_mem = function f -> 
   let table = ref [] in 
     function n -> 
       try 
        List.assoc n !table 
       with Not_found -> 
        let f_n = f n in 
          table := (n, f_n) :: !table; 
          f_n 

#val make_mem : ('a -> 'b) -> 'a -> 'b 

Very well. Notice that it has one less parameter than the code posted   
by Andrej Bauer which has type memo_rec : (('a -> 'b) -> 'a -> 'b) ->   
'a -> 'b. The only difference is the line 

let f_n = f n in ... 
with respect to 
let f_n = f g n in ... where g is the anonymous function itself 

in the same way Bauer's [fib_memo] uses an extra parameter [self] 

let fib_memo = 
   let rec fib self = function 
     | 0 -> 1 
     | 1 -> 1 
     | n -> self (n - 1) + self (n - 2) 
   in 
     memo_rec fib 

Now I try to apply make_mem to but it does not work 

let rec fib = function 
   | 0 -> 0 
   | 1 -> 1 
   | n -> fib_mem (n - 1) + fib_mem (n - 2) 
and fib_mem = make_mem fib 

# This kind of expression is not allowed as right-hand side of `let rec' 

Ok... usually one only need to expand to avoid the problem 

let rec fib = function 
   | 0 -> 0 
   | 1 -> 1 
   | n -> fib_mem (n - 1) + fib_mem (n - 2) 
and fib_mem = function n -> 
   let f = make_mem fib in 
     f n 

# val fib : int -> int = <fun> 
# val fib_mem : int -> int = <fun> 

But know fib 35 takes several minutes to be computed ! 

I believe I understand why: I am computing a different fib_mem for   
each value of n and applying it just after, while I wanted a single   
fib_mem to be used for all computations. In the process, the   
tabulation vanishes. 

The only work around I found is to lift the table argument in [make_mem] 

let make_mem = fun table f -> 
   function n -> 
     try 
       List.assoc n !table 
     with Not_found -> 
       let f_n = f n in 
        table := (n, f_n) :: !table; 
        f_n 

# val make_mem : ('a * 'b) list ref -> ('a -> 'b) -> 'a -> 'b = <fun> 

And build fib in the following way 

let fib_mem = function n -> 
   let table = ref [] and 
       fib = function 
        | 0 -> 0 
        | 1 -> 1 
        | n -> fib_mem (n - 1) + fib_mem (n - 2) 
   in make_mem table fib n 

# fib_mem 35: instantaneous 

The problem is that the memoization is much more intrusive, which is   
what I was trying to avoid.

Jon Harrop suggested:

I believe you want to "untie the knot" of recursion, creating an higher-order, 
auxiliary fibonacci function fib_aux that accepts the recursive call as an 
argument: 
# let rec fib_aux fib = function 
    | 0 | 1 as n -> n 
    | n -> fib(n - 1) + fib(n - 2);; 
val fib_aux : (int -> int) -> int -> int = <fun> 

You can recover the ordinary fibonacci function using the Y combinator: 

# let rec fib n = fib_aux fib n;; 
val fib : int -> int = <fun> 

You can write a higher-order memoization function that accepts an argument 
with the type of fib_aux: 

# let memoize f = 
    let m = Hashtbl.create 0 in 
    let rec f' n = 
      try Hashtbl.find m n with Not_found -> 
        let x = f f' n in 
        Hashtbl.replace m n x; 
        x in 
    f';; 
val memoize : (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b = <fun> 

Now you can memoize recursive functions easily: 

# memoize fib_aux 35;; 
- : int = 9227465

Martin Jambon corrected:

> I believe you want to "untie the knot" of recursion, creating an higher-order, 
> auxiliary fibonacci function fib_aux that accepts the recursive call as an 
> argument: 
> # let rec fib_aux fib = function 
>    | 0 | 1 as n -> n 
>    | n -> fib(n - 1) + fib(n - 2);; 
> val fib_aux : (int -> int) -> int -> int = <fun> 


Since the point is to make the function not recursive, I think you 
shouldn't use "let rec" :-)

Diego Olivier FERNANDEZ PONS then said:

Your solution is similar to Andrej Brauer's one which is exactly what   
I was trying to avoid because it is too much intrusive. When you break   
the recursion in two functions you change the type of [fib] from 
[fib : int -> int] to [fib : (int -> int) -> int -> int)]. 
In my first example you keep the type of [fib] and add a second   
function [fib_mem]. You can use anyone indifferently and hide the   
latter with the .mli 
val fib : int -> int = <fun> 
val fib_mem : int -> int = <fun> 

When you compare your solution with what I am trying to do you see   
there is a big difference in locality and transparency 

let rec fib = function 
   | 0 -> 0 
   | 1 -> 1 
   | n -> fib (n - 1) + fib (n - 2) 

transformed into 

let rec fib = function 
   | 0 -> 0 
   | 1 -> 1 
   | n -> fib_mem (n - 1) + fib_mem (n - 2) 
and fib_mem = make_mem fib 

The latter could even be done automatically by a source to source   
transformation (if it worked).

Oleg then said:

> The latter could even be done automatically by a source to source 
> transformation (if it worked). 


But it almost does: 
let make_mem = fun table f -> 
  function n -> 
    try 
      List.assoc n !table 
    with Not_found -> 
      let f_n = f n in 
      table := (n, f_n) :: !table; 
      f_n 
;; 

let rec fib = function 
  | 0 -> 0 
  | 1 -> 1 
  | n -> mem fib (n - 1) + mem fib (n - 2) 
and mem = make_mem (ref []) 
;; 

 fib 35;; 
 - : int = 9227465 
instantaneous. 

The biggest difference is replacing "fib_mem" in your code with 
"mem fib" in mine. The same number of characters in either case... 
And yes, this can be done via camlp4... OTH, with camlp4 it is quite 
trivial to convert a let rec expression to the one involving open 
recursion. So, we can write something like 

let fib n = funM MyModule n -> let rec fib function 0 -> 1 ... in fib n;; 

and, depending on what MyModule actually implements, obtain either the usual 
or the memoized Fibonacci (or even partially unrolled to any desired degree).

Andrej Bauer also answered:

> In my first example you keep the type of [fib] and add a second function 
> [fib_mem]. You can use anyone indifferently and hide the latter with the 
> .mli 
> val fib : int -> int = <fun> 
> val fib_mem : int -> int = <fun> 

If you want to keep the same type for fib, and have the memoized one, as 
well as to have locality you can do something like this: 
let make_memo f = ... 

let rec make_rec f x = f (make_rec f) x 

let fib, fib_mem = 
   let fib' self = function 
     | 0 -> 0 
     | 1 -> 1 
     | n -> self (n - 1) + self (n - 2) 
   in 
     make_rec fib', make_mem fib' 

(You will notice that make_rec is just the Y combinator.) 

> When you compare your solution with what I am trying to do you see there 
> is a big difference in locality and transparency 

I fail to see this big difference, frankly, since all you're doing is 
just a beta-reduction of what Jon and I suggested. 
A recursive function _is_ the fixed point of a non-recursive one with an 
"extra" argument. You may hide this fact if you wish, but I think it's 
more honest to admit it to yourself. The "untied" version of fib has the 
advantage that you can do many cool things to it: memoizing is just one 
possibility.

skaller then said:

> A recursive function _is_ the fixed point of a non-recursive one with an 
> "extra" argument. You may hide this fact if you wish, but I think it's 
> more honest to admit it to yourself. The "untied" version of fib has the 
> advantage that you can do many cool things to it: memoizing is just one 
> possibility. 

There is, however, a good reason this is not practical in general: 
for a recursion of N entities (either functions or polymorphic 
variants in Ocaml can be 'open-recursioned') you need an 
extra N arguments .. and the result is unreadable, as well 
as possibly incurring a performance hit. 
I wonder if one can add compiler support: for example given 

   let rec fib x = match x with 
     | 0 -> 0 
     | 1 -> 1 
     | n -> fib (n - 1) + fib (n - 2) 

The compiler silently generates: 

   let @fib fib' x = match x with 
     | 0 -> 0 
     | 1 -> 1 
     | n -> fib' (n - 1) + fib' (n - 2) 

   let fib = make_rec @fib 

and now you have fib as written .. but you ALSO can do: 

   let fib = make_mem @fib 

to create a memoised version. 

That's for one argument and can clearly be done easily 
by the compiler (in fact, camlp4). 

However the extension to multiple arguments is not clear. 
Maybe labelled arguments would help, perhaps using 
a record. 

Andrei said: 

"You may hide this fact if you wish, but I think it's 
more honest to admit it to yourself." 

but I think this is misleading: there's a good 
reason NOT to open the recursions. There's a fundamental 
principle of software design: the open/closed principle (OOSC) 
which is not obeyed by either the closed or open form. 

We need a form that is simultaneously closed and ready to 
use but which is also open and amenable to extension. 

FYI: the case that most interests me at the moment is neither 
type recursion nor functional recursion -- I'm interested 
in whether it is possible to design an open-recursive grammar, 
this seems to need both recursive data types *and* recursive 
functions in open/closed form. 

Interestingly in this case I have actually implemented one 
already, allowing Felix to extend it's own parser by combining 
an Ocamlyacc parser with an executable RD parser .. but 
this really isn't the same as systematic static extension 
where, for example you write a basic grammar, and then 
extensions to it.

Don Syme said:

You may be interested in the approach to this kind of problem discussed 
in http://dx.doi.org/10.1016/j.entcs.2005.11.038 (see also tech report 
at http://research.microsoft.com/users/dsyme/papers/valrec-tr.pdf). 
Under that approach you get to write the code in a natural way as shown 
below: fib_mem is defined recursively, but the "cache" function has the 
natural "(a -> b) -> (a -> b)" type and is abstract and reusable (no 
details as to the nature of the internal table are revealed).   

let cache f = 
  let table = ref [] in 
  fun n -> 
     try 
       List.assoc n !table 
     with Not_found -> 
       let f_n = f n in 
          table := (n, f_n) :: !table; 
          f_n 

let rec fib_mem = 
   cache (function 
            | 0 -> 0 
            | 1 -> 1 
            | n -> fib_mem (n - 1) + fib_mem (n - 2)) 

The use of a computation on the right of a "let rec" is allowed by 
systematically introducing initialization holes using lazy values and 
forces.  There are disadvantages to this approach, as it introduces a 
potential for initialization unsoundness somewhat similar to those in 
most designs and implementations of recursive modules.  However the 
paper argues that in the balance it is not unreasonable for a strict 
language to accept this in order to gain modularity and localize the 
potential for unsoundness.  It is even more compelling when often 
working with abstract APIs such as Java and .NET GUI libraries. 

While this isn't OCaml, and may not ever be the right design for OCaml, 
I've found it a useful technique to know even when doing C#, C++ and 
OCaml programming, as a broad range of recursion puzzles can be 
addressed by modelling the problem the "natural" way (e.g. more like 
Haskell) and then using a translation that introduces initialization 
holes systematically.  The translation of your sample into OCaml using 
"lazy" initialization holes is shown below (for single recursion you can 
also just use a "option ref").  Note "cache" does not change, 
maintaining the property that the caching function is abstract and 
reusable. 

let (!!) x = Lazy.force x 
let rec fib_mem' = lazy 
   cache (function 
            | 0 -> 0 
            | 1 -> 1 
            | n -> !!fib_mem' (n - 1) + !!fib_mem' (n - 2)) 

let fib_mem = !!fib_mem' 

FWIW it is well known that laziness can be used in essentially this way, 
e.g. see Michel Mauny's early papers on laziness in OCaml.  However I've 
not seen a paper that argues the case for making this the default 
interpretation of "let rec" in a strict language.

Using folding to read the cwn in vim 6+

Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).

:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'&lt;1':1
zM

If you know of a better way, please let me know.

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.

Alan Schmitt