Caml Weekly News

Previous week Up Next week
Hello,

Here is the latest Caml Weekly News, week 19 to 26 November, 2002.

01) OCaml HTTP Daemon library: ocaml-http 0.0.3
02) Hash consing
03) 8 Assistant/Associate Professorships at the IT University of
    Copenhagen
04) xmllexer
05) Unix.sendto Unix.recvfrom
06) mlgmp bugfix release
07) Jabbr, an XMPP library for OCaml
08) Why systhreads?
09) Web Demonstration of Type Error Slicing
10) caml2html v1.1

======================================================================
01) OCaml HTTP Daemon library: ocaml-http 0.0.3
----------------------------------------------------------------------
Stefano Zacchiroli announced:

ocaml-http is a library inspired from perl's HTTP::Daemon that permits
to write simple HTTP daemons in OCaml, main module is named Http_daemon.

Currently GET only HTTP daemons are supported.

You can define the behaviour of your daemon defining a callback that is
invoked each time an HTTP request is received.
A simple OO interface that abstracts over HTTP requests and responses is
provided.
A set of facility functions for writing callbacks is also provided (e.g.
functions to send headers, files, and so on).

Regarding daemons' "social life" is possible to specify address and port
on which the daemon will listen to, whether or not a child is forked for
each request and a timeout after which client connections will be
closed.

A preliminary release (0.0.3) of ocaml-http is available for download
at:

Home page:  http://www.students.cs.unibo.it/~zacchiro/ocaml-http.en.html
Tarball:    http://www.students.cs.unibo.it/~zacchiro/stuff/ocaml-http-0.0.3.tar.gz
CVS Web:    http://www.cs.unibo.it/cgi-bin/cvsweb/helm/DEVEL/ocaml-http/

Feedback is appreciated.

======================================================================
02) Hash consing
----------------------------------------------------------------------
Lukasz Lew asked and Jean-Christophe Filliatre answered:

> Let's suppose wy have module with such interface
> 
> module X =
>  type t (*big data structure*)
> 
>  add : string -> t
>  delete : t -> ()
> 
> And in the other module i keep X.t list, and i want to keep with each of 
> them additional data of unknown type [at time of X compilation]

You could  have a polymorphic  type "'a X.t"  where 'a stands  for the
type of the  associated data, and provide functions  "add_info : 'a ->
'a t -> unit" and "get_info : 'a t -> 'a" in module X.

> Other is to keep with every t int identifier as small as possible, to keep
> additional data in array, but this solution isn't very good.

Following this  kind of  idea, you could  do hash-consing of  your X.t
values, so that  they are allocated only once; doing  this, it is easy
to associate a  unique integer to each value in  X.t, which will allow
you to build an efficient map over X.t.

I already wrote an hash-consing module, available here:
http://www.lri.fr/~filliatr/software.en.html

(By  the way,  doing hash-consing  has many  other advantages  such as
saving space, providing O(1) equality, ...)

======================================================================
03) 8 Assistant/Associate Professorships at the IT University of
   Copenhagen
----------------------------------------------------------------------
Camilla Jensen announced:

The IT University of Copenhagen has 8 new Assistant/Associate Professorships
within the areas of programming language technology, software design and
implementation, networks and protocols, foundations of image analysis,
algorithms
and complexity theory, and verification.

Application deadline is January 6th, 2003.

Full announcement at http://www.it-c.dk/English/vacancies/TUBVmk/

======================================================================
04) xmllexer
----------------------------------------------------------------------
Manuel Maarek announced:

I am pleased to announce the first release of xmllexer available here:

http://www.macs.hw.ac.uk/~mm20/xmllexer/

xmllexer is an XML lexer for Camlp4. It is composed of one OCaml file   
(in revised syntax). xmllexer.ml is derived from the default lexer of 
Camlp4: plexer.ml (written by Daniel de Rauglaudre).

======================================================================
05) Unix.sendto Unix.recvfrom
----------------------------------------------------------------------
Salvatore Altavilla asked and Dimitri Timofeev answered:

> I would want to have an example of like sending and receiving a 
> package (type string) in ocaml via UDP protocol

Maybe an attached file (an exercise) could help you. By the way,
i'd like to hear all the criticism and suggestions one may have
concerning this code.

(see the file at: http://caml.inria.fr/archives/200211/msg00255.html)

======================================================================
06) mlgmp bugfix release
----------------------------------------------------------------------
David Monniaux announced:

Here is a bugfix release for mlgmp, the OCaml / GNU MP interface.

Among other things, an annoying bug triggered by garbage collection during
integer divisions was fixed.

http://www.di.ens.fr/~monniaux/download/mlgmp-20021123.tar.gz

======================================================================
07) Jabbr, an XMPP library for OCaml
----------------------------------------------------------------------
Mike Lin announced:

Hi. I finally got around to documenting a first version of Jabbr, an
OCaml library for the XMPP (better known as Jabber) instant messaging
system that I wrote last summer.

http://mikelin.mit.edu/xmpp/jabbr/

======================================================================
08) Why systhreads?
----------------------------------------------------------------------
Xavier Leroy lectured:

It seems that the annual discussion on threads started again.  Allow 
me to deliver again my standard lecture on this topic.

Threads have at least three different purposes:

1- Parallelism on shared-memory multiprocessors.
2- Overlapping I/O and computation (while a thread is blocked on a network
   read, other threads may proceed).
3- Supporting the "coroutine" programming style
   (e.g. if a program has a GUI but performs long computations,
    using threads is a nicer way to structure the program than
    trying to wrap the long computation around the GUI event loop).

The goals of OCaml threads are (2) and (3) but not (1) (for reasons
that I'll get into later), with historical emphasis on (2) due to the
MMM (Web browser) and V6 (HTTP proxy) applications.

Pure user-level scheduling, or equivalently control operators (call/cc),
provide (3) but not (2).

To achieve (2) with a user-level scheduler such as OCaml's bytecode
thread library requires all sorts of hacks, such as non-blocking I/O
and select() under Unix, plus wrapping of all I/O operations so that
they call the user-level scheduler in cases where they are about to
block.  (Otherwise, the whole process would block, and not just the
calling thread.)

Not only this is ugly (read the sources of the bytecode thread library
to get an idea) and inefficient, but it interacts very poorly with
external libraries written in C.  For instance, deep inside the C
implementation of gethostbyname(), there are network reads that can
block; there is no way to wrap these with scheduler calls, short of
rewriting gethostbyname() entirely.

To make things worse, non-blocking I/O is done completely differently
under Unix and under Win32.  I'm not even sure Win32 provides enough
support for async I/O to write a real user-level scheduler.

Another issue with user-level threads, at least in native code, is the
handling of the thread stacks, especially if we wish to have thread
stacks that start small and grow on demand.  It can be done, but is
highly processor- and OS-dependent.  (For instance, stack handling on
the IA64 is, ah, peculiar: there are actually two stacks that grow in
opposite directions within the same memory area...)

One aspect of wisdom is to know when not to do something oneself, but
leave it to others.  Scheduling I/O and computation concurrently, and
managing process stacks, is the job of the operating system.  Trying
to do it entirely in a user-mode program is just not reasonable.
(For another reference point, see Java's move away from "green
threads" and towards system threads.)

What about parallelism on SMP machines?  The main issue here is that
the runtime system, and in particular the garbage collector and memory
manager, must be MP-safe.  This means minimizing global state, and
introducing locking around accesses to shared resources.  If done
naively (e.g. locking at each heap allocation), this can be extremely
costly; it also complicates the runtime system a lot.  Finally,
garbage collection can become a limiting factor if it is done in the
"stop the world" fashion (all threads stop during GC); a concurrent GC
avoids this problem, but adds tremendous complexity.

(Of course, all this SMP support stuff slows down the runtime system
even if there is only one processor, which is the case for almost all
our users...)

All this has been done before in the context of Caml: that was
Damien Doligez's Concurrent Caml Light system, in the early 90s.
Indeed, the incremental major GC that we have in OCaml is a
simplification of Damien's concurrent GC.  If you're interested, have
a look at Damien's publications.

Why was Concurrent Caml Light abandoned?  Too complex; too hard to debug
(despite the existence of a machine-checked proof of correctness);
and dubious practical interest.  Shared-memory multiprocessors have
never really "taken off", at least in the general public.  For large
parallel computations, clusters (distributed-memory systems) are the
norm.  For desktop use, monoprocessors are plenty fast.  Even if you
have a 4-processor SMP machine, it isn't clear whether you should
write your program using shared memory or using message passing -- the
latter is slightly more expensive, but scales to clusters...

What about hyperthreading?  Well, I believe it's the last convulsive
movement of SMP's corpse :-)  We'll see how it goes market-wise.  At
any rate, the speedups announced for hyperthreading in the Pentium 4
are below a factor of 1.5; probably not enough to offset the overhead
of making the OCaml runtime system thread-safe.

In summary: there is no SMP support in OCaml, and it is very very
unlikely that there will ever be.  If you're into parallelism, better
investigate message-passing interfaces.

(this of course has spawned an active thread, starting here:
http://caml.inria.fr/archives/200211/msg00274.html)

======================================================================
09) Web Demonstration of Type Error Slicing
----------------------------------------------------------------------
Christian Haack announced:

Previous methods have generally identified the location of a type
error as a particular program point or the program subtree rooted at  
that point.  In the paper "Type Error Slicing in Implicitly Typed 
Higher-Order Languages" by two of us (Haack and Wells), we present a
new approach that treats the location of a type error as a minimal set
of program points which are necessary for the type error and presents
this "location" as a slice of the program.  Connected to this paper,   
we have implemented type error slicing for a subset of core Standard
ML.

We have prepared a web interface that allows you to experiment with
our implementation using just your web browser.  This interface is now
available at:

http://www.cee.hw.ac.uk/ultra/compositional-analysis/type-error-slicing/

======================================================================
10) caml2html v1.1
----------------------------------------------------------------------
Sebastien Ailleret announced:

I'm pleased to announce v1.1 of caml2html. Caml2html is a tool which
creates hilighted html pages from OCaml files (.ml, .mli, .mll and
.mly). You can see an example of its output here :
http://perso.wanadoo.fr/sebastien.ailleret/caml/html/overflow.mli.html

You can get caml2html from my OCaml page :
http://perso.wanadoo.fr/sebastien.ailleret/caml/

This new version introduces the following improvements :
- possible use of css to choose the colors you want
- can concat several ml files in a single html document
- better interaction between ml and html ('<', '>', ';')
- many performances improvements

======================================================================
Old cwn
----------------------------------------------------------------------

If you happen to miss a cwn, you can send me a message
(alan.schmitt@inria.fr) and I'll mail it to you, or go take a look at
the archive (http://pauillac.inria.fr/~aschmitt/cwn/). If you also wish
to receive it every week by mail, just tell me so.

======================================================================

Alan Schmitt