Previous week Up Next week

Hello

Here is the latest Caml Weekly News, for the week of 12 to 19 April, 2005.

  1. [French] Transparents de presentation d'OCaml
  2. Impact of GC on memoized algorithm
  3. Ocamlnet 1.0 released
  4. cfind 0.0.0
  5. ocaml-ast-analyze v 0.1.1
  6. ocaml-gettext v 0.2.0
  7. Ocaml <-> C
  8. CamlGI question

[French] Transparents de presentation d'OCaml

Archive: http://caml.inria.fr/pub/ml-archives/caml-list/2005/04/091b3df936ef868e7c8c3cf7000801e6.en.html

David Mentre said:
[ About slides I made to present OCaml. Sorry, French only as my slides
  are in french. BTW, caml-mode under Emacs has worked quite well for
  the live examples. Thanks. ]

Bonjour à tous,

J'ai fait une présentation d'OCaml la semaine dernière pour notre LUG
sur Rennes. Les transparents sont disponibles sur le web (sources OOo,
PDF et HTML) :
 http://www.linux-france.org/~dmentre/gulliver/presentations/expose-ocaml-2005-04-07/
(licence domaine public, repompez autant que vous voulez ;)

J'ai essayé de faire passer pourquoi je voyais OCaml comme un
*excellent* langage de programmation. Rien que du très classique pour
les habitués de cette liste. :)

Lors de l'exposé, j'ai illustré mon propos par quelques expressions
tapées en caml-mode sous Emacs. J'ai trouvé que c'était très pratique
pour montrer concrètement quelques typages et répondre aux
questions. Avec deux écrans virtuels et les bons raccourcis clavier, on
passe facilement des transparents à Emacs et vice versa.
    

Impact of GC on memoized algorithm

Archive: http://caml.inria.fr/pub/ml-archives/caml-list/2005/04/408d0d837d153cb041f37707b662be02.en.html

Continuing a thread from last week, Damien Doligez said:
Assuming you are going to use hash tables:

In my experience, it is important to pay attention to the cost of the
hashing function.  If you can avoid going through the whole structure to
compute the hash, you'll save a lot of time.

It may also be a good idea to limit the number of cache entries, instead
of just letting the cache grow arbitrarily large.  I've had good results
by deleting some entries at random when the cache gets too big.
    
Jan Kybic added:
Exactly. When the cache size exceeds your real RAM, it is often
cheaper to recalculate the value instead of retrieving it from the
disk (swap).  I have good experience with deleting the oldest (by
order of creation) and cheapest-to-recalculate entries when memory
becomes scarce by alternatively filling several (linked) hash tables.
The heuristic needs monitoring the current memory consumption, the
size of the items and timing the memoized operations.

On the other hand, an exact LRU (last recently used) cache based on
doubly-linked lists did not work well for me, the cost of maintaining
the links was too high.

The best trade-off obviously depends on the size of your items and
keys, cost of recalculating the items, and on the access
pattern... Experimenting is probably unavoidable.
    

Ocamlnet 1.0 released

Archive: http://caml.inria.fr/pub/ml-archives/caml-list/2005/04/737c32b3d21311fb9cb1371a50293760.en.html

Gerd Stolpmann announced:
The Ocamlnet project announces version 1.0 of this library collection.
This version includes a few bugfixes. It is also thought as the
ultimately mature version of the library ever, hence the jump to 1.0.

----------------------------------------------------------------------------
What is Ocamlnet?
----------------------------------------------------------------------------

A collection of modules for the Objective Caml language which focus on 
application-level Internet protocols and conventions.

The current distribution contains:

- a mature implementation of the CGI protocol

- an implementation of the JSERV protocol (AJP-1.2), can be used with
  mod_jserv (Apache JServ) and mod_jk (Jakarta connector) to connect
  application servers written in O'Caml with web servers

- an implementation of the FastCGI protocol, for the same purpose

- an experimental POP3 client

- a library of string processing functions related to Internet 
  protocols (formerly known as "netstring" and distributed separately):
  MIME encoding/decoding, Date/time parsing, Character encoding
  conversion, HTML parsing and printing, URL parsing and printing,
  OO-representation of channels, and a lot more.

Ocamlnet is developed as a SourceForge project:

  http://sourceforge.net/projects/ocamlnet

Developers and code contributions are welcome.

Ocamlnet is licensed under the zlib/libpng license.

----------------------------------------------------------------------------
Changes from 0.98 to 1.0
----------------------------------------------------------------------------

Only bug fixes and improvements to ease debugging of problems:

Mail infrastructure:
- One can now select a single LF as line terminator to write
  mail messages. This is the default when Netsendmail.sendmail
  pipes messages to the (external) MTA.

FastCGI:
- Improved exception handling. Low-level errors are signaled by the
  new FCGI_Error exception.
- Improved fastcgi compatibility, works now also with lighttpd

CGI:
- Many exceptions explain in more detail what is going wrong.

Other:
- Fix for quoting errors in Neturl.
    

cfind 0.0.0

Archive: http://caml.inria.fr/pub/ml-archives/caml-list/2005/04/0710de4450267645f2de68981c22f492.en.html

Radu Grigore announced:
Description: cfind is a UNIX tool that provides functionality similar
to that of Google Desktop from the command line. It is written
entirely in OCaml.

Homepage: http://cfind.sourceforge.net/

I'll appreciate any input from the OCaml community.
    
Jan Kybic answered:
It looks definitely very useful. Proposed extensions and changes:

- configurable choise of a lexer. For example there could be a table
  (read from a configuration file) with regular expressions matching
  path and file names, association them to parsers.

- If I understand your code correctly, in TeX files only
  command names are indexed, is it correct? Then I might prefer a
  different lexer, which ignores comments and command names and
  indexes the words in the text.

- It should be also possible to apply other configurable filters to
  the files before indexing. An example would be to decompress 
  all "*.gz" or "*.bz2" files before indexing

- More complicate logical expressions defining match, in the spirit of:
  "functional" AND "lazy" AND NOT "Haskell"

- It would be nice to be able to break files into smaller units and to
  find the units which match, not the whole file. A typical example
  would be email in mbox format, or perhaps functions in a program.
    

ocaml-ast-analyze v 0.1.1

Archive: http://caml.inria.fr/pub/ml-archives/caml-list/2005/04/64db5c2156f8678d51ac62b550accab2.en.html

Sylvain Le Gall announced:
General:
Ocaml-ast-analyze is a library that provides all the functions required
to build a camlp4 pr_*.cmo module ( ie a printer module ). Creating by 
hand such a module, require to match a lot of unuseful case, and always 
do the same branching... This module abstract all the matches and 
enables only to match what interest you in the AST of an Ocaml source 
file.

It comes also with pr_ast_dump.cmo which is a real pr_*.cmo module. This
module dumps the AST of an Ocaml source file. It allows the programmer 
to understand what should be match in the AST.

Changes:
v0.1.1 is released because of the dependency holds by ocaml-gettext (
the string extractor of ocaml-gettext is based on this module ). But 
the library should be useful, even with not a lot of documentation ( 
should come in the next version ).

Link:
http://www.carva.org/sylvain.le-gall/ocaml-ast-analyze.html
    

ocaml-gettext v 0.2.0

Archive: http://caml.inria.fr/pub/ml-archives/caml-list/2005/04/ff50941ea070c30d00f43af49c4d6b7d.en.html

Sylvain Le Gall announced:
General:
Ocaml-gettext is a library that enables string translation in Ocaml. The
API is based on GNU gettext. It comes with a tool to extract the string
which needs to be translated from Ocaml source file.

This enable Ocaml program to output string in the native language of the
user ( if you also provide a file containing the translation of the 
english string contains in the program to the one in the native language 
of the user ).

The translation is base on string ( this means that you need to provide
a string and not a unique identifier, like in some other catalog 
system ). This string is in english, and will be returned if the native 
language of the user doesn't have translation catalog.

Changes:
v 0.2.0 is the first public release. Former release was simple wrapper
around GNU gettext. This release have a wrapper to GNU gettext AND a 
pure Ocaml implementation of the library ( based on Camomile ).

Link :
http://www.carva.org/sylvain.le-gall/ocaml-gettext.html
    

Ocaml <-> C

Archive: http://caml.inria.fr/pub/ml-archives/caml-list/2005/04/8a7669ccef59d4fe0f8ac00734d20abf.en.html

Paul Argentoff asked and Jean-Christophe Filliatre answered:
> Let's say we have a function:
> 
> external f : t option -> unit = "c_f"
> 
> How can I make analysis depending on the parameter, which in Ocaml is:
> 
> match a with
>   | None -> chunk1
>   | Some x -> chunk2 x
> 
> What functions/macros from a standard ocaml includes set can I use?

None  is represented  by an  integer  (0) and  Some is  pointing to  a
block. You can test for the former case with the macro Is_long:

        if (Is_long(v)) {
           ... None case ...
        } else {
           ... Some case ...
           ... you access to x here with Field(v, 0) ...
        }
    

CamlGI question

Archive: http://caml.inria.fr/pub/ml-archives/caml-list/2005/04/cab5740cff6c34bef0d6d960b36cc196.en.html

Mike Hamburg asked:
Is CamlGI still actively maintained?  I'm writing a CGI/FastCGI program 
using it, and have been having some trouble with the library.

My CGI program is designed to be a more flexible way to index web 
pages.  It's not finished yet, but a not-very-polished toy example can 
be found at http://capricorn.dnsalias.org/mike/index/ .  (I may as well 
mention that "not very polished" means, among other things, "not known 
to work in browsers other than Firefox, and known to display wrong in 
Safari").

When used as a FastCGI, the indexing script hangs, either usually or 
always, after writing out exactly 8KB of data (that is, 8192 bytes, no 
more, no less).  After 30 seconds, mod_fastcgi times out the 
connection, but mysteriously writes out the rest of the script's 
output.  It is quite clear that the script has finished by the time the 
hang occurs and that all its output has been written with Unix.write, 
as it displays even if killed with signal 9 while hanging; therefore I 
think it may be a formatting error in the output routines of CamlGI.

The script also sometimes breaks its output pipe to the server, which 
stays broken it until it is restarted manually (eg sudo killall 
index.fcgi).  This may or may not be a separate bug.

The plain CGI version works just fine, if a bit slow, but some of the 
features of the script only work in the FastCGI version, such as 
thumbnailing.

Has anyone familiar with CamlGI seen either of these issues before, or 
have any idea how to resolve them?
    
Robert Roessler answered and Alex Baretta added:
> I am not able to shed any light on the CamlGI question... OTOH, the 
> announcement from Gerd Stolpmann a few days ago regarding Ocamlnet 1.0 
> may be of interest, given that it includes "a mature implementation of 
> the CGI protocol" and "an implementation of the FastCGI protocol".

It is worth noting that Baretta DE&IT has commissioned a full 
implementation of the HTTP/1.1 protocol from Gerd. The HTTP library will 
be based on Ocamlnet and will export more or less the same API as the 
Netcgi module. We chose this approach rather than FastCGI because the 
FastCGI project seems dead and did not look like a viable solution for 
our Xcaml application server.

Xcaml aims at being a Apache+Tomcat+JSP+Servlet replacement. The Xcaml 
virtual machine and API are already complete, but the performance which 
they achieve in conjunction with Apache is mediocre. Gerd's new HTTP 
connector Ocamlnet will give us top notch performance while without 
sacrificing the safety guarantees of the Ocaml language and VM.

The new library will be released to the community by Baretta DE&IT and 
Gerd Stolpmann jointly under the terms of the GPL. When the integration 
with the Xcaml server will be done, the full Application System/Xcaml 
will be released under the terms of the GPL.
    
Gerd Stolpmann added:
Let me also add a few words about this project. What we are going to
implement here is nothing else but a web server written in O'Caml, or
better a web server component that can be integrated into the
application it is serving. Of course, this web server will have
"industry quality", especially regarding stability and performance. The
HTTP kernel is already written, and implements event-driven message
exchange for HTTP/1.0 and 1.1 in only 1200 lines of code.

Another part of the web server is called the "reactor". It provides a
Netcgi-compatible interface into which existing applications using
Netcgi can be simply plugged in. That means it will be quite easy to add
the web server component to existing CGI applications. The reactor
processes one HTTP request after the other, and can call an arbitrary
content generator for every request. To achieve parallelism, it is
planned to integrate the reactor into a multi-threaded setup.

I am also figuring out a purely event-based implementation (using only
Unix.select) in the hope that the simplification of scheduling will give
us a performance boost. This setup will be a lot more complicated, and
when carefully combined with multi-threading or -processing it will also
be possible to plug in existing Netcgi-based application in addition to
purely event-based content generators, i.e. the best of all worlds.

As you can see, some aspects of the web server design follow
conservative ideas (like the reactor), and some are very experimental. I
hope this results in a top-performing server that can be configured in
very flexible ways.
    
Michael Alexander Hamburg asked and Alex Baretta answered:
> Given then that my application should be multithreaded, and will be
> running on a webserver using Rails (which traditionally uses FastCGI),
> which of these libraries do you suggest that I use?  Http, Netcgi_afp or
> Netcgi_fcgi?  Or are they interoperable enough that it doesn't matter?

So long as you use Netcgi, it does not matter. The API does not expose 
the difference between the various connectors.
    

Using folding to read the cwn in vim 6+

Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).

:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'&lt;1':1
zM

If you know of a better way, please let me know.


Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.


Alan Schmitt