Caml Weekly News

Hello

Here is the latest Caml Weekly News, for the week of December 11 to December 18, 2007.

The Caml Weekly News will be taking a break next week. Happy holidays!

CUFP Report Available
New contact address for the hump
First-class typed, direct-style sprintf
objsize-0.1
Jsure 1.0 Javascript checker
"Ref" and copy of functions
OCaml meeting in Paris
Camlp5 release 5.05

CUFP Report Available

Archive: http://groups.google.com/group/fa.caml/browse_frm/thread/58d732a29f4521ff#83bb57898b9c241c

Kathleen Fisher announced:

Jeremy Gibbons has written a very detailed report on the 2007   
Commercial Users of Functional Programming Workshop.  His report is   
available from the CUFP web page (http://cufp.functionalprogramming.com) 
as well as from the CUFP google group   
page (http://groups.google.com/group/cufp).  Thank you, Jeremy! 

Peter Thiemann and his students are just finishing up the post   
processing of the recordings of the talks, and we are in the process   
of getting those talks loaded onto the web.  I'll send a further   
announcement when the talks are available.

New contact address for the hump

Archive: http://groups.google.com/group/fa.caml/browse_frm/thread/04a9ab9657fe8341#74a9140a02e15640

Maxence Guesdon announced:

The new (and working) e-mail address to contact the hump administrators is: 
  caml-hump "AT" inria.fr 

The Caml Hump: 
  http://caml.inria.fr//cgi-bin/hump.en.cgi

First-class typed, direct-style sprintf

Archive: http://groups.google.com/group/fa.caml/browse_frm/thread/5fd7b552a8e8b067#53c242b901f557db

Oleg described:

We demonstrate direct-style typed sprintf: sprintf format_spec arg1 arg2 .... 
The type system ensures that the number and the types of format 
specifiers in format_expression agree with the number and the types of 
arg1, etc. Our sprintf and format specifiers are ordinary, 
first-class functions, and so can be passed as arguments or returned 
as results. The format specifiers can also be composed incrementally. 
Unlike Danvy/Filinski's sprintf, the types seem to be simpler. 

Recently there has been discussion contrasting the OCaml Printf module 
with Danvy/Filinski's functional unparsing. The latter can be 
implemented as a pure library function, requiring no special support 
from the compiler. It does require writing the format specifiers in 
the continuation-passing style, which makes types huge and 
errors very difficult to find. Perhaps less known that 
Danvy/Filinski's functional unparsing has a direct-style 
counter-part. It is particularly elegant and simple: the whole 
implementation takes only four lines of code. The various 
implementations of typed sprintf are lucidly and insightfully 
described in 
        http://pllab.is.ocha.ac.jp/~asai/papers/tr07-1.ps.gz 

Alas, the elegant sprintf requires control operators supporting both 
answer-type modification and polymorphism. The corresponding calculus 
has been presented in the recent APLAS 2007 paper by Asai and 
Kameyama.  They have proven greatly desirable properties of their 
calculus: strong type soundness, principal types, the existence of a 
type inference algorithm, preservation of types and equality through 
CPS translation, confluence, strong normalization for the subcalculus 
without fix. 
        http://logic.cs.tsukuba.ac.jp/~kam/paper/aplas07.pdf 
        http://pllab.is.ocha.ac.jp/~asai/papers/aplas07slide.pdf 

At that time, only prototype implementation was available, in a 
private version of mini-ML. Yesterday, a straightforward Haskell98 
implementation of the calculus emerged, based on parameterized monads: 
  http://www.haskell.org/pipermail/haskell/2007-December/020034.html 
It includes sprintf. 

OCaml already has delimited control operators, in the delimcc 
library. Those operators, too, support polymorphism. They are 
different however, supporting multiple typed prompts, but the fixed 
answer type (which is the type of the prompt). It has not been known 
until today if the quantity of the prompts can make up for their 
quality: can multi-prompt shift/reset (or, cupto) _emulate_ the 
modification of the answer-type. We now know that the answer is yes. 
Many elegant applications of shift/reset become possible; in 
particular, `shift (fun k -> k)' becomes typeable. The latter has many 
applications, e.g., in web form programming and linguistics. 

The full implementation and many tests (of sprintf) are available at 
        http://okmij.org/ftp/ML/sprintf-cupto.ml 

The following are a few notable excerpts. First are the definitions of 
answer-type modifying, polymorphic shift/reset.  Unlike the 
fixed-answer-type shift/reset, which have one prompt, shift2/reset2 
have two prompts, for the two answer types (before and after the 
modification). 

let reset2 f = 
  let p1 = new_prompt () in 
  let p2 = new_prompt () in 
  push_prompt p1 (fun () -> abort p2 (f (p1,p2)));; 
val reset2 : ('a Delimcc.prompt * 'b Delimcc.prompt -> 'b) -> 'a = <fun> 

let shift2 (p1,p2) f = 
  shift p1 (fun k -> fun x -> 
    push_prompt p2 (fun () -> f k x; failwith "omega"));; 
val shift2 : 
  ('a -> 'b) Delimcc.prompt * 'b Delimcc.prompt -> 
  (('c -> 'a -> 'b) -> 'a -> 'd) -> 'c = <fun> 

At first sight, these definitions seem trivial. At the second sight, 
they seem outright wrong, as 'abort p2' seem to be executed before the 
corresponding prompt p2 is pushed. Hopefully, the fifth sight will 
show that the definitions are indeed simple and do what they are 
supposed to. 

We can write the following append function (Sec 2.1 of Asai and 
Kameyama's APLAS paper) 

let rec append lst prompts = 
  match lst with 
  | [] -> shift2 prompts (fun k l -> k l) 
  | a::rest -> a :: append rest prompts;; 

whose inferred type 
   'a list -> ('a list -> 'b) Delimcc.prompt * 'b Delimcc.prompt -> 'a list 
clearly shows both the polymorphism (in 'b) and the answer-type 
modification from 'b to 'a list -> 'b. 

Here's the typed sprintf: 

let test3 = sprintf (lit "The value of " ^^ fmt str ^^ lit " is " ^^ fmt int);; 
   val test3 : string -> int -> string = <fun> 
let test3r = test3 "x" 1;; 
   val test3r : string = "The value of x is 1" 

We can write this test in the following way, to demonstrate that 
sprint and format specifiers are first-class and that the format 
specification can be composed incrementally. 

let test3' = 
  let format_spec1 = lit "The value of " ^^ fmt str ^^ lit " is " in 
  let format_spec = format_spec1 ^^ fmt int in 
  let applyit f x = f x in 
  let formatter = applyit sprintf format_spec in 
  formatter "x" 1;; 

Here are the (inferred) types of format specifiers and of their compositions: 

 # lit "xx";; 
 - : '_a Delimcc.prompt * '_a Delimcc.prompt -> string = <fun> 

 # fmt int;; 
 - : (int -> '_a) Delimcc.prompt * '_a Delimcc.prompt -> string = <fun> 
 # fmt str;; 
 - : (string -> '_a) Delimcc.prompt * '_a Delimcc.prompt -> string = <fun> 

 # fmt int ^^ lit "xx";; 
 - : (int -> '_a) Delimcc.prompt * '_a Delimcc.prompt -> string = <fun> 

 # fmt int ^^ fmt str;; 
 - : (int -> string -> '_a) Delimcc.prompt * '_a Delimcc.prompt -> string =<fun> 

The type of (fmt int ^^ fmt str) clearly shows the answer-type 
modification. One can read this as follows: given prompts, the 
expression computes a string upon the hypotheses 'int' and 'string' in 
that order.

objsize-0.1

Archive: http://groups.google.com/group/fa.caml/browse_frm/thread/069b8cee302e699e#5c699ef3f5c71389

Dmitry Grebeniuk announced:

  Some time ago there was a discussion about measuring 
sizes of ocaml values.  I've got some "round tuits" and 
released a library that I use in my programs for some 
years.  Maybe it will be useful for other people too. 
  It is better than pure ocaml solutions because it 
doesn't build hash table of visited values, and it uses 
two bits for each visited value in worst case (but in my 
practice it used no more than 120kb of additional memory 
when I measured sizes of 300Mb-values). 

Readme:  http://89.187.37.10/gds/objsize/README 
Tarball: http://89.187.37.10/gds/objsize/objsize-0.1.tar.gz

Jon Harrop said and Dmitry Grebeniuk replied:

> This doesn't seem to work on 64-bit. First I get: 

  Thanks for testing, now it works on your platform too. 

Readme:  http://89.187.37.10/gds/objsize/README 
Tarball: http://89.187.37.10/gds/objsize/objsize-0.11.tar.gz

Jsure 1.0 Javascript checker

Archive: http://groups.google.com/group/fa.caml/browse_frm/thread/b4a7613afd790c6f#5f66bbade9416f59

Berke Durak announced:

Jsure is a "lint" for Javascript, which is also known as Ecmascript. 

It checks syntax and a little bit of semantics (it's difficult to do 
anything more sophisticated with real-world Javascript short of 
interpreting it). 

We have been quite happy with it at Exalead for a few months now and so 
we decided to release it under the LGPL. 

It is used daily to check the code in the Baagz http://baagz.com/ project. 

It uses Aurochs as its parser (http://aurochs.fr/). 

You can get it at: 

   http://aurochs.fr/jsure.html

"Ref" and copy of functions

Archive: http://groups.google.com/group/fa.caml/browse_frm/thread/fbf1535f57fd5115#df8d80d702b7f1a5

Pierre-Evariste Dagand asked:

I'm looking for advices about a "clean way" of doing something which, 
by design, isn't. So, let states the problem. 

I'm doing a combinator library based on Arrows for a Yampa-like system : 

type ('a,'b) arrow = Arrow of ( 'a -> 'b ) 

Thus, I can instantiate an arrow with : 

let arr f = Arrow ( f ) 
val arr : ('a -> 'b) -> ('a, 'b) arrow 

Then, I have, for example, the composition of arrows : 

let (>>>) (Arrow f) (Arrow g) =Arrow ( fun  c -> g  ( f  c ) ) 
val ( >>> ) : ('a, 'b) arrow -> ('b, 'c) arrow -> ('a, 'c) arrow 

Right here, everything plays well (excepted some weak type variables 
issues but that's not a big deal). 

But (and that's here that the trouble is) I need a "loop" combinator, 
that I wrote this way : 

let loop init (Arrow f) = 
  let state = ref init in 
    Arrow ( 
      fun c -> 
        let new_state , output = f ( !state , c ) in 
          state := new_state; 
          output 
    ) 
val loop : 'a -> ('a * 'b, 'a * 'c) arrow -> ('b, 'c) arrow 

Finally, I can write a small arrow  : 

let arr_counter = loop 0 ( arr ( fun ( counter , x ) -> counter+1 , 
counter+x ) ) 
let arr_double = arr ( fun x -> 2*x ) 
let arr_my_small_arrow = arr_counter >>> arr_double 

And I execute it with : 

let run (Arrow f) input = 
  f  input 
val run : ('a, 'b) arrow -> 'a -> 'b 

And it works. Right. 

But now, I would like to be able to "copy" a built arrow and to be 
able to execute the copy without side-effecting on the first one. 
Obviously, I cannot do that in this implementation. 

My current solution is really *ugly* : 

let arrow_string = Marshal.to_string arrow [ Marshal.No_sharing ; 
Marshal.Closures ] 

Then I "Marshal.from_string arrow_string". I'm not even sure if it 
will not blow out of my hands with "strange" things in the Ref cell. 

I'm aware of implementations in CPS but I use to think that it has a 
performance cost that I'm not ready to pay (I'm fighting against a C++ 
code so...). 

So if someone knows an elegant solution to my problem (if someone has 
read this mail until here), please let me know :-) 

Best regards, 

P.S.: For those who are just curious, these combinators are used in a 
functional-reactive library for Peer-to-Peer overlays programming.

Oleg suggested:

If you like using the mutable state, perhaps you might find the 
following code helpful. The key idea is packaging the clone function 
along with the arrow. There is no longer any need in any unsafe 
features. The lesson from our tagless final APLAS paper is that many 
things are significantly easier if we do the work at the production 
site rather than at the consumption site. 

(* The first component is the arrow itself, the second one is the clone 
   function*) 
type ('a,'b) arrow = {arrow: 'a -> 'b; clone: unit -> ('a,'b) arrow};; 

let rec arr f = {arrow = f; clone = fun () -> arr f};; 

let rec (>>>) f g = {arrow = (fun c -> g.arrow (f.arrow c)); 
                     clone = (fun () -> f.clone () >>> g.clone ())};; 

(* Here, our clone function uses the initial state rather than the 
current state. It is trivial to clone from the current state, if 
desired.*) 

let rec loop init f = 
  let state = ref init in 
  {arrow = 
      (fun c -> 
        let new_state , output = f.arrow ( !state , c ) in 
          state := new_state; 
          output); 
   clone = (fun () -> loop init (f.clone ()))};; 

let arr_counter = loop 0 (arr (fun (counter, x) -> counter+1 ,counter+x));; 
let arr_double = arr (fun x -> 2*x);; 
let arr_my_small_arrow = arr_counter >>> arr_double;; 

let run f input = let v1 = f.arrow input in 
                  let v2 = f.arrow input in 
                  (v1,v2);; 

let test1 = run arr_my_small_arrow 10;; 
  val test1 : int * int = (20, 22) 

let test2 = run arr_my_small_arrow 10;; 
  val test2 : int * int = (24, 26)  (*obviously, counter keeps 
  counting *) 

let test3 = run (arr_my_small_arrow.clone ()) 10;; 
val test3 : int * int = (20, 22)  (* cloning `resets' the counter *)

Jacques Garrigue also suggested:

Here is yet another solution, using objects, which actually combines 
Zheng's and Oleg's ideas. That is, it separates state and function, 
but provides only access to state through a cloning method, so that it 
is completely type safe. This is just what objects are good at! 

class ['a,'b] arrow (f : 'a -> 'b) = 
  object (self) method call = f method copy = self end 

let (>>>) rf rg : ('a,'b) arrow = 
  object 
    val rf : ('a,'c) arrow = rf 
    val rg : ('c,'b) arrow = rg 
    method call x = rg#call (rf#call x) 
    method copy = {< rf = rf#copy; rg = rg#copy >} 
  end 

let loop init rf : ('b,'c) arrow = 
  object 
    val mutable state = init 
    val rf : ('a*'b,'a*'c) arrow = rf 
    method call x = 
      let state', y = rf#call (state, x) in 
      state <- state'; y 
    method copy = {< rf = rf#copy >} 
  end 

let arr = new arrow 
let arr_counter = loop 0 (arr (fun (counter,x) -> counter+1, counter+x)) 
let arr_double = arr (fun x -> 2*x) 
let arr_my_small_arrow = arr_counter >>> arr_double 

The key here is the {< ... >} construct, which creates a shallow copy 
of an object, eventually with some fields changed. As a result, in 
loop there is no need to handle the state explicitly: the state field 
in the copied object will be distinct from the state field in the 
original object. On the other hand, you must explicitly update fields 
holding arrows, since the copy is shallow. Note that the explicit 
"val" fields are needed to allow updating their contents when copying. 

One difference with Oleg's approach is that we take a copy of the 
original object, rather than creating a completely new record. In this 
case, this doesn't mean much, since there is no extra computation 
involved. Still, the state after copying is not the original but the 
current one. And this may matter more if the construction is more 
complicated.

Pierre-Evariste Dagand replied:

Yet another nice solution I would never have imagined :-) 

And the performance is quite good : 

1.428032 microseconds on my small benchmark 

(the unsafe Ref implantation takes 0.75 microseconds while the CPS 
implantation takes 2.7 microseconds). 

Nevertheless, I have carried out my benchmark on Oleg's implementation 
and find an impressive performance : 

0.74 microseconds ! 

Now, I will re-write my combinators and see which style better suits my needs.

Pierre-Evariste Dagand finally added:

Finally, impressed by the speed of the record solution, I have 
re-written the whole library with records. 

I have measured the performance of this implementation against the 
Ref-based one on "real" code. Thus, I have simulated my Chord [1] 
implementation with a simulator instrumented such that it drops the 
processing time for each arrow processing. 

The results can be found at [2]. 

In a nutshell : the record solution has a negligible overhead compared 
to the Ref solution. 

Thanks all, 

[1]: http://pdos.csail.mit.edu/chord/ 
[2]: http://perso.eleves.bretagne.ens-cachan.fr/~dagand/projets/opis/processing_speed_records.pdf

OCaml meeting in Paris

Archive: http://groups.google.com/group/fa.caml/browse_frm/thread/844f6a6880371cbd#8f748cc03a025c4f

It the middle of a thread, Berke Durak said:

I agree.  I think Ocamlfind and Godi are nice pieces of work, and it 
certainly would be better if we could integrate existing systems. 

We all love Ocaml, but there seems to be an important problem with it. 

Unlike most programming languages of similar magnitude, Ocaml doesn't 
have a developer conference where its users could meet face-to-face and 
discuss these kind of things...  (or did I miss something?)  If not, we 
should organize one. 

Ocaml users span half a dozen continents but we could start with a small 
gathering in Paris.  I'd suggest late January. 

We could then discuss distribution, building, packaging and other 
issues;  and publish guidelines for a "request for implementation" -- 
things we would like to see in an integrated compilation and package 
manager. 

People would then implement different solutions and we could gather two 
months later to evaluate the proposals. 

Does anyone have a proposal for a gathering place?

The editor said:

There have been many replied, suggesting dates and places. Please follow 
the archive link above if you are interested.

Camlp5 release 5.05

Archive: http://groups.google.com/group/fa.caml/browse_frm/thread/bde6e55ecd4ba192#7c2ef011bd661100

Daniel de Rauglaudre announced:

New release of Camlp5: 5.05 

Changes: 
  - added function Pcaml.quotation_location returning, in quotation 
    expanders, the location of the quotation in the source file. 
  - added generation of the file META for ocamlfind (built in 
    directory "etc"). 
  - some bug fixes and improvements. 

See the Camlp5 documentation and the file CHANGES. 

Download the sources and the documentation at: 
   http://pauillac.inria.fr/~ddr/camlp5/

Using folding to read the cwn in vim 6+

Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).

:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'&lt;1':1
zM

If you know of a better way, please let me know.

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.

Alan Schmitt