Previous week Up Next week


Here is the latest Caml Weekly News, for the week of March 23 to 30, 2010.

  1. ocamldebug: User-defined printers for abstract data types?
  2. no_scan_tag and int array
  3. testers wanted for experimental SSE2 back-end
  4. Backtracting exceptions in toplevel
  5. jsonpat
  6. Other Caml News

ocamldebug: User-defined printers for abstract data types?


Stefan Ratschan asked and Florent replied:
> The ocamldebug manual entry for load_printer says: "The loaded file does
> not have direct access to the modules of the program being debugged."
> So I have to put the whole printer definition into the loaded file. But
> then, how can one write a printer for abstract data types? External
> files do not have access to those.

The printing of Big_int is possible and AFAIK Big_int is an abstract data type.

The following code is part of in VSYML [1]:

(** custom debug printer for big_int values (why is it missing in vanilla ocaml?)
 @param value_big_int the value to be printed
 @return unit *)
let print_big_int (value_big_int:Big_int.big_int) =
 Format.printf "%s" (Big_int.string_of_big_int value_big_int);;

And it is loaded by the following commands in the debugger:
load_printer build/debug/DebugPrinter.cmo
install_printer DebugPrinter.print_big_int

[1] VSYML VHDL Symbolic Simulator in OCaml

no_scan_tag and int array


ygrek asked:
Consider this code:

open Printf

let measure f =
 let t = Unix.gettimeofday () in
 let () = f () in
 printf "%.4f sec" (Unix.gettimeofday () -. t)

let () =
 let gc () = for i = 1 to 10 do Gc.full_major () done in
 let a = Array.make 4_000_000 0 in
 measure gc;
 printf " normal %u (%u)\n%!" (Array.length a) (Gc.stat ()).Gc.live_words;

 Obj.set_tag (Obj.repr a) (Obj.no_scan_tag);
 measure gc;
 printf " no_scan_tag %u (%u)\n%!" (Array.length a) (Gc.stat ()).Gc.live_words;

 measure gc;
 printf " no array (%u)\n%!" (Gc.stat ()).Gc.live_words;

Output looks like :

0.2281 sec normal 4000000 (4000165)
0.0002 sec no_scan_tag 4000000 (4000165)
0.0002 sec no array (164)

So, as expected, setting No_scan_tag on the array of integers prevents GC from
uselessly scanning the huge chunk of memory. Looks like polymorphic array
functions still work fine and GC correctly reclaims array memory when it is
not referenced anymore. Apparantly this trick is not allowed for float array
as they have a special tag set. The question is - how safe is this? And even
more, could the compiler itself set this tag?
Damien Doligez replied:
> So, as expected, setting No_scan_tag on the array of integers prevents GC from
> uselessly scanning the huge chunk of memory. Looks like polymorphic array
> functions still work fine and GC correctly reclaims array memory when it is
> not referenced anymore. Apparantly this trick is not allowed for float array
> as they have a special tag set.

The trick is not needed for float arrays, the GC already doesn't scan

> The question is - how safe is this?

It's safe, and will be in the forseeable future.

BUT: you should use Abstract_tag and not No_scan_tag.  Abstract_tag means
"don't make assumptions about the contents of this block", while
No_scan_tag is just the min of all the tags that the GC is not supposed
to scan.  Right now they are equal, but a future version of OCaml might
have No_scan_tag = Double_array_tag, which would break your code.

> And even more, could the compiler itself set this tag?

This is a bit tricky because you have to make sure that the static
type of the array is "int array".  Unlike floats, a run-time test
at allocation will not work.

You should enter this as a feature wish in the BTS.

testers wanted for experimental SSE2 back-end


Continuing the thread from last week, Xavier Leroy said and Dmitry Bely replied:
>>> This is a call for testers concerning an experimental OCaml compiler
>>> back-end that uses SSE2 instructions for floating-point arithmetic.[...]
>> I cannot provide any benchmark yet
> Too bad :-( I got very little feedback to my call: just one data point
> (thanks Gaetan).  Perhaps most OCaml users interested in numerical
> computations have switched to x86-64bits already?  At any rate, given
> such a lack of interest, this x86-32/SSE2 port isn't going to make it
> into the OCaml distribution.

It's a pity. Probably even my (future) benchmarks won't help...

>> but even not taking into account
>> the better register organization there are at least two areas where
>> SSE2 can outperform x87 significantly.
>> 1. Float to integer conversion
>> Is quite inefficient on x87 because you have to explicitly set and
>> restore rounding mode.
> Right.  The mode change makes the conversion about 10x slower on x87
> than on SSE2.  Apparently, float->int conversion is uncommon is
> numerical code, otherwise we'd observe bigger speedups on real
> applications...
>> 2. Float compare
>> Does not set flags on x87 so
> The SSE2 code is prettier than the x87 code, but this doesn't seem to
> translate into a significant performance gain, in my limited testing.
>> As for SSE2 backend presented I have some thoughts regarding the code
>> (fast math functions via x87 are questionable,
> Most x86-32bits C libraries implement sin(), cos(), etc with the x87
> instructions, so I'm curious to know what you find objectionable here.

Microsoft's implementation for P4 and above is SSE2-based. And Intel
itself recommends to do so:

What Is AM Library?
Ever missed a sine or arctangent instruction among Intel Streaming
SIMD Extensions? Ever wished there were a way to calculate logarithm
or exponent in about a dozen cycles? Here is a new release of

Approximate Math Library (AM Library) -- a set of fast routines to
calculate math functions using Intel(R) Streaming SIMD Extensions
(SSE) and Streaming SIMD Extensions 2 (SSE2). The Library offers
trigonometric, reverse trigonometric, logarithmic, and exponential
functions for packed and scalar arguments. The processing speed is
many times faster than that of x87 instructions and even of table
lookups. The accuracy of AM Library routines can be adequate for many
applications. It is comparable with that of reciprocal SSE
instructions, and is hundreds times better than what is achievable
with lookup tables.

The AM Library is provided along with the full source code and a usage sample.
[end of quote]

Another interesting reading:

>> optimization of floating compare etc.) Where to discuss that - just
>> here or there is some entry in Mantis?
> Why not start on this list?  We'll move to private e-mail if the
> discussion becomes too heated :-)


1. My variant of emit_float_test (in many cases eliminates extra jump).

let emit_float_test cmp neg arg lbl =
 let opcode_jp cmp =
   match (cmp, neg) with
     (Ceq, false) -> ("je", true)
   | (Ceq, true)  -> ("jne", true)
   | (Cne, false) -> ("jne", true)
   | (Cne, true)  -> ("je", true)
   | (Clt, false) -> ("jb", true)
   | (Clt, true)  -> ("jae", true)
   | (Cle, false) -> ("jbe", true)
   | (Cle, true)  -> ("ja", true)
   | (Cgt, false) -> ("ja", false)
   | (Cgt, true)  -> ("jbe", false)
   | (Cge, false) -> ("jae", true)
   | (Cge, true)  -> ("jb", false) in
 let branch_opcode, need_jp = opcode_jp cmp in
 let branch_opcode, arg0, arg1, need_jp  =
   match arg.(1).loc with
     Reg _ when need_jp ->
     (* swap args if it excludes jmp *)
     let (branch_opcode_swap, need_jp_swap) =
         (match cmp with
           Ceq -> Ceq
         | Cne -> Cne
         | Clt -> Cgt
         | Cle -> Cge
         | Cgt -> Clt
         | Cge -> Cle) in
     if need_jp_swap
     then (branch_opcode, arg.(0), arg.(1), true)
     else (branch_opcode_swap, arg.(1), arg.(0), false)
   | _ ->
     (branch_opcode, arg.(0), arg.(1), need_jp)
 begin match cmp with
 | Ceq | Cne -> `      ucomisd `
 | _         -> `      comisd  `
 `{emit_reg arg0}, {emit_reg arg1}\n`;
 let branch_if_not_comparable =
   if cmp = Cne then not neg else neg in
 if need_jp then
   if branch_if_not_comparable then begin
     ` jp      {emit_label lbl}\n`;
     ` {emit_string branch_opcode}     {emit_label lbl}\n`
   end else begin
     let next = new_label() in
     ` jp      {emit_label next}\n`;
     ` {emit_string branch_opcode}     {emit_label lbl}\n`;
     `{emit_label next}:\n`
 else begin
   `   {emit_string branch_opcode}     {emit_label lbl}\n`

2. My variant of fast math functions (see explanation above)

let emit_floatspecial = function
   "sqrt"  -> `        sqrtsd  `
 | _ -> assert false

3. Loading st(0) can be two instructions shorter :)

             ` sub     esp, 8\n`;
             ` fstp    REAL8 PTR [esp]\n`;
             ` movsd   {emit_reg dst}, REAL8 PTR [esp]\n`;
             ` add     esp, 8\n`

can be written as

             ` fstp    REAL8 PTR [esp-8]\n`;
             ` movlpd  {emit_reg dst}, REAL8 PTR [esp-8]\n`;

4. Unnecessary instruction in Lop(Iload(Single, addr))

           `   movss   {emit_reg dest}, REAL4 PTR {emit_addressing addr
i.arg 0}\n`;
           `   cvtss2sd {emit_reg dest}, {emit_reg dest}\n`

can be written as

           `   cvtss2sd {emit_reg dest}, REAL4 PTR {emit_addressing
addr i.arg 0}\n`

Backtracting exceptions in toplevel


C. Fr asked and Jake Donham replied:
> It there a way to get a backtrace for exceptions occurring in the toplevel
> (for example, getting more information when the stack blows)? I’ve read that
> there was an equivalent of the -b option that ocamlrun uses, but I couldn’t
> find any subsequent information on it.

In 3.11.x there is Printexc.record_backtrace which controls the -b
flag, and Printexc.get_backtrace to get the trace of the last

If you want to see backtraces printed automatically in the toplevel,
try my patch:



Julien Verlaguet announced:
I am proud to announce the first release of Jsonpat. Jsonpat is a command line
tool that allows to write json to json tranformations, using pattern-matching,
directly in the shell.

For example given the file, myexample.json :
 "age":39 }

The following command extracts the values present in the field "name":

$ jsonpat -p '{name:x} -> x' myexample.json

The source code and documentation can be found here:

Comments and suggestions are more than welcome.

Other Caml News

From the ocamlcore planet blog:
Thanks to Alp Mestan, we now include in the Caml Weekly News the links to the
recent posts from the ocamlcore planet blog at

Proof of negation and proof by contradiction:

minisat2 ocaml bindings:

Status of OCaml packages on Ubuntu Lucid Lynx (10.04 LTS): transition to OCaml 3.11.2 finished:

“The Whitespace Thing” for OCaml:


Updated backtrace patch:

CCSS 1.2 released:

Inside OCaml objects:

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.

Alan Schmitt