Previous week Up Next week


Here is the latest Caml Weekly News, for the week of May 16 to 23, 2006.

  1. annot looks up types in *.annot files
  2. [Camlp4] Beta-release
  3. compiler bug?
  4. Array 4 MB size limit
  5. HOL Light version 2.20 now available
  6. ocamldebug and abstract record types

annot looks up types in *.annot files


Christian Lindig announced:
To lookup types of identifiers from within Vim, I have implemented a 
small tool in Ocaml called annot:

	$ annot -type 30 20 main.annot
   	string list

Given a cursor position (line 30, column 20) it looks up the type 
annotation for that position in main.annot. It is trivial to hook this 
to a key in Vim, and probably most other editors.

function! OCamlType()
     let col  = col('.')
     let line = line('.')
     let file = expand("%:p:r")
     echo system("annot -n -type " . line . " " . col . " " . file . 
map ,t :call OCamlType()<return>

I'm well aware that Vim and Emacs have their own solutions, using their 
own scripting languages, but I think it is better to have one general 

You can download the tool from my subversion repository; it is released 
under a BSD-style license.

	svn co svn://

The tool should build on Unix systems, comes with ./configure, and a 
manual page. Let me know if there are any problems. Of course, positive 
feedback is also welcome.

[Camlp4] Beta-release


Continuing the thread from last week, Alain Frisch said:
Nicolas Pouillard wrote:
> The last four months, I have been working on the Camlp4 renovation
> with Michel Mauny.
> We are glad to announce the beta-release of a new version of Camlp4,
> that contains a lot of major changes.
> *** This version is not backward compatible. Use it at your own risk.***
> Major changes are described at:

Wow, that's a major renovation indeed. Great news, good job.

I've been able to upgrade two of my syntax extensions [1][2] without any
problem and the code is now cleaner than before. Some people might want
to look at them for examples of simple syntax extensions working with
the new Camlp4.

Thanks Nicolas!

-- Alain

[1] openin:
[2] ulex:
Stefano Zacchiroli asked and Nicolas Pouillard answered:
> What are the plans for the future of camlp4, and especially for the
> changes you made? Are them going to be integrated in the "official"
> camlp4 release and thus integrated in future releases of ocaml? Will
> camlp4 be distributed separately in the future?

It's planned to be integrated in the future releases of OCaml.

compiler bug?


Dan Koppel asked:
I would like to report what I think might be a bug in the Ocaml compiler.  But
first I wanted to run this by this group in case there's something I'm missing.
I have some very simple code that consists of 2 nested loops.  Inside the inner
loop, is a simple statement.  Furthermore, the inner loop is not "tight".  Ie.
the number of iterations within the inner loop is very large and the number of
iterations of the outer loop is very small.  I then manually time this.  I then
change the code by inserting a simple function call between the inner and outer
loops.  This should have virtually no effect whatsoever.  However, when I time
this, I get exactly twice the time.  This is somewhat inexplicable.  I tried
tinkering with the "-inline" option for ocamlopt but this had no effect.  Below
is the actual code (
let main () =
  let dummy1 = ref 0 in
  let dummy2 = ref 0.0 in
  for i = 1 to 4 do
    for j = 1 to 1000000000 do
      dummy1 := !dummy1 + 1;
      dummy1 := !dummy1 - 1
    dummy2 := Unix.gettimeofday ()
let _ = main ()
I compile as follows: ocamlopt unix.cmxa
and run: ./a.out
Is this in fact a bug of the ocamlopt compiler?  Or is there some way currently
to make this effect disappear?
John Carr answered:
On SPARC the presence of the function call in the outer loop causes
the code generated for the inner loop to change so the dummy1 variable
is stored on the stack instead of in a register.  Each loop iteration
loads dummy1, modifies it, and stores it back onto the stack.
The store-load hazard, loading a value that is in the store buffer,
adds a large delay.  The loop runs in half the time if I comment out
either the store or the load in the assembly.

If the inner loop did more computation the effect would be much less.

This is surprising but not strictly a bug.  Xavier Leroy has posted
about similar minor changes causing the compiler to box or unbox a
floating point value with major changes in performance.
Xavier Leroy also answered:
As John Carr mentioned, the function call forces the variables that
are live across the call (i.e. dummy1) to be spilled, causing one additional
memory read and one additional write at each iteration of the inner
loop.  Since your inner loop is approximately 4 integer instructions,
the additional memory traffic can easily halve performance.

I grant you that spill code generation could be more clever here and
spill/reload around the whole inner loop.  But, no, it's not a bug:
Ocaml has a bug when the generated code crashes or produces incorrect

> I learned in basic compiler theory that when a function is called, you
> save all the registers before entering the function.

That's an over-simplification.  First, many compilers designate a
number of registers as "callee-save", meaning that all functions must
preserve their values.  These would be preserved across a call.
It happens that ocamlopt does not implement callee-save registers, as
they make garbage collection (more exactly, stack traversal) and
exception handling more expensive.  But even when a function call
destroys all registers, as with ocamlopt, there are many possible
placements for spills (saving a register in the stack) and reloads (of
a register from the stack), and the placement you suggest (only around
calls) is rarely optimal.  Well, in your example it is :)

Generation of spill code is a dark corner of compiler construction.
As with many other compilation problems, there are a bunch of
NP-completeness results.  Unlike many other compilation problems, I'm
not aware of any published heuristics that works well in most cases.
Well, there is the paper by George and Appel where they use an ILP
solver (integer linear programming) to produce optimal spills, but
that's kind of expensive...

Array 4 MB size limit


Continuing the thread from last week, Frederick Akalin asked and Brian Hurt answered:
> I'm running into cases where the 4 MB limit on arrays is starting to become a 
> problem.  Lists are much slower and cause seg faults for me on the same data 
> set, and Bigarrays are a partial solution because I'd like to be able to 
> store arbitrary types in arrays (not just numbers).
> I was greatly surprised when I found out there was such a low limit on 
> arrays.  Is there a reason for this?  Will this limit ever be increased?

It comes from how arrays are stored in memory.  Every boxed value has a 
"tag word", which, among other uses, is used by the GC to figure out how 
big the object is (i.e. where the next object starts).  Now, the encoding 
scheme of this tag word is fairly complicated, but it works out that for 
arrays, 10 bits of the 32 bits is used for other stuff, leaving 22 bits 
for the size.  This limits the size of an array to 4M-1 elements 
(actually, it limits the size of the array to 16M-4 bytes, as size is 
measured in words- which is why arrays of floats are limited to 2M-1 
entries- each float takes up 8 bytes aka 2 words).

When you move to 64 bits, the tag word doubles in size, but the amount of 
"other" information in the tag word doesn't- this means that suddenly you 
have 52 bits of size, or 4T arrays.  And now floats only take up one word, 
not two, so they too can have 4T arrays.

The array of array idea someone brought up is a good one.  There's a 
slight performance hit, but not much.  I'd keep the top level array short- 
1K entries is enough.  The advantage of this is that if you're heavily 
using the array, the top level array will stay in cache (possibly even L1 
cache), meaning the main cost of an array access only goes up by about 10% 
or so, maybe less (maybe 1% if it stays in L1 cache).  The reasoning goes 
like this: the largest cost by far of an array access on a large array is 
the cache miss.  Doing a memory reference that is satisified out of L1 
cache generally takes 1-4 clock cycles.  If you have to go to L2 cache, 
that's going to take 10-40 clock cycles (approximately).  Going to main 
memory?  100-350+ clock cycles.  With a large array, you're likely going 
to take the 100-350+ clock cycle hit anyways. If the top level array is in 
L1 cache, you're only adding 1-4 clockcycles to the 100-350+ clock cycle 
hit you're going to take anyways.

Personally, I think PCs should have gone 64 bit circa 1997 or so.  Once we 
started getting hard disks large enough that we couldn't mmap the entire 
hard disk, we started hitting problems.  Long long and the large file 
interfaces are examples of these problems.  The longer we go- and the 
larger memories and hard disks get, the bigger the problems get. 
Microsoft seems to be the main impediment, now that AMD has forced Intel 
to get off the stick.

> If I had a record type with 5 floats, 
> will the limit then by Sys.max_array_size / 10?

Within the record, the floats are unboxed (assuming you didn't have any 
non-float elements).  This means the floats are stored directly in the 
record, and that the record doesn't hold pointers to the floats.  But the 
record itself is boxed- which means that an array of these records is 
really an array of pointers to these records, meaning that you're back at 
the 4M-1 limit.

Note that a record of 5 floats costs 44 bytes (40 bytes for the 5 floats + 
4 bytes for the tag word).  Assuming no records are stored in more than 
one location, the total memory cost of an array of 4M-1 of them is 16M for 
the array, plus (4M-1)*44 for the records, for a total of 192M-44 bytes. 
This is almost 10% of your available memory space right there.

> Is there some sort of 
> existing ArrayList module that works around this problem?  Ideally, I'd like 
> to have something like C++'s std::vector<> type, which can be dynamically 
> resized.

Ocaml can't dynamically resize arrays.  This gets tricky to do when arrays 
get large.  At the extreme, if the array takes up 50% + 1 byte of the 
total address space, you can't resize it to be any larger- to resize 
requires you to copy it, which takes more memory than you have.  I've 
seend problems well before then.

> Also, the fact that using lists crashes for the same data set is surprising. 
> Is there a similar hard limit for lists, or would this be a bug?  Should I 
> post a test case?

I'm willing to bet dollars to donuts that the problem you hit with lists 
was stack overflow.  Without signifigantly impacting performance there 
isn't a whole lot Ocaml can do about detecting stack overflow before the 
segfault hits.  My general rule is, if it's going to contain more than a 
few dozen items, it probably shouldn't be a list.  Extlib's library is 
less prone to this error, but you can still run into serious problems with 
long lists.
Xavier Leroy also answered:
> I was greatly surprised when I found out there was such a low limit on
> arrays.  Is there a reason for this?  Will this limit ever be increased?

As Brian Hurt explained, this limit comes from the fact that heap
object sizes are stored in N-10 bits, where N is the bit width of the
processor (32 or 64).

Historical digression: this representation decision was initially
taken when designing Caml Light in 1989-1990.  At that time, even
professional workstations had 16 M of RAM at best, so limiting arrays to
4M elements was reasonable.  The decision was then reconsidered in
1995 during the redesign that led to OCaml.  At that time,
64-bit architectures were all the rage: OCaml was actually implemented
on a 64-bit Alpha, and only then backported to 32-bit machines.
So, the original header format was kept, since it makes complete sense
on a 64-bit architecture.  Little did I know that the 32-bitters would
survive so long.  Now, it's 2006, and 64-bit processors are becoming
universally available, in desktop machines at least.  (I've been
running an AMD64 PC at home since january 2005 with absolutely zero
problems.)  So, no the data representations of OCaml are not going to
change to lift the array size limit on 32-bit machines.

> Is the limit a limit on the number of elements or the total size?  The
> language in Sys.max_array_size implies the former, but the fact the
> limit is halved for floats implies the latter.  If I had a record type
> with 5 floats, will the limit then by Sys.max_array_size / 10?

No.  In general, Caml arrays are not unboxed, meaning that your array
of 5-float records is actually an array of pointers to individual
blocks holding 5 floats.  The only exception is for arrays of floats,
which are unboxed.

> Is there
> some sort of existing ArrayList module that works around this problem?
> Ideally, I'd like to have something like C++'s std::vector<> type, which
> can be dynamically resized.  Do I have to write my own? :(

A better idea would be to determine exactly what data structure you need:
which abstract operations, what performance requirements, etc.  C++
and Lisp programmers tend to encode everything as arrays or lists,
respectively, but quite often these are not the best data structure
for the application of interest.

> Also, the fact that using lists crashes for the same data set is
> surprising.  Is there a similar hard limit for lists, or would this be a
> bug?  Should I post a test case?

Depends on the platform you use.  In principle, Caml should report
stack overflows cleanly, by throwing a Stack_overflow exception.
However, this is hard to do in native code as it depends a lot on the
processor and OS used.  So, some combinations (e.g. x86/Linux) will
report stack overflows via an exception, and others will let the
kernel generate a segfault.

If you're getting the segfault under x86/Linux for instance, please
post a test case on the bug tracking system.  It's high time that
Damien shaves 
Nicolas Cannasse said and Xavier Leroy answered:
> A segfault will happen on Windows/MSVC port. I also found some cases
> where the commandline program (the haXe compiler in that case) just
> silently exited on Stack Overflow (exit code was not 0 but no error or
> "program error" infamous message box was displayed).
> I think that there is some MSVC specific C extension for catching such
> stack overflows (__try / __except*). It would be nice to have such a
> handling at the ocaml toplevel that would at least exit with a
> meaningful error message. I don't care so much in my case since I'm not
> using C code a lot, I know for sure that any crash/early abort is
> indeeed a stack overflow. That might not be the case for all Win32 users.
> Best,
> Nicolas
> *

I'm aware of __try/__finally, but a bit of experimentation suggested
that it doesn't play nice with OCaml's exception handling.  There is a
lower-level API whose name I cannot remember that might be usable
within Caml, but I didn't pursue this.  Windows users are welcome to
contribute suggestions or code.

HOL Light version 2.20 now available


John Harrison announced:
I'm pleased to announce the release of version 2.20 of the HOL
Light theorem prover, available now from the HOL Light homepage:

Since the last release, the system has been cleaned up, with many
"internal" functions deleted or hidden. There is now a reference
manual, available as crosslinked HTML or as a single PDF file, that
documents all the inference rules, conversions, tactics and utility
functions defined in the system:

Besides this consolidation and documentation, there are several new
features, improvements and bugfixes. Those I consider most important
are listed below.



 * Better infrastructure for the integers including automation
   of divisibility properties (more details below).

 * Much improved performance of ARITH_RULE on problems with many
   instances of cutoff subtraction.

 * Much improved performance of the ring and field functions on
   large problems.

 * Generalized beta conversion (e.g. (\(x,y). x + y) (1,2) = 1 + 2)
   is now built into the default "rewrites"

 * Online help is available (do ``help "ident";;'')

 * Type abbreviations are supported (see "new_type_abbrev")

 * Binary numerals are also accepted ("0b101010")

 * "define_finite_type", for defining indexing types, has been


Two constant name changes:

  dest_int -> real_of_int
  mk_int   -> int_of_real

the renaming of these utility functions:

  is_beq                -> is_iff
  [dest|is|mk]_intconst -> [dest|is|mk]_realintconst
  upto                  -> --
  gather                -> filter

and the following theorem renamings (some others about iterated
operations had a few redundant hypotheses removed)

 [N]SUM_CMUL -> [N]SUM_LMUL (and there's an _RMUL version too)


The type `:int` was formerly a poor relation of `:num` and `:real`, with
rather less in the way of syntax operations and automated support. This
has mostly changed. There are now syntax operations like "mk_intconst",
arithmetic conversions like "INT_ADD_CONV" and "INT_REDUCE_CONV" and an
instantiation of the ring procedure, "INT_RING".

The most basic divisibility notions "divides", "coprime" and "gcd" have
been defined on both integers and natural numbers in the core. Few
lemmas about them have been provided, but there are some very useful
(though hacky and brittle) automated proof procedures that can
generate many routine lemmas about divisibility notions automatically:
NUMBER_RULE and NUMBER_TAC (over the natural numbers) and INTEGER_RULE
and INTEGER_TAC (over the integers). For example, over the natural

  NUMBER_RULE `coprime(x * y,x EXP 2 + y EXP 2) <=> coprime(x,y)`;;

  NUMBER_RULE `!d a b:num. d divides (a * b) /\ coprime(d,a) ==> d divides b`;;

  NUMBER_RULE `~(gcd(a,b) = 0) /\ a = a' * gcd(a,b) /\ b = b' * gcd(a,b)
               ==> coprime(a',b')`

  NUMBER_RULE `!a x y:num.
         coprime(a,n) ==> ((x * a == y * a) (mod n) <=> (x == y) (mod n))`;;

Over the integers, even some existential goals can be solved, e.g. the
2-variable Chinese remainder theorem. This works less well over the
natural numbers because of the positivity constraints, though I plan to
fix this eventually:

   `!a b u v:int. coprime(a,b) ==> ?x. (x == u) (mod a) /\ (x == v) (mod b)`;;

   `gcd(a,n) divides b ==> ?x:int. (a * x == b) (mod n)`;;


  Examples/       --- binary expansion of numbers
  Examples/       --- combinatory logic (port of old HOL88 example)
  Examples/          --- basic Symbolic Trajectory Evaluation theory
  Multivariate/ --- multivectors with outer and geometric product
  100/*.ml                 --- Some of the "great 100 theorems"

The last bunch of examples are HOL Light formalizations of some of the
theorems from the following list:

I believe that all the ones done in HOL Light are either in this
directory or somewhere in the standard core or libraries.

ocamldebug and abstract record types


Frederick Akalin asked and Jacques Garrigue answered:
> At the risk of starting another flamewar, I have another question.  I  
> notice that if I have a record type that's defined in an .mli file, I  
> am able to print objects of that type and see its contents in the  
> debugger.  However, if I make that type abstract (only defining it in  
> the .ml file), I am unable to do so, instead seeing "<abstr>", unless  
> I am in the .ml file where the type is defined.
> Surely this information is available to the debugger from anywhere in  
> the program?  Currently I make most of my types non-abstract simply  
> because it is impossible to debug my programs without being able to  
> check record contents, a practice which I would like to avoid having  
> to do.  I am used to gdb, which prints out all a struct's contents  
> regardless of protected/private modifiers, or Perl, which provides a  
> similar facility with the Data::Dumper module.

To be more precise, the debugger uses information from the .cmi file
to print values. The information is not included in the .cmo file,
even with the -g option. Printing automatically abstract type would
require a new infrastructure, saving type definitions in the .cmo.

Note that an intermediate step would be to declare your definitions as
private: they become semi-abstract (you cannot create values from
outside the module), but you can still print them.

> I am aware that the debugger lets you load printing functions, a  
> facility which I am now exploring.  Although my first impression is  
> that it requires a non-trivial amount of work, both on the coding and  
> debugging side.

This requires some work, but not as difficult as it may seem.
The pretty-printer is very easy to use, particularly Format.fprintf.

Using folding to read the cwn in vim 6+

Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).

:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'&lt;1':1

If you know of a better way, please let me know.

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.

Alan Schmitt