Previous week Up Next week
Hello,
Here is the latest Caml Weekly News, week 25 March to 01 April, 2003.
1) New PXP mailing list
2) Recursive types and functors
3) OCaml performance
4) Our shrinking Humps
======================================================================
1) New PXP mailing list
----------------------------------------------------------------------
Blair Zajac announced:
I've set up a new Mailman mailing list for Ocaml's PXP, the Polymorphic
XML Parser. This list has Gerd's backing.
The intention of this list is to be a place to discuss using and
improving PXP, reporting bugs, and the other things you would expect
from a mailing list for PXP.
The mailing list is managed with Mailman 2.1.1. It's home page is
http://www.orcaware.com/mailman/listinfo/ocaml-pxp-users
and the email address of the list is
ocaml-pxp-users@orcaware.com
You can subscribe by going to
http://www.orcaware.com/mailman/listinfo/ocaml-pxp-users
or by sending an email to
ocaml-pxp-users-join@orcaware.com
======================================================================
2) Recursive types and functors
----------------------------------------------------------------------
David Brown asked and Jean-Christophe Filliatre discussed:
(the message contains some code, available at
http://caml.inria.fr/archives/200303/msg00352.html)
> I have a recursive type where I'd like one of the constructors of the
> type to contain a set of the type (or something like set). However, I
> can't figure out how to represent this.
>
> For example:
>
> type foo =
> | Integer of int
> | String of string
> | Set of FooSet
>
> module FooSet = Set.Make (struct type t = foo let compare = compare end)
>
> but this obviously doesn't work.
I'm pretty sure this has already been discussed on this list, but I
couldn't find the related thread in the archives...
A (too) naive solution could be to make a polymorphic instance of the
Set module (either by adding an argument 'a everywhere in signatures
OrderedType and S, or by copying the functor body and replacing
Ord.compare by compare); then you have polymorphic sets, say 'a Set.t,
balanced using compare, and you can define
type foo = Integer of int | ... | Set of foo Set.t
Unfortunately this doesn't work because sets themselves shouldn't be
compared with compare, but with Set.compare (see set.mli). And then
you point out the main difficulty: comparing values in type foo
requires to be able to compare sets of foo, and comparing sets
requires to *implement* sets and thus to compare values in foo.
Fortunately, there is another solution (though a bit more complex).
First we define a more generic type 'a foo where 'a will be
substituted later by sets of foo:
type 'a foo = Integer of int | ... | Set of 'a
Then we implement a variant of module Set which implements sets given
the following signature:
module type OrderedType =
sig
type 'a t
val compare: ('a -> 'a -> int) -> 'a t -> 'a t -> int
end
that is where elements are in the polymorphic type 'a t and where the
comparison function depends on a comparison function for arguments in
'a (which will represent the sets, in fine). The functor implements a
type t for sets using balanced trees, as usual, and defines the type
of elements elt to be t Ord.t:
module Make(Ord: OrderedType) =
struct
type elt = t Ord.t
and t = Empty | Node of t * elt * t * int
Right after, it implements comparison over elements and sets in a
mutually recursive way:
let rec compare_elt x y =
Ord.compare compare x y
and compare = ... (usual comparison of sets, using compare_elt)
The remaining of the functor is exactly the same as for Set, with
compare_elt used instead of Ord.compare. I attach the implementation
of this module.
There is (at least) another solution: to use a set implementation
where comparison does not require a comparison of elements. This is
possible if, for instance, you are performing hash-consing on type foo
(which result in tagging foo values with integers, then used in the
comparison). This solution is used in Claude Marché's regexp library
(http://www.lri.fr/~marche/regexp/) and uses a hash-consing technique
available here: http://www.lri.fr/~filliatr/software.en.html
======================================================================
3) OCaml performance
----------------------------------------------------------------------
David Monniaux discussed Ocaml performance:
Let me tell you about our experience here. We are developing a large
program consisting of
- a large part of Caml code handling complex data structures
- a smaller C library handling certain numerical matrix computations that
are triggered by the Caml code
- some C (+ assembler) libraries dealing with system-dependent issues.
I profiled the code using OProfile (http://oprofile.sourceforge.net), for
expenses in clock cycles and cache faults. Earlier attempts were made with
gprof.
It turned out that we spent a significant amount of time in:
- The Caml polymorphic compare function (15% time + some cache faults)
Part of the problem seems to lie with the fact that the same function is
called when comparing strings, int64's and other types, thus the
processor has to do lots of tests and jumps just to get at the correct
comparison function.
Wouldn't it be reasonable to define String.compare and Int64.compare to
call monomorphic functions?
- The garbage collector (15% time + lots of cache faults)
There's little we can do about it. Changing the size of the minor heap,
adjusting it to optimize the use of L2 cache seems to gain 2.30% of the
total running time.
Curiously, using the compactor seems to slow things slightly.
Would it be possible to optimize the GC cache-wise? For instance, have
it ask the processor to "prefetch" data.
- 17% in a particular matrix function written in C. There's little we can
do except trying to optimize it carefully and compiling it with the best
C compiler around.
- The rest of the time is spent within the Caml code.
Now this was a bit surprising to us, because we thought we spent far more
time in the numerical computations.
On a related subject, Issac Trotts asked and Xavier Leroy answered:
(start of thread: http://caml.inria.fr/archives/200303/msg00372.html)
> Here's a numerical mini-benchmark comparing C to OCaml
> on a simple implementation of the Discrete Fourier Transform:
>
> http://redwood.ucdavis.edu/~issac/dft_compare.tar.gz
>
> The results on my 1 GHZ Pentium III Linux box:
>
> C:
> real 0m21.273s
> user 0m21.200s
> sys 0m0.040s
>
> OCaml:
> real 1m51.602s
> user 1m47.020s
> sys 0m0.260s
>
> So the C version was about five times as fast. This is after looking
> for ideas
> in the "Writing Efficient Numerical code in Objective Caml" page [1]
> and the Great Language Shootout statistical moment page for OCaml [2].
>
> The OCaml code was easier to read and debug, and would be
> easier to modify.
>
> I'd be interested if anyone on this list knows of a way
> to make it perform as well as the C version (without using the FFT.)
>
> [1]
> http://216.239.53.100/search?q=cache:5YnsSStlWiAC:caml.inria.fr/ocaml/numerical.html+efficient+numerical+ocaml&hl=en&ie=UTF-8
>
> [2]
> http://www.bagley.org/~doug/shootout/bench/moments/moments.ocaml
It can be done, but not on a Pentium 3. Here are my timings:
Pentium 4 Pentium 4 SSE2 Alpha 21264
(2 GHz) (2 GHz) (500 MHz)
C 20 20 36
OCaml (your code) 113 40 52
OCaml (variant 1) 90 26 40
OCaml (variant 2) 72 38 100
Variants 1 and 2 differ on the complex multiply step:
Your code:
let a2=c*. !a -. s*. !b
and b2=c*. !b +. s*. !a in
a := a2;
b := b2;
Variant 1:
let x = s *. !a in
a := c*. !a -. s*. !b;
b := c*. !b +. x
Variant 2:
let olda = !a and oldb = !b in
a := c *. olda -. s *. oldb;
b := c *. oldb +. s *. olda
The "Pentium 4 SSE2" column is an experimental code generator for the
Pentium 4 that uses SSE2 instructions and registers for floating-point
computations. (Before you ask: no, it's not publically available,
but will be the basis for the x86_64 code generator as soon as the
hardware becomes available.)
As you can see above, variant 1 achieves almost the performance of C
on platforms that have a regular register-based FP arithmetic unit.
However, the x86 floating-point stack (what OCaml uses for
compatibility with Pentium 3 and earlier processors) is notoriously
cranky and hard to generate efficient code for. gcc manages to
exploit instruction-level parallelism between the "re" and "im"
computations via amazing feats (fxch instructions, etc), but the
ocamlopt x86 code generator just generates very sequential code...
So, unless you have an Alpha at hand, you'd better consider FFT.
There's an FFT implementation that I use as a benchmark here:
http://camlcvs.inria.fr/cgi-bin/cvsweb.cgi/ocaml/test/fft.ml
and it delivers about 2/3 of the performances of C, even on the Pentium.
======================================================================
4) Our shrinking Humps
----------------------------------------------------------------------
Sergey Goldgaber worried:
The Caml Humps are just a list of links. There seems to be no real
archive of contributed OCaml code. Because of this the community is
losing these contributions as linked web pages disappear (either
because of web-site reorganization, or because the people maintaining
those personal web sites have moved on to something else).
One example is Beno?t de Boursetty's PNM library, which doesn't seem to
be at http://www.stud.enst.fr/~debourse/projects.html any longer, as
the Humps maintain. This seems like a valuable library, of practical
use to me right now. I could try to track down Beno?t de Boursetty, or
ask about this particular library on this list, but that is not an
effective long-term solution for every missing package case.
There are also all sorts of other libraries and applications which are
far too advanced for me to make use of right now, but which I could see
myself using a year or two down the road. But with the web in flux the
way it is, it is not wise to rely on any particular web page still
being there after any length of time.
I could go through and manually download every individual piece of
software, but apart from being extraordinarily tedious, I would loose
all of the Hump's wonderful organization and descriptions.
I think what would be great if all of these packages were available in
a centralized, mirrored repository available for download. That way,
ideally, people would be able to get every available package and burn
it to CDROM, distributing the entire archive for posterity.
I know there has been talk of a CPAN-like service, and think that would
be great as well. However, nothing so complex is needed for a simple
centralized archive. And it is an archive that is needed more.
Otherwise the community loses code.
I wish I had a server and bandwidth to donate, or I would just do this
myself. As it is I'm making an appeal to the community for solutions.
If there are no individuals or corporate entities in the OCaml
community who are willing/able to provide the required resources,
perhaps we could look at something like ibiblio http://ibiblio.org/
Does anyone have any experience with this service?
======================================================================
Old cwn
----------------------------------------------------------------------
If you happen to miss a cwn, you can send me a message
(alan.schmitt@inria.fr) and I'll mail it to you, or go take a look at
the archive (http://pauillac.inria.fr/~aschmitt/cwn/). If you also wish
to receive it every week by mail, just tell me so.
======================================================================
Alan Schmitt