Hello, Here is the latest Caml Weekly News, week 25 March to 01 April, 2003. 1) New PXP mailing list 2) Recursive types and functors 3) OCaml performance 4) Our shrinking Humps ====================================================================== 1) New PXP mailing list ---------------------------------------------------------------------- Blair Zajac announced: I've set up a new Mailman mailing list for Ocaml's PXP, the Polymorphic XML Parser. This list has Gerd's backing. The intention of this list is to be a place to discuss using and improving PXP, reporting bugs, and the other things you would expect from a mailing list for PXP. The mailing list is managed with Mailman 2.1.1. It's home page is http://www.orcaware.com/mailman/listinfo/ocaml-pxp-users and the email address of the list is ocaml-pxp-users@orcaware.com You can subscribe by going to http://www.orcaware.com/mailman/listinfo/ocaml-pxp-users or by sending an email to ocaml-pxp-users-join@orcaware.com ====================================================================== 2) Recursive types and functors ---------------------------------------------------------------------- David Brown asked and Jean-Christophe Filliatre discussed: (the message contains some code, available at http://caml.inria.fr/archives/200303/msg00352.html) > I have a recursive type where I'd like one of the constructors of the > type to contain a set of the type (or something like set). However, I > can't figure out how to represent this. > > For example: > > type foo = > | Integer of int > | String of string > | Set of FooSet > > module FooSet = Set.Make (struct type t = foo let compare = compare end) > > but this obviously doesn't work. I'm pretty sure this has already been discussed on this list, but I couldn't find the related thread in the archives... A (too) naive solution could be to make a polymorphic instance of the Set module (either by adding an argument 'a everywhere in signatures OrderedType and S, or by copying the functor body and replacing Ord.compare by compare); then you have polymorphic sets, say 'a Set.t, balanced using compare, and you can define type foo = Integer of int | ... | Set of foo Set.t Unfortunately this doesn't work because sets themselves shouldn't be compared with compare, but with Set.compare (see set.mli). And then you point out the main difficulty: comparing values in type foo requires to be able to compare sets of foo, and comparing sets requires to *implement* sets and thus to compare values in foo. Fortunately, there is another solution (though a bit more complex). First we define a more generic type 'a foo where 'a will be substituted later by sets of foo: type 'a foo = Integer of int | ... | Set of 'a Then we implement a variant of module Set which implements sets given the following signature: module type OrderedType = sig type 'a t val compare: ('a -> 'a -> int) -> 'a t -> 'a t -> int end that is where elements are in the polymorphic type 'a t and where the comparison function depends on a comparison function for arguments in 'a (which will represent the sets, in fine). The functor implements a type t for sets using balanced trees, as usual, and defines the type of elements elt to be t Ord.t: module Make(Ord: OrderedType) = struct type elt = t Ord.t and t = Empty | Node of t * elt * t * int Right after, it implements comparison over elements and sets in a mutually recursive way: let rec compare_elt x y = Ord.compare compare x y and compare = ... (usual comparison of sets, using compare_elt) The remaining of the functor is exactly the same as for Set, with compare_elt used instead of Ord.compare. I attach the implementation of this module. There is (at least) another solution: to use a set implementation where comparison does not require a comparison of elements. This is possible if, for instance, you are performing hash-consing on type foo (which result in tagging foo values with integers, then used in the comparison). This solution is used in Claude Marché's regexp library (http://www.lri.fr/~marche/regexp/) and uses a hash-consing technique available here: http://www.lri.fr/~filliatr/software.en.html ====================================================================== 3) OCaml performance ---------------------------------------------------------------------- David Monniaux discussed Ocaml performance: Let me tell you about our experience here. We are developing a large program consisting of - a large part of Caml code handling complex data structures - a smaller C library handling certain numerical matrix computations that are triggered by the Caml code - some C (+ assembler) libraries dealing with system-dependent issues. I profiled the code using OProfile (http://oprofile.sourceforge.net), for expenses in clock cycles and cache faults. Earlier attempts were made with gprof. It turned out that we spent a significant amount of time in: - The Caml polymorphic compare function (15% time + some cache faults) Part of the problem seems to lie with the fact that the same function is called when comparing strings, int64's and other types, thus the processor has to do lots of tests and jumps just to get at the correct comparison function. Wouldn't it be reasonable to define String.compare and Int64.compare to call monomorphic functions? - The garbage collector (15% time + lots of cache faults) There's little we can do about it. Changing the size of the minor heap, adjusting it to optimize the use of L2 cache seems to gain 2.30% of the total running time. Curiously, using the compactor seems to slow things slightly. Would it be possible to optimize the GC cache-wise? For instance, have it ask the processor to "prefetch" data. - 17% in a particular matrix function written in C. There's little we can do except trying to optimize it carefully and compiling it with the best C compiler around. - The rest of the time is spent within the Caml code. Now this was a bit surprising to us, because we thought we spent far more time in the numerical computations. On a related subject, Issac Trotts asked and Xavier Leroy answered: (start of thread: http://caml.inria.fr/archives/200303/msg00372.html) > Here's a numerical mini-benchmark comparing C to OCaml > on a simple implementation of the Discrete Fourier Transform: > > http://redwood.ucdavis.edu/~issac/dft_compare.tar.gz > > The results on my 1 GHZ Pentium III Linux box: > > C: > real 0m21.273s > user 0m21.200s > sys 0m0.040s > > OCaml: > real 1m51.602s > user 1m47.020s > sys 0m0.260s > > So the C version was about five times as fast. This is after looking > for ideas > in the "Writing Efficient Numerical code in Objective Caml" page [1] > and the Great Language Shootout statistical moment page for OCaml [2]. > > The OCaml code was easier to read and debug, and would be > easier to modify. > > I'd be interested if anyone on this list knows of a way > to make it perform as well as the C version (without using the FFT.) > > [1] > http://216.239.53.100/search?q=cache:5YnsSStlWiAC:caml.inria.fr/ocaml/numerical.html+efficient+numerical+ocaml&hl=en&ie=UTF-8 > > [2] > http://www.bagley.org/~doug/shootout/bench/moments/moments.ocaml It can be done, but not on a Pentium 3. Here are my timings: Pentium 4 Pentium 4 SSE2 Alpha 21264 (2 GHz) (2 GHz) (500 MHz) C 20 20 36 OCaml (your code) 113 40 52 OCaml (variant 1) 90 26 40 OCaml (variant 2) 72 38 100 Variants 1 and 2 differ on the complex multiply step: Your code: let a2=c*. !a -. s*. !b and b2=c*. !b +. s*. !a in a := a2; b := b2; Variant 1: let x = s *. !a in a := c*. !a -. s*. !b; b := c*. !b +. x Variant 2: let olda = !a and oldb = !b in a := c *. olda -. s *. oldb; b := c *. oldb +. s *. olda The "Pentium 4 SSE2" column is an experimental code generator for the Pentium 4 that uses SSE2 instructions and registers for floating-point computations. (Before you ask: no, it's not publically available, but will be the basis for the x86_64 code generator as soon as the hardware becomes available.) As you can see above, variant 1 achieves almost the performance of C on platforms that have a regular register-based FP arithmetic unit. However, the x86 floating-point stack (what OCaml uses for compatibility with Pentium 3 and earlier processors) is notoriously cranky and hard to generate efficient code for. gcc manages to exploit instruction-level parallelism between the "re" and "im" computations via amazing feats (fxch instructions, etc), but the ocamlopt x86 code generator just generates very sequential code... So, unless you have an Alpha at hand, you'd better consider FFT. There's an FFT implementation that I use as a benchmark here: http://camlcvs.inria.fr/cgi-bin/cvsweb.cgi/ocaml/test/fft.ml and it delivers about 2/3 of the performances of C, even on the Pentium. ====================================================================== 4) Our shrinking Humps ---------------------------------------------------------------------- Sergey Goldgaber worried: The Caml Humps are just a list of links. There seems to be no real archive of contributed OCaml code. Because of this the community is losing these contributions as linked web pages disappear (either because of web-site reorganization, or because the people maintaining those personal web sites have moved on to something else). One example is Beno?t de Boursetty's PNM library, which doesn't seem to be at http://www.stud.enst.fr/~debourse/projects.html any longer, as the Humps maintain. This seems like a valuable library, of practical use to me right now. I could try to track down Beno?t de Boursetty, or ask about this particular library on this list, but that is not an effective long-term solution for every missing package case. There are also all sorts of other libraries and applications which are far too advanced for me to make use of right now, but which I could see myself using a year or two down the road. But with the web in flux the way it is, it is not wise to rely on any particular web page still being there after any length of time. I could go through and manually download every individual piece of software, but apart from being extraordinarily tedious, I would loose all of the Hump's wonderful organization and descriptions. I think what would be great if all of these packages were available in a centralized, mirrored repository available for download. That way, ideally, people would be able to get every available package and burn it to CDROM, distributing the entire archive for posterity. I know there has been talk of a CPAN-like service, and think that would be great as well. However, nothing so complex is needed for a simple centralized archive. And it is an archive that is needed more. Otherwise the community loses code. I wish I had a server and bandwidth to donate, or I would just do this myself. As it is I'm making an appeal to the community for solutions. If there are no individuals or corporate entities in the OCaml community who are willing/able to provide the required resources, perhaps we could look at something like ibiblio http://ibiblio.org/ Does anyone have any experience with this service? ====================================================================== Old cwn ---------------------------------------------------------------------- If you happen to miss a cwn, you can send me a message (alan.schmitt@inria.fr) and I'll mail it to you, or go take a look at the archive (http://pauillac.inria.fr/~aschmitt/cwn/). If you also wish to receive it every week by mail, just tell me so. ====================================================================== Alan Schmitt