Previous week   Up   Next week
Hello,

Here is the latest Caml Weekly News, week 07 to 14 January, 2003.

1) Graph data structures in Baire
2) Memory management dominates running time
3) GlSurf 1.2 available
4) LablGL 0.99
5) Lambda Calculus
6) otags 3.06.6
7) New Introductory book on Functional programming, using OCaml (in Italian)

======================================================================
1) Graph data structures in Baire
----------------------------------------------------------------------
Diego Olivier Fernandez Pons announced:

Baire now contains some data structures for graphs (both ephimeral and
persistent). You will find them in the download section.
http://www.edite-de-paris.com.fr/~fernandz/Caml/Baire/index.html

======================================================================
2) Memory management dominates running time
----------------------------------------------------------------------
Jörgen Gustavsson asked and Damien Doligez lectured:

>I profiled my program with gprof on a moderately sized input which gave
>the following top ten time consumers.
>
>   %  cumulative    self              self    total
> time   seconds   seconds    calls  ms/call  ms/call name
> 55.5      63.29    63.29                            fl_allocate [1]
> 11.0      75.79    12.50                            mark_slice [2]
>  5.7      82.32     6.53                            sweep_slice [3]
>  4.0      86.91     4.59                            _brk_unlocked [4]
>  2.5      89.80     2.89                            oldify_one [5]
>  2.3      92.42     2.62                            oldify_mopup [6]
>  1.6      94.23     1.81                            modify [7]
>  1.5      95.90     1.67                            alloc_shr [8]
>  0.9      96.91     1.01                            allocate_block [9]
>  0.9      97.90     0.99                            compare_val [10]

fl_allocate, alloc_shr, and allocate_block are O'Caml allocation
functions for the major heap.
mark_slice and sweep_slice are the major collector.
brk_unlocked is a C library function, I'm surprised to see it use so
much run time.
oldify_one and oldify_mopup are the minor collector.
modify is the write barrier that is called when an assignment is done.


>Unfortunately it gets worse when we increase the input size.

This behaviour is a bit unusual.


>I am not an expert on garbage collection but the only explanation that
>I can see for this behaviour is that the free list that fl_allocate
>searches grows as the program runs. If there are lots of small blocks
>in the begining of the free list then fl_allocate potentially has to
>search very far when it allocates somewhat larger blocks.
>
>This situation can occur, for example, if the program allocates many
>small blocks in the begining which are then reclaimed by the garbage
>collector and then later the program allocates mostly larger blocks.

This phenomenon is called "fragmentation" and O'Caml does a few things
to prevent fragmentation-related problems:

1. Whenever two free blocks are adjacent, they are merged into one
   bigger block.
2. The free list is not searched from the beginning every time, because
   that would lead to an accumulation of small blocks at the beginning
   of the list, even under normal conditions.  Instead, each search
   starts where the previous one stopped, so that no block is examined
   more often than any other.
3. The heap compaction algorithm is designed to remove all fragmentation
   from the free list by moving objects around to make sure that all
   free blocks are adjacent.
4. When fragmentation is too high, the typical behaviour is that the
   allocation requests cannot be satisfied from the free list, and
   the heap size has to be increased indefinitely.  The GC can detect
   when this is the case, and automatically trigger a heap compaction.

It seems that you have found a case where this strategy fails to detect
a situation of heavy fragmentation.


>One reason for why I doubt it, is that if it is correct then the ocaml
>memory management can asymptoticly dominate the running time. Is it 
>really
>so?

It looks surprising, but we don't have an analysis of the run-time
cost of memory management, and such an analysis would be quite hard
to do precisely.


>Finally, what can I do about this problem?

You can have good control of the GC parameters through the Gc module
of the standard library.  The things you should try are:
1. Increase the minor heap size.  This will let the minor GC deallocate
   more blocks, and thus reduce the load of the major GC.  For this
   parameter, the point of diminishing returns depends quite a lot
   on the allocation behaviour of your program.
2. Increase the space overhead.  The major GC has a trade-off between
   space and time.  If you increase the space overhead, your program
   will need more memory, but the major GC load will decrease.
   Reasonable values for this parameter are between 50 and 500.
3. Decrease max_overhead.  This will trigger more automatic heap
   compactions, thus reducing fragmentation.  Be aware, however,
   that heap compaction is costly and not incremental, so it will
   pause the execution of your program (for up to a few seconds).
   This can be a problem if your program is real-time or interactive.
4. Call Gc.compact from your program.  This is a good idea if you
   can easily identify in your program the point where it switches
   from allocating small blocks to allocating big blocks.

If none of the above helps, I would be interested in getting a copy
of your program, to see what is going on in more detail.  After all,
the behaviour that you are observing might simply be caused by a bug.

======================================================================
3) GlSurf 1.2 available
----------------------------------------------------------------------
Christophe Raffalli announced:

I am pleased to annouce that GlSurf 1.2 is available at
http://www.lama.univ-savoie.fr/~raffalli/glsurf.html

GlSurf is a program to draw surface (implicit surface)
using OpenGl.

There is now a "wishes list" on the web page ... so if
OCaml programmer wants to contribute ! This kind of programming
is fun :-)

--
Version 1.2:
        - bug when free variables are used in "surface".
        - new algorithm to extract triangles from cubes. Between two   
        and three times less triangles with the same cubes !
        on test9:

               |  version 1.1 | version 1.2 |
        -------------------------------------
        time   |  40.36       | 19.64       |
        nb tri | ~125000      | ~45000      |
        -------------------------------------

        - simplification of the algorithm (a lot of useless tests
        removed.
        - more parameters can be set from the script (see the
        documentation.
        - added "p" key to print the eye position.
        - fixed a display bug when changing the size parameter
        between two drawings.

Version 1.1:
        - Removed a last minute simplification which was in fact a  
        bug producing wrong topology (visible in examples/test7);
        - Simplification of the test to divide cubes and the root_in_cube
        function (a little less triangles in general).
        - Fixed a bug in root_in_cube (the diagonals were missing).
        - Fixed typo in examples/test10.

Version 1.0:
        Initial version

======================================================================
4) LablGL 0.99
----------------------------------------------------------------------
Jacques Garrigue announced:

Here is a new release of

             The LablGL interface to openGL, version 0.99

Changes with respect to 0.98 are only minor:

* add texture binding functions, contributed by Chris Hecker

* add support for Tcl/Tk8.4

* allow compiling and installing without Tk

This feature have been requested, so this seemed a good idea to
release them.
There is no new binary release for windows, as the need is rather
small.

You can find it at:
        http://wwwfun.kurims.kyoto-u.ac.jp/soft/olabl/lablgl.html

======================================================================
5) Lambda Calculus
----------------------------------------------------------------------
Gérard Huet announced:

Everything you wanted to know about lambda-calculus - but were afraid
to ask.
Nicely packaged as a set of executable Ocaml modules in limpid Pidgin
ML syntax.

Learn how to program in pure lambda calculus the slowest quicksort in
the world:

# let _L=list[3;2;5;1] in normal_list<<(^Quicksort ^L)>>;
- : list int = [1; 2; 3; 5]

Freak out with the Böhm-out technique and amaze your local theory guru
by challenging him to separate Quicksort and Factorial:

# bohm(_Quicksort,_Fact);
- : list Term.term = [[x0,x1,x2,x3,x4]x3; [x0,x1]x0; [x0,x1]x0; [x0]x0]

Learn Recursion theory in one easy lesson, and surprise your friends by
revealing that
Kleene is Church composed with Gödel :

value kleene t = church (godel t);

Win fortunes at TV Quiz shows by computing Gödel numbers faster than
lightning:

# godel _Fix = 6941718342796165477078794502929179108365127687513804648;
- : bool = True

All this and more in the Constructive Computation Theory course,
available in the Ocaml Hump:

http://caml.inria.fr/humps/

or directly at:
http://pauillac.inria.fr/~huet/CCT/

======================================================================
6) otags 3.06.6
----------------------------------------------------------------------
Cuihtlauac Alvarado announced:

Hi camlers,

Insignificant release: added GPL "COPYING" file, forgot for a while.

http://perso.wanadoo.fr/cuihtlauac.alvarado/otags-3.06.6.tar.gz

======================================================================
7) New Introductory book on Functional programming, using OCaml (in Italian)
----------------------------------------------------------------------
Marta Cialdea Mayer and Carla Limongelli announced:

About the book

As far as we konw, this is the first book on OCaml written in
Italian

The book gives an introduction both to basic concepts and techniques
of functional programming, and to its theoretical foundation.

In the first part of the book Objective Caml is used to present the
principles of functional programming.  Particular attention is devoted
to higher order functions: they are introduced in the first chapter.
Moreover recursion and basic data structures are presented, and the
relation among inductive data types, recursive definitions and
inductive proofs is highlighted.  The OCaml module system is also
introduced.

The second part shows the fundation of the functional paradigm.  It
contains an introduction to pure and applied lambda calculus, and a
brief account of its role in defining the concepts of computability
and decidability. The type system of languages in the ML class is also
discussed, and the concept of evaluation in an environment, together
with the difference between static and dynamic scope of declarations.

The book is the result of about ten years teaching experience in
basic computer science courses at the University of "Roma Tre" in Rome.

About the authors: Marta Cialdea Mayer is associate professor at the
Faculty of Engineering of "Roma Tre" University, where she holds
courses on Artificial Intelligence and Functional Programming.  Carla
Limongelli is Research Associate at the same Faculty, where she
presently hold the course of Algorithms and Data Structures.

======================================================================
Old cwn
----------------------------------------------------------------------

If you happen to miss a cwn, you can send me a message
(alan.schmitt@inria.fr) and I'll mail it to you, or go take a look at
the archive (http://pauillac.inria.fr/~aschmitt/cwn/). If you also wish
to receive it every week by mail, just tell me so.

======================================================================

Alan Schmitt