Caml Weekly News Previous week Up Next week

Hello

Here is the latest Caml Weekly News, for the week of March 14 to 21, 2006.

  1. Camomile-0.6.5
  2. Post-doc offer
  3. Stroustrup et al. propose to introduce "lambda closures" in C++
  4. PhD position
  5. Severe loss of performance due to new signal handling

Camomile-0.6.5

Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32571

Yoriyuki Yamagata announced:
I'm pleased to announce Camomile-0.6.5, a comprehensive Unicode
library for OCaml.  This is a bug fix release.  Up to previous
version, collation rules without a header was just ignored.
This problem affects, for example, Scandinavian languages.  This
release fixes this problem.

You can download the new release from
http://sourceforge.net/projects/camomile

In addition, I'm looking for a maintainer for Godi-package for
Camomile.  Or someone already adopted it?  I'm too busy, and besides,
I almost forget how to make Godi-package.
        

Post-doc offer

Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32586

Luc Maranget announced:
Our team (Moscova, Inria Rocquencourt, France) proposes a
post-doctoral fellow position.

Basically, the work we propose is first  to release the new version of
JoCaml, our extension of Objective Caml with Join-calculus primitives,
and then to study and implement en extension to the existing
prototype, such as type safe serialization.

Application deadline is March 30th.

Candidates must have defended their thesis before September 1, 2006.

More details on the application process are available at
http://www.inria.fr/travailler/opportunites/postdoc/postdoc.en.html
A short description of the position is given at the end of this mail.

In any case, candidates should contact me (Luc.Maranget@inria.fr)

--Luc Maranget

==================================================
Position description

* Environnement
Jocaml is an extension of Ocaml based on the Join-calculus,
see http://join.inria.fr. There are already two releases of the
Join-calculus and Jocaml. A third version, simplified and better
integrated to the Ocaml compiler, is almost completed.
It remains to seriously test the prototype, to write some examples
and documentation.

Additionally we plan to integrate some some important extensions, such
as type safe marshalling or matching by regular term expressions.

* Missions
With Moscova project-team, the post-doctorant will test and this new
release of Jocaml, and set its diffusion on the web. The
post-doctorant will probably have enough time to study and
implement an extension of JoCaml, such as type safe marshalling or
matching by regular term expressions.

* Ideal candidate
The post-doctorant should have a good knowledge of the programming
languages theory, and, if possible, of concurrency theory. The work
will also consist in a good part of programming within distributed
systems. This work is adequate for an expert in theory who wants to
have some practical experience, or for a very good programmer (in
operating systems or compilers) who wants to gain experience in
programming under control of theory.
        

Stroustrup et al. propose to introduce "lambda closures" in C++

Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32588

Eijiro Sumii said:
A friend of mine informed me of this report

    Lambda expressions and closures for C++
    Jeremiah Willcock, Jaakko Jarvi, Doug Gregor, Bjarne Stroustrup,
Andrew Lumsdaine
    2006-02-26
    http://public.research.att.com/~bs/N1968-lambda-expressions.pdf

and I thought you might be interested.  (I searched a little and
didn't find any discussion on this report in this list.)

A few highlights:

----------------------------------------------------------------------

    We propose to extend the C++ language with lambda expressions, and
define the semantics of these unnamed local functions via translation
to closures: function objects implemented using local classes.

...

    void f() {
      int sum = 0;
      for each(a.begin(), a.end(),
        <>(int x) -> int extern(sum) {return sum += x;});
    }

...

    2.1 Omitting the return type
    The return type of a lambda expression can be omitted if the body
of the lambda function contains at most one return statement.

----------------------------------------------------------------------
        

PhD position

Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32603

Julia Lawall announced:
PhD position: the Coccinelle framework for evolution of Linux device drivers

Description of the position:

A full 3-year PhD scholarship is available at the University of Copenhagen
(DIKU) to work on the design and development of the Coccinelle framework
for evolution of Linux device drivers.

The candidate should have a strong interest in most of the following:

    * Programming language design
    * Programming language semantics
    * Program analysis
    * Compilation
    * Operating system design
    * Device drivers

Tools related to Coccinelle are being developed using OCaml.

A Masters degree is required to start the position, but need not be
completed by the time of the application.

The deadline for the application is April 14, 2006 at 12pm (noon), Danish
time.

More information about the position and the application process is
available at: http://www.diku.dk/~julia/announcement.pdf

More information about the project is available at:
http://www.emn.fr/x-info/coccinelle/

All interested candidates irrespective of age, gender, race, religion or
ethnic background are invited to apply.  Knowledge of Danish is not
required.
        

Severe loss of performance due to new signal handling

Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32592

Markus Mottl asked:
This report has also been posted to the OCaml bug tracker, but since it is a
surprising observation, it may be good if people on the list know that it
exists without having to search the bug tracker archive.  Maybe some assembler
guru can repeat this result and explain to us what's going on...

----------

It seems that changes to signal handling between OCaml 3.08.4 and 3.09.1 can
lead to a very significant loss of performance (up to several orders of
magnitude!) in code that uses threads and performs I/O (tested on Linux).

The attached file (slow.ml) demonstrates this: it prints a character to stdout
in a for-loop. The uploaded version will take approximately 600ms in native
code to complete this test when redirecting output to /dev/null.  If you
comment out the line containing "module X = Thread" and compile without thread
support, then the test suddenly only takes around 1.5ms, i.e. it runs 400 times
faster.

Profiling using oprofile revealed that the function
"caml_process_pending_signals" seems to be responsible for that.  Annotated
assembler output showed that the code was sampled an astonishing number of
times in the instruction "test %eax,%eax" as obviously generated for "if
(async_action != NULL)" in this function. This is really weird, because
everything else seems to be sampled a sensible number of times, but it would
surely explain the timings.

OCaml-3.08.4 does not exhibit any problems of that kind.

slow.ml:

open Unix
open Printf

module X = Thread

let () =
  let t1 = gettimeofday () in
  for i = 1 to 100000 do
    print_char '.';
  done;
  let t2 = gettimeofday () in
  eprintf "%f\n" (t2 -. t1);
        
Christophe Troestler said:
> Profiling using oprofile revealed that the function
> "caml_process_pending_signals" seems to be responsible for that.

An earlier related thread:
http://caml.inria.fr/pub/ml-archives/caml-list/2006/02/2858f1e4532daae90d5b0762e3fff3cd.en.html

But your code is even more striking!
        
Xavier Leroy answered:
> It seems that changes to signal handling between OCaml 3.08.4 and 3.09.1
> can lead to a very significant loss of performance (up to several orders
> of magnitude!) in code that uses threads and performs I/O (tested on Linux).
> [...]
> Maybe some assembler guru can repeat this result and explain to us
> what's going on...

Short explanation: atomic instructions are dog slow.

Longer explanation:

OCaml 3.09 fixed a number of long-standing bugs in signal handling
that could cause signals to be "lost" (not acted upon).  The fixes,
located mostly in the code that polls for pending signals
(caml_process_pending_signals), rely on an atomic "read-and-clear"
operation, implemented using atomic processor instructions on x86,
x86-64 and PPC.  This makes signal handling correct (no signal can be
lost) but I didn't realize that it has such an impact on performance,
even on a uniprocessor machine.  Thanks for pointing this out.

(To prevent a number of well-meaning but irrelevant posts, keep in
mind that we're using atomic instructions in a single-threaded
program, to get atomicity w.r.t. signals, not w.r.t. concurrent threads.
We don't need the latter kind of atomicity given OCaml's threading model.)

Now, you may wonder why the problem appears mainly with threaded
programs.  The reason is that programs linked with the Thread library,
even if they do not create threads, check for signals much more
often, because they enter and leave blocking sections more often.  In
your example, each call to "print_char" needs to lock and unlock the
stdout channel, causing two signal polls each time.

So, it's time to go back to the drawing board.  Fortunately, it
appears that reliable polling of signals is possible without atomic
processor instructions.  Expect a fix in 3.09.2 at the latest, and
probably within a couple of weeks in the CVS.
        
Oliver Bandel said and Gerd Stolpmann answered:
> > OCaml 3.09 fixed a number of long-standing bugs in signal handling
> > that could cause signals to be "lost" (not acted upon).  The fixes,
> 
> I'm not clear about what your proble is with lost signals,
> but when using signals on Unix/Linux-systems, you can use
> UNIX-API, with sigaction/sigprocmask etc. you can do things well,
> and with the signal-function which C provides things are bad/worse.
> The C-API signal-function signal(3) clears out the signal handler
> after a call to it. In the sigaction/sigprocmask/... functions
> the handler remains installed.

The problem is the following: The O'Caml runtime cannot handle signals
immediately because this would break memory management (e.g. imagine a
signal happens when memory has just been allocated but not initialized).
To get around this the signal handler sets just a flag, and the compiler
emits instructions that regularly check this flag at safe points of
execution (i.e. memory is known to be initialised). These instructions
are now atomic in 3.09. In 3.08, you have basically

if "flag is set" then (
  (*)
  "clear flag";
  "call the signal handler function"
)

If another signal happens at (*) it will be lost.

As you mention sigprocmask: Of course, you can block signals before
checking the flag and allow them again after clearing it, but this would
be even _much_ slower than the solution in 3.09, because sigprocmask
needs a context switch to do its work (it is a kernel function).

I don't know what Xavier has in mind to solve the problem, but I would
think about reducing the frequency of the atomic check.
This could work as follows:

- Revert the check to the 3.08 solution
- Use the alarm clock timer to regularly call a signal_manager
  function at a certain frequency (i.e. the signal flag is set
  at a certain frequency)
- Only the alarm clock timer signal is left unblocked. The
  other signals are normally blocked.
- In signal_manager, it is checked whether there are other
  pending signals, and if so, their functions are called.

Of course, it is again possible that alarm clock signals are lost, but
this is harmless, because it is a repeatedly emitted signal. The other
signals cannot be lost, but their execution is deferred to the next
alarm clock event.

> But if this is what you think about (and how it will be done
> on windows or other systems) I don't know, but maybe this is
> a hint that matters.
> 
> BTW: I saw that in the Unix-module the unix-signalling functions are
>      now included... (the ywere not on older versions of Ocaml).

They have been included for a long time. New is Thread.sigmask.
        
Xavier Leroy then said:
> The problem is the following: [...] In 3.08, you have basically
>
> if "flag is set" then (
>   (*)
>   "clear flag";
>   "call the signal handler function"
> )
>
> If another signal happens at (*) it will be lost.

Actually, the problematic code in 3.08 is:

   tmp <- flag;
   (*)
   flag <- 0;
   if (tmp) { process the signal; }

and indeed a signal can be lost (never processed) if it occurs at (*).

The solution I have in mind is to implement exactly the pseudocode you
give above.  If a signal occurs at (*), it is not lost (the signal
handler function will be called just afterwards!), just conflated with
a previous occurrence of that signal, but this is fair game: POSIX
signals have the same behaviour.  (Yes, I'm ignoring the queueing
behaviour of realtime POSIX signals.)

Note however that in 3.09 and in my proposed fix, there is one flag
per signal, which still improves over 3.08 (which had only one shared
flag) and ensures that two occurrences of different signals are not
conflated, again as per POSIX.

 > I don't know what Xavier has in mind to solve the problem, but I would
 > think about reducing the frequency of the atomic check.

That would be plan C, plan B being making the check even more efficient.
I'd rather not introduce timer signals if at all possible, though,
since these mess up many function calls.
        
Will Farr said:
As an aside, if anyone is interested in techniques for making atomic
transactions fast with low latency, etc, the paper

Atomic heap transactions and fine-grain interrupts by Olin Shivers,
James W. Clark and Roland McGrath:
http://www-static.cc.gatech.edu/~shivers/papers/heap.ps

presents several *neat* hacks to do this efficiently.  I'm sure that
the implementators on the list are already aware of this work, but I
just wanted to point it out as interesting reading for people (like
myself) who think this stuff is neat but don't necessarily have broad
experience with it.
        

Using folding to read the cwn in vim 6+

Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).

:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'&lt;1':1
zM

If you know of a better way, please let me know.


Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.


Alan Schmitt