Hello
Here is the latest Caml Weekly News, for the week of March 14 to 21, 2006.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32571
Yoriyuki Yamagata announced:I'm pleased to announce Camomile-0.6.5, a comprehensive Unicode library for OCaml. This is a bug fix release. Up to previous version, collation rules without a header was just ignored. This problem affects, for example, Scandinavian languages. This release fixes this problem. You can download the new release from http://sourceforge.net/projects/camomile In addition, I'm looking for a maintainer for Godi-package for Camomile. Or someone already adopted it? I'm too busy, and besides, I almost forget how to make Godi-package.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32586
Luc Maranget announced:Our team (Moscova, Inria Rocquencourt, France) proposes a post-doctoral fellow position. Basically, the work we propose is first to release the new version of JoCaml, our extension of Objective Caml with Join-calculus primitives, and then to study and implement en extension to the existing prototype, such as type safe serialization. Application deadline is March 30th. Candidates must have defended their thesis before September 1, 2006. More details on the application process are available at http://www.inria.fr/travailler/opportunites/postdoc/postdoc.en.html A short description of the position is given at the end of this mail. In any case, candidates should contact me (Luc.Maranget@inria.fr) --Luc Maranget ================================================== Position description * Environnement Jocaml is an extension of Ocaml based on the Join-calculus, see http://join.inria.fr. There are already two releases of the Join-calculus and Jocaml. A third version, simplified and better integrated to the Ocaml compiler, is almost completed. It remains to seriously test the prototype, to write some examples and documentation. Additionally we plan to integrate some some important extensions, such as type safe marshalling or matching by regular term expressions. * Missions With Moscova project-team, the post-doctorant will test and this new release of Jocaml, and set its diffusion on the web. The post-doctorant will probably have enough time to study and implement an extension of JoCaml, such as type safe marshalling or matching by regular term expressions. * Ideal candidate The post-doctorant should have a good knowledge of the programming languages theory, and, if possible, of concurrency theory. The work will also consist in a good part of programming within distributed systems. This work is adequate for an expert in theory who wants to have some practical experience, or for a very good programmer (in operating systems or compilers) who wants to gain experience in programming under control of theory.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32588
Eijiro Sumii said:A friend of mine informed me of this report Lambda expressions and closures for C++ Jeremiah Willcock, Jaakko Jarvi, Doug Gregor, Bjarne Stroustrup, Andrew Lumsdaine 2006-02-26 http://public.research.att.com/~bs/N1968-lambda-expressions.pdf and I thought you might be interested. (I searched a little and didn't find any discussion on this report in this list.) A few highlights: ---------------------------------------------------------------------- We propose to extend the C++ language with lambda expressions, and define the semantics of these unnamed local functions via translation to closures: function objects implemented using local classes. ... void f() { int sum = 0; for each(a.begin(), a.end(), <>(int x) -> int extern(sum) {return sum += x;}); } ... 2.1 Omitting the return type The return type of a lambda expression can be omitted if the body of the lambda function contains at most one return statement. ----------------------------------------------------------------------
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32603
Julia Lawall announced:PhD position: the Coccinelle framework for evolution of Linux device drivers Description of the position: A full 3-year PhD scholarship is available at the University of Copenhagen (DIKU) to work on the design and development of the Coccinelle framework for evolution of Linux device drivers. The candidate should have a strong interest in most of the following: * Programming language design * Programming language semantics * Program analysis * Compilation * Operating system design * Device drivers Tools related to Coccinelle are being developed using OCaml. A Masters degree is required to start the position, but need not be completed by the time of the application. The deadline for the application is April 14, 2006 at 12pm (noon), Danish time. More information about the position and the application process is available at: http://www.diku.dk/~julia/announcement.pdf More information about the project is available at: http://www.emn.fr/x-info/coccinelle/ All interested candidates irrespective of age, gender, race, religion or ethnic background are invited to apply. Knowledge of Danish is not required.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32592
Markus Mottl asked:This report has also been posted to the OCaml bug tracker, but since it is a surprising observation, it may be good if people on the list know that it exists without having to search the bug tracker archive. Maybe some assembler guru can repeat this result and explain to us what's going on... ---------- It seems that changes to signal handling between OCaml 3.08.4 and 3.09.1 can lead to a very significant loss of performance (up to several orders of magnitude!) in code that uses threads and performs I/O (tested on Linux). The attached file (slow.ml) demonstrates this: it prints a character to stdout in a for-loop. The uploaded version will take approximately 600ms in native code to complete this test when redirecting output to /dev/null. If you comment out the line containing "module X = Thread" and compile without thread support, then the test suddenly only takes around 1.5ms, i.e. it runs 400 times faster. Profiling using oprofile revealed that the function "caml_process_pending_signals" seems to be responsible for that. Annotated assembler output showed that the code was sampled an astonishing number of times in the instruction "test %eax,%eax" as obviously generated for "if (async_action != NULL)" in this function. This is really weird, because everything else seems to be sampled a sensible number of times, but it would surely explain the timings. OCaml-3.08.4 does not exhibit any problems of that kind. slow.ml: open Unix open Printf module X = Thread let () = let t1 = gettimeofday () in for i = 1 to 100000 do print_char '.'; done; let t2 = gettimeofday () in eprintf "%f\n" (t2 -. t1);Christophe Troestler said:
> Profiling using oprofile revealed that the function > "caml_process_pending_signals" seems to be responsible for that. An earlier related thread: http://caml.inria.fr/pub/ml-archives/caml-list/2006/02/2858f1e4532daae90d5b0762e3fff3cd.en.html But your code is even more striking!Xavier Leroy answered:
> It seems that changes to signal handling between OCaml 3.08.4 and 3.09.1 > can lead to a very significant loss of performance (up to several orders > of magnitude!) in code that uses threads and performs I/O (tested on Linux). > [...] > Maybe some assembler guru can repeat this result and explain to us > what's going on... Short explanation: atomic instructions are dog slow. Longer explanation: OCaml 3.09 fixed a number of long-standing bugs in signal handling that could cause signals to be "lost" (not acted upon). The fixes, located mostly in the code that polls for pending signals (caml_process_pending_signals), rely on an atomic "read-and-clear" operation, implemented using atomic processor instructions on x86, x86-64 and PPC. This makes signal handling correct (no signal can be lost) but I didn't realize that it has such an impact on performance, even on a uniprocessor machine. Thanks for pointing this out. (To prevent a number of well-meaning but irrelevant posts, keep in mind that we're using atomic instructions in a single-threaded program, to get atomicity w.r.t. signals, not w.r.t. concurrent threads. We don't need the latter kind of atomicity given OCaml's threading model.) Now, you may wonder why the problem appears mainly with threaded programs. The reason is that programs linked with the Thread library, even if they do not create threads, check for signals much more often, because they enter and leave blocking sections more often. In your example, each call to "print_char" needs to lock and unlock the stdout channel, causing two signal polls each time. So, it's time to go back to the drawing board. Fortunately, it appears that reliable polling of signals is possible without atomic processor instructions. Expect a fix in 3.09.2 at the latest, and probably within a couple of weeks in the CVS.Oliver Bandel said and Gerd Stolpmann answered:
> > OCaml 3.09 fixed a number of long-standing bugs in signal handling > > that could cause signals to be "lost" (not acted upon). The fixes, > > I'm not clear about what your proble is with lost signals, > but when using signals on Unix/Linux-systems, you can use > UNIX-API, with sigaction/sigprocmask etc. you can do things well, > and with the signal-function which C provides things are bad/worse. > The C-API signal-function signal(3) clears out the signal handler > after a call to it. In the sigaction/sigprocmask/... functions > the handler remains installed. The problem is the following: The O'Caml runtime cannot handle signals immediately because this would break memory management (e.g. imagine a signal happens when memory has just been allocated but not initialized). To get around this the signal handler sets just a flag, and the compiler emits instructions that regularly check this flag at safe points of execution (i.e. memory is known to be initialised). These instructions are now atomic in 3.09. In 3.08, you have basically if "flag is set" then ( (*) "clear flag"; "call the signal handler function" ) If another signal happens at (*) it will be lost. As you mention sigprocmask: Of course, you can block signals before checking the flag and allow them again after clearing it, but this would be even _much_ slower than the solution in 3.09, because sigprocmask needs a context switch to do its work (it is a kernel function). I don't know what Xavier has in mind to solve the problem, but I would think about reducing the frequency of the atomic check. This could work as follows: - Revert the check to the 3.08 solution - Use the alarm clock timer to regularly call a signal_manager function at a certain frequency (i.e. the signal flag is set at a certain frequency) - Only the alarm clock timer signal is left unblocked. The other signals are normally blocked. - In signal_manager, it is checked whether there are other pending signals, and if so, their functions are called. Of course, it is again possible that alarm clock signals are lost, but this is harmless, because it is a repeatedly emitted signal. The other signals cannot be lost, but their execution is deferred to the next alarm clock event. > But if this is what you think about (and how it will be done > on windows or other systems) I don't know, but maybe this is > a hint that matters. > > BTW: I saw that in the Unix-module the unix-signalling functions are > now included... (the ywere not on older versions of Ocaml). They have been included for a long time. New is Thread.sigmask.Xavier Leroy then said:
> The problem is the following: [...] In 3.08, you have basically > > if "flag is set" then ( > (*) > "clear flag"; > "call the signal handler function" > ) > > If another signal happens at (*) it will be lost. Actually, the problematic code in 3.08 is: tmp <- flag; (*) flag <- 0; if (tmp) { process the signal; } and indeed a signal can be lost (never processed) if it occurs at (*). The solution I have in mind is to implement exactly the pseudocode you give above. If a signal occurs at (*), it is not lost (the signal handler function will be called just afterwards!), just conflated with a previous occurrence of that signal, but this is fair game: POSIX signals have the same behaviour. (Yes, I'm ignoring the queueing behaviour of realtime POSIX signals.) Note however that in 3.09 and in my proposed fix, there is one flag per signal, which still improves over 3.08 (which had only one shared flag) and ensures that two occurrences of different signals are not conflated, again as per POSIX. > I don't know what Xavier has in mind to solve the problem, but I would > think about reducing the frequency of the atomic check. That would be plan C, plan B being making the check even more efficient. I'd rather not introduce timer signals if at all possible, though, since these mess up many function calls.Will Farr said:
As an aside, if anyone is interested in techniques for making atomic transactions fast with low latency, etc, the paper Atomic heap transactions and fine-grain interrupts by Olin Shivers, James W. Clark and Roland McGrath: http://www-static.cc.gatech.edu/~shivers/papers/heap.ps presents several *neat* hacks to do this efficiently. I'm sure that the implementators on the list are already aware of this work, but I just wanted to point it out as interesting reading for people (like myself) who think this stuff is neat but don't necessarily have broad experience with it.
Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).
:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1
zM
If you know of a better way, please let me know.
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.