Hello
Here is the latest Caml Weekly News, for the week of May 9 to 16, 2006.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32904/focus=32904
Geoffrey Alan Washburn announced:OCamlTeX is a wrapper for LaTeX and its variants (pdfLaTeX, XeTeX, etc.) that provides the ability to define macros in terms of OCaml code. OCamlTeX is derived from Scott Pakin's PerlTeX, but with some enhancements that are useful for OCaml. Using OCamlTeX is straightfoward. Just add \usepackage{ocamltex} to your document's preamble and you can then start writing OCaml macros like \ocamlnewcommand{\mymacro}[x,y]{ "\\textbf{" ^ y ^ x ^ "}" } where unlike LaTeX and PerlTeX instead of writing the number of arguments the function takes within the square brackets, you can explicitly name them. What is actually going on is that OCamlTeX and TeX/LaTeX are communicating via temporary files, and the above bit of LaTeX corresponds to defining the OCaml function let rec latex_mymacro (x : string) (y : string) : string = "\\textbf{" ^ y ^ x ^ "}";; It is then possible to use your macro just like you would any other LaTeX macro. The macro invocation \mymacro{foo}{bar} expands to \textbf{barfoo} and is then expanded further. Consequently, macros defined using \ocamlnewcommand can either call themselves recursively by using "latex_mymacro" or outputing a call to "\mymacro" in their output. OCamlTeX differs from PerlTeX in two notable ways (other being written in OCaml and providing OCaml macros): * First, OCamlTeX currently doesn't support LaTeX style optional arguments. I can imagine in the putting this back in by making use of OCaml's optional argument functionality. * Second, OCamlTeX provides an additional macro \ocamlexec, that allows for executing arbitrary top-level code. For example, this is useful if you want to open a module, or import OCaml code in bulk. I expect this will be especially useful for those of you that write papers about software/languages you've written in OCaml because it makes it easy to actually call your code from within your document and maintain consistency. So, for example you can always be sure your code examples type-check. I think it would be interesting to consider a Haskell port, because monads would be very helpful for structuring a more extensive TeX API, rather than always just manipulating strings. However, I am not as familiar with how to implement this sort of software in Haskell. The current version of OCamlTeX is available from the world readable Subversion repository: https://svn.cis.upenn.edu/svnroot/ocamltex/. If you're interested in hacking on it, I can give you commit access. You will need the OCaml library/package "cash" to use it. Bug reports and feature suggestions welcome.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32907/focus=32907
Maxence Guesdon announced:I'm glad to announce the release of Caml-get 0.7. Caml-get is a tool to distribute and get Objective-Caml code in a way similar to the apt-get utility. It is useful to distribute and reuse small pieces of code rather than create (and depend on) a library for each of them. More details on the Caml-get page: http://pauillac.inria.fr/~guesdon/camlget.en.html Changes in 0.7: - new architecture: there is now a Camlget library, used by the command line tool. A Camlget_gui module allows to create graphical interfaces to handle camlget repositories (client side). - the command line tool now asks the user before upgrading a piece of code and can display the differences between the current code and the new version. - a new ocamldoc generator, odoc_cgtest, is included. Using @cgtest tags in ocamldoc comments, one can write ocaml expressions of type bool; the odoc_cgtest generator gather these assertions in a single file which can be run to perform tests (i.e. indicate which assertions fail). Odoc_htmlcg, the included html generator, can handle these new tags to display the assertions in the documentation. Here is an example of definition of assertions: (** [v1 << v2] returns true if version [v1] is strictly inferior to version [v2] . @cgtest The following assertions hold: - [[1;2] << [1;3]] - [not ([1;2] << [1;2])] - [let v = [ 1; 2 ; 3] and w = [ 1; 2] in w << v ] *) val (<<) : version -> version -> bool - The plugin for cameleon has been improved.
Archive: http://article.gmane.org/gmane.comp.lang.caml.general/32917
Aleksey Nogin announced:I've compiled the binary RPMs of OCaml 3.09.2 for Fedora Core 2,3,4 and 5, Red Hat Enterprise Linux 3 and 4 and Mandrake 10.1. You should be able to download them from http://rpm.nogin.org/ocaml.html as usual. P.S. Sorry for the huge delay (I've been on paternity leave :) ).
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32937/focus=32937
Hendrik Tews asked and Markus Mottl answered:> 1. About rule 1: CAMLparam / CAMLreturn can be ommitted in the > following cases: > > a) there is no value in the function that is a pointer to a > block inside the heap. If you do not have a pointer into the OCaml-heap (i.e. nothing of type "value"), then there is naturally nothing to protect. I sometimes prefer the Begin_roots/End_roots-macros defined in memory.h if I want to protect values allocated locally. I always use the CAMLparam/return macros to protect function arguments. > b) no allocation will take place between the start and the end > of the function. (Really? Even in the presence of threads?) It depends. If you use threads and call "caml_{enter,leave}_blocking_section" within the function and have a value whose lifetime extends into (e.g. I/O-operations on bigarrays) or crosses this block, then you will have to protect this value, because the GC may be called within the blocking section by another thread, which might reclaim the value. If any of the above requirements does not hold and if you don't allocate anything before using that value (standard case), then you don't have to protect the value. > 2. I believe rule 2 (register with CAMLlocal) must be extended to > intermediate values. Consider for instance > > value f(...); > value g(...); > > ... > h(f(...), g(...)); > > Assuming right to left evaluation, this will break if the > garbage collector is called inside f, because the value > returned by g is not registered as a root. True, this might break. > However, the following is save: > > h2(g(...)) > > because h2 will register the value before any allocation can > happen. Yes, this is safe.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32925/focus=32925
Francois Rouaix asked:I'm contemplating writing an OCaml interface for a C++ middleware library that my company develops and uses internally. Typically this middleware will start an event loop on a thread in the background, leaving the application responsible for its own threads (and potentially using none and having its own code running entirely on events within the eventloop thread). How's this likely to be compatible with OCaml use of native threads (this is on Linux by the way)? The manual section for interfacing with C isn't mentionning threads anywhere. Should Caml code be restricted to run on threads it has created? Or can it run on any threads? How can I synchronize between a thread running C++ code and a thread running OCaml code (i.e. both communicating on a message queue)? Thanks for any suggestions.Ted Kremenek answered:
From my understanding of OCaml threads, they are non-preemptive user- level threads that run within a single kernel thread. Further, OCaml does not employ a concurrent garbage collector; essentially the whole program stops (including all OCaml threads) when the garbage collector runs. OCaml threads do not provide any additional actual parallelism (they are not preemptive like kernel threads), but like all user-level threads they provide an abstraction of parallel threads as a programming idiom. I am not certain if in the implementation of OCaml threads that one thread automatically yields to another when it calls a blocking operation like an I/O operation (like the user-level thread implementation for C described in the Capriccio paper that appeared in SOSP a few years ago), but I doubt it. You might be able to run the OCaml run-time (I am talking about native code) within a program that employs multiple kernel threads as long as ALL the OCaml code runs in one designated thread, but the run- time was not designed to be executed within multiple kernel threads. Of course I could be completely wrong about this, but this was my interpretation from the documentation, previous emails on this list, and from my brief circumspection of the code from the OCaml run-time.Gerd Stolpmann also answered:
If you compile ocaml with pthread support, the O'Caml threads are real Linux threads. When using them, be aware of: - The memory management code is not thread-safe. That means only one thread is allowed to run O'Caml code at any time. To ensure this there is the so-called "master lock". You can acquire the master lock by calling leave_blocking_section() and you can release it by enter_blocking_section() When you call a C function that may block for indefinite time one usually releases the lock first, calls the function, and then requires it. The stub function looks like value foo_stub (...) { ... enter_blocking_section(); /* From here: NO ACCESS to any O'Caml memory, not even * CAMLparam/CAMLlocal variables! */ ... foo(); ... leave_blocking_section(); /* Normal rules for stub functions apply again */ ... } For callbacks from C code you will typically first call leave_blocking_section() and later enter_blocking_section() before you return. - O'Caml must never see threads not created by itself. This means a thread created by your middleware must not run O'Caml code. (You get immediately a segfault if you try.) - Although O'Caml mutexes and condition variables actually base on their pthread counterparts, there is no interface from the C side. So you better do any synchronization completely in C, i.e. the message queue is written in C and uses the normal pthread primitives. The O'Caml stub functions accessing this queue need to call enter/leave_blocking_section as explained. Hope this helps. Had to solve a similar problem for a customer.Vincenzo Ciancia then added and Markus Mottl said:
> However, let's make it clearer, you can create a thread in ocaml and then > call a C function, and this C function, which will be run concurrently with > other C functions called the same way, can call back ocaml. Yes, this is possible, and good libraries provide ways of letting user created threads enter the library to execute callbacks.Ted Kremenek asked and Gerd Stolpmann replied:
> This is very interesting. Thank you for pointing this out. I have > some questions that would clarify a few things for me. > > Because of the run-time lock, should I gather that the threads are > still essentially cooperative threads? Essentially they are pthreads, i.e. on Linux kernel threads. However, there is a level of cooperation on top of this. > For example, does it work > this way: if a thread holding the master lock is switched out and the > kernel schedules another thread, that other thread will start running > and try and acquire the lock. For all threads known to O'Caml we have: - Either the thread is running - Or it is waiting for the master lock (the kernel knows that, and won't schedule this thread) - Or it is executing non-O'Caml code because it passed enter_blocking_section. I hope it is now clear that getting the master lock is not the problem. The trick is how a thread releases it. The running thread (that has the lock) is notified when it must release the master lock. There is a special controller thread that periodically checks whether the running thread ran too long. If so, a special notification mechanism will cause that this thread gives the master lock up. Effectively, this is a second scheduler on top of the kernel scheduler. > It won't be able to, so it goes back > to sleep, and another thread will wake up, try and acquire the lock, > goes back to sleep, and so on, until the original thread holding the > lock is rescheduled. No, this won't happen, because the kernel does not wake up threads waiting for a lock that is not free. These are kernel-level locks! > Only when the thread releases the lock > (yields?) will another thread be able to run. Is this how it works? > If so, this would lend itself to extremely poor performance: if I had > 100 threads, 99 of them may have to wake up and go to sleep before ie.. > the original one is scheduled. That is 99 useless context switches. > Or rather is the lock acquired and released (within the generated > native code for the OCaml part of a program, not C code) on calls to > the memory management functions and other run-time code that are not > thread-safe? No this is not done. > This is also seems slow, since the heap is actively > manipulated all the time, so locks will constantly be acquired and > released, and you will have the same wake-up, make little or no > progress and go back to sleep problem I mentioned before. > > Your last email said that only one thread is allowed to run OCaml > code at any time, so it seems to me that this mutual exclusion must > come from somewhere. I'm very curious to know how this is > implemented. I gather that most people want to use threads in OCaml > to have multiple threads running OCaml code, and not necessarily have > a bunch of threads executing called C code (allowing the master lock > to be released). I'm just trying to understand how actual > performance would ever resemble anything desirable. Performance is quite good, but you should keep in mind: - O'Caml code can never run on more than one CPU - Switches between O'Caml threads are less fine-grained than between kernel threads. Imagine you stat a file, and it is necessary to load some blocks from disk. This can take 0.01 s of time. During that time a switch is impossible. (I.e. the problem is that there are system calls that are non-blocking in general but that can nevertheless consume lots of time in unlucky data cases.)Markus Mottl also replied:
> Because of the run-time lock, should I gather that the threads are > still essentially cooperative threads? For example, does it work > this way: if a thread holding the master lock is switched out and the > kernel schedules another thread, that other thread will start running > and try and acquire the lock. It won't be able to, so it goes back > to sleep, and another thread will wake up, try and acquire the lock, > goes back to sleep, and so on, until the original thread holding the > lock is rescheduled. No. The other threads will block in the master lock, which is a POSIX-mutex + condition variable. Releasing the master lock will signal exactly one of the blocking threads that it may run. The context switch, however, does not necessarily happen immediately. It depends on the OS when this will happen, and the scheduling policy determines which thread is going to run. > If so, this would lend itself to extremely poor performance: if I had > 100 threads, 99 of them may have to wake up and go to sleep before > the original one is scheduled. That is 99 useless context switches. That would be horrible, and that's not the way it works. > Or rather is the lock acquired and released (within the generated > native code for the OCaml part of a program, not C code) on calls to > the memory management functions and other run-time code that are not > thread-safe? This is also seems slow, since the heap is actively > manipulated all the time, so locks will constantly be acquired and > released, and you will have the same wake-up, make little or no > progress and go back to sleep problem I mentioned before. A timer signal makes sure that the current thread yields once in a while. Thus, it is not necessary to acquire/release locks at each allocation, which would make everything run dog slow. > Your last email said that only one thread is allowed to run OCaml > code at any time, so it seems to me that this mutual exclusion must > come from somewhere. I'm very curious to know how this is > implemented. I gather that most people want to use threads in OCaml > to have multiple threads running OCaml code, and not necessarily have > a bunch of threads executing called C code (allowing the master lock > to be released). I'm just trying to understand how actual > performance would ever resemble anything desirable. Unless INRIA implements a GC that can handle multiple threads running in parallel, which would have its own performance tradeoffs, you'll essentially always share one processor only. It depends on your application whether that's a problem. I/O-heavy applications will do fine, because the system calls can all be performed in parallel. You can also always call C or Fortran to run in parallel on bigarrays (matrices), because they don't mess with the OCaml-heap.Xavier Leroy answered the initial question:
> [Interfacing Caml code with multithreaded C/C++ code] > The manual section for interfacing with C isn't mentionning threads > anywhere. Should Caml code be restricted to run on threads it has > created? Or can it run on any threads? It depends whether your Caml code is single-threaded or multi-threaded. Case 1: The Caml code is single-threaded (more precisely: it is not linked with Caml's threading library). For instance, Caml provides a number of functions that you will call back from the main C/C++ multithreaded program. In this case, you can call back Caml code from any thread. All you need to make sure is that you never call back from two threads concurrently. Just put a lock around the callbacks and you're done. Case 2: The Caml code is linked with Caml's threading library. Then, you can only call back from threads created by Caml, and as Gerd Stolpmann mentioned, there are additional subtleties to ensure mutual exclusion. I'd recommend you avoid this scenario. > How can I synchronize between a thread running C++ code and a thread > running OCaml code (i.e. both communicating on a message queue)? If your message queue is already implemented in C++, all you need to do is wrap the "put" and "get" operations of the message queue using the Caml foreign-function interface, and call these functions from your Caml code.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32945/focus=32945
Nicolas Pouillard announced:The last four months, I have been working on the Camlp4 renovation with Michel Mauny. We are glad to announce the beta-release of a new version of Camlp4, that contains a lot of major changes. *** This version is not backward compatible. Use it at your own risk.*** Major changes are described at: http://gallium.inria.fr/~pouillar/camlp4-changes.html This release comes as a full OCaml package, that includes the new version of Camlp4. It is available at: http://gallium.inria.fr/~pouillar/ocaml-3.10+dev6-and-camlp4-beta-r23518.tar.bz2 The build and installation process is the OCaml one (./configure && make world && ...). Be careful with `make install': it could damage your current installation of OCaml and Camlp4, you can use --prefix at configure to avoid it. Since my internship ends on June, the 30th, we advise you to update your extensions soon and take advantage of this period asking questions and report bugs. This will ensure a smooth transition to a future official release of OCaml including this version of Camlp4.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32942/focus=32942
Charles Bouillaguet asked and Jacques Garrigue answered:> I would like to write a type which describe some kind of term in a > toy programming language. It has sets, but not sets of sets > > type 'a array_ = [`Array of 'a] > type base_sort = [`Int | `Float | `Object | `Array of base_sort] (* > arrays of arrays are OK *) > type sort = [base_sort | `Set of > base_sort] (* sets of sets are NOT OK*) > > The problem appear when I want to define my values : > > type 'sort array_state = 'sort * [`ArrayStateVar of string | > `ArrayWrite of 'me * 'sort base_value * [`Int] base_value * 'sort > base_value] as 'me > and 'sort base_value = 'sort * [`Inert of unit | `FieldRead of > [`Object] base_value * 'sort field | `ArrayRead of 'sort array_state > * ('sort array_) base_value * [`Int] base_value] > and 'sort field = 'sort * [`FieldVar of string | `FieldWrite of 'me * > [`Object] base_value * 'sort base_value] as 'me > > is refused with error : > ========================= > In the definition of base_value, type > [ `Int ] array_state > should be > 'a array_state > ====================== > > I then have two questions : > > a) is it possible to write that with polymorphic variants, and how ? There are two problems here. The first one is that mutually recursive type abbreviations are monomorphic. As a result, if you use several times array_state inside the same recursive type definition (that also defines array_state), all its occurences should have the same type. At first I thought it was the only problem. Then the solution would be to duplicate definitions (i.e. define int_base_value, object_base_value...) The trouble is that the array case takes 'sort as a parameter, meaning that we would need an infinity of such ducplicates. Here is the second problem: by nature, polymorphic variants only allow regular types (that can be represented by a regular graph), and your definitions do not represent a regular type (you can get ever deeper arrays.) > b) Is it possible to wrote that with Recursive modules, and how ? Recursive modules do not help, as this is a fundemental restriction of structural types (to keep unification decidable.) What you need here is GADTs, which are already in Haskell (GHC), and a work in progress for ocaml. In your case it should also be possible to simulate them by using encodings of GADTs using existential types (available through polymorphic record fields), but this is rather heavy weight to use. I have some sample code to do that, but it is on a broken laptop...Jacques Garrigue then added:
> The trouble is that the array case takes 'sort as a parameter, meaning > that we would need an infinity of such duplicates. > Here is the second problem: by nature, polymorphic variants only allow > regular types (that can be represented by a regular graph), and your > definitions do not represent a regular type (you can get ever deeper > arrays.) Note that this can be solved by introducing a nominal type (record or sum type) that breaks the irregular cycles. Here the following is sufficient. type 'sort array_state = 'sort * [ `ArrayStateVar of string | `ArrayWrite of 'me * 'sort base_value * [`Int] base_value * 'sort base_value] as 'me and 'sort base_value = {sort: 'sort; desc: [ `Inert of unit | `FieldRead of [`Object] base_value * 'sort field | `ArrayRead of 'sort array_state * ('sort array_) base_value * [`Int] base_value]} and 'sort field = 'sort * [ `FieldVar of string | `FieldWrite of 'me * [`Object] base_value * 'sort base_value] as 'me Probably this is not you wanted, but just to make things clear.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/32949/focus=32949
Frederick Akalin asked and Brian Hurt answered:> I'm running into cases where the 4 MB limit on arrays is starting to become a > problem. Lists are much slower and cause seg faults for me on the same data > set, and Bigarrays are a partial solution because I'd like to be able to > store arbitrary types in arrays (not just numbers). > > I was greatly surprised when I found out there was such a low limit on > arrays. Is there a reason for this? Will this limit ever be increased? It comes from how arrays are stored in memory. Every boxed value has a "tag word", which, among other uses, is used by the GC to figure out how big the object is (i.e. where the next object starts). Now, the encoding scheme of this tag word is fairly complicated, but it works out that for arrays, 10 bits of the 32 bits is used for other stuff, leaving 22 bits for the size. This limits the size of an array to 4M-1 elements (actually, it limits the size of the array to 16M-4 bytes, as size is measured in words- which is why arrays of floats are limited to 2M-1 entries- each float takes up 8 bytes aka 2 words). When you move to 64 bits, the tag word doubles in size, but the amount of "other" information in the tag word doesn't- this means that suddenly you have 52 bits of size, or 4T arrays. And now floats only take up one word, not two, so they too can have 4T arrays. The array of array idea someone brought up is a good one. There's a slight performance hit, but not much. I'd keep the top level array short- 1K entries is enough. The advantage of this is that if you're heavily using the array, the top level array will stay in cache (possibly even L1 cache), meaning the main cost of an array access only goes up by about 10% or so, maybe less (maybe 1% if it stays in L1 cache). The reasoning goes like this: the largest cost by far of an array access on a large array is the cache miss. Doing a memory reference that is satisified out of L1 cache generally takes 1-4 clock cycles. If you have to go to L2 cache, that's going to take 10-40 clock cycles (approximately). Going to main memory? 100-350+ clock cycles. With a large array, you're likely going to take the 100-350+ clock cycle hit anyways. If the top level array is in L1 cache, you're only adding 1-4 clockcycles to the 100-350+ clock cycle hit you're going to take anyways. Personally, I think PCs should have gone 64 bit circa 1997 or so. Once we started getting hard disks large enough that we couldn't mmap the entire hard disk, we started hitting problems. Long long and the large file interfaces are examples of these problems. The longer we go- and the larger memories and hard disks get, the bigger the problems get. Microsoft seems to be the main impediment, now that AMD has forced Intel to get off the stick. > If I had a record type with 5 floats, > will the limit then by Sys.max_array_size / 10? Within the record, the floats are unboxed (assuming you didn't have any non-float elements). This means the floats are stored directly in the record, and that the record doesn't hold pointers to the floats. But the record itself is boxed- which means that an array of these records is really an array of pointers to these records, meaning that you're back at the 4M-1 limit. Note that a record of 5 floats costs 44 bytes (40 bytes for the 5 floats + 4 bytes for the tag word). Assuming no records are stored in more than one location, the total memory cost of an array of 4M-1 of them is 16M for the array, plus (4M-1)*44 for the records, for a total of 192M-44 bytes. This is almost 10% of your available memory space right there. > Is there some sort of > existing ArrayList module that works around this problem? Ideally, I'd like > to have something like C++'s std::vector<> type, which can be dynamically > resized. Ocaml can't dynamically resize arrays. This gets tricky to do when arrays get large. At the extreme, if the array takes up 50% + 1 byte of the total address space, you can't resize it to be any larger- to resize requires you to copy it, which takes more memory than you have. I've seend problems well before then. > Also, the fact that using lists crashes for the same data set is surprising. > Is there a similar hard limit for lists, or would this be a bug? Should I > post a test case? I'm willing to bet dollars to donuts that the problem you hit with lists was stack overflow. Without signifigantly impacting performance there isn't a whole lot Ocaml can do about detecting stack overflow before the segfault hits. My general rule is, if it's going to contain more than a few dozen items, it probably shouldn't be a list. Extlib's library is less prone to this error, but you can still run into serious problems with long lists.
Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).
:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1
zM
If you know of a better way, please let me know.
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.