Hello
Here is the latest OCaml Weekly News, for the week of January 07 to 14, 2014.
Archive: https://sympa.inria.fr/sympa/arc/caml-list/2014-01/msg00020.html
Yotam Barnoy asked:So far, I've been programming in ocaml using only sequential programs. In my last project, which was an implementation of a large machine learning algorithm, I tried to speed up computation using a little bit of parallelism with ParMap, and it was a complete failure. It's possible that more time would have yielded better results, but I just didn't have the time to invest in it given how bad the initial results were. My question is, what are the options right now as far as parallelism is concerned? I'm not talking about cooperative multitasking, but about really taking advantage of multiple cores. I'm well aware of the runtime lock and I'm ok with message passing between processes or a shared area in memory, but I'd rather have something more high level than starting up several processes, creating a named pipe or a socket, and trying to pass messages through that. Also, I assume that using a shared area in memory involves some C code? Am I wrong about that? I was expecting Core's Async to fill this role, but realworldocaml is fuzzy on this topic, apparently preferring to dwell on cooperative multitasking (which is fine but not what I'm looking for), and I couldn't find any other documentation that was clearer.Yaron Minsky replied:
This is indeed something that is not well covered in RWO. That said, the Async_parallel library is aimed at this kind of target. No shared memory region, just some automation around spinning up processes and communicating jobs between them.Gerd Stolpmann also replied:
There is Netmulticore, a part of Ocamlnet: http://projects.camlcity.org/projects/dl/ocamlnet-3.7.3/doc/html-main/Intro.html#ch_comp It utilizes shared memory, and accesses it from several processes. There is no C code involved, but it is fairly unsafe nevertheless, because you need to stick to certain programming rules, or the program crashes. However, if you manage to encapsulate all the unsafe things, it is a good option. I'm using Netmulticore in the Plasma Map/Reduce implementation.Yotam Barnoy then asked and Gerd Stolpmann replied:
> Gerd, would it be possible to put netmulticore up on opam? I realize > this is a sensitive topic for you, but from what I can tell, this is > the most comprehensive and efficient solution to parallel programming > that exists for ocaml, so why not make it as easily available as > possible? Netmulticore is part of Ocamlnet, and I guess you can get it by simply installing ocamlnet with opam. I'm not the maintainer, so I don't know whether it is included or not. > Also, regarding the ocaml_modify() and ocaml_initialize() regression > in 4.01 -- yikes! Is there any way to get support back for writing > outside of the ocaml heap? This seems like a pretty big deal! (Sorry > if I'm arriving late for whatever discussion took place about this > before). There is a workaround in place - so far the OS supports it: thanks to Xavier the symbols caml_modify and caml_initialize are declared as weak in 4.01, allowing them to be redefined in executables. Netmulticore redefines these symbols with their pre-4.01 functions. This isn't optimal yet, because the old write barriers are a bit slower, and because this introduces a very low-level dependency on the current version of Ocaml. Nevertheless, it works for now. (Ideas for a better solution are highly welcome.)Mark Shinwell then said and Gerd Stolpmann replied:
> Jeremie Dimino and myself have a somewhat embryonic proposal > that should permit most of the write barrier to be > circumvented for out-of-heap access, and also avoid the page > table test. We'll send mail to the list in due course once > we've had time to think about this further. I'm very curious to hear about that. So far I have only the idea to override the := operator selectively in all modules that want to modify out-of-heap values: let ( := ) = Netmcore_heap.assign h where Netmcore_heap.assign is a C function that tests whether the destination address is in the out-of-heap address space of h. If so, the assignment can be done directly. If not, it just falls back to a normal caml_modify. This solution isn't nice, though, because it can be very problematic to have the extra indirection of ref. So, what I'm searching for is a way to turn any record mutation into an external C function call (or at least to parametrize the write barrier somehow).Roberto Di Cosmo answered the original question:
there are regularly discussions on how to perform computations involving parallelism on this list; a relatively recent and detailed one is https://sympa.inria.fr/sympa/arc/caml-list/2012-06/msg00043.html And yes, one might be puzzled because there are so many different approaches, but this is unavoidable, because there are so many different needs which are not easily covered by a single approach. If one wants to distribute a computation that has very little data exchange compared to the local computation on each node, the best approach is quite different to the one needed when large data sets need to be shared or exchanged among the workers. And if you can do all on a single machine, you can obtain significant speedup on a multicore machines by exploiting OS level mechanisms that are not available if you need a cluster. Not to mention the fact that in many cases one is looking for concurrency, and not parallelism: even if at first sight they may look similar, deep down they really are not. Since you mention machine learning, it's quite probable that you want to perform several iterations of relatively inexpensive computations on very large arrays of integers or floats: if this is the case, Parmap (and not ParMap, that lends to confusion with a different project in the Haskell world) was specially designed to provide a highly efficient solution, and avoid boxing/unboxing overhead of float arrays, *if* you use it properly (in particular, see the notes at the end of the README file, or look at http://www.dicosmo.org/code/parmap/ and the research article pointed from there, where a precise discussion of the steps involved in performing parallel computation on float arrays is given). Actually, feedback from happy users, and synthetic benchmarks indicate that Parmap should provide one of the best possible performances for this use case on a multicore machine, short of resorting to using libraries like ancient, that requires carefulness to avoid core dumps, or external libraries that already have taken care of parallelism for you (like LaPack, etc.). But if one performs a map over an array of floats *without* using the special functions for floats, then all sort of boxing/unboxing and copying will take place, and the "parallel" version might very well be even slower than the sequential one, *if* the computation on each float is fast. Finally, if the float computations are the bottleneck, then a very interesting project to keep an eye on is SPOC, that may significantly outperform everything else on earth: taking advantage of the GPUs in your machine, it can perform float computations on large arrays in a fraction of the time that your CPU, even multicore, requires... but of course you need to learn to program a GPU kernel for that. Learn more about this here http://www.algo-prog.info/spoc The bottonline is, parallelism is easier than concurrency, but when one looks for speed, every detail counts, and getting a real speedup does not come for free. We would really need a single place where to share ideas, tips and serious analysis of the various available approaches: multicore machines and GPUs are a reality, and this issue is bound to come up again and again.Yotam Barnoy then said and Anil Madhavapeddy replied:
> Regarding a place to share ideas, it seems like it would be very > useful to have an official ocaml wiki. Haskell has this and it's a > huge help. In fact, I would say haskell development would be greatly > hampered without it. There's so much information that's relevant to > more than one library ie. doesn't fit in any particular library's > documentation. It wouldn't be too hard to set up a wikimedia > instance on ocaml.org, would it? Alternatively it should be pretty > easy to set up something on wikia. This wiki would also be a great > place to describe the conceptual implementation of the compiler, > which is again what haskell has. We do have a fledgling service for "domain-specific" conversations, in the form of lists.ocaml.org. In fact, we set up a "wg-parallel" mailing list last year, but never announced it for various reasons. This seems like a good time to advertise its existence: http://lists.ocaml.org/pipermail/wg-parallel/ (note that if anyone else would like an archived list on lists.ocaml.org for a project or community group, then please do drop a line to infrastructure@lists.ocaml.org to request it) Regarding other services on ocaml.org, we (the "infrastructure team") are happy to set them up, but please bear in mind that they all come with a maintenance burden. Dealing with security issues, backups, software updates, outages all take up time, and I confess a preference for sipping martinis and hacking on code instead of sysadmin work. Jeremy and Leo got tired of waiting for me to set up the wiki too, and started: https://github.com/ocamllabs/compiler-hacking/wiki If you follow the links through there, there is a 'compiler internals' page that would be good to contribute to, and you (or anyone else) is extremely welcome to add more information on topics such as parallel programming libraries there. I think we could have a decent stab at a wiki.ocaml.org by backing it against a GitHub repository, and not have to do any special hosting for it at all (the OPAM web pages work in a similar fashion at the moment). But for now though, I'd recommend focussing on the problem at hand (parallel programming) and getting some information down somewhere, and less on the lack of a central wiki.Ashish Agarwal then added:
Regarding the need for a wiki, why not create a new Parallel Programming page under tutorials [1]. A "tutorial" can be as simple as listing the libraries available and a brief description about the high level goal of each. Note ocaml.org is now almost entirely written in Markdown. A new page can be written quite easily, see for example The Basics tutorial [2]. [1] http://ocaml.org/learn/tutorials/ [2] https://github.com/ocaml/ocaml.org/blob/master/site/learn/tutorials/basics.mdFrancois Berenger also replied to earlier posts from the thread:
Roberto Di Cosmo wrote: > Being the main developer of Parmap, I am quite curious to know where the > "complete failure" you are mentioning actually comes from, and I would like > to put you in touch with Francois Berenger, that has been using Parmap in > production for quite a while with excellent results. Absolutely! I usually develop a sequential program working on a big list and using List.iter or List.map. When I am happy with the sequential program, I add an nprocs parameter to use the Parmap equivalents when nprocs > 1. The only problems I have encountered were out of memory in case the nprocs forks did not fit on my machine when I handled too big data. I usually handle independent pieces of data, for example many proteins or chemical molecules. More answers follow below. Yotam Barnoy wrote: > My question is, what are the options right now as far as parallelism is > concerned? I'm not talking about cooperative multitasking, but about really > taking advantage of multiple cores. I'm well aware of the runtime lock and I'm > ok with message passing between processes or a shared area in memory, but I'd > rather have something more high level than starting up several processes, > creating a named pipe or a socket, and trying to pass messages through that. There are MPI bindings for OCaml. > Also, I assume that using a shared area in memory involves some C code? Am I > wrong about that? Gerd Stolpman's impressive ocamlnet library has nice pure OCaml wrappers for all this I believe. http://projects.camlcity.org/projects/ocamlnet.html > I was expecting Core's Async to fill this role, but realworldocaml is fuzzy on > this topic, apparently preferring to dwell on cooperative multitasking (which > is fine but not what I'm looking for), and I couldn't find any other > documentation that was clearer. Cf. parallel Vs. concurrent in there: http://chimera.labs.oreilly.com/books/1230000000929/ch01.html#sec_terminology I believe Parmap and MPI are for parallelism (what you want very probably), while Async/Lwt and OCaml threads are for concurrency (used in webservers, GUIs, etc.).
Archive: https://sympa.inria.fr/sympa/arc/caml-list/2014-01/msg00036.html
Martin DeMello announced:Thomas Leonard of 0install has written up a series of blog posts detailing his porting of 0install from python to OCaml. The latest one, on the bugs he encountered along the way, is particularly interesting: http://roscidus.com/blog/blog/2014/01/07/ocaml-the-bugs-so-far/
Archive: https://sympa.inria.fr/sympa/arc/caml-list/2014-01/msg00046.html
Louis Mandel announced:We are happy to announce the new release of ReactiveML: http://reactiveml.org. ReactiveML is a language similar to OCaml extended with a built-in notion of parallel composition. It is based on the reactive synchronous model, a cooperative concurrency model. The language is well suited to program applications with a lot of parallel processes which communicate and synchronize a lot such as video games or simulation problems. ReactiveML is compiled into plain OCaml code and thus can link or be linked to any OCaml code. This new version includes among others the following feature: - a new programming construct "signal/memory" which helps in particular to program concurrent data structures. The full list of changes is available at http://reactiveml.org/distrib/changes. ReactiveML can be installed from the sources: http://reactiveml.org/distrib/rml-1.09.02-2014-01-08.tar.gz or using OPAM: opam install rml
Archive: https://sympa.inria.fr/sympa/arc/caml-list/2014-01/msg00047.html
Roberto Di Cosmo requested:the short story is that we need help in packaging for Mac OSX and Windows some specialised solvers that can be used with opam, and are already packaged for Debian: aspcud, mccs and packup, with aspcud being the highest priority. The Debian packages, built by Ralf Treinen, are a good entry point for any volunteer, as they contain the pointer to the upstream source, list the C/C++ libraries and other dependencies involved, and contain useful patches not yet integrated upstream; even if you do not have a Debian (or Ubuntu) machine at hand, most of the information is available here aspcud : http://packages.debian.org/wheezy/aspcud mccs : http://packages.debian.org/wheezy/mccs packup : http://packages.debian.org/wheezy/packup If you are willing to help with this work, please post a comment under the corresponding issues opened on github, and say to what part of the effort you whish to contribute, so we can have some basic coordination among the volunteers. aspcud : https://github.com/ocaml/opam/issues/1074 mccs : https://github.com/ocaml/opam/issues/1075 packup : https://github.com/ocaml/opam/issues/1076 Now, here comes the long story, that might be of interest for you even if you are just a regular user of opam. As you might know, the opam package manager incorporates state-of-the-art technology that has been developed for managing packages in GNU/Linux distributions in the Mancoosi project (www.mancoosi.org), and uses the libcudf and dose libraries from it. A key point of the approach which we set forth in Mancoosi (for the interested reader, see [1,2] below), is that a modern package manager should clearly separate a platform specific front-end from a platform independent back-end in charge of solving dependencies in order to perform upgrades or downgrades according to the user's preferences. Using the CUDF format as a pivot, we can share the effort of developing efficient solvers among multiple package managers, and can even use different solvers in the same package manager if the default one hits its limits (which *can* happen, as dependency solving is an NP-complete problem). This is actually possible today in Debian: you can call apt-get with the option --solver SOLVER where SOLVER can be aspcud, mccs or packup. User preferences are an important, and often overlooked concept: when you request the installation of a package, there might be many ways of satisfying it, and a user should be able to state, for example, whether he is a "trendy" person that prefers to see all other packages upgraded to the latest possible version at the same time, or on the contrary, a "paranoid" person that wants to change them as little as possible. This is why all CUDF compliant solvers allow to specify user preferences by combining lexicographically keyworks like "new", "changed", "removed", "notuptodate"; for example, a trendy person will use as preference -removed,-notuptodate,-new meaning "please minimise the number of removed packages, and then the number of packages that are not at their latest version, and finally the number of package that were not already present" While a paranoid person would use -removed, -changed meaning "please minimise the number of removed packages, and then the number of packages that are modified (installed/removed/upgraded/downgraded)". Actually, the criteria supported by the CUDF solvers evolved over time, and aspcud supports nowadays a more sophisticated preference language (see http://www.mancoosi.org/misc-2012/criteria/), but all of "new", "changed", "removed" and "notuptodate" can be encoded in the latest version, and aspcud as packaged for Debian accepts them right away. The good news is that opam has built-in support to use aspcud as an external CUDF-compliant solver too: if aspcud is installed on your machine, then dependency solving will be delegated to aspcud, and the default preferences are set to -removed,-notuptodate,-new. You can change these preferences if needed by just setting the OPAMCRITERIA environment variable, and you can ask opam not to use the external solver with the option --no-aspcud (that should really be --no-external-solver). Now that the opam users are hitting the limits of the internal solver embedded in opam, it is important to have CUDF compliant solvers available on all platforms: aspcud is a priority because it can be used by opam right away (no need to change even a single line of code), but packup and mccs are important too, as with little changes in the opam code one will be able to choose which solver to use. So, to sum up: - if you are running Debian (or a Debian-based distribution, like Ubuntu), lucky you, just apt-get install aspcud and you are all set; you can even play with OPAMCRITERIA to adjust the solutions to your needs, and you can use --no-aspcud if you want to see the difference with the internal solver - if you are on another platform, stay tuned for when packages for aspcud/packup/mccs will be available on your platform -- Roberto [1] A modular package manager architecture, http://dx.doi.org/10.1016/j.infsof.2012.09.002 also freely available from http://www.dicosmo.org/Articles/2013-AbateDiCosmoTreinenZacchiroli-Ist.pdf [2] Dependency solving: A separate concern in component evolution management, http://dx.doi.org/10.1016/j.jss.2012.02.018 also freely available from http://www.dicosmo.org/Articles/2012-AbateDiCosmoTreinenZacchiroli-Jss.pdf
Archive: https://sympa.inria.fr/sympa/arc/caml-list/2014-01/msg00071.html
Goswin von Brederlow asked and David Sheets replied:> last year someone here was working on bindings for zeromq. > > Is that still ongoing and how is it going? Does it support zeromq 4.0 > with CURVE? https://github.com/fmp88 was working on https://github.com/fmp88/ocaml-zeromq and https://github.com/fmp88/ocaml-czmq but I don't know much about their status. Also, I have https://github.com/dsheets/ocaml-sodium which is sufficient to implement DNSCurve. These all use ctypes which should be getting stub generation Real Soon Now.Daniel Bünzli also replied:
There's at least this: > opam list --all | grep zmq | cut -d " " -f 1 | xargs opam info package: lwt-zmq version: 1.0-beta4 repository: default upstream-url: https://github.com/hcarty/lwt-zmq/archive/v1.0-beta4.tar.gz upstream-kind: http upstream-checksum: 29338d125a545daf45df9e3d7631d01d homepage: https://github.com/hcarty/lwt-zmq depends: ocamlfind & lwt & ocaml-zmq installed-version: available-versions: 1.0-beta3, 1.0-beta4 description: Lwt-friendly wrapping around ZeroMQ sockets package: ocaml-zmq version: 0 repository: default upstream-url: https://github.com/pdhborges/ocaml-zmq/tarball/master upstream-kind: http upstream-checksum: 8e845370b99630c2a84cf4495480522e homepage: https://github.com/pdhborges/ocaml-zmq depends: ocamlfind & ounit & uint installed-version: available-version: 0 description: OCaml bindings for ZMQ 2.1
Thanks to Alp Mestan, we now include in the OCaml Weekly News the links to the recent posts from the ocamlcore planet blog at http://planet.ocaml.org/. Univalent foundations subsume classical mathematics: http://math.andrej.com/2014/01/13/univalent-foundations-subsume-classical-mathematics/ Unikernels, and the Rise of the Virtual Library Operating System: http://anil.recoil.org/2014/01/13/unikernels-in-cacm.html Async Parallel: https://ocaml.janestreet.com/?q=node/122 When QuickCheck Fails Me: http://www.mega-nerd.com/erikd/Blog/CodeHacking/Haskell/quickcheck_fail.html
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.