Hello
Here is the latest Caml Weekly News, for the week of March 26 to April 02, 2013.
Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-03/msg00267.html
Adrien Nader announced:I'm pleased (and much relieved) to announce a new version of the yypkg mingw-builds. Together with yypkg, this provides a package manager both on Linux and on Windows with 70 packages of libraries and binaries built both for i686 and x86_64. Everything is easily fully reproducible. Release highlights include: - New x64 toolchain (both as a cross- and native toolchain) - Improved native toolchains - OCaml cross-compiler, only targets i686-w64-mingw32 (see bug 5737) - More packages - Updated packages - Installers for Windows You can find details, documentationa and downloads on the new website: http://yypkg.org/mingw-builds/ PS: you can also read/write comments and vote on reddit if you feel like it's worth it: http://www.reddit.com/r/programming/comments/1b8egi/yypkg_mingwbuilds_windowslinux_package_manager_70/He later added:
I've put a new version (1.2 RC1) which fixes the issues I had found in beta1. The documentation and the binaries are updated; there are small fixes here and there (mostly in the doc even though there is still room for improvements). The address hasn't changed and you can find everything on yypkg.org: http://yypkg.org/mingw-builds/ As before, I've submitted it to reddit. Unlike before, I haven't been blocked by its spam filter. http://www.reddit.com/r/programming/comments/1bi47j/yypkg_mingwbuilds_12rc1_70_packages_for_windows/
Archive: https://sympa.inria.fr/sympa/arc/caml-list/2013-03/msg00246.html
Philippe Veber asked:I'm developping an ocsigen website doing some scientific calculations. Up to now, the calculations were done in the same process that runs the server. In order to gain in scalability (and maybe stability too), I would like to run those calculations in a separate (pool of) process(es). As this is a pretty typical setup, I guess quite a few people have already done that. So I'd like to hear some suggestions on what library to use in this particular context. It seems to me that the release library [1] should do the job and is lwt-friendly, but there are maybe other good options? Thanks for any hint, cheers! Philippe. [1] https://github.com/andrenth/releaseMartin Jambon suggested:
I wrote and used a library called Nproc about a year ago. It lets you create (Nproc.create) a pool of N processes, to which you can submit (Nproc.submit) computations of any type quasi-magically - just make sure any big environment required for the computation is not copied with each closure that you send to the workers. The submodule Nproc.Full provides a more advanced interface that lets each worker process have its own local environment. https://github.com/MyLifeLabs/nproc I haven't used Nproc in a while but it was working fine and should still work.Philippe Veber then said and Alain Frisch replied:
> nproc meets exactly my needs: a simple lwt-friendly interface to > dispatch function calls on a pool of processes that run on the same > machine. I have only one concern, that should probably be discussed on > the ocsigen list, that is I wonder if it is okay to fork the process > running the ocsigen server. I think I remember warnings on having parent > and children processes sharing connections/channels but it's really not > clear to me. FWIW, LexiFi uses an architecture quite close to this for our application. The main process manages the GUI and dispatches computations tasks to external processes. Some points to be noted: - Since this is a Windows application, we cannot rely on fork. Instead, we restart the application (Sys.argv.(0)), with specific command-line flag, captured by the library in charge of managing computations. This is done by calling a special function in this library; the function does nothing in the main process and in the sub-processes, it starts the special mode and never returns. This gives a chance to the main application to do some global initialization common to the main and sub processes (for instance, we dynlink external plugins in this initialization phase). - Computation functions are registered as global values. Registration returns an opaque handle which can be used to call such a function. We don't rely on marshaling closures. - The GUI process actually spawns a single sub-process (the Scheduler), which itself manages more worker sub-sub-processes (with a maximal number of workers). Currently, we don't do very clever scheduling based on task priorities, but this could easily be added. - An external computation can spawn sub-computations (by applying a parallel "map" to a list) either synchronously (direct style) or asynchronously (by providing a continuation function, which will be applied to the list of results, maybe in a different process). In both cases, this is done by sending those tasks to the Scheduler. The Scheduler dispatches computation tasks to available workers. In the synchronous parallel map, the caller runs an inner event loop to communicate with the Scheduler (and it only accepts sub-tasks created by itself or one of its descendants). - Top-level external computations can be stopped by the main process (e.g. on user request). Concretely, this kills all workers currently working on that task or one of its sub-tasks. - In addition to sending back the final results, computations can report progress to their caller and more intermediate results. This is useful to show a progress bar/status and partial results in the GUI before the end of the entire computation. - Communication between processes is done by exchanging marshaled "variants" (a tagged representation of OCaml values, generated automatically using our runtime types). Since we can attach special variantizers/devariantizers to specific types, this gives a chance to customize how some values have to be exchanged between processes (e.g. values relying on internal hash-consing are treated specially to recreate the maximal sharing in the sub-process). - Concretely, the communication between processes is done through queues of messages implemented with shared memory. (This component was developed by Fabrice Le Fessant and OCamlPro.) Large computation arguments or results (above a certain size) are stored on the file system, to avoid having to keep them in RAM for too long (if all workers are busy, the computation might wait for some time being started). - The API supports easily distributing computation tasks to several machines. We have done some experiments with using our application's database to dispatch computations, but we don't use it in production.Anil Madhavapeddy then asked and Alain Frisch replied:
> Are all of the messages through these queues persistent, or just the larger > ones that are too big to fit in the shared memory segment, and are they > always point-to-point streams? > > We've got a similar need in Xen/Mirage for shared memory communication and > queues, and have been breaking them out into standalone libs such as: > > https://github.com/djs55/shared-memory-ring > > ...which is ABI-compatible with the existing Xen shared memory interfaces, > and also an OCaml version of the transport-agnostic API sketched out in: > http://anil.recoil.org/papers/2012-resolve-fable.pdf > > The missing link currently is the persistent queuing service, but we're > investigating the options here (ocamlmq looks rather nice). The messages always go through shared memory queues (non persistent), but their payload is offloaded to the file system (in temporary files) when it is too large. There is no real persistence, though, because the temporary file is not self-describing (it is not sufficient to restart a computation after a process failure, for instance). (The distribution of computations through a database is closer to real persistence.) Queues are unidirectional point-to-point streams between two processes (we use a pair of such queues for between the main process and the scheduler, and between the scheduler and each worker process). The relevant part of the API is: ======================================================== type t (** The type of point-to-point shared memory queues. *) val create: ?max_size:int -> unit -> t (** A queue is created in one process with [create], passed by name (using [id]) to a single another process which can call [from_id]. *) val from_id: ?max_size:int -> string -> t (** Access a queue created by another process given its unique name. *) val id: t -> string (** Globally (system-wide) unique name attached to the queue. *) val close: t -> unit exception CannotGrow exception BrokenPipe val send: t -> string -> unit (** Non-blocking operation. Raises [BrokenPipe] if the reader has disconnected. Raises [CannotGrow] if the maximum size of the buffer has been reached. *) val read: t -> string option (** Non-blocking operation. Returns [None] if no message is available. Raises [BrokenPipe] if the writer has disconnected. *) ======================================================== If there is some interest for it, maybe Fabrice could release the code with an open source license? One thing which is not supported by this library is a notification mechanism to inform the other side of the queue that messages are available. For now, we simulate that by pinging the processes stdin/stdout descriptors. In the scheduler and the worker, we use select to monitor them (there is probably a big cost of doing so, especially considering how select is emulated under Windows). In the GUI process, we use standard .Net process monitoring facilities to inject callbacks in the main GUI thread. (The OCaml side of our application in mono-threaded, and we use a few external native or .Net threads to monitor system conditions.)Gerd Stolpmann also replied:
Interesting that there are now other shared memory implementations for OCaml. Note that there are a number of them in Ocamlnet, with some specialities not yet mentioned. There is the Netcamlbox library providing message boxes of limited size for exchanging OCaml values directly. That means the value is copied to the shared memory block by the sender, and the receiver can pick it up there without copying it again. Sender and receiver can map the memory at different addresses (the copy procedure invoked by the sender takes care of possible offsets, so that that Netcamlbox also allows the communication between processes that don't have a fork relation). There is no need for marshalling the value. http://projects.camlcity.org/projects/dl/ocamlnet-3.6.3/doc/html-main/Netcamlbox.html Going even beyond that, Netmulticore implements an "ancient" heap in shared memory (like Richard's Ancient lib, but with more options). This heap is organized like OCaml's major heap, and there is even a GC implementation for it. There are a number of data structures (arrays, hash tables, queues, buffers) which are aware of residing in shared memory. For synchronization there are mutexes, semaphores and condition variables. So far the values to manipulate are already in shared memory, programming with Netmulticore feels a lot like programming with multi-threading. In practice, however, you need to frequently copy values in and out, so it is not exactly as convenient. For Netmulticore, all processes must map the shared memory to the same address (easy with "fork"). http://projects.camlcity.org/projects/dl/ocamlnet-3.6.3/doc/html-main/Intro.html#netmulticore http://projects.camlcity.org/projects/dl/ocamlnet-3.6.3/doc/html-main/Netmcore_tut.html > The missing link currently is the persistent queuing service, but > we're investigating the options here (ocamlmq looks rather nice). There is also Netamqp, which can be used together with RabbitMQ. http://projects.camlcity.org/projects/netamqp.htmlGerd Stolpmann also replied to the initial question:
Well, I don't know whether this is an option for Ocsigen users, but Ocamlnet includes fairly good multiprocessing support. You can run servers that dynamically start subprocesses on demand. Look for Netplex: http://projects.camlcity.org/projects/dl/ocamlnet-3.6.3/doc/html-main/Intro.html#netplex I've no good recipe, though, how to plug in service processors that base on lwt (well, there is an adaptor in Ocamlnet for lwt - Uw_lwt - but I wouldn't know what to do on the Ocsigen side, but maybe worth exploring). Ocamlnet also includes other mechanisms that are generally interesting for compute stuff, namely Netmulticore for exploiting several cores on the same machine with fast shared memory architecture, and RPC for distributing computations in a network. Both are extensions of Netplex, so it is easy to integrate into a single program.Philippe Veber then asked and Gerd Stolpmann replied:
> Thanks for your input Gerd! As I understand it, your suggestion is to have > an RPC server (based on netplex) doing the actual calculations. That RPC > server would be called by the ocsigen server when needed (the ocsigen > server is the client in that scheme). So in that schema, only the RPC call > should be lwt-friendly. Digging in ocamlnet documentation, it seems that > could be achieved using [Rpc_simple_client.call] wrapped inside a > [Lwt_preemptive.detach]. I guess RPC would be most convenient here - it supports a server mode where the child processes accept the new connections (btw, if you don't want to deal with the RPC encoding stuff (i.e. XDR), just marshal the OCaml value as string, and use RPC functions that are declared as string->string). A sample program would the "finder" service here: http://projects.camlcity.org/projects/dl/ocamlnet-3.6.3/examples/rpc/finder
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.