Hello
Here is the latest Caml Weekly News, for the week of October 25 to November 01, 2011.
Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2011-10/msg00119.html
Yaron Minsky announced:Jane Street is looking to hire functional programmers for our offices in New York, London and Hong Kong. We're looking for both interns for this upcoming summer as well as full-time hires. Jane Street has the largest team of OCaml developers in any industrial setting, and probably the world's largest OCaml codebase. We use OCaml for running our entire business, supporting everything from statistical research to systems administration to automated trading systems. If you're interested in using OCaml to solve real-world problems, there's no better place. Jane Street has (in my humble opinion) a great work environment. It has a very informal feel --- if you dress much nicer than a t-shirt in jeans, you'll look out of place --- but it's also an intellectually challenging place where you get to wrestle with hard problems and learn about the subject matter of trading, a fascinating field in its own right. There's also a strong focus on education, with both on the job training and a system of formal classes. We also have a strong commitment to OCaml and to open-source software. We released our Core suite of libraries a few years back, and we continue to extend the reach of our public releases. And we have and will continue to financially support projects to improve the OCaml ecosystem. Compensation is more than competitive, and no prior experience with finance is required. Here are some resources you can use to learn more about Jane Street and what we do. - A talk I gave at CMU about how and why we use OCaml http://ocaml.janestreet.com/?q=node/61 - Our technical blog: http://ocaml.janestreet.com If you want to get a flavor of our approach to software, you might be interested in looking at Async, a monadic concurrency library we just released: http://ocaml.janestreet.com/?q=node/100 Follow this link to apply: http://janestreet.com/apply
Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2011-10/msg00121.html?checked_cas=2
Yaron Minsky announced:While we're in the announcing mood, I wanted to announce the first public release of Async, Jane Street's monadic concurrency library. You can find out more about Async here: http://ocaml.janestreet.com/?q=node/100Cedric Cellier asked and Yaron Minsky replied:
> While comparing async and lwt you forget to mention > performances. Didn't you run any benchmark with some result to share ? > Or are Async performances irrelevant to your use cases so you never > benchmarked ? Perhaps surprisingly, while we've benchmarked and tuned the performance of Async, but we've never done the same for Lwt. This has to do with history as much as anything else. We wrote the first version of Async years ago, when Lwt was much less mature. We looked at Lwt at the time and decided that there were several design choices that made it unsuitable for our purposes, so we wrote Async. We've looked back at Lwt for inspiration over the years, but we've never again had the question in front of us as to whether to use Lwt or Async, so doing the benchmarking has never been high on our list. We do have some guesses about how performance differs. For example, Lwt's bind is faster than Async's bind because of different interleaving policies: Lwt can run the right-hand side of a bind immediately, whereas in Async, it's always scheduled to run later, so that's more work that Async has to do in. That said, we prefer the semantics of Async's bind, even though it has a performance cost. (You can easily implement Lwt-style bind.) Another thing to note for any intrepid benchmarkers is that the released version of Async is missing a useful feature that is already available in our development trunk, which is tail-recursive bind. In the released version, the following loop: let rec loop () = after (sec 30.) >>= fun () -> printf ".%!"; loop () will allocate an unbounded amount of space, creating one deferred value every 30 seconds. In our development trunk bind is tail-recursive, but that hasn't gotten to the public release yet. You can do the same loop efficiently in Async using the upon operator: let rec loop () = after (sec 30.) >>> fun () -> printf ".%!"; loop () This is actually more efficient than using the tail-recursive version of bind, since it allocates one less deferred value every time through the loop.Gerd Stolpmann commented:
Which is already the third one (Equeue and Lwt being the others). I'm very up to reinventing the wheel, but I guess there is some reason. Does Janestreet use any open source libraries? Or does the commitment not go that far?Yaron Minsky then replied and Gerd Stolpmann said:
> I admit to a certain amount of ambivalence to releasing Yet Another > Monadic Concurrency library. (I don't think Equeue is in quite the > same category, since it has such a different style of interface.) Equeue offers several styles. On the lower level it is quite different, but the version offered by the Uq_engines module is pretty much the same with its own naming style (e.g. seq instead of bind etc.). > But I think we had good reasons for creating Async. As I said in my > blog post, the differences in error-handling and interleaving policy > were enough that we really felt we needed a different library. In deed this is interesting. Equeue also follows Lwt's idea not to interleave when possible, simply for performance reasons. You can, however, demand at any time to interleave (with the eps_e operator), and a user wanting this always could build a little layer on top of Equeue to force it. Exceptions are specially represented in Equeue as a different kind of return value. Most operators catch exceptions, cancel the remaining part of the execution flow (which is also specially supported, simulating a "falling through" effect), and return the exception. Effectively, this means that exceptions do not need that much attention, and often the right thing is done anyway. > And now that we've created it, there are multiple reasons to release > it. For one thing, we want it for out own open-source projects > outside of the office! And it's a precondition for us for releasing > other software that we've developed internally that depends on Async. Honestly, I'm a bit concerned about the yet-another-async library, because a library writer can usually only support one library, and remains incompatible with the other ones. What would you do if you wanted to use NetAMQP, which is for Equeue? It would be complicated (and maybe even impossible) to integrate this library into a program using Async otherwise. For Equeue+Lwt in the same program there is now at least a partial solution, but it is not really pleasant. What I fear is that the community is split up into fractions. The community is not so large that we can afford this. > As an aside, we use lots of OCaml libraries developed outside our > walls: RES, PCRE, Lacaml, Postgres bindings and OUnit and xml-light, > to name some off the top of my head. Where some of the developers happen to be Jane Street employees... Seriously, in many companies there is a culture to block everything from outside, and that's the other half of my concerns. Not that all this means you shouldn't publish your code. It's great but certainly not substituting community interaction.Milan Stanojević asked and Gerd Stolpmann replied:
> > In deed this is interesting. Equeue also follows Lwt's idea not to > > interleave when possible, simply for performance reasons. You can, > > Did you mean to say "Lwt interleaves when possible"? > > For example > > let foo () = > let r = x >>= bar in > .... > r > > in Async [bar] will run only after [foo] has completed (therefore > there is no interleaving between [foo] and [bar]), while in Lwt [bar] > can run in the middle of [foo] so there is an interleaving You probably mean the case where x has already computed a result. When developing Uq_engines, I was a bit unsure how to treat this case. In the initial version, I just disallowed this case: There was simply no way to run into it. If you wanted to create a thread that just yields a constant value, the only way was the function eps_e, which "computes" the constant in one step (i.e. it is not immediately there, but only after rescheduling). Later I allowed this case, and also added const_e, which creates a thread with an immediate result. Nevertheless, almost all code I produced uses eps_e, not const_e, because the possible interactions with state changes are easier to overlook. However, there is also const_e when speed really matters. Besides this problem, I'm unsure where the other conceptual differences are between the threading implementation. For example, Equeue has actually two schedulers: one very simple one, which is only used when no I/O and no delays need to be considered (and which is local to Uq_engines), and a heavy one when these phenomenons can occur. The simple scheduler is really unfair - the next thread at hand is executed, often leading to a cache-friendly execution flow with high locality. But it's fast. The other scheduler is absolutely fair: all resources are treated equal, and events are queued up (fifo order). It is only invoked when the simple scheduler cannot continue. So yes, there are probably differences. I am a bit surprised that Equeue (which is part of Ocamlnet, btw) is not recognized as monadic threading library, although it was definitely the first public Ocaml library exploring this idea (beginnings in 1999), and has probably the largest user code base (remember the original wink.com search service was written with it). It was "only" never been announced as monadic, and uses a different terminology because it's a threading library in the first place, and I always hoped it's easier to understand for people without Haskell background.Milan Stanojević then said and Gerd Stolpmann replied:
> > When developing Uq_engines, I was a bit unsure how to treat this case. > > In the initial version, I just disallowed this case: There was simply no > > way to run into it. If you wanted to create a thread that just yields a > > In Async and Lwt I think it quite possible to have this case even if x > was not just giving you a constant value. For example, x could be > read_char (), and if you have buffered input (like Async does, and > probably Lwt) it is quite possible that read_char will be determined > immediately (since there is no need to issue a system call) Right, if functions like read_char are considered as primitives, you can have that. In Equeue e.g. Uq_io.really_input_e isn't a primitive, but just something built on top of input_e, and if it reads from the buffer, the value is returned via eps_e. So, it depends a bit on the design which functions show this phenomenon.rixed asked:
What if someday you want to use an external library that uses lwt ? Will it be possible to mix the two ? Since this kind of monadic library can easily impose its behavior on all other library around (if for nothing else than the exception mechanism to use), I have the feeling that for us mere users choosing between lwt and async is to choose between two large sets of incompatible libraries, am I wrong ?Yaron Minsky then said and Jérémie Dimino replied:
> It's an excellent question, and one I don't yet have a good feel for. It > would be great to find some kind of modus vivendi which would allow the > libraries to interoperate. I think it is not too hard to mix Lwt.t and Defered.t values, one can start with something like that: let lwt_of_async t = let waiter, wakener = Lwt.wait () in Defered.upon t (Lwt.wakeup wakener); waiter let async_of_lwt t = Defered.create (fun ivar -> Lwt.on_success t (Ivar.fill ivar)) But we need to check how this behaves with error handling, and also the local storage of Lwt. For the scheduler the easiest solution is probalby to write a Lwt engine based on Async.
Thanks to Alp Mestan, we now include in the Caml Weekly News the links to the recent posts from the ocamlcore planet blog at http://planet.ocamlcore.org/. Junior Scala Developer at Hugh Gilmore (Full-time): http://functionaljobs.com/jobs/84/junior-scala-developer-at-hugh-gilmore A First-Principles GIF Encoder: http://alaska-kamtchatka.blogspot.com/2011/10/first-principles-gif-encoder.html Announcing Async: http://ocaml.janestcapital.com/?q=node/100
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.