Hello
Here is the latest Caml Weekly News, for the week of May 10 to 17, 2011.
Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2011-05/msg00067.html
Ashish Agarwal announced:I am excited to tell you about a new opportunity to develop OCaml software at The Center for Genomics and Systems Biology (CGSB) at New York University (NYU), located in the heart of Manhattan. The position's main function will be to develop software in the OCaml language to manage, analyze, and display the vast amounts of data generated by next-generation sequencing technologies. NYU's strong commitment to this field is represented by its $100M investment in the brand new CGSB building, which houses the latest sequencing platforms and excellent high performance computing facilities. You will support the computational needs of several experimental labs by contributing to the following infrastructure: o A database for tracking samples, very large quantities of raw data and analysis results o A website for users to submit new samples, monitor progress of their workflow, and visualize data o A system for distributing batch jobs to a cluster, accounting for dependencies between jobs and cached results The ideal candidate will be an experienced functional programmer with knowledge of many OCaml libraries and tools, such as database bindings, ocsigen, ocamlnet, batteries, janestreet-core, etc. Experience in the following areas is a plus but not required: bioinformatics, statistics, type theory, distributed computing, and UNIX systems administration. NYU researchers are using sequencing technologies to investigate basic questions about the nature of life and to address fundamental problems in human health. The very large datasets generated by these technologies pose significant computational challenges for which the robust principles of functional programming are ideally suited. Please contact me to discuss this position further. Thank you.
Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2011-05/msg00069.html
Dario Teixeira asked and Jacques Garrigue replied:> I've seen OCaml code "in the wild" where both of the following signatures > are present: (the type parameter for 't' is a phantom type) > > val foo: [< `A ] t -> unit > val bar: [ `A ] t -> unit > > But is there any practical difference between [ `A ] and [< `A ] given > that there is only one element in the set? In this particular case the two types are almost equivalent. The only counterexample I could find is unifying with the following private row type: type leA = private [< `A] leA is unifiable with [< `A] but not with [`A]. The difference becomes more significant when there is an argument. For instance, [< `A of int] and [< `A of bool] are unifiable, giving [< `A of int & bool], but [`A of int] cannot be unified with [`A of bool]. Note that some old versions of OCaml did some "singleton promotion", i.e. [< `A of int] was automatically converted to [`A of int]. This was removed as an unnecessary complication, and also because you might actually want to distinguish the two for private row types.
Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2011-05/msg00075.html
Prashanth Mundkur:The Disco team is pleased to announce the possibility of doing large-scale data analysis (ala map-reduce) in OCaml. Disco [1] is an open-source distributed computing framework inspired by the map-reduce paradigm. It includes a distributed replicating tag-based filesystem that allows you to store your datasets in a fault-tolerant manner. Disco comes with additional tools: DiscoDB [2] for implementing efficient mapping objects and Discodex [3] for distributed indices for querying large datasets. Disco has been in production use at Nokia for two years, and is used to process terabytes of data daily [4]. The core job scheduling, cluster monitoring and filesystem logic of Disco is written in Erlang, leveraging the strengths of Erlang in concurrency and distribution. The primary language for writing compute jobs is currently Python; however, the latest Disco 0.4 release [5] has opened up the Disco worker interface, allowing jobs written to be written in any language. ODisco is the first available non-Python implementation of this Disco worker interface, and allows distributed processing of large-scale datasets in OCaml. The computation is not restricted to a record-oriented key-value style interface; the OCaml task directly gets access to the input data source and writes the output data in whatever format it chooses. The overall computation however currently still follows the traditional map-reduce dataflow, with map/shuffle/reduce stages. ODisco is available at https://github.com/pmundkur/odisco and also in the 3.12 section of Godi as the godi-odisco package. Please let us know if you have any issues with either ODisco or Disco on the Disco mailing list. Happy hacking! [1] Disco Project, http://discoproject.org [2] DiscoDB, http://discoproject.org/doc/contrib/discodb/discodb.html [3] Discodex, http://discoproject.org/doc/contrib/discodex/discodex.html#discodex [4] Disco at Nokia, http://www.erlang-factory.com/conference/SFBay2011/speakers/VilleTuulos [5] Disco 0.4 release, http://disco.posterous.com/disco-04
Thanks to Alp Mestan, we now include in the Caml Weekly News the links to the recent posts from the ocamlcore planet blog at http://planet.ocamlcore.org/. Dose3 in debian experimental !: https://mancoosi.org/~abate/dose3-debian-experimental OCaml as a SQL*Plus replacement?: http://gaiustech.wordpress.com/2011/05/14/ocaml-as-a-sqlplus-replacement/ Dynlink as dlopen..: http://blog.rastageeks.org/ocaml/article/dynlink-as-dlopen ocaml-geoip: https://forge.ocamlcore.org/projects/geoip/ taskord: https://forge.ocamlcore.org/projects/taskord/ Announcing: OCI*ML: http://gaiustech.wordpress.com/2011/05/13/announcing-ociml/ Running a classical proof with choice in Agda: http://math.andrej.com/2011/05/10/running-a-classical-proof-with-choice-in-agda/ How newcomers can easily contribute to the OCaml Batteries: https://forge.ocamlcore.org/forum/forum.php?forum_id=793
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.