Caml Weekly News

Hello

Here is the latest Caml Weekly News, for the week of March 13 to 20, 2012.

CVE request: Hash DoS vulnerability (ocert-2011-003)
New version of standalone ocaml interval library + mpfr/mpfi preliminary bindings for ocaml
Arrays and private types
ML workshop 2012: call for presentations
Software Development Engineer at OCamlPro (Paris, France)
Spring release
Efficient scanning of large strings from files
Parsing cmi file
Other Caml News

CVE request: Hash DoS vulnerability (ocert-2011-003)

Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2012-03/msg00186.html

Deep in this thread, Xavier Leroy replied to many people:

Gerd Stolpmann writes:
> The Random module is definitely not good enough (e.g. if you
> know when the program was started like for a cgi, and the cgi reveals
> information it should better not like the pid, the Random seed is made
> from less than 10 unpredictable bits, and on some systems even 0 bits).

Dario Teixeira adds:
> I think the problem may be in finding a good source of randomness
> that is common across all OSes.  In Unixland this problem has
> largely been solved: pretty much everyone supports /dev/random and
> /dev/urandom.  Windows does things differently, however.

David Allsopp adds:
> Does the source of randomness have to be common? The decision to use
> a random seed doesn't need to be limited by a problem getting a good
> cryptographically secure generator on a given OS - you'd simply
> document that the implementation on that particular OS doesn't seed
> with a good PRNG and await a patch from someone who may care in the
> future, but at least the philosophy behind the decision is correct!

We are also thinking of strengthening Random.self_init, for instance
by using /dev/urandom when available.  This said, for randomizing
hashtables or other data structures, we do *not* need a
cryptographically-strong PRNG: we're not generating an RSA key pair or
some other situation where cryptographic quality is required; we're
just making a mild DOS attack impractical.

(Obligatory advertisement: if you're in need of
cryptographically-strong random data,
http://forge.ocamlcore.org/projects/cryptokit/
is what you need.)

New version of standalone ocaml interval library + mpfr/mpfi preliminary bindings for ocaml

Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2012-03/msg00190.html

Jean-Marc Alliot announced:

Everything is in the title.

Just go to :
http://www.alliot.fr/fbbdet.html.fr

Arrays and private types

Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2012-03/msg00203.html

Pietro Abate asked and Gabriel Scherer replied:

> In my application I'm using arrays all over, and lately I've discovered a
> couple of bugs related to the fact that I was using the index of one array 
> to
> get the element of another array. Since both indexes are int the compiler 
> could
> not help me at all. Using private types it seems I can solve this problem
> without loosing anything (??).

Here is a proposal:
  
https://gitorious.org/gasche-snippets/private-array-keys-type/blobs/master/private_array_key_types.ml

It works by using a functor to generate "fresh" private types for
keys. Note that the arrays themselves are still polymorphic (no
IntArray FloatArray etc.). The user still has to use the discipline to
produce a new application of ArrayMake each time she wants to use a
different kind of array: if she only does `module A = ArrayMake(struct
end)` and then use `A` for everything, there will be no additional
safety guarantee.

Later on, Pietro Abate asked and Gabriel Scherer replied:

> Thanks Gabriel, very nice solution. If I go this way, I guess there is
> no way to access array elements using the usual a.(i) syntax (where i
> = M.key i)... [...]
> Is this a problem I can solve using a camlp4 decorator ?

I don't think you need -- nor want to use -- a camlp4 extension. a.(i)
is desugared into (Array.get a i) at a purely syntactical level in
OCaml, so you could overload its behavior by changing the Array module
in the typing environment.

With my example you could write, for example:
module A1 = ArrayMake(struct end)
let () =
  let module Array = A1 in
  let k = A1.key in
  assert (A1.make 3 true).(k 2);;

You could even define the ArrayMake functor so that it returns a
structure with an Array submodule. You would then write, using 3.12
"local open" syntax:

module A1 = ArrayMake(struct end)
let () =
  let open A1 in
  assert (Array.make 3 true).(k 2)

That said, I don't think that the slight readability benefit of
writing a.(i) instead of (get a i) will outweigh the confusion among
your readers that don't understand what you're doing with this weird
Array stuff.

ML workshop 2012: call for presentations

Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2012-03/msg00212.html

Alain Frisch announced:

Another great event for the OCaml community, to be held in conjunction with
ICFP in September... Note that users, not only researchers, are particularly
welcome to propose a presentation, and of course to attend the workshop.

-- Alain

=======================================================================
CALL FOR PRESENTATIONS

ACM SIGPLAN Workshop on ML
Thursday, September 13th, 2012, Copenhagen, Denmark
(co-located with ICFP)

http://www.lexifi.com/ml2012
=======================================================================

The ML family of programming languages includes dialects known as
Standard ML, OCaml, and F#.  These languages have inspired a large
amount of computer-science research, both practical and theoretical.
This workshop aims to provide a forum where users, developers and
researchers of ML languages and related technology can interact and
discuss ongoing research, open problems and innovative applications.

The format of ML 2012 will continue the return in 2010 and 2011 to a
more informal model: a workshop with presentations selected from
submitted abstracts.  The workshop will not publish proceedings, so
any contributions may be submitted for publication elsewhere.  We hope
that this format will encourage the presentation of exciting (if
unpolished) research and deliver a lively workshop atmosphere.


SCOPE
-----

We seek research presentations on topics related to ML, including but
not limited to

  * applications: case studies, experience reports, pearls, etc.
  * extensions: higher forms of polymorphism, generic programming,
    objects, concurrency, distribution and mobility, semi-structured
    data handling, etc.
  * type systems: inference, effects, overloading, modules, contracts,
    specifications and assertions, dynamic typing, error reporting, etc.
  * implementation: compilers, interpreters, type checkers, partial
    evaluators, runtime systems, garbage collectors, etc.
  * environments: libraries, tools, editors, debuggers, cross-language
    interoperability, functional data structures, etc.
  * semantics: operational, denotational, program equivalence,
    parametricity, mechanization, etc.

Three kinds of submissions will be accepted: Research Presentations,
Experience Reports and Demos.

  * Research Presentations: Research presentations should describe new
    ideas, experimental results, significant advances in ML-related
    projects, or informed positions regarding proposals for
    next-generation ML-style languages.  We especially encourage
    presentations that describe work in progress, that outline a
    future research agenda, or that encourage lively discussion.
    These presentations should be structured in a way which can be, at
    least in part, of interest to (advanced) users.

  * Experience Reports: Users are invited to submit Experience Reports
    about their use of ML languages. These presentations do not need
    to contain original research but they should tell an interesting
    story to researchers or other advanced users, such as an
    innovative or unexpected use of advanced features or a description
    of the challenges they are facing or attempting to solve.

  * Demos: Live demonstrations or short tutorials should show new
    developments, interesting prototypes, or work in progress, in the
    form of tools, libraries, or applications built on or related to
    ML.  (Please note that you will need to provide all the hardware
    and software required for your demo; the workshop organizers are
    only able to provide a projector.)

Each presentation should take 20-25 minutes, except demos, which
should take 10-15 minutes.  The exact time will be decided based on
the number of accepted submissions.


SUBMISSION INSTRUCTIONS
-----------------------

Submissions should be at most two pages, in PDF format, and printable
on US Letter or A4 sized paper. Submissions longer than a half a page
should include a one-paragraph synopsis suitable for inclusion in the
workshop program.

Submissions must be uploaded to the following website before the
submission deadline (2012-06-04):

  https://www.easychair.org/conferences/?conf=ml2012

For any question concerning the scope of the workshop or the
submission process, please contact the program chair
(alain AT frisch.fr).


IMPORTANT DATES
---------------

  * 2012-06-04: Submission
  * 2012-07-13: Notification
  * 2012-09-13: Workshop


PROGRAM COMMITTEE
-----------------

    Alain Frisch (chair)    (LexiFi)
    Anders Schack-Nielsen   (SimCorp)
    Cedric Fournet          (Microsoft Research)
    Francois Pottier        (INRIA)
    Gian Ntzik              (Imperial College)
    Jeremy Yallop
    Keiko Nakata            (Institute of Cybernetics, Tallinn)
    Matthias Blume          (Google)
    Oleg Kiselyov
    Stephen Weeks           (Jane Street Capital)
    Tomas Petricek          (University of Cambridge)


STEERING COMMITTEE
------------------

    Andreas Rossberg        (Google)
    Chung-chieh Shan        (Cornell University)
    Eijiro Sumii (chair)    (Tohoku University)
    Jacques Garrigue        (Nagoya University)
    Matthew Fluet           (Rochester Institute of Technology)
    Robert Harper           (Carnegie Mellon University)
    Yaron Minsky            (Jane Street)

Software Development Engineer at OCamlPro (Paris, France)

Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2012-03/msg00216.html

Thomas Gazagnaire announced:

We (OCamlPro) are looking to recruit an excellent software engineer.

OCamlPro[1] is a small French company located near Paris. It is devoted to 
promote the use of OCaml to IT professionals, as a way to make industrial 
software more reliable. We participate to the development of OCaml and we 
create new tools and libraries to facilitate the use of OCaml in large 
industrial software projects (such as the TypeRex[2] development studio).

We especially seek for candidates with:

* strong problem solving skills, ie. the ability to find clean and elegant 
solutions to complex engineering problems;

* very strong experience of developing applications in OCaml (at least 3 
years);

* good understanding and knowledge of the OCaml compilers and runtimes' 
internals;

* experience in extending, contributing and maintaining open-source libraries 
and tools;

* the determination to work with our customers to understand and analyze 
their needs, and deliver great products to fulfill them.

If you are interested by this position, please email your C.V. with a 
description of some of your best accomplishments to: 
contact AT ocamlpro.com

Thanks,

[1] http://www.ocamlpro.com
[2] http://www.typerex.org

Spring release

Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2012-03/msg00225.html

Daniel Bünzli announced:

I'd like to announce the release of Cmdliner 0.9.1, React 0.9.3, 
Rtime 0.9.2, Uuidm 0.9.4, and Xmlm 1.1.0.

There's a mix of bug fixing and small feature additions. Consult the 
individual release notes given below. 

But foremost all the modules now support OASIS and use
setup.ml for the distribution. Thanks to Sylvain for his work and
taking time to respond to my questions.

The tarballs were tested with `odb.ml` and seem to install fine
that way. Before they eventually find their way into oasis-db
(it doesn't support oasis 0.3 yet) you'll find a few lines here [1] 
that you can add to your `packages` file to access them directly.

Best, 

Daniel

[1] http://erratique.ch/software/odb-packages.txt

* Cmdliner v0.9.1 http://erratique.ch/software/cmdliner

- OASIS support.
- Fixed broken `Arg.pos_right`.
- Variables `$(tname)` and `$(mname)` can be used in a term's man
  page to respectively refer to the term's name and the main term
  name.
- Support for custom variable substitution in `Manpage.print`.
- Adds `Term.man_format`, to facilitate the definition of help commands.
- Rewrote the examples with a better and consistent style. 

# Incompatible API changes

- The signature of `Term.eval` and `Term.eval_choice` changed to make
  it more regular: the given term and its info must be tupled together
  even for the main term and the tuple order was swapped to make it
  consistent with the one used for arguments.




* React v0.9.3 http://erratique.ch/software/react

- OASIS support


* Rtime v0.9.2 http://erratique.ch/software/rtime

- OASIS support.


* Uuidm v0.9.4 http://erratique.ch/software/uuidm

- OASIS support.
- New functions `Uuidm.v3` and `Uuidm.v5` that generate directly these 
  kinds of UUIDs.
- New function `Uuidm.v4_gen` returns a function that generates
  version 4 UUIDs with a client provided random state. Thanks to Lauri
  Alanko for suggesting that `Random.make_self_init` may be too weak
  for certain usages.




* Xmlm v1.1.0 http://erratique.ch/software/xmlm

- OASIS support.
- Fixes a bug in the UTF-16 decoder.
- Fixes a bug in `Xmlm.make_output` with a custom function. Thanks to
  Konstantinas Myalo for the report and the patch.
- New optional argument `decl` to `Xmlm.make_output` to control whether the
  XML declaration should be output.
- New function `Xmlm.output_depth`, returns the current element nesting level.

Efficient scanning of large strings from files

Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2012-03/msg00217.html

Philippe Veber asked and Jérémie Dimino replied:

> Say that you'd like to search a regexp on a file with lines so long
> that you'd rather not load them entirely at once. If you can bound
> the size of a match by k << length of a line, then you know that you
> can only keep a small portion of the line in memory to search the
> regexp. Typically you'd like to access substrings of size k from left
> to right. I guess such a thing should involve buffered inputs and
> avoid copying strings as much as possible. My question is as follows:
> has anybody written a library to access these substrings gracefully
> and with decent performance? Cheers,

You can use a non-backtracking regexp library to find offsets of the
substrings, then seek in the file to extract them. You can use for
example the libre library from Jérôme Vouillon [1]. It only accept
strings as input but it would be really easy to make it work on input
channels (just replace "s.[pos]" by "input_char ic").

  [1] http://sourceforge.net/projects/libre/
      https://github.com/avsm/ocaml-re.git

Parsing cmi file

Archive: https://sympa-roc.inria.fr/wws/arc/caml-list/2012-03/msg00177.html

Bob Zhang asked and Gerd Stolpmann replied:

>    I noticed that Godi can pretty print cmi files, is there already
> libraries parsing cmi files?

Yes, toplevellib.cma (i.e. the ocaml toploop). It's a silly trick. Run the
toploop and do

module M = <NameOfTheCMI>;;

and the toploop responds.

Gerd Stolpmann later suggested and David Brown added:

> I think you can also "use" ocamlc -i for it, maybe easier to wrap.

ocamlc -i will print out all of the types of the .ml file, not just
the interface.  It's useful, but not necessarily what is wanted.

The top trick doesn't work if the code isn't byte-compiled and has the
consequence of running the module.

It would be handy to be able to actually just dump out the .cmi file,
especially in cases, such as prebuilt packages that include the .cmi but
left out the .ml files.

rixed also replied to the original question:

Yes you can do this using the compiler libs (not installed by default
but debian have these in a separate package).
For an exemple of use see for instance the small tool displaying mli
signatures from this git repo:

git clone http://git.gitorious.org/ocalme/cmidump.git

Raphael Proust also suggested:

There is the cmigrep tool found on http://homepage.mac.com/letaris/
I have no idea about current status though;
http://jun.furuse.info/hacks seems to imply it works on 3.12.

Hongbo Zhang then said and Mehdi Dogguy replied:

> I tried, it does not compile, but it would be not hard to fix, I
> guess.

In Debian, we apply the following patches to compile it:

        http://patch-tracker.debian.org/package/cmigrep/1.5-9

FWIW, it compiles and runs perfectly well with any OCaml >= 3.10.

Other Caml News

From the ocamlcore planet blog:

Thanks to Alp Mestan, we now include in the Caml Weekly News the links to the
recent posts from the ocamlcore planet blog at http://planet.ocamlcore.org/.

Ocsigen Js_of_ocaml 1.1 released:
  http://ocsigen.org/

Ocsigen Server 2.0.4 released:
  http://ocsigen.org/

Interval computation library 1.1:
  http://caml.inria.fr/cgi-bin/hump.cgi?contrib=805

OCaml MySQL Protocol 0.4:
  http://caml.inria.fr/cgi-bin/hump.cgi?contrib=804

OCaml MySQL Protocol 0.4 available:
  https://forge.ocamlcore.org/forum/forum.php?forum_id=828

Spring release:
  http://erratique.ch/software

ML workshop 2012: call for presentations!:
  http://www.lexifi.com/blog/ml2012

Unix.open_process* and file descriptors:
  http://blog.rastageeks.org/ocaml/article/unix-open_process-and-file

barbra:
  https://forge.ocamlcore.org/projects/barbra/

whenjobs 0.7.0 released:
  http://rwmj.wordpress.com/2012/03/13/whenjobs-0-7-0-released/

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.

Alan Schmitt