Caml Weekly News

Hello

Here is the latest Caml Weekly News, for the week of October 19 to 26, 2010.

oasis v0.2.0: Architecture for building OCaml libraries and applications
ocaml-data-notation v0.0.3: Store data using OCaml notation
OPLP: Ocaml APache Log Parser
Asynchronous IO programming in OCaml
Generalized Algebraic Datatypes
Other Caml News

oasis v0.2.0: Architecture for building OCaml libraries and applications

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/74221546f1b5ec98#

Sylvain Le Gall announced:

OASIS generates a full configure, build and install system for your
application. It starts with a simple `_oasis` file at the toplevel of your
project and creates everything required.

It uses external tools like OCamlbuild and it can be considered as the glue
between various subsystems that do the job.

It also features a do-it-yourself command line invocation and an internal
configure/install scheme. Libraries are managed through findlib. It has been
tested on GNU Linux and Windows.

It also allows to have standard entry points and description. It helps to
integrates your libraries and software with third parties tools like GODI.

Changelog and full blog post here:
http://www.ocamlcore.com/wp/2010/10/oasis-v02-release/

Homepage:
http://oasis.forge.ocamlcore.org/

Get source code:
$ darcs get http://darcs.ocamlcore.org/repos/oasis

Browse source code:
http://darcs.ocamlcore.org/cgi-bin/darcsweb.cgi?r=oasis;a=summary

Dario Teixeira asked and Sylvain Le Gall replied:

> Do you have plans to make GODI packages for Oasis and its dependencies? 
> (I don't mean using Oasis to automate the generation of GODI packages; 
> I mean GODI packages for Oasis itself).  It's a little step that makes 
> trying out new software all the more convenient... 

I don't have plans for GODI, but I plan to build Debian packages. I don't 
know (yet) GODI enough to do it. But I would help anyone who has plan 
about that. 

If you wan to try OASIS, there is an installer that should work out of 
the box on Linux/Windows (just providing the application, not the 
library). It is on the download page: 
https://forge.ocamlcore.org/frs/?group_id=54

ocaml-data-notation v0.0.3: Store data using OCaml notation

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/f05d552c4de2fc20#

Sylvain Le Gall announced:

This library uses `type-conv` to dump OCaml data structures using OCaml data
notation.

This kind of data dumping helps to write OCaml code generator, like OASIS.

Changes:
* Partial support for polymorphic variant, as used in OASIS v0.2.0

Homepage:
http://forge.ocamlcore.org/projects/odn

Get source code:
$ darcs get http://darcs.ocamlcore.org/repos/ocaml-data-notation

Browse source code:
http://darcs.ocamlcore.org/cgi-bin/darcsweb.cgi?r=ocaml-data-notation;a=summary

OPLP: Ocaml APache Log Parser

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/89d4dffd2307ed01#

Continuing the thread from last week, Oliver said:

> Aha... long ago I had done something similar, and it had an sql-like
> language for using it.
[...]


Here is the URL for that tool:

  http://www.first.in-berlin.de/software/tools/apalogretrieve/

Asynchronous IO programming in OCaml

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/efbcae437e13cf8b#

Jon Harrop asked and Jake Donham replied:

> Is there a tutorial on using something like LWT for asynchronous
> programming
> in OCaml? I'm looking for an example like an echo server that handles
> clients concurrently without blocking threads, so it can handle thousands
> of
> clients without significant performance degradation.

Not a tutorial, but here is a minimal TCP server in LWT:

http://github.com/avsm/ocaml-cohttpserver/blob/master/server/http_tcp_server.ml

Phil Tomson also replied:

There's node.ocaml:  http://github.com/mathgladiator/node.ocaml

Oliver then asked and Dario Teixeira replied:

> Can you recommend papers on monadic programming? 
> Or how did you mastered it? 

"Mastered" it might be too strong a word... :-)  Anyway, my recommendation 
is to simply start using it and let practice do its thing.  (In my case 
practice came from developing Ocsigen/Eliom apps). 

As for books or tutorials, I would suggest taking a look at material for 
learning Haskell.  Recently, some well-publicised Haskell books targeted 
at beginners have come out [1,2].  No introduction to Haskell is really 
complete without also discussing monads. (Reading Haskell is fairly 
straightforward for those familiar with Ocaml, btw). 

[1] http://book.realworldhaskell.org/ 
[2] http://learnyouahaskell.com/

bluestorm added:

A very good introduction to monads for the programmer, in my opinion, is 
"Monads for functional programming", by Philip Wadler 

  http://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/baastad.pdf 

If one wish to stay in OCaml country, there is a blog post by Brian Hurt : 

  http://enfranchisedmind.com/blog/posts/a-monad-tutorial-for-ocaml/

Commenting on the use of LWT, Anil Madhavapeddy said and Jérémie Dimino replied:

> This should work fine for a couple of thousand clients or so, but you'll
> begin to see degradation as the number of clients increase. This is
> because LWT internally uses select(2) to wait for file-descriptors, and
> not the newer kqueue(2) or epoll(2) interfaces. You can read more about
> the "C10K problem" here: [3] http://www.kegel.com/c10k.html
> I've got a stripped-down version of LWT that uses these newer event-driven
> kernel interfaces (and so should be able to cross the 10,000 client
> barrier fairly easily), but it won't be ready for release for another
> month or so.  Drop me an email off-list if you want to try it out earlier.

I made an implementation of lwt using libev [1]. I tested it with
ocsigen and ab but the result was always a bit better with select than
with epoll. That is why i did not replace select by libev in the main
branch. In fact i never found the source of any benchmark comparing
select to epoll on the web.

> Async disk I/O under Linux is annoyingly problematic if you aren't using
> direct I/O (and hence page/block-aligned structures). Avoid if possible :)

The next version of Lwt will support asynchronous disk I/O by using
mincore + mmap. It is already in the development version.

Jérémie

  [1] http://www.dimino.org/cgi-bin/darcsweb.cgi?r=lwt-libev;a=summary

Generalized Algebraic Datatypes

Archive: http://groups.google.com/group/fa.caml/browse_thread/thread/72bcffc35e4cd50d#

Jacques Le Normand announced:

I am pleased to announce an experimental branch of the O'Caml compiler: O'Caml
extended with Generalized Algebraic Datatypes. You can find more information
on this webpage:

https://sites.google.com/site/ocamlgadt/

And you can grab the latest release here:

svn checkout https://yquem.inria.fr/caml/svn/ocaml/branches/gadts

bluestorm asked and Jacques Le Normand replied:

> It's very interesting.
>
> First, I'm curious of the "historical" aspects of this work : where does it
> come from ? Apparently there is work from you and Jacques Garrigue, but it's
> hard to tell. Is it new, or a long-running experiment ?

The history: the algorithm was developed, in part, for my PhD research. I've
been working on it with Jacques Garrigue for the last two months.

> In your "intuition" section (btw. there is a typo here, it should be (type
> s) (x : s t)), you seem to present GADT as directly related to the new (type
> s) construct. It's a bit surprising because it's difficult to know the
> difference between this and classic type variables. I suppose it is because
> only (type s) guarantee that the variable remains polymorphic, and you use
> that to ensure that branch-local unifications don't "escape" to the outer
> level ? Could you be a bit more explicit on this ?

I don't know what you mean by "remains polymorphic". However, (type s) and
polymorphism are quite distinct concepts. Consider the following examples:

# let rec f (type s) (x : s) : s = f x;;
Error: This expression has type s but an expression was expected of type s
       The type constructor s would escape its scope

# fun (type s) ( f : s -> s) ( x : s) -> f x;;
- : ('_a -> '_a) -> '_a -> '_a = <fun>

The reason I chose to use newtypes, ie (type s), is that I needed a type
variable which did not change (I believe the Haskell people call it rigid),
so I decided to use type constructors. Another option, previously
implemented, was to use polymorphic variables, ie:

let rec foo : 'a. 'a t -> t =
    function
        | IntLit x -> x

However, this has several disadvantages, the biggest of which is that  the
variable 'a cannot be referenced inside the expression since its scope is
the type in which it was introduced.

> It's also a bit difficult to know what's the big deal about "exhaustiveness
> checks". As I understand it, you remark that with GADTs some case cannot
> happen due to typing reasons, but the exhaustive check doesn't know about
> it. Could you be a bit more explicit about what the exhaustiveness checker
> does :
> - is it exactly the same behavior as before, ignoring GADT specificities ?
> (ie. you haven't changed anything)
> - if not, what have you changed and how can we try to predict its reaction
> to a given code ?
> - what can we do when it doesn't detect an impossible case ? I suppose we
> can't a pattern clause for it, as the type checker would reject it.

This problem is not new in O'Caml. For example:

# type t = { x : 'a . 'a list } ;;
type t = { x : 'a. 'a list; }
# let _ = function { x = [] } -> 5;;
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
{x=_::_}

however, try creating a value of type ('a. 'a list) satisfying the pattern _
:: _

What I've done is written a second pass to the exhaustiveness checker. The
first pass does the same thing as before, but ignores GADTs completely. The
second pass exhaustively checks every possible generalized constructor
combination.

For example, in the code

type 'a t = Foo : int t | Bar : bool t | Baz : float t

let f : type s. s t * s t * s t -> s =
    function
         Foo, Foo, Foo
      | ....

My code will check all 9 possible patterns and will output any which were
missed. The pattern returned by my algorithm is a valid pattern.

> I'm not sure I understand the example of the "Variance" section.
> Why is the cast in that direction ? It seems to me that even if we could
> force t to be covariant, this cast (to a less general type) shouldn't work :
>
>   # type +'a t = T of 'a;;
>   # let a = T (object method a = 1 end);;
>   # (a :> < m : int; n : bool > t);;
>   Error: Type < a : int > t is not a subtype of < m : int; n : bool > t

I apologize, that should be:

type -'a t = C : < m : int > -> < m : int > t

or, as a constraint:

type -'a t = C of 'a constraint 'a = < m : int >

> Again, you "Objects and variants" and "Propagation" subsections are a bit
> vague. Could you give example of code exhibiting possible problems ?

Propagation:

Currently, this code doesn't compile:

    let rec baz : type s . s t -> s =
      fun (type z) ->
function
    IntLit x -> x : s
  | BoolLit y -> y : s

so you need to add the annotation:

    let rec baz : type s . s t -> s =
      fun (type z) ->
((function
    IntLit x -> x
  | BoolLit y -> y) : s t -> s)

objects (and polymorphic variants):

the following will not compile:

    let rec eval : type s . s t -> s =
      function
| IntLit x -> ignore (object method m : s = failwith "foo" end : < m : int;
..>) ; x

polymorphic variants in patterns:

the following will not compile:

    let rec eval : type s . [`A] * s t -> unit =
      function
| `A, IntLit _ -> ()
| `A, BoolLit _ -> ()

> Finally, a few syntax trolls, in decreasing order of constructivity :
>
> - is it a good idea to reproduce the "implicit quantification" of ordinary
> types ? It seems it could be particularly dangerous here.
>   for example, changing
>     type _ t = Id : 'a -> 'a t
>   to
>     type 'a t = Id : 'a -> 'a t | Foo of 'a
>   introduce, if I'm not mistaken, a semantic-changing variable captures.
>   (I thought other dark corners of the type declarations already had this
> behavior, but right now I can't remember which ones)

type 'a t = Id : 'a -> 'a t | Foo of 'a

is the same as

type 'b t = Id : 'a -> 'a t | Foo of 'b

In other words, the type variables in generalized constructor definitions
are distinct from the type parameters.

> - if I understand it correctly, (type a . a t -> a) is yet another syntax
> for type quantification. Why ? I thought (type a) was used to force
> generalization, but ('a . ...)-style annotation already force polymorphism
> (or don't they ?). Is it a semantic difference with ('a . 'a t -> 'a), other
> than its interaction with gadts ? Can we use (type a . a t -> a) in all
> places where we used ('a . 'a t -> 'a) before ?

(type s) does not force generalization (see above); this is why this new
syntax is needed. You can use (type a . a t -> a) anywhere you used ('a. 'a
t -> 'a) could before, assuming that you don't have any types a that you
don't want hidden. This syntax extension is purely syntactic sugar.

>
> - is there a rationale for choosing Coq-style variant syntax instead of
> just adding a blurb to the existing syntax, such as
>     | IntLit of int : int t
>   or
>     | IntList of int return int t
>   ?

The only rationale is that I want to make it clear that the type variables
found inside generalized constructor definitions are distinct from the type
parameters. In your second example, return is not a keyword in O'Caml. I
could very well have chosen your first example. If there is a consensus on
some alternate syntax, I have no qualms about changing it.

Thank you for the feedback. I will add some of these things to my webpage.

Other Caml News

From the ocamlcore planet blog:

Thanks to Alp Mestan, we now include in the Caml Weekly News the links to the
recent posts from the ocamlcore planet blog at http://planet.ocamlcore.org/.

Compiling pcre-ocaml with Visual Studio 2008:
  http://le-gall.net/sylvain+violaine/blog/index.php?post/2010/10/21/Compiling-pcre-ocaml-with-Visual-Studio-2008

ocaml-data-notation: release 0.0.3:
  http://forge.ocamlcore.org/forum/forum.php?forum_id=712

Stack Overflow:
  http://gaiustech.wordpress.com/2010/10/22/stack-overflow/

Bookaml:
  https://forge.ocamlcore.org/projects/bookaml/

OASIS v0.2 release:
  http://www.ocamlcore.com/wp/2010/10/oasis-v02-release/

Unison on windows tips:
  http://le-gall.net/sylvain+violaine/blog/index.php?post/2010/10/20/Unison-on-windows%3A-tips-and-tricks

OCaml-HTTP 0.1.4-3:
  http://caml.inria.fr/cgi-bin/hump.cgi?contrib=356

Random.self_init:
  http://misterpingouin.blogspot.com/2010/10/randomselfinit.html

ocaml-http has moved:
  http://upsilon.cc/~zack/blog/posts/2010/10/ocaml-http_has_moved/

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.

Alan Schmitt