OCaml Weekly News

Hello

Here is the latest OCaml Weekly News, for the week of March 31 to April 07, 2015.

GADTs and Menhir
OCamp - Reactive programming in the shell
Mathematical Expression Library
Other OCaml News

GADTs and Menhir

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2015-03/msg00169.html

Andre Nathan asked:

I'm trying to learn a bit about GADTs, but I can't figure out how to
make them work with Menhir.

I have defined the following GADT:

(* foo.ml *)

type 'a t =
  | Int : int -> int t
  | Bool : bool -> bool t

With this definition I can write an "eval" function as

let eval (type t) (foo : t Foo.t) : t =
  match foo with
  | Foo.Int i -> i
  | Foo.Bool b -> b

Now considering the simple parser and lexer below,

(* parser.mly *)

%{ open Foo %}

%token <int> INTEGER
%token <bool> BOOL
%token EOF

%start <'a Foo.t> start
%%

start:
  | value; EOF { $1 }
  ;
value:
  | i = INTEGER  { Int i }
  | b = BOOL     { Bool b }
  ;

(* lexer.mll *)

{ open Parser }

let digit = ['0'-'9']
let boolean = "true" | "false"

rule token = parse
  | [' ' '\t']             { token lexbuf }
  | '\n'                   { Lexing.new_line lexbuf; token lexbuf }
  | digit+ as i            { INTEGER (int_of_string i) }
  | boolean as b           { BOOL (bool_of_string b) }
  | eof                    { EOF }

when I try to compile this I get the error below:

File "parser.mly", line 15, characters 43-52:
Error: This expression has type int Foo.t
       but an expression was expected of type bool Foo.t
       Type int is not compatible with type bool

This is the

  | b = BOOL     { Bool b }

line. I believe the error comes from not having the locally-abstract
type annotations in the code generated by Menhir.

Is there any way around this?

Francois Pottier replied:

> value:
>   | i = INTEGER  { Int i }
>   | b = BOOL     { Bool b }

The problem lies here. What is the OCaml type of the symbol "value"? The OCaml
compiler complains that in one branch it seems to be int Foo.t, but in the
other branch it seems to be bool Foo.t. (The problem does not have anything to
do with Menhir, really.)

Usually, it is not advisable to try to perform parsing and type-checking in
one single phase. So, my advice would be to:

  1- perform parsing in the usual way,
     constructing ASTs in a normal algebraic data type "ast"
     (not a GADT);

  2- perform type-checking (or type inference).
     If you insist on using GADTs, you will probably need
     a GADT of terms (your type 'a Foo.t) and
     a GADT of type representations ('a ty)
     and the type-checking function will have type
     ast -> 'a ty -> 'a Foo.t option
     which means that, given an untyped term and
     a representation of an expected type, it
     either fails or succeeds and produces a typed term.

There are examples of this kind of thing online, e.g.
http://www.cs.ox.ac.uk/projects/gip/school/tc.hs
by Stephanie Weirich.

Andre Nathan then asked and later added:

> Can you give me an example of what this "GADT of type representations"
> would look like? I couldn't understand the Haskell example...

I found a reference to an email from Jeremy Yallop to the list from 2013
[1], and managed to get it working with the following solution:

(* foo.ml *)
open Printf

type 'a t =
  | Int : int -> int t
  | Bool : bool -> bool t

type ast =
  [ `Int of int
  | `Bool of bool
  ]

type any =
  | Any : 'a t -> any

let typed = function
  | `Int i -> Any (Int i)
  | `Bool b -> Any (Bool b)

let print : type a. a t -> unit = function
  | Int i -> printf "%d\n" i
  | Bool b -> printf "%b\n" b

The parser now returns a `Foo.ast`:

value:
  | i = INTEGER { `Int i }
  | b = BOOL    { `Bool b }
  ;

and with that I can print the parsed value with

  let ast = Parser.start Lexer.token lexbuf in
  let Foo.Any t = Foo.typed ast in
  Foo.print t

I'm happy that it works, but the `any` type is a bit of a mistery to me.
In Jeremy's email it's explained as

"An existential to hide the type index of a well-typed AST, making it
possible to write functions that return constructed ASTs whose type is
not statically known."

Does anyone have a reference to literature that explains this technique
(I'm guessing that would be Pierce's book)? The OCaml manual briefly
shows an example with a `dyn` type, but not much is said about it.

Jeremy Yallop replied:

There's a more detailed explanation of the technique in the lecture
notes on the following page:

   http://www.cl.cam.ac.uk/teaching/1415/L28/materials.html

The section you want is

    8.4.2 Pattern: building GADT values

on p83 of the notes from 9 February, but it might also be helpful to
look through some of the earlier notes for background.

Gerd Stolpmann also replied:

A couple of weeks ago I ran into the same problem. It is just a matter
of not being accustomed to how GADTs work, and not having the right
design patterns in your mind.

OCaml users are used to expose any polymorphism and that it is not
directly possible to hide it. So, normally if there is a variable 'a on
the right side of the type definition, it must also appear on the left
side (e.g. type 'a foo = Case1 of 'a | Case2 of ...). This is not
required for GADTs, because the variable is bound by the case it applies
to (i.e. for the Int case you have 'a=int and for the Bool case you have
'a=bool). So it is implicitly known. Because of this, many people prefer
to write

type _ t =                  (* look here, no 'a anymore *)
  | Int : int -> int t
  | Bool : bool -> bool t

and 

type any =
  | Any : _ t -> any

The type parameter still exists because t is still polymorphic, but you
cannot do anything with it unless you match against the cases. Now,
"any" goes a step further, and even discards this parameter. It's a
matter of perspective: if 'a is case-dependent but not dependent to
anything outside t, you can also consider t as monomorphic. In other
words: Knowing all cases binds 'a. 

You need Any if you want to put several t values into a container, e.g.

Does not work: [ Int 34; Bool true ]
Does work:     [ Any(Int 34); Any(Bool true) ]

This nice thing with GADTs is that you can "undo" this change of
perspective by matching against the cases:

let print = ... (* as you defined it *)
let print_any (Any x) = print x
let () =
  List.iter print_any [ Any(Int 34); Any(Bool true) ]

OCamp - Reactive programming in the shell

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2015-04/msg00005.html

Frédéric Bour announced:

OCamp extends unix shells with constructions to express memoization, sharing
of computations and reactive programming.

# Subcommands

## fire

Just wrap a unix command with "ocamp fire" to enable the extension:
  $ ocamp fire bash

This will spawn a new bash session where the following subcommands are
enabled.

## hipp

  $ ocamp hipp <command>

Will memoize the output and exit status of <command>.
Later calls to the same <command> won't lead to actual execution, but just to
a duplication of its previous output.
Concurrent calls to <command> will just share the same process, the beginning
of the output being replayed to later callers.

The identity of a command is defined by its arguments and working directory.

## stir

  $ ocamp stir <command>

Indicate potential changes in the output if <command> was rerun.
Later calls to `hipp` will recompute <command> as if it was not yet memoized.

## (un)follow

  $ ocamp follow <command>

First, <command> is memoized if it was not the case yet.
Then changes to dependencies of <command> will trigger a reevaluation.
Use `stir` to notify a change.

(to follow is an hipp/stir reactivity).

## pull

  $ ocamp pull <command>

Closely related to `hipp`, but instead of marking dependency on the output of
<command>, the dependency applies to the "effects" of <command>.

Thus, if `stir` is used:
- all pullers will be reevaluated.
- hippers will be reevaluated only if the output is different.

## Summary

  $ ocamp fire <command> - setup a new session alive until <command> exits
          pull <command> - mark dependency on effects of <command>
          hipp <command> - mark dependency on output of <command>
          stir <command> - notify that <command> might have been updated
          follow <command> - eval <command>, and reactively recompute it
                             whenever one of its dependencies change.
          unfollow <command> - stop recomputing <command> when dependencies
                               change

hipp and pull provide memoization.
stir and follow bring a flavor of reactive programming.

# Examples

## Fibonacci

  $ cat fib.sh
  #!/bin/sh
  ARG="$1"
  if [ "$ARG" -le 1 ]; then
    echo "$ARG"
  else
    A=`ocamp hipp ./fib.sh $((ARG-1))`
    B=`ocamp hipp ./fib.sh $((ARG-2))`
    echo $((A+B))
  fi

  $ time ocamp fire ./fib.sh 50
  12586269025
    real    0m0.391s
  user    0m0.153s
  sys     0m0.060s

## Build-system

`ocamp` provides simple primitives to construct and manage a dependency graph.

This might be a saner foundation to base a build-system on than make(1):
- the command focus on one specific problem
- no dsl is involved; rules can be plain unix commands, including a shell,
rather than a make-flavored simulation of shell
- nothing is provided for resolving goals; indeed this is better left to tools
specifically built for goal-search.

A quick'n'dirty script building ocamp itself is provided as an example.

# Future

The current release is a proof-of-concept and should be considered alpha
quality.
The two features planned next are a way to make the graph persistent (all data
is kept in memory atm) and an interface to debug and/or observe graph
construction.

Note: code is undergoing legal review and should be available soon \o/

He later added:

Code is finally available at:
    https://github.com/def-lkb/ocamp

Mathematical Expression Library

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2015-04/msg00013.html

Kenneth Adam Miller asked:

Is there a library somewhere where I can represent and simplify simple bit
operation expressions? Add, subtract, exclusive or, or, and, divide, multiply,
modulus, composed recursively, and operations on the expression type, such as
simplify?

Ashish Agarwal replied:

If you don't find anything else, maybe some of our old code from PADL 2010 can
help as a starting point [1]. Probably it'll be easier to put something
together from scratch, assuming you don't care about performance or need
variables.

[1] http://ashishagarwal.org/2010/01/18/automating-mp-transformations/

Ivan Gotovchits suggested:

If the expression, that you’re trying to simplify is actually a BAP’s BIL,
then there’re some optimizations in the `Bil` module. The most interesting is
constant folding and purging unused variables. You can also use fixpoint
function to drive the optimization passes. 

In any case, even if it is not a BIL, you can look at BAP’s constant folder
class.

Markus Weißmann also suggested:

Just for completeness sake: There is also the "boolean expression simplifier"
[1] library implementing the Quine-McCluskey algorithm (and friends) in pure
OCaml.
I'd guess that a decent SMT solver will solve the "raw" expression much faster
than running it through the simplifier and then solving it (with the same
solver).

regards
Markus

[1] http://bes.forge.ocamlcore.org/

Other OCaml News

From the ocamlcore planet blog:

Thanks to Alp Mestan, we now include in the OCaml Weekly News the links to the
recent posts from the ocamlcore planet blog at http://planet.ocaml.org/.

Ideal finger 001:
  http://erratique.ch/log/2015-04-05

Towards Heroku for Unikernels: Part 2 - Self Scaling Systems:
  http://amirchaudhry.com/heroku-for-unikernels-pt2

Another PhD position in Ljubljana:
  http://math.andrej.com/2015/04/03/another-phd-position-in-ljubljana/

Reviewing the second year of OCaml Labs in 2014:
  http://anil.recoil.org/2015/04/02/ocamllabs-2014-review.html

Pearl No.3 - Saddleback Search:
  http://typeocaml.com/2015/03/31/pearl-no-3-saddleback-search/

Towards Heroku for Unikernels: Part 1 - Automated deployment:
  http://amirchaudhry.com/heroku-for-unikernels-pt1

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.

Alan Schmitt