Caml Weekly News

Previous week Up Next week
Hello,

Here is the latest Caml Weekly News, week 24 June to 01 July, 2003.

1) Ocaml and 64 bit address spaces?
2) type inference visualization
3) OCaml not automatically specialising a polymorphic function
4) Release version 0.2 of camlserv: the first stable release
5) syntax of private constructors in CVS version
6) how to use a module
7) ocaml polymorphic variants and double coercion

==============================================================================
1) Ocaml and 64 bit address spaces?
------------------------------------------------------------------------------
** Niall Dalton asked and Xavier Leroy answered:

> I haven't had a chance to check, but does the Ocaml compiler
> allow one to use 64 bit address spaces on 64 bit machines? (modulo
> the requisite OS support etc., limits with address lines and so on)
> I'm interested in particular 64-bit ppc and x86-64 based machines.
> If it does, I'd also be interested in any gotchas that anyone
> has run across when working with very large heap sizes (several
> GB).

Yes, on Alpha and Itanium machines.

On Sparc, Mips and PPC, it should be possible to compile the bytecode
interpreter in 64-bit mode, but the native-code compiler still
generates 32-bit code.  That might change if there is enough demand,
e.g. for the new G5 machines.

The x86-64 port of OCaml will be fully 64-bit once it is completed,
that is, shortly after I get access to one of these machines.
However, given the lack of "testdrive" programs for the x86-64, I have
no idea when I'll get a shell account on one of these toys.

==============================================================================
2) type inference visualization
------------------------------------------------------------------------------
** Hugo Simoes announced:

I did a Web application for the visualization of the type inference process for
the Simple Type System and pure ML. You can "play" with it at:

http://www.ncc.up.pt/typetool

I intend to add other type systems in the future.

==============================================================================
3) OCaml not automatically specialising a polymorphic function
------------------------------------------------------------------------------
** Richard Jones asked and Xavier Leroy answered:

> - max3.ml ------------------------------------------------------------
> let max a b =
>   if a > b then a else b
> in
>
> print_int (max 2 3);;
> ----------------------------------------------------------------------
>
> Looking at the assembler, OCaml doesn't work out that "max" has type
> max : int -> int -> int

Yes, because it has type 'a -> 'a -> 'a according to the ML typing rules...
But if you help OCaml with a type constraint, you'll get the more
efficient code that you expect:

  let max (a:int) (b:int) = ...

> , and so it generates very inefficient
> code. This is a bit surprising because (I think) max can't be called
> from outside the module, and the one place where it is called
> specifies the type.
> So I'm guessing here that OCaml doesn't really optimize across
> functions?

You're correct that OCaml doesn't do type specialization for
user-defined functions (only for some predefined operations like ">").
It does perform some inter-function optimizations such as function
inlining and known call optimizations.

==============================================================================
4) Release version 0.2 of camlserv: the first stable release
------------------------------------------------------------------------------
** Jurjen Stellingwerff announced:

Camlserv is a full functioning server for highly dynamic websites
Many bugs are fixed:
. cookie support
. correct SSL support
. more informative error messages
Short hand alternative notation of language tags (<when test="...">
becomes #if ...)
Speedup due to optimization of often used algorithms
API-Change of 'conn_file' function to include days before file expires

Now I want to focus on providing example scripts and projects. This will
include a small ocaml-only project (using modules without the internal
scripting language).

==============================================================================
5) syntax of private constructors in CVS version
------------------------------------------------------------------------------
** Shaddin Doghmi asked:

From the CVS version  changes file, under language features:

- Support for "private types", or more exactly concrete data types
 with private constructors or labels.  These data types can be
 de-structured normally in pattern matchings, but values of these
 types cannot be constructed directly outside of their defining module.

What is the syntax for those? is there some kind of documentation on the
CVS server, like a development version of the manual or something?

** Brian Rogoff answered:

No documents are available yet. Check out the following:

bpr@granite[bpr]$ ledit ~/bin/ocaml
        Objective Caml version 3.06+36 (2003-06-19) 

# type t = private { x : int; y : int } ;;
type t = private { x : int; y : int; }
# type ('a, 'b) choice = private L of 'a | R of 'b;;
type ('a, 'b) choice = private L of 'a | R of 'b
# let p0 = { x = 0; y = 0 } ;;
One cannot create values of the private type t
# module Point2D :
    sig
      type t = private { x : int; y : int }
      val grid : int
      val mkPoint : int -> int -> t
    end =
  struct
    type t = { x : int; y : int }
    let grid = 10

    let snap i =
      let rem = i mod grid in
      if rem <= 0 then
        i - rem
      else
        i - rem + grid

    let mkPoint x y = { x = snap x; y = snap y }
  end;;
module Point2D :
  sig
    type t = private { x : int; y : int; }
    val grid : int
    val mkPoint : int -> int -> t
  end
# open Point2D;;
# match mkPoint 5 5 with { x = x; y = y } -> Printf.printf "(%d, %d)n" x y;;
(10, 10)
- : unit = ()

** Pierre Weis lectured:

[Sketching the semantics of private types, this message is long.]

You're absolutely right on your comments and explanation. However, I
think we also need to explain the ideas beneath private types and
their intended uses.

As you should already know, usual sum and product types in Caml
correspond to the mathematical notion of free algebraic data
structures. This is fairly useful and allows the modelization of a lot
of common data structures in practical programming
situations. However, besides the free algebra structures,
mathematicians use a lot of non-free algebras (so-called structures
defined via generators and relations). Private types aim at giving a
way to modelize those non-free structures. Equivalenetly, you can
think of private types as providing a way to implement types equipped
with invariants, quotient structures (i.e. sets of equivalence classes
for some equivalence relation on a free algebra), or data types with
canonical normal forms.

As an example, consider a simple modelization of natural integers in
unary notation:

type nat =
   | Zero
   | Succ of nat;;

The nat type is a regular sum type; a value of the nat type is either
the constant constructor Zero (modelizing the integer 0) or
constructed with any application of the constructor Succ to a nat
value (Succ n modelizing the successor of n or n + 1). Hence, the nat
type is adequate in the sense that it exactly modelizes the unary
representation of positive integers. Nothing new here, and we can
easily write a test to 0 as:

let is_zero = function
  | Zero -> true
  | _ -> false;;

Now, let's go one step further and imagine we modelize integers
(either positive or negative). Even if a complex (adequate)
modelization of those numbers using a regular Caml data type is not
very complex, let's take, for the sake of explanation, a simple
approach by generalizing the nat type; we simply add an extra Pred
constructor to modelize negative integers: Pred n is the predecessor
of n or n - 1:

type integer =
   | Zero
   | Succ of integer
   | Pred of integer;;

Now, we can modelize all integers in unary notation but the integer
type is no more adequate, in the sense that we have a problem of
unicity of representation: a given number can be modelized by
(infinitely) many distinct values of the integer data type. For
instance, the terms Succ (Pred Zero) and Zero all represent the number
0. You may think that this is not a big problem but in fact it is very
ennoying, because when writing a function over values of type integer,
you must now pattern match a lot of patterns instead of the only one
that canonically represents a given integer. For instance

let is_zero = function
  | Zero -> true
  | _ -> false;;

is no more the expected predicate. We must write something much more
complex, for instance

let rec is_zero = function
  | Zero -> true
  | Succ (Pred n) -> is_zero n
  | Pred (Succ n) -> is_zero n
  | n -> false;;

(BTW: is this code now correct ?)

A private type will solve this problem. We define a module Integer
whose interface defines the integer type as a private type equipped
with functions that construct values in the different summands of the
type:

type integer = private
   | Zero
   | Succ of integer
   | Pred of integer;;

value zero : unit -> integer
value succ : integer -> integer
value pred : integer -> integer

In the implementation of the module, we keep the definition of the
type as regular (say ``public'', if you want) and implement the
constructing functions such that they ensure the unicity of
representation of integers:

let zero () = Zero;;

let succ = function
  | Pred n -> n
  | n -> Succ n;;

let pred = function
  | Succ n -> n
  | n -> Succ n;;

It is now easy to prove that all the integers built with applications
of functions pred and succ have a unique representation (or
equivalently that no tree of type integer obtained by composing those
functions can have an occurrence of the redexes (Pred (Succ ...)) or
(Succ (Pred ...))).

Now, since all the clients of the Integer module cannot construct
values of type integer by directly applying the constructors and must
use the constructing functions instead, we know for sure that the
unicity property holds. Hence, the naive (and natural) version of
is_zero is correct again.

In conclusion: private types serve to modelize types with arbitrary
algebraic invariants (no miracle here: you must program those
invariants in the definition of the constructing functions).

In contrast with abstract data types, the private types still maintain
the elegant pattern matching facility of ML outside the module that
defines the type.

** Chris Hecker asked:

>In contrast with abstract data types, the private types still maintain
>the elegant pattern matching facility of ML outside the module that
>defines the type.

So, when do we get views to solve this problem for abstract types?  If
pattern matching is so great (which it is) it seems a crime to not allow it
on abstract types (another great feature).  To make an american candy joke,
peanut butter and chocolate taste great together!

** Brian Rogoff said:

I guess we all know what you're going for when you hit that vending machine!

I was going to reply to a message in last week's thread about polytypic
programming in which the use of open recursion/fixed points on functions was
demonstrated, that there is a "dual" trick at the level of types, the
so-called two level types. One place this trick finds some use is in the
encoding of views into ML like languages. You can find the trick demonstrated
for SML here

http://www.cs.princeton.edu/~danwang/drafts/recursion-schemes.pdf

but I'm not sure how much using this style costs in efficiency in OCaml.
If it costs too much, we'll have to bug Stephen Weeks to implement OCamlton.
BTW, it's nice to have views (and all kinds of things) supported directly in
the language but these tricks seem workable to me. It seems like OCaml
could be a bit nicer than SML here (what a surprise! :), since we could use
polymorphic variants for the external interface, as there is no type
abstraction there. For example, something like the following for the earliest
examples in that paper

module type NAT =
  sig
    type t
    type 'a s = [`Zero | `Succ of 'a]
    val inj : t s -> t
    val prj : t -> t s
  end

module SimpleNat : NAT =
  struct
    type 'a s = [`Zero | `Succ of 'a]

    (* Look ma, no intermediate constructor for recursive types! *)
    type t = t s

    let inj x = x
    let prj x = x
  end

module FastNat : NAT =
  struct
    type 'a s = [`Zero | `Succ of 'a]
    type t = int

    let inj = function
        `Zero -> 0
      | `Succ n -> n + 1

    let prj = function
        0 -> `Zero
      | n -> `Succ (n - 1)
  end

As a digression, the two level type trick is also used (in OCaml) here

http://www.eecs.harvard.edu/~nr/pubs/mania-abstract.html
(This is a very appealing paper to me, as the embedded language technique the
author describes looks applicable to a few things I'm doing)

and here

http://homepage.iis.sinica.edu.tw/~trc/padl00.ps

and of course in a ton of Haskell/Gofer/Squiggol oriented papers.

As another digression, the trick of using open recursive functions and "fixing"
them manually with special fps is explained nicely here:

 http://www.lfcs.informatics.ed.ac.uk/reports/97/ECS-LFCS-97-375/

though the post last week was the first time I saw it used in quite that way.
That was quite neat. I wonder how that technique dovetails with two level
types?

** Pierre Weis also answered Chris Hecker:

> So, when do we get views to solve this problem for abstract types?  If
> pattern matching is so great (which it is) it seems a crime to not allow it
> on abstract types (another great feature).  To make an american candy joke,
> peanut butter and chocolate taste great together!

Hem, the introduction of the private type feature was just my answer   
to this question. I wanted to get the security of type abstraction   
without the drawback of loosing pattern matching.

Put it another way:

-- regular data types are concrete, you can freely create their values
with constructors or labels, you can inspect their values via pattern
matching,

-- private types are ``semi concrete'', you cannot freely create their
values (you are obliged to use their construction functions) but you  
can still inspect their values via pattern matching,

-- abstract types are not concrete, you cannot freely create their
values (you are obliged to use the construction functions they
provide) and you cannot inspect their values via pattern matching (you
are also obliged to use the inspection functions they provide).

Yet another way (coarse and oversimplified view (:)):

-- concrete types are freely readable/writable data types,

-- private types are freely read only data types,

-- abstract types are nor freely readable nor freely writable data types.

And some specific advantages of each:

-- concrete types give you the maximum freedom (no invariants to   
respect, no burden with construction functions), but you get a maximum
dependancy on the implementation (modification of the concrete type is
visible all over the place). Addition of operators is easy.

-- abstract types give you the maximum security (strict enforcement of
invariants by the compiler), no dependancy on the implementation that
can be modified as required by the implementor of the type, no
possibility of confidentiality leaks: the contents of a value that can
have truely hidden parts. Addition of operators is difficult or impossible.

-- private types give you the same security as abstract types for
invariants and the same facility for the addition of new
operators. You loose the ability to hide parts of values.

In my mind, private types give you for free a lot of useful facilities
that you could have with views, while retaining invaluable, both
conceptual and implementationnal, simplicity.

You may be right that eventually Caml could still need an extra
feature as powerful and complex as views for abstract data types, but
for the time being let's start and experiment with this now available,
simple, and effective feature. It has already proved to be incredibly
powerful and efficient for some applications. So let's go and discover
where we still need more :)

I mean, try it :)

==============================================================================
6) how to use a module
------------------------------------------------------------------------------
** Jung Woon Ho asked and Jean-Baptiste Rouquier answered:

>I'm having trouble (using the modules or the files that I wrote) (in the
>interactive mode of  OCaml).

(*First, let Caml know where you put your files if they aren't in the
current directory :*)
#directory "foo/bar";;

(*This is the "include" directive (see the manual) :*)
#use "baz.ml";;

(*if you file is already compiled, you can use :*)
#load "qux.cmo";;  

(*but you have to "#load" all the files that egg uses, for instance if
egg.ml contains "Graphics.open_graph", then you must load graphics before
(as when you link the main programm) :*)
#load "graphics.cma";;
#load "egg.cmo";;

(*If you want to state "quz ()" instead of "Egg.quz ()", just open this
module (it requires egg.cmi in the list of searching directories) :*)
open Egg;;

** Matt Gushee added:

> (*if you file is already compiled, you can use :*)
> #load "qux.cmo";;

If you know in advance what you want to load, you can also load it by
passing it as an argument to the toplevel, e.g.:

  $ ocaml qux.cmo

I prefer this approach because it cuts down on the characters I have to
retype when I make errors (Note to OCaml team: can we *please* get some
kind of history-editing functionality in the toplevel? I know there are
licensing and portability issues with GNU readline, but what about the
getline library used by OCaml2?)

This approach is also handy if you want to quickly check your syntax:

  $ ocaml my_new_module.ml

This doesn't give you an interactive shell, so it's different from

  $ ocaml
  # #use "my_new_module.ml";;

But it will tell you if there are errors in your code.

** Remi Vanicat said:

> (Note to OCaml team: can we *please* get some kind of
> history-editing functionality in the toplevel? I know there are
> licensing and portability issues with GNU readline, but what about
> the getline library used by OCaml2?)

ledit is there for this. You can found it in the ocaml bazaar :
http://caml.inria.fr/distrib/bazar-ocaml/

==============================================================================
7) ocaml polymorphic variants and double coercion
------------------------------------------------------------------------------
** Jacques Garrigue said:

The following message is a courtesy copy of an article
that has been posted to comp.lang.functional as well.

Shaddin Doghmi writes:

> What is the difference between double corcion and single coersion? what
> can double coercion do that single coercion cant?

Lots of things :-)
In a single coercion, the expected type is directly built from the
target type in a predefined way. The algorithm attempts to be clever,
but if the target type is recursive, or already has a polymorphic
structure, there is no unique most general expected type. Moreover,
there are extra restrictions to avoid a blow-up in type inference
time, namely left hand sides of arrows are not available for subtyping
(but other contravariant types are...)

Here are typical examples involving polymorphic variants

fun x -> (x :> [`Nil | `Cons of 'a * 'b] as 'b)
- : [< `Cons of 'b * ([ `Cons of 'a | `Nil ] as 'c) as 'a | `Nil ] -> 'c
Note that the expected type has only been partially open.

fun x ->
  (x : [< `Nil | `Cons of 'a * 'b] as 'b :> [`Nil | `Cons of 'a * 'c] as 'c)
With a double coercion we can get recursion on the expected side.

fun x ->
  (x : [< `Nil | `Cons of 'a * 'b] as 'b :> [> `Nil | `Cons of 'a * 'c] as 'c)
This wouldn't work with a single coercion, as the target is polymorphic.

If there is no recursion, you can get this last effect through dispatch:
function `None | `Some _ as x -> x
- : [< `None | `Some of 'a ] -> [> `None | `Some of 'a ]

> Also, there doesnt seem to be any good detailed references for
> polymorphic variants. Both the ocaml manual and the oreilly book dont
> mention lots of details. Anybody know of any good references?

See my publications for examples and theory with polymorphic variants.
In particular the paper on code reuse is a good introduction, and the
syntax is (almost) the current one.

http://wwwfun.kurims.kyoto-u.ac.jp/~garrigue/papers/

There is a very small bit on coercions in "Programming with ..."
(which uses an older syntax, by the way).
Basically they work just like for object types, which get a reasonably
detailed account in the reference manual. The main differences are
that (1)there is no concept of class type with variants, so you always
need double coercions to introduce recursion, and (2)with polymorphic
variants a closed type vt has both a polymorphic subtype [< vt] and a
polymorphic supertype [> vt], while an object type has only a subtype
#vt.

==============================================================================
Using folding to read the cwn in vim 6+
------------------------------------------------------------------------------
Here is a quick trick to help you read this cwn if you are viewing it using
vim (at least version 6).

:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1
zM

If you know of a better way, please let me know.

==============================================================================
Old cwn
------------------------------------------------------------------------------

If you happen to miss a cwn, you can send me a message
(alan.schmitt@inria.fr) and I'll mail it to you, or go take a look at
the archive (http://pauillac.inria.fr/~aschmitt/cwn/). If you also wish
to receive it every week by mail, just tell me so.

==============================================================================

Alan Schmitt