Previous week   Up   Next week
Hello,

Here is the latest Caml Weekly News, week 25 February to 04 March, 2003.

1) Alternative proposal: COAN

======================================================================
1) Alternative proposal: COAN
----------------------------------------------------------------------
We continue our coverage of last week proposal for a COAN. The thread
started at http://caml.inria.fr/archives/200302/msg00291.html and
continues in March at http://caml.inria.fr/archives/200303/msg00000.html
Here are some excerpts with direct pointers.

Michal Moskal said:
http://caml.inria.fr/archives/200302/msg00324.html

CPAN modules are standardized. That's the main advantage of them. We need
this kind of stuff for ocaml.

What we should standardize? IMHO:

1. Naming conventions. Maybe some guidelines for package names. Anyway
   for package foo, version 1.2.3:
     - source tarball should be foo-1.2.3.tar.gz
     - it should unpack all it's files into foo-1.2.3 directory
   (these are good GNU practices anyway)

2. Build procedure. We can either use OCamlMakefile, ocamake, or
   some ocaml script. But make it standard for all COAN packages.
   Preferably makefiles should be generated or build process should be
   performed by a special tool, so it is possible for example to add
   DESTDIR support, or shared library support if it's added to ocaml,
   in one place instead of hundreds of COAN packages.

   In any event building COAN package should be matter of one command.
   And it should be the same command for all COAN packages.

3. Installation. IMHO packages should be installed to `ocamlc
   -where`/package-name. Installation tool, whatever it is,
   should support DESTDIR (i.e. specifying fake root directory).

4. Documentation. But this should be easy -- just use ocamldoc,
   and maybe some additional files (in pkg-version/doc/).

5. Enforce META files for findlib?

6. Maybe some metafile describing package and dependencies?
   From which .spec for rpm and makefile rules for debs could
   be generated. Preferably in XML.
 
Lauri Alanko said and Xavier Leroy answered:
http://caml.inria.fr/archives/200302/msg00351.html

> When Haskell got to the same situation (ie. people began to collect
> miscellaneous libraries into a coherent whole), one of the first things
> to be done was to extend the module system of the language to
> hierarchical namespaces: [...]
> This sort of thing is done in Java, it is done in Perl, and it probably
> ought to be done in just about any language that plans to support lots
> of libraries. Naming conflicts are icky. I would very much like O'Caml
> to have hierarchical namespaces as well.

Since day one, the OCaml module system has had "hierarchical"
namespaces in the form of nested modules.  Without sounding too cocky,
I'd say that ML doesn't have much to learn from Java or Perl in the
module department.

It is true that until recently OCaml didn't support separate
compilation of the submodules.  I.e. in order to present the user of
the library Lib with the hierarchical view Lib.A, Lib.B, etc,
the source had to be put in one file lib.ml.  Notice that this style
is perfectly workable with small to medium-sized libraries (see
my Cryptokit library for an example).

The '-pack' mechanism was introduced in 3.06 to support separate
compilation of the submodules.  Since it is a recent extension, it's
not yet stable enough to be used widely, but I expect this to change
with time.

Roberto Di Cosmo said and Jacques Garrigue answered:
http://caml.inria.fr/archives/200302/msg00363.html

> But it is probably necessary here to clearly separate the different issues...
> at first sight, I see:
> 
> - centralized repository:
>     Issue: we want some central place where to look for Ocaml code
>            without resorting to google
> - easy installation:
>     Issue: I want to run advi to give flashy LaTeX presentation, and I want t
>            just get a binary for my nice OS I love so much, without having to
>            recompile anything
> - dependency tracking:
>     Issue: we would really really like to avoid reading "README"s
>            to discover the zillion packages on which the next future
>            generation Ocaml killer application will depend. Just
>            type "install XYZ" and that's it. 

His conclusion was that apt-get (or something similar) is the way to
go. I agree mostly with this too, but my experience with BSD ports and
ocaml upgrading nightmares makes me differ on details.

* For me the central repository should not contain the source themselves.
  This was an error with the CDK: the distribution becomes huge, and  
  it is very difficult for the maintainers to track changes
  by developpers (who do not necessarily want to work in that
  repository, for evident practical reasons). Not speaking of
  licensing problems.
  With BSD ports, the central repository only contains metadata, that
  it a directory for each package, with its name, its dependencies,
  where to find it, how to configure and install it, and eventually 
  some patches to make it fit in the system.
  The central repository would be a small one containing only
  metadata, updated often, eventually by authors themselves. Users who
  want newest stuff update by cvs, others get snapshots. Ideally some
  snapshots go through testing to become releases.
  The sources themselves are distributed as tarballs. This may be a
  good idea to replicate them on a few ftp servers, but there is no
  reason to make it compulsory.

* Easy installation means that you should be able to download, compile
  and install the desired ocaml program or library in one single  
  command, including all the dependencies.
  This does not mean that a binary should be available. A binary will
  only work with a single version of ocaml and all dependencies, which 
  is way too restrictive. Binaries may be provided on a by OS basis,  
  but then it is much more comfortable for users to use the packaging
  system provided by the OS (tgz on FreeBSD, rpms on redhat, deb on
  debian, pkg on OSX, ??? on windows...) If the basic framework is
  right, that work should be easy enough.

* Dependency resolution and automatic recompilation/reinstallation is
  the core of the problem.
  When you modify an ocaml library, all its dependencies have to be
  recompiled. You certainly want to automate this, and have some
  foolproof system to be able to go back to a stable state. This is
  also an area where a bit of compiler "support" may become necessary.
  At least, have different library directories for different version
  of ocaml, ideally some scheme a la OSX to install several versions  
  of the same package in parallel.

I personally don't think that standardizing the tools to produce
individual package is a useful move. Providing good tools to ease
package construction matters, but enforcing them on developpers is
counter-productive.  What is needed is just the glue to make it
uniform from the user point of view. Ocamlfind can probably
help there for complex cases, but as simple cases work well enough
with ocamlc, I would prefer it to stay optional for end-users (this is
certainly OK to rely on it in the implementation of the system).

Benjamin Pierce summarized:
http://caml.inria.fr/archives/200302/msg00370.html

Seems like maybe the beginnings of a minimalist proposal are beginning to
emerge from the discussion...

   - Using a "hump" model -- centrally stored meta-data pointing to
     actual package contents stored on people's individual servers and
     updated at will -- avoids uploading/mirroring issues.

     In fact, the current hump seems almost right -- it just needs
        - to be minimally machine readable (perhaps just able to export
          its DB in some simple XML format)
        - to have some way of indicating dependencies
        - to include pointers directly to package contents (not to
          people's web pages where packages can be found, etc.)

     For a notation for signalling dependencies, a little work is needed.
     But I have the impression that there are a number of people in the
     community that understand the issues rather well, and that a pretty
     simple solution would be good enough for 99.9% of the cases...   

     For pointers to package contents, one should establish a common file
     naming convention -- e.g., record the URL of the package contents in
     the COAN as

         http://hostname.fr/path/mynicepackage-VERSION.tar.gz

     and store versions 1.2, 1.3, 1.5 on the server as

         http://hostname.fr/path/mynicepackage-1.2.tar.gz
         http://hostname.fr/path/mynicepackage-1.3.tar.gz
         http://hostname.fr/path/mynicepackage-1.5.tar.gz

   - Let people write their makefiles in whatever way they like, use
     findlib or not as they prefer, etc., but establish a minimal set
     of common requirements -- e.g.
         - saying just 'make install' should do configuration if
           necessary, build bytecode and (if possible) native versions,
           and put them where they belong
         - DESTDIR should be treated appropriately
         - etc.

     Again, there are several people in the community that have, among
     them, a pretty clear sense of what these minimal requirements should
     look like.  (I know there is some disagreement about details, but as
     a developer I don't really care -- I just want someone to decide on
     something reasonable and publish a template that I can follow if I
     want to contribute my code to the community.)  I'd love it if three
     or four of them could just get together, decide something
     reasonable, discuss it with the OCaml authors to make sure they
     agree, and tell the rest of us what to do.

Xavier Leroy said:
http://caml.inria.fr/archives/200303/msg00006.html

I'm catching up on this interesting discussion, and find myself in
violent agreement with Jacques: something like the BSD "ports" system,
concentrating on (re-)compilation from sources and dependency
management, sounds like a strong starting point.

I tried to read the BSD "ports" manual once, and my head exploded
midway :-)  I hope we can simplify things a bit, though.

> * For me the central repository should not contain the source themselves.

I agree it should be sufficient to give a URL to the sources in the
metadata describing the package.  In some cases, library authors
cannot provide a really stable URL, hence some kind of mirrorring of
the sources might be necessary.  (And is a real nightmare to do: INRIA
can easily provide lots of disk space and bandwith, but making sure
that no-one uses the INRIA mirror to trade warez is the hard part :-)
But, yes, let's desing the system around the idea that sources are to
be downloaded from arbitrary URLs, like BSD ports do.

> * Dependency resolution and automatic recompilation/reinstallation is
>   the core of the problem.

Agreed.  What do you envision for this?  Is it enough for each package
to list the names and version number ranges for all the packages it
needs?  Or shall we try to exploit additionally the dependency
information (MD5 checksums of imported modules) already present in
compiled OCaml files?

> I personally don't think that standardizing the tools to produce
> individual package is a useful move.

It's not necessary, but could help.  For the packaging tools to work,
each port will have to contain a makefile or shell script that
implements correctly a basic, shared protocol, e.g. "configure,
compile, and install yourself", or "uninstall yourself".  Providing
template makefiles that implement this protocol could help library
authors do the packaging.

Also, as Nikolaj pointed out, precious few OCaml libraries build on
Windows.  That's basically because most library authors don't know
anything beyond Unix.  Again, template makefiles could help overcome
this issue.

Speaking of this, I've been considering generating a Makefile.inc file
as part of the OCaml installation, containing useful information such
as "where is the OCaml library?", "what version of OCaml is this?",
and "is the native code compiler supported?".  By just putting
        include `ocamlc -where`/Makefile.inc
in your makefiles, most of the need for an autoconfiguration script
could be avoided.

Finally, some standardization on where packages install their files
would help the end user.  Some packages install one module in the
OCaml stdlib dir, others install in a subdir of the stdlib, others
install in the "contrib" subdir, etc.  OCamlfind can handle all this,
but I believe more stringent guidelines on where libraries should go
would help.

Sven Luther commented:

I don't really like the idea of the range of version numbers. There is
no way you can know when a package will change in the future, and then
there is the issue of the version of ocaml it was built for (err, you
are speaking sources, so this is not as important, but still, there may
be source changes as well).

Anyway, i would much prefer that we begin to look at a formal versioning
system, for example a 3 level version system, where the first will be
for api changes, the second for backward compatible api changes (adding
stuff and the like), and the last will be for bug fixes and other non
api changing changes. This is something that is used for C so numbers,
and could work out well here. Basically it guarantees that if the api
number of a dependency did not change, then you can use it.

Sure, it needs more cooperation between the different module authors,
but i think that this will be accepted by them.

Also, this is for ocaml only, problems may happen with stublibs and
other non ocaml stuff, but in these cases, mostly libraries, i believe
depending on the right level of so number would be enough.

As for the build system, i will say here what i already told the small
group apointed by benjamin, i think it would be nice if we could just
have some kind of wrapper for the most common
configuration/build/installation options, and then have each package
provide a small ocaml module which would comply with the following
module type :

module type Ocaml_Package = sig
  val configure : ... -> string
  val build : ... -> string
  val install : ... -> string
end

where the ... would be replaced with the common option strings.

I think it can be expanded with some kind of listing of additional
option by the packages, and can then be used to build a simple little
installation program, or even dynamically loaded or something such.

This would solve all the request for simple and standard installation
processes, let yet the authors free to use whatever build system they
choose, provided they provide the appropriate wrapper, which would
server as documentation as well.

======================================================================
Old cwn
----------------------------------------------------------------------

If you happen to miss a cwn, you can send me a message
(alan.schmitt@inria.fr) and I'll mail it to you, or go take a look at
the archive (http://pauillac.inria.fr/~aschmitt/cwn/). If you also wish
to receive it every week by mail, just tell me so.

======================================================================

Alan Schmitt