OCaml Weekly News

Previous Week Up Next Week

Hello

Here is the latest OCaml Weekly News, for the week of January 19 to 26, 2021.

Table of Contents

How to get pleasant documentation for a library using Dune?

gasche announced

I'm working to publish a small library using Dune. The documentation automatically generated by dune build @doc looks fairly unpleasant to me, as I don't see an easy way to explain what the library is about. I'm creating this topic in case I am missing something simple, and to get other people to share their library-documentation practices or examples.

Problem description

For the sake of the example let's imagine that the library is called Foo and contains three modules A, B and C. I'm using the standard dune approach of wrapped modules, so I get three compilation units Foo.A, Foo.B, Foo.C. Each module has a .mli file with documentation comments.

When I run dune build @doc, dune generates an index.html file with basically no content, pointing to a foo/index.html file with basically no content, pointing to a foo/Foo/index.html looking like this:

Up – foo » Foo

Module Foo

module A : sig ... end

module B : sig ... end

module C : sig ... end

It's easy to skip the first two pages, and use the third page as a landing page for the documentation of my library. However, this landing page is not very pleasant:

  1. It should explain what the library is about.
  2. It should briefly describe what each module does, so that users know which module they want to look at first.

(Point 2 is especially important with "wrapped libraries", where it's not necessarily obvious which of the several modules is the main entry point with the important functions to look at first. In comparison, in a design where the "entry point" is in the Foo module, with Foo.A and Foo.B as more advanced submodules (or Foo_A and Foo_B in the old days) the user is guided to look at Foo first.)

My problem is: what should I change in my Dune setup to be able to do this?

I have read the dune documentation on documentation, but I could not find an answer to this question.

Rough ideas

Roughly I see two ways to get what I want, that I have not tried yet:

  1. I could write my own landing page for the library as a separate doc.mld file, use the (documentation) stanza to get it included in the built documentation, and use this as the entry point into my library.
  2. In could write my own foo.ml module instead of using Dune's default wrapped-module scaffolding, inserting my own module A = Foo__A aliases, with documentation comments in the standard style. Then I suppose that foo/Foo/index.html would get this content in the way I expect.

They feel a bit complex to me, and (2) involves the tedious work of redoing the wrapping logic myself. I guess that (1) is not so bad, and I would be inclined to do this if it was documented somewhere as the recommended approach.

(Maybe there is some odoc option that would help solve this problem?)

Examples from other people?

Do you have a library built using dune with nice documentation? If so, can you show the documentation and the corresponding sources (in particular dune setup)?

Thibaut Mattio replied

I think the documentation of Streaming is a great example of the option 1 you describe.

The corresponding Dune setup can be found here

That's also the approach we took for Opium's documentation, although the index page is certainly not as impressive as Streaming's.

gasche then said

Thanks! It looks like these systems rely on an undocumented feature of the (documentation) stanza (or odoc), which is that a user-provided index.mld file will implicitly replace the automatically-generated index.mld file, giving a reasonably natural result.

The opium documentation also uses the {!modules: modulename ...} markup directive, which is a way to include the module index within this manually-written landing page without having to duplicate the markup. Streaming¹ uses inline html instead to get a nicer-looking result, but it is too much effort. Maybe there is a better way, or the tools could be improved to make this easier.

¹: I'm ashamed to admit that I wasn't aware of this very nice library Streaming, am I consuming the wrong sources of information on the OCaml ecosystem?

Finally, the Opium documentation manifestly has a short synopsis for each module in its listing, which corresponds to my "It should briefly describe what each module does" requirement. I believe that this comes from the first line of the first documentation comment of the module. There are module-global documentation comments in the library I'm working on, but they do not include such first-line headers.

Once I have the impression of understanding what is a good way to do this, I may try to contribute better documentation in dune.

Gabriel Radanne replied

It looks like these systems rely on an undocumented feature of the (documentation) stanza (or odoc), which is that a user-provided index.mld file will implicitly replace the automatically-generated index.mld file, giving a reasonably natural result.

I confirm this feature is here to stay, is the right one to customize your index page, and in the future will benefit from good support from odoc directly.

The opium documentation also uses the {!modules: modulename ...} markup directive, which is a way to include the module index within this manually-written landing page without having to duplicate the markup. Streaming¹ uses inline html instead to get a nicer-looking result, but it is too much effort. Maybe there is a better way, or the tools could be improved to make this easier.

I would strongly advise to use the modules markup directive, and to suggests output improvements on odoc's bug instead of hacking HTML together. We could absolutely add the synopsis of the module here, for instance.

Daniel Bünzli then said

which is that a user-provided index.mld file will implicitly replace the automatically-generated index.mld file, giving a reasonably natural result.

This is also the correct way to customize the landing page of your package for odig generated doc sets, see here for more information.

I confirm this feature is here to stay, is the right one to customize your index page, and in the future will benefit from good support from odoc directly.

There's an open issue about that here.

Alt-Ergo 2.4.0 release

OCamlPro announced

We are pleased to announce a new release of Alt-Ergo!

Alt-Ergo 2.4.0 is now available from Alt-Ergo’s website. An associated opam package will be published in the next few days.

This release contains some major novelties:

  • Alt-Ergo supports incremental commands (push/pop) from the smt-lib standard.
  • We switched command line parsing to use cmdliner. You will need to use –<option name> instead of -<option name>. Some options have also been renamed, see the manpage or the documentation.
  • We improved the online documentation of your solver, available here.

This release also contains some minor novelties:

  • .mlw and .why extension are depreciated, the use of .ae extension is advised.
  • Add –input (resp –output) option to manually set the input (resp output) file format
  • Add –pretty-output option to add better debug formatting and to add colors
  • Add exponentiation operation, ** in native Alt-Ergo syntax. The operator is fully interpreted when applied to constants
  • Fix –steps-count and improve the way steps are counted (AdaCore contribution)
  • Add –instantiation-heuristic option that can enable lighter or heavier instantiation
  • Reduce the instantiation context (considered foralls / exists) in CDCL-Tableaux to better mimic the Tableaux-like SAT solver
  • Multiple bugfixes

The full list of changes is available here. As usual, do not hesitate to report bugs, to ask questions, or to give your feedback!

First release of Art - Adaptive Radix Tree in OCaml

Calascibetta Romain announced

I'm glad to announce the first release of art, an implementation of the Adaptive Radix Tree in OCaml. The goal of this library is to provide a data-structure such as Map.S (and keep the order) with performances of Hashtbl.t.

Performances

art uses Bechamel as a tool for micro-benchmarking and it compares performances about insertion and lookup. As you can see, about insertion, art is definitely more fast than Hashtbl.t.

For the lookup operation, we are slightly more fast than the Hashtbl.t. The main advantage comparing to Hashtbl.t is the ability to use maximum~/~minimum or to iter over the whole data-structure with a certain order.

On details, benchmarks use a normal distribution of strings about their lengths. As a practical example where art will be better than Hashtbl.t is when you want to index several words (such as email addresses).

Tests

Of course, the library provide a fuzzer and tests have a coverage of: 91.93 %

Read Optimized Write Exclusion - ROWEX

Even if it's not a part of the package, the distribution comes with lock-free implementation of art: rowex. This implementation comes from a research paper about data-structure and atomic operations.

ROWEX provides a persistent implementation which manipulates a file to store the whole data-structure. The goal is to provide an indexer free to be manipulated by several processes in parallel.

Currently, the implementation of ROWEX in OCaml is not well-tested and it is no distributed. It does not take the advantage of ocaml-multicore (but it should) but outcomes are good and the development will be more focus on this part.

So feel free to play with it a bit :+1:.

perf demangling of OCaml symbols (and a short introduction to perf)

Fabian announced

As a project sponsored by the OCaml software foundation, I've worked on demangling OCaml symbols in perf. Some screenshots are below. The work is currently being upstreamed. In the meantime, it can be used as follows:

git clone --depth=1 https://github.com/copy/linux.git
# or:
# wget https://github.com/copy/linux/archive/master.tar.gz && tar xfv master.tar.gz
cd linux/tools/perf
make
alias perf=$PWD/perf
# or copy perf to somewhere in your PATH

Your distribution's version of perf will also work for the examples below, but will have less readable symbols :-)

Short intruction to perf

Perf is a Linux-only sampling profiler (and more), which can be used to analyse the performance profile of OCaml and other executables. When compiling with ocamlopt, add -g to include debug information in the executable. dune does this automatically, even in the release profile. To start a program and record its profile:

perf record --call-graph dwarf program.exe

Or record a running program:

perf record --call-graph dwarf -p `pidof program.exe`

Then, view a profile using:

perf report # top-down
perf report --no-children # bottom-up

Within the report view, the following keybindings are useful:

  • +: open/close one callchain level
  • e: open/close entire callchain
  • t: Toggle beween current thread and all threads (e.g., only dune, ocamlopt, etc.)

Or generate a flamegraph:

git clone https://github.com/brendangregg/FlameGraph
cd FlameGraph
perf script -i path/to/perf.data | ./stackcollapse-perf.pl | ./flamegraph.pl > perf-flamegraph.svg

You may need to run the following command to allow recording by non-root users (more infos):

echo 0 | sudo tee /proc/sys/kernel/perf_event_paranoid

Decimal 0.2.1 - arbitrary-precision decimal floating point

Yawar Amin announced

Happy to announce that decimal 0.2.1 has been pubished on opam.

decimal is a port of Python's decimal module to OCaml and implements the General Decimal Arithmetic Specification. However note that it is a port in progress–basic arithmetic and rounding functions have been ported, but I am still working on powers and logs. The ported functions pass the same unit test suite that the Python version does (with some minor modifications).

Another caveat: currently the library is only supported on 64-bit architectures due to (exponent) overflow issues on 32-bit. If anyone is willing to test and fix overflows on 32-bit, I am more than happy to accept PRs.

Here's an example of using the module:

(* Rosetta Code Currency Example *)

(* Demo purposes, normally you'd prefix module name or local open *)
open Decimal

let hamburger_qty = of_string "4_000_000_000_000_000"
let hamburger_amt = of_string "5.50"
let milkshake_qty = of_int 2
let milkshake_amt = of_string "2.86"

(* Shortcut to divide 7.65 by 100 *)
let tax_rate = of_string "7.65e-2"

let subtotal = hamburger_qty * hamburger_amt + milkshake_qty * milkshake_amt
let tax = round ~n:2 (subtotal * tax_rate)
let total = subtotal + tax

let () = Format.printf "Subtotal: %a
     Tax:  %a
   Total: %a\n" pp subtotal pp tax pp total

You can get the package with: opam install decimal. Minimum OCaml version 4.08.

Basic GitLab CI configuration

gasche announced

After a long ci-golfing adventure (83 tests), I got a .gitlab-ci.yml file that I think is reusable and useful for small projects / libraries:

Features:

  • It is project-agnostic, so it should work unchanged for your own (simple) projects.
  • It caches the opam dependencies.
  • It builds the project, runs the tests and builds the documentation.
  • Several compiler versions can be tested in parallel.
  • It provides an easy way to upload the documentation as "Gitlab project Pages".

CI times are satisfying: on very small libraries I observe a 11mn job time on the first run (or when cleaning the opam cache), and 2mn job time on following runs.

The expected usage-mode of this CI configuration is that you copy it in your own project. If you find that you need/want additional features, ideally you would try to write them in a project-agonistic way and contribute them back to the example repository.

This configuration does not use @smondet's trick of generating a docker image on the fly. I think this would be an excellent idea to get more reliable caching, but it is too complex for me and I don't see how to do it in a maintainable and project-agnostic way.

Current status

I wrote this CI configuration over the week-end, and have not used it much. I expect it to keep evolving somewhat before it stabilizes. Feedback from other people trying to use the configuration would be warmly welcome.

Aside on _build caching

I also implemented caching of dune's _build data, inspired by the data-encoding example of @raphael-proust. I don't need it for my small projects (dune build is 3s, compared to 1m setting up the Docker image), but I thought it would make the CI configuration scale better to larger projects.

When I tested this CI configuration, I discovered that caching the dune _build data does not work as well as I had expected. (Tracking issue: dune#4150).

I can tell because I am asking dune to tell me about what it is rebuilding (dune build --display short). I suspect that projects that cache the _build data without logging what dune (re)builds are also not caching as much as they think they are.

(But then maybe the use of a fixed-compiler OPAM image, as data-encoding is using, solves the issue.)

official CI template?

I considered submitting this CI configuration as an "OCaml Gitlab CI template" to go with the official list of "blessed" CI templates in the documentation. But reading the Development guide for Gitlab CI/CD templates convinced me that my CI configuration is nowhere ready to serve this role.

Gitlab developers apparently expect that users will be able to "include" those CI templates by pointing to their URL, and then tune it for their own use-case (without modifying it) by performing some (unreasonable?) inheritance tricks using whatever those configurations offers as abstraction/inheritance/extension/overriding mechanism. Let's just say that this is next-level CI configuration writing, and that my script is not ready for this.

OCaml Office Hours?

Deep in this thread, UnixJunkie said

In addition to mailing lists and discuss, there is also an IRC channel where people can interact with some ocaml experts in a more "interactive" manner (//irc.freenode.net/#ocaml)

json-data-encoding 0.9

Raphaël Proust announced

On behalf of Nomadic Labs, I'm happy to announce the release of json-data-encoding version 0.9.

The code is hosted on Gitlab: https://gitlab.com/nomadic-labs/json-data-encoding It is distributed under GNU LGPL with linking exception. The documentation is available online: https://nomadic-labs.gitlab.io/json-data-encoding/ The package is available under opam: opam install json-data-encoding

json-data-encoding is a library to define encoder/decoder values to translate OCaml values to JSON and back. It also generates JSON schemas so you can document the value representation. It can use either Ezjsonm or Yojson as backends.

The version 0.9 has the following new features:

  • more tests
  • memoisation of fixpoint encoding to avoid repeated computations
  • support for format field for string schemas (see https://json-schema.org/understanding-json-schema/reference/string.html#format) (contributed by @levillain.maxime)
  • fixed integer bound printing in schemas (bug report by @pw374)
  • support for json-lexeme streaming (see details below)
  • support for inclusion/exclusion of default-value fields during serialisation (contributed by @levillain.maxime)
  • improved union-of-object schemas (contributed by @levillain.maxime)

One major difference with the previous release is the inclusion of a lexeme-streaming JSON constructor. Specifically, the function

val construct_seq : 't encoding -> 't -> jsonm_lexeme Stdlib.Seq.t

generates a sequence of Jsonm.lexeme (the . This sequence is lazy (in the sense of Stdlib.Seq not of Stdlib.Lazy) and it paves the way to a similar feature in data-encoding. An interesting feature of sequences is that they can be used in Vanilla OCaml settings as well as Lwt/Async settings where they allow user-driven yielding in between elements.

VSCode OCaml Platform v1.6.0

Rudi Grinberg announced

On behalf of the vscode-ocaml-platform team, I'm pleased to announce 1.6.0. This release contains a new activity tab for managing opam switches developed by @tmattio. We hope you find it useful.

Change log:

- Highlight token aliases in Menhir associativity declarations (#473)

- Activate the extension when workspace contains OCaml, Reason sources or
  project marker files. (#482)

- Add `ocaml.useOcamlEnv` setting to determine whether to use `ocaml-env` for
  opam commands from OCaml for Windows (#481)

- Fix terminal creation when using default shell and arguments (#484)

- Add an OCaml activity tab.

  The activity tab provides three views: the available switches, the build
  commands and an Help and Feedback section with links to community channels.

- Support `eliom` and `eliomi` file extensions (#487)

- Fix ocaml/ocaml-lsp#358: automatic insertion of an inferred interface was
  inserting code incorrectly on the second switch to the newly created (unsaved)
  `mli` file. If the new `mli` file isn't empty, we don't insert inferred
  interface (#498)

release 0.3.0 of drom, the OCaml project creator

Fabrice Le Fessant announced

We are pleased to release version 0.3.0 of drom, the OCaml project creator.

drom is born from a simple observation: every time you create a new OCaml project, you spend time searching and copy-pasting files from other projects, adapting them to the new one. drom does that for you: it comes with a set of predefined skeleton projects, that you can easily configure and adapt to your goal.

It's as easy as:

$ drom new
  # check the list of skeletons
$ drom new PROJECT_NAME --skeleton SKELETON_NAME
$ cd PROJECT_NAME
$ emacs drom.toml
   # ... edit basic description, dependencies, etc. ...
$ drom project
$ drom build

Thanks to contributors (Maxime Levillain and David Declerck), the list of project skeletons for drom 0.3.0 has grown:

  • OCaml projects: library menhir mini_lib mini_prg ppx_deriver ppx_rewriter program
  • C Bindings: c_binding ctypes_foreign ctypes_stubs
  • Javascript projects: js_lib js_prg vue wasm_binding

and you can easily contribute your own: for example, gh:USER/SKELETON will trigger the download of the USER/SKELETON project from Github as a template for your new project.

drom is available from opam: opam update && opam install drom.0.3.0

https://github.com/ocamlpro/drom

Enjoy !

Old CWN

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.