OCaml Weekly News

Previous Week Up Next Week

Hello

Here is the latest OCaml Weekly News, for the week of April 16 to 23, 2024.

Table of Contents

A second beta for OCaml 5.2.0

octachron announced

Last week, we merged in the 5.2 branch of OCaml an update to the compiler-libs "shape" API for querying definition information from the compiler.

Unfortunately, this small change of API breaks compatibility with at least odoc. Generally, we try to avoid this kind of changes during the beta releases of the compiler. However, after discussions we concluded that it will be easier on the long term to fix the API right now in order to avoid multiplying the number of supported versions of the shape API in the various OCaml developer tools .

We have thus released a second beta version of OCaml 5.2.0 to give the time to developer tools to update their 5.2.0 version ahead of the release (see below for the installation instructions).

Beyond this changes of API, the new beta contains three minor bug fixes and three documentation updates, which is a good sign in term of stability.

As usual, you can follow the last remaining compatibility slags on the opam readiness for 5.2.0 meta-issue.

If you find any bugs, please report them on OCaml's issue tracker.

Currently, the release is planned for the beginning of May.

If you are interested in full list of features and bug fixes of the new OCaml version, the updated change log for OCaml 5.2.0 is available on GitHub.

Installation Instructions

The base compiler can be installed as an opam switch with the following commands on opam 2.1:

opam update
opam switch create 5.2.0~beta2

The source code for the beta is also available at these addresses:

Fine-Tuned Compiler Configuration

If you want to tweak the configuration of the compiler, you can switch to the option variant with:

opam update
opam switch create <switch_name> ocaml-variants.5.2.0~beta2+options <option_list>

where option_list is a space-separated list of ocaml-option-* packages. For instance, for a flambda and no-flat-float-array switch:

opam switch create 5.2.0~beta2+flambda+nffa ocaml-variants.5.2.0~beta2+options ocaml-option-flambda ocaml-option-no-flat-float-array

All available options can be listed with opam search ocaml-option.

Changes since the first beta

  • Compiler-libs API Changes
    • #13001: do not read_back entire shapes to get aliases' uids when building the usages index (Ulysse Gérard, review by Gabriel Scherer and Nathanaëlle Courant)
  • Bug Fixes
    • #13058: Add TSan instrumentation to caml_call_gc(), since it may raise exceptions. (Fabrice Buoro, Olivier Nicole, Gabriel Scherer and Miod Vallat)
    • #13079: Save and restore frame pointer across Iextcall on ARM64 (Tim McGilchrist, review by KC Sivaramakrishnan and Miod Vallat)
    • #13094: Fix undefined behavior of left-shifting a negative number. (Antonin Décimo, review by Miod Vallat and Nicolás Ojeda Bär)
  • Documentation Updates
    • #13078: update Format tutorial on structural boxes to mention alignment questions. (Edwin Török, review by Florian Angeletti)
    • #13092: document the existence of the [@@poll error] built-in attribute (Florian Angeletti, review by Gabriel Scherer)
    • #13066, update OCAMLRUNPARAM documentation for the stack size parameter l (Florian Angeletti, review by Nicolás Ojeda Bär, Tim McGilchrist, and Miod Vallat)

An implementation of purely functional double-ended queues

Humza Shahid announced

I have some code that might be useful to others here. I had the idea of a new purely functional implementation for double ended queues, and I implemented it (https://github.com/hummy123/bro-deque)[here].

The idea is pretty simple, and it proves to be quite fast in benchmarks.

The idea is to have a record containing:

  • A head array representing the start of the queue, with a limit on the number of elements it can have.
  • A tail array representing the end of the queue, also with a limit on the number of elements it can have.
  • A balanced binary tree based on the rope data structure. (The internal nodes pointing to other nodes contain integer metadata indicating the number of elements on the left and right subtrees, and leaf nodes contain an array of elements.)

When trying to insert into either the head or tail array when the array is at max capacity, the array is either appended or prepended to the tree and the array/element we wanted to insert is now either the head or tail.

I was looking for some way to test the performance and adapted (this code)[https://discuss.ocaml.org/t/ocaml-speed-recursive-function-optimization/13502/3] to use it, and it's pretty fast - only about 4x slower than the standard library's mutable queue. (Although this was really implemented in mind aiming for fast access time rather than fast insertion/removal time.)

It has some non-standard functions for double ended queues too, like O(log n) insert/removal/indexing at any arbitrary location (with a constant that makes this faster than on a typical binary tree - a typical binary tree contains on element per node, increasing height, but this contains arrays of elements at the leaves so more data is packed and the height is shorter).

Some other people might find it useful, so here it is for others to copy-and-paste. I don't know if it's worth putting on opam (I don't have a use for this myself in any of my projects but curiosity led me to implement it.)

Feedback / Help Wanted: Upcoming OCaml.org Cookbook Feature

Cuihtlauac Alvarado announced

We've just updated the cookbook: https://staging.ocaml.org/cookbook. We'd love to have your feedback on it. The corresponding PR is still the same: https://github.com/ocaml/ocaml.org/pull/1839

The visual design is not yet final, but it works. It is organized in recipes, tasks and categories.

A task is something that needs to be done inside a project. A recipe is a code sample and explanation of how to perform a task using a combination of packages. Some tasks can be performed using different combination of libraries, each is a different recipe. Categories are groups of tasks or categories

You'll see most tasks don't have any recipes. We hope to collect the best recipes. Categories are also open for discussion.

Picos — Interoperable effects based concurrency

polytypic announced

Picos is a framework for building interoperable elements such as

  • schedulers that multiplex large numbers of user level fibers to run on a small number of system level threads,
  • mechanisms for managing fibers and for structuring concurrency,
  • communication and synchronization primitives, such as mutexes and condition variables, message queues, STMs, and more, and
  • integrations with low level asynchronous IO systems,

of effects based cooperative concurrent programming models.

polytypic then announced

I'm happy to announce that version 0.2.0 of Picos is now available on opam.

A small core picos framework allows concurrent abstractions implemented in Picos to communicate with Picos compatible schedulers.

In addition to the core framework, the picos package comes with a couple of sample schedulers and some scheduler agnostic libraries as well as bunch of auxiliary libraries.

Sample schedulers:

  • picos.fifos — Basic single-threaded effects based Picos compatible scheduler for OCaml 5.
  • picos.threaded — Basic Thread based Picos compatible scheduler for OCaml 4.

Scheduler agnostic libraries:

  • picos.sync — Basic communication and synchronization primitives for Picos.
  • picos.stdio — Basic IO facilities based on OCaml standard libraries for Picos.
  • picos.select — Basic Unix.select based IO event loop for Picos.

Auxiliary libraries:

  • picos.domain — Minimalistic domain API available both on OCaml 5 and on OCaml 4.
  • picos.exn_bt — Wrapper for exceptions with backtraces.
  • picos.fd — Externally reference counted file descriptors.
  • picos.htbl — Lock-free hash table.
  • picos.mpsc_queue — Multi-producer, single-consumer queue.
  • picos.rc — External reference counting tables for disposable resources.
  • picos.thread — Minimalistic thread API available with or without threads.posix.

All of the above are entirely opt-in and you are free to mix-and-match with any other Picos compatible future schedulers and libraries implemented in Picos or develop your own.

See the reference manual for further information.

Ppxlib dev meetings

Nathan Rebours announced

You can find our last meeting's notes here.

We had three guests yesterday: @shonfeder @lubegasimon and @moazzammoriani.

You are always welcome to join whether you have a specific topic you want to bring up or you just want to tag along. We'll post the link here ahead of the meeting.

Ortac 0.2.0

Nicolas Osborne announced

We are very excited to announce this new Ortac release!

Ortac is a set of tools that translate Gospel specifications into OCaml code and use these translations to generate programs that check at runtime that the OCaml implementation respects the Gospel specifications.

You can find the project on this repo and install it via opam.

This new release contains four packages:

  • ortac-core
  • ortac-runtime
  • ortac-qcheck-stm
  • ortac-runtime-qcheck-stm

The main improvements that brings this release concern the ortac-qcheck-stm plugin (the other three packages are mainly released for compatibility reasons).

ortac-qcheck-stm provides a plugin for Ortac. It generates QCheck-STM tests for a module specified with Gospel. QCheck-STM is a model-based testing framework and Ortac/QCheck-STM relies on the Gospel models you gave in the specifications to build the QCheck-STM tests.

I'd like to highlight two of these improvements.

The first one is that type invariants for what we call the system under test are now checked. Let's say you want to generate QCheck-STM tests for a fixed-size container. You can give the following specification to your type:

type 'a t
(*@ mutable model contents : 'a list
    model size : int
    with t invariant List.length t.contents <= t.size *)

Now, the generated tests will check that the invariant is respected at initialisation of the system under test (the value of type 'a t) and that it is preserved by the functions being tested.

The second improvement concerns the test failure message. In order to make the failure more informative, a message stating which part of the Gospel specifications has been violated and a small OCaml program that demonstrates the unexpected behaviour will be displayed.

For example, with an artificial bug in the Array.length function, running the Ortac/QCheck-STM-generated test will print the following failure message:

random seed: 172339461
generated error fail pass / total     time test name
[✗]    1    0    1    0 / 1000     0.0s Array STM tests (generating)

--- Failure --------------------------------------------------------------------

Test Array STM tests failed (5 shrink steps):

   length sut

+++ Messages ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Messages for test Array STM tests:

Gospel specification violation in function length

  File "array.mli", line 7, characters 12-22:
    i = t.size

when executing the following sequence of operations:

  open Array
  let protect f = try Ok (f ()) with e -> Error e
  let sut = make 16 'a'
  let r = length sut
  assert (r = 16)
  (* returned 42 *)

================================================================================
failure (1 tests failed, 0 tests errored, ran 1 tests)

Although it has already helped find and fix some bugs, this project is still fairly new. So, feel free to try it and report any issue.

Happy testing!

OUPS meetup april 2024

zapashcanon announced

The next OUPS meetup will take place on Thursday, 25th of April 2024. It will start at 7pm at the 4 place Jussieu in Paris.

:warning: :trumpet: It will be in the in the Esclangon building (amphi Astier). :trumpet: :warning:

Please, register on meetup as soon as possible to let us know how many pizza we should order.

For more details, you may check the OUPS’ website .

This month will feature the following talks :

Symbolic execution for all with Owi and Wasm – Léo Andrès

WebAssembly (Wasm) is a new binary compilation target adopted by many high-level programming languages such as C/C++, Rust, Java, and Go. Building on this foundation, we present Owi, a toolkit to work with Wasm within the OCaml ecosystem. In particular, Owi features a reference interpreter for Wasm capable of performing both concrete and symbolic execution. In this presentation, we describe how we designed reusable components and a modular interpreter from a concrete one, enabling the sharing of code between the concrete and symbolic interpreters. Additionally, we demonstrate how it is possible to perform symbolic execution of other languages by compiling them to Wasm using the symbolic interpreter. We provide examples of symbolic execution applied to C and Rust code and describe our current work to extend this functionality to support OCaml and other garbage-collected languages by integrating WasmGC into Owi.

Smt.ml - A Multi Back-end Front-end for SMT Solvers in OCaml – Filipe Marques

SMT solvers are crucial tools in fields such as Software Verification, Program Synthesis, and Test-Case Generation. However, using their APIs, especially in typed functional languages like OCaml, can be challenging due to their complexity and lack of user-friendly interfaces. To address this, we propose Smt.ml, an open-source library that serves as a single interface for multiple SMT solvers in OCaml. Currently supporting solvers such as Z3, Colibri2, and Bitwuzla, Smt.ml enables users to seamlessly work with different solvers using one unified syntax. The library incorporates built-in optimizations to handle both concrete and symbolic expressions efficiently. Smt.ml has been successfully integrated with Owi, an interpreter and toolkit for WebAssembly. This integration allowed Owi to perform static symbolic execution and test-case generation for WebAssembly programs. Notably, Owi was able to identify known vulnerabilities in a widely-used C data structure libraries.

Mirage 4.5.0 released

Thomas Gazagnaire announced

On behalf of the Mirage team, I'm happy to announce the release of MirageOS 4.5.0. This was merged in opam-repository last week, so it should be available just in time for the upcoming 14th MirageOS hack retreat!

This release introduces a significant change in the Mirage tool by splitting the definition of command-line arguments used at configure-time and runtime. Command-line arguments used in the configure script (also called 'configuration keys' and defined in the Key module) are essential during the setup of module dependencies for the unikernel, allowing for specialized production of a unikernel for a given target runtime environment. On the other hand, command-line arguments that the unikernel can use a runtime (defined in the Runtime_arg module) are helpful for customizing deployments without altering the dependencies of the unikernels.

  • API changes:
    • There is no more ~stage parameter for Key.Arg.info.
    • Key now define command-line arguments for the configuration tool.
    • There is a new module Runtime_arg to define command-line arguments for the unikernel.
    • As there are no more keys type 'Both, users are now expected to create two separated keys in that case (one for configure-time, one for runtime) or decide if the key is useful at runtime of configure-time.
  • Intended use of configuration keys (values of type 'a key):
    • Used to set up module dependencies of the unikernel, such as the target (hvt, xen, etc.) and whether to use DHCP or a fixed IP address.
    • Enable the production of specialized unikernels suitable for specific target runtime environments and dedicated network and storage stacks.
    • Similar keys will produce reproducible binaries to be uploaded to artifact repositories like Docker Hub or https://builds.robur.coop/.
  • Intended use of command-line runtime arguments (values of type a runtime_arg):
    • Allow users to customize deployments by changing device configuration, like IP addresses, secrets, block device names, etc., post downloading of binaries.
    • These keys don’t alter the dependencies of the unikernels.
    • A runtime keys is just a reference to a normal Cmdliner term.
  • key_gen.ml is not generated anymore, so users cannot refer to Key_gen.<key_name> directly.
    • Any runtime argument has to be declared (using runtime_arg and registered on the device (using ~runtime_args). The value of that argument will then be passed as an extra parameter of the connect function of that device.
    • Configuration keys are not available at runtime anymore. For instance, Key_gen.target has been removed.
  • Code migration:

      (* in config.ml *)
     let key =
       let doc = Key.Arg.info ~doc:"A Key." ~stage:`Run [ "key" ] in
       Key.(create "key" Arg.(opt_all ~stage:`Run string doc))
    let main = main ~keys:[abstract hello] ..
    
    (* in unikernel.ml *)
    let start _ =
      let key = Key_gen.hello () in
      ...
    

    becomes:

    (* in config.ml *)
    let hello = runtime_arg ~pos:__POS__ "Unikernel.hello"
    let main = main ~runtime_args:[hello] ...
    
    (* in unikernel.ml *)
    open Cmdliner
    
    let hello =
      let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
      Arg.(value & opt string "Hello World!" doc)
    
    let start _ hello =
      ...
    

The mirage-skeleton repository and a few tutorials on https://mirage.io have been updated and now compile with mdx to check for future API breakage. Documentation PRs are very welcome if you find some missing updates. We also welcome more general feedback regarding this API change.

I also would like to use this announcement as a reminder that we have restarted the mirage bi-weekly calls. Check the MirageOS mailing list or the MirageOS Matrix channel for more info. The next one is planned for the 29th of April. If you are using or planning to use MirageOS (or are just curious about the project), feel free to join, it's open to everyone!

Happy hacking!

patricia-tree 0.9.0 - library for patricia tree based maps and sets

Dorian Lesbre announced

I'm happy to announce the release of a new patricia-tree library, version 0.9.0 on opam.

This library that implements sets and maps as Patricia Trees, as described in Okasaki and Gill's 1998 paper Fast mergeable integer maps. It is a space-efficient prefix trie over the big-endian representation of the key's integer identifier.

For full details, visit see the documentation or the source on github.

Features

  • Similar to OCaml's Map and Set, using the same function names when possible and the same convention for order of arguments. This should allow switching to and from Patricia Tree with minimal effort.
  • The functor parameters (KEY module) requires an injective to_int : t -> int function instead of a compare function. to_int should be fast, injective, and only return positive integers. This works well with hash-consed types.
  • The Patricia Tree representation is stable, contrary to maps, inserting nodes in any order will return the same shape. This allows different versions of a map to share more subtrees in memory, and the operations over two maps to benefit from this sharing. The functions in this library attempt to maximally preserve sharing and benefit from sharing, allowing very important improvements in complexity and running time when combining maps or sets is a frequent operation.
  • Since our Patricia Tree use big-endian order on keys, the maps and sets are sorted in increasing order of keys. We only support positive integer keys. This also avoids a bug in Okasaki's paper discussed in QuickChecking Patricia Trees by Jan Mitgaard.
  • Supports generic maps and sets: a 'm map that maps 'k key to ('k, 'm) value. This is especially useful when using GADTs for the type of keys. This is also sometimes called a dependent map.
  • Allows easy and fast operations across different types of maps and set (e.g. an intersection between a map and a set), since all sets and maps, no matter their key type, are really positive integer sets or maps.
  • Multiple choices for internal representation (NODE), which allows for efficient storage (no need to store a value for sets), or using weak nodes only (values removed from the tree if no other pointer to it exists). This system can also be extended to store size information in nodes if needed.
  • Exposes a common interface (view) to allow users to write their own pattern matching on the tree structure without depending on the NODE being used.

Comparison to other OCaml libraries

  • ptmap and ptset

    There are other implementations of Patricia Tree in OCaml, namely ptmap and ptset. These are smaller and closer to OCaml's built-in Map and Set, however:

    • Our library allows using any type key that comes with an injective to_int function, instead of requiring key = int.
    • We support generic (heterogeneous) types for keys/elements.
    • We support operations between sets and maps of different types.
    • We use a big-endian representation, allowing easy access to min/max elements of maps and trees.
    • Our interface and implementation tries to maximize the sharing between different versions of the tree, and to benefit from this memory sharing. Theirs do not.
    • These libraries work with older version of OCaml (>= 4.05 I believe), whereas ours requires OCaml >= 4.14
    • Our keys are limited to positive integers.
  • dmap

    Additionally, there is a dependent map library: dmap. It allows creating type safe dependent maps similar to our heterogeneous maps. However, its maps aren't Patricia trees. They are binary trees build using a (polymorphic) comparison function, similarly to the maps of the standard library. Another difference is that the type of values in the map is independent of the type of the keys, allowing keys to be associated with different values in different maps. i.e. we map 'a key to any ('a, 'b) value type, whereas dmap only maps 'a key to 'a.

    dmap also works with OCaml >= 4.12, whereas we require OCaml >= 4.14.

OCANNL 0.3.1: a from-scratch deep learning (i.e. dense tensor optimization) framework

Lukasz Stafiniak announced

OCANNL 0.3.2 is out now. Thanks!

Other OCaml News

Old CWN

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe to the caml-list.