OCaml Weekly News

Previous Week Up Next Week

Hello

Here is the latest OCaml Weekly News, for the week of November 19 to 26, 2024.

Table of Contents

OCaml 5.2.1 released

octachron announced

We have the pleasure of announcing the release of OCaml 5.2.1, dedicated to the memory of Niels Bohr and Paul Éluard on the anniversary of their deaths.

OCaml 5.2.1 is a collection of safe but import runtime time bug fixes backported from the 5.3 branch of OCaml to improve the stability of the 5.2 runtime while waiting for the upcoming release of OCaml 5.3.0.

The full list of bug fixes is available below for more details.

Happy hacking, Florian Angeletti, for the OCaml team.

Installation Instructions

The base compiler can be installed as an opam switch with the following commands:

opam update
opam switch create 5.2.1

The source code for the release is also directly available on:

Bug Fixes In OCaml 5.2.1 (18 November 2024)

  • Runtime System:
    • #13207: Be sure to reload the register caching the exception handler in caml_c_call and caml_c_call_stack_args, as its value may have been changed if the OCaml stack is expanded during a callback. (Miod Vallat, report by Vesa Karvonen, review by Gabriel Scherer and Xavier Leroy)
    • #13252: Rework register assignment in the interpreter code on m68k on Linux, due to the %a5 register being used by GLIBC. (Miod Vallat, report by Stéphane Glondu, review by Gabriel Scherer and Xavier Leroy)
    • #13268: Fix a call to test in configure.ac that was causing errors when LDFLAGS contains several words. (Stéphane Glondu, review by Miod Vallat)
    • #13234, #13267: Open runtime events file in read-write mode on ARMel (ARMv5) systems due to atomic operations limitations on that platform. (Stéphane Glondu, review by Miod Vallat and Vincent Laviron)
    • #13188: fix races in the FFI code coming from the use of Int_val(...) on rooted values inside blocking questions / without the runtime lock. (Calling Int_val(...) on non-rooted immediates is fine, but any access to rooted values must be done outside blocking sections / with the runtime lock.) (Etienne Millon, review by Gabriel Scherer, Jan Midtgaard, Olivier Nicole)
    • #13318: Fix regression in GC alarms, and fix them for Flambda. (Guillaume Munch-Maccagnoni, report by Benjamin Monate, review by Vincent Laviron and Gabriel Scherer)
    • #13140: POWER back-end: fix issue with call to caml_call_realloc_stack from a DLL (Xavier Leroy, review by Miod Vallat)
    • #13370: Fix a low-probability crash when calling Gc.counters. (Demi Marie Obenour, review by Gabriel Scherer)
    • #13402, #13512, #13549, #13553: Revise bytecode implementation of callbacks so that it no longer produces dangling registered bytecode fragments. (Xavier Leroy, report by Jan Midtgaard, analysis by Stephen Dolan, review by Miod Vallat)
    • #13502: Fix misindexing related to Gc.finalise_last that could prevent finalisers from being run. (Nick Roberts, review by Mark Shinwell)
    • #13520: Fix compilation of native-code version of systhreads. Bytecode fields were being included in the thread descriptors. (David Allsopp, review by Sébastien Hinderer and Miod Vallat)

smaws preview release, an AWS SDK for OCaml using eio

Chris Armstrong announced

I'm pleased to announce the first preview release for the smaws library (0.1~preview1).

smaws provides AWS bindings for OCaml using the modern eio library for effects-based concurrency handling.

Whats in this release

This release includes SDKs for some AWS services and is intended to demonstrate its API.

It is not production ready, lacking important features such as full API documentation EC2/ECS instance metadata authentication, retry and timeout handling, etc. It also needs support for the other internal AWS API types to extend coverage across most AWS services.

Motivation

I wanted to build an AWS SDK using modern effects-based concurrency. I've built similar bindings for ReScript and ReasonML in the past (some of the code is in fact ported across) but this is the first OCaml-native bindings I've created.

Unlike similar projects in the OCaml ecosystem, it uses the newer Smithy definitions to generate its bindings instead of the Python botocore definitions. These should be better supported by AWS in the future with richer API definitions.

What's next

My next task is to finish off API documentation generation, and then expand support for all the authentication methods and other API types that will allow this to be used with most AWS services.

ppx_deriving_ezjsonm

Patrick Ferris announced

I'm happy to announce the release of ppx_deriving_ezjsonm (based off of ppx_deriving_yaml). The two libraries share a common definition of a "value" which made the reuse of the existing deriver possible for a simple JSON deriver.

opam update
opam install ppx_deriving_ezjsonm

The documentation is online.

This library may come in handy when your dependency cone already includes ezjsonm. If that is not the case, you would probably have better luck in the yojson ecosystem of tools.

Happy JSON-ing :camel:

FUN OCaml now has a YouTube Channel

Sabine Schmaltz announced

I just created a YouTube channel for FUN OCaml. :sparkles: :camel:

The talk recordings from the conference in Berlin on September 16 + 17, 2024 are now available for viewing!

https://www.youtube.com/@FUNOCaml/featured

If you can, commenting, liking, or subscribing helps us to make these videos more visible and easier to find on YouTube, so big thanks for everyone who helps us with this! :orange_heart: :camel:

For people who avoid YouTube: The videos will also be made available on watch.ocaml.org.

Terrateam's open source Ocaml repository

Malcolm announced

A few years ago my friend and I started a company called Terrateam, which does infrastructure orchestration on GitHub. Being that I am an Ocamler and we are a lean company, we chose to use Ocaml as our primary language. We recently went open source and I'm posting the link here to contribute an example of an actual company using Ocaml. A real repository.

The code can be found here.

There a few things to note about the repo:

  1. It's a mono repo, so while many of the libraries in there are generic, they are not really individually consumable as is.
  2. We have our own concurrency framework (more on that below).
  3. We use our own build library (pds, which is in opam).
  4. The code is in flux all the time so things change rapidly.

Why did we build our own concurrency framework?

Disclaimer: Yet another concurrency framework? Yep! Do I expect anyone to use it? Nope, and that's ok. It is designed for our needs. It's meant to be maintainable by one person. It's not meant to compete with Lwt or Async for mind share. If it grows, great, if it doesn't, I'm happy still.

Our concurrency framework is called "Asynchronous Building Blocks" (Abb). It started over a decade ago when I was frustrated with a few things:

  1. I wanted kqueue support in Async, but (at the time) Async required modifying a handful of repos to support it and it just wasn't obvious how.
  2. Lwt supported kqueue, but for no good reason, I just didn't like Lwt. Part of it was how failure worked in Lwt and other part is just it didn't fit my aesthetic. That isn't a ding against Lwt, just personal preference.
  3. I wanted as much of it to be implemented in Ocaml as possible. As it stands now, the only C code is libkqueue which is a little shim to to allow kqueue code to run on Linux, otherwise everything is in Ocaml.
  4. I didn't like how neither Async nor Lwt really supported cancelling operations. I wanted that to be part of the framework, not an ad-hoc feature per library. Coming from Erlang, cancelling is really important to me and part of how I think about writing concurrent software. I was bummed that (last I looked) Eio explicitly rejected cancelling.
  5. I also wanted a little experiment of "what if the concurrency library exposed a syscall interface like an OS?" So a lot of the interface is meant to look low-level (I don't think this idea really panned out or made Abb meaningfully different).
  6. I also just like having my own frameworks.

Add a dash of naivete, "how hard can it be to build a concurrency framework?", I started my own. First commit was Mar 9, 2013.

Much of the concurrency monad is based on an unreleased library called Fut by @dbuenzli

Over time, Abb matured to where I could use it in my personal projects. And by the time we decided to make Terrateam, I felt it was good enough for production. And it's been running production traffic for a few years now.

One, unexpected, benefit of Lwt and Async existing in the community is that adding a third one isn't that hard. Almost all libraries that want to be used support both, and that usually means that they have a generic interface. Cohttp and Dns are examples. So I could use existing libraries for things I didn't want to or don't feel I could reasonably implement myself.

I've also used Abb as a foundation my web framework called Brtl (pronounced Brutal) which is both a backend framework build on Cohttp and a frontend framework built on Brr. It really doesn't do anything fancy, like Dream, it's pretty low level and focused on being simple.

The good:

  1. It works! At least, for me.
  2. Given that I wrote very single line of code, debugging and bug fixing (which is less and less) is very easy. I also have a really great mental model of how it works.
  3. I like that I can just cancel a whole graph of async work if it's no longer needed.
  4. The future's library works in FreeBSD, Linux, and JavaScript.
  5. The test coverage is pretty darn high. This is because it's a pretty intricate thing to implement so I had to implement a lot of tests to stay sane.

The bad:

  1. Performance is not anything special. I don't think this is a fundamental flaw, it's just that it is as fast as I need it to be right now.
  2. Some of the API is a awkward if you don't know the system. Or names are long, like Abb_future_combinators.
  3. The multi-target build story kind of sucks. I think that might be a bigger issue with the pds build system but for now in the web framework you have to use Abb_js rather than Abb for everything.
  4. There are definitely some corners cut in, especially around file IO, but that's OK, we don't do much file IO.
  5. Explicitly takes advantage in that everything runs in a single thread. So implicitly going multi-threaded would probably break things.

Future work:

There really isn't a lot of future work. For the most part: Abb is done. Or should I say the interface is done. Yes, it will need updates to fight bit rot, but there isn't much more for it to do. It runs your code concurrently, the end.

However, as Ocaml5 becomes more of a thing, it will need to take advantage of that. I haven't really thought about how to do it. One item I have in my to-do list is to evaluate if Picos could be a base layer for Abb. Abb is a layered approach so really you only need to implement the Abb_intf.S interface and everything above that should Just Work (given single threaded semantics). I think any future work to support multi core will probably need an explicit "this crosses a thread boundary" API. Abb will get there, eventually, but right now it doesn't need to.

Effects will obviously have a big impact, I have no idea what that'll do for Abb. I hope I can transition it slowly to supporting effects but I don't want to look at effects until it's in the type system.

Some, perhaps, libraries of interest in the repo:

  1. Abb_scheduler_kqueue - The most used scheduler. It implements the Abb_intf interface.
  2. Abb_scheduler_select - A simpler select-based scheduler. This is meant to be used any place kqueue is not supported and also as demo.
  3. Pgsql_io - An implementation of the PostgreSQL protocol.
  4. Githubc2 - An automatically generated GitHub REST library generated from their JSON Schema. This actually has no Abb dependency, just implements the API serializing/deserializing.
  5. OpenAPI CLI - This generates a library (see Githubc2) from an OpenAPI spec. It is, absolutely, a bit of a rats nest, but it works. We chose to do code-gen for this because I didn't want to be blocked when compiling based on different compiler versions as we're using Ast_helper. Ocaml, the code, is more stable than Ocaml, the compiler API.

There is a bunch of other stuff in there. If you decide to poke around and have any questions, feel free to ask. I can promise: not every decision in there is well thought out or coherent.

OUPS december 2024

zapashcanon announced

CAUTION: the time has been changed from 7pm to 6:30pm

The next OUPS meetup will take place on Thursday, 12th of December 2024. It will start at 6:30pm at the 4 place Jussieu in Paris. It will be in the in the Esclangon building (amphi Astier).

Please, register on meetup as soon as possible to let us know how many pizza we should order.

For more details, you may check the OUPS’ website .

This time we'll have the following talks:

Snapshottable Stores – Clément Allain & Gabriel Scherer (@gasche)

We say that an imperative data structure is snapshottable or supports snapshots if we can efficiently capture its current state, and restore a previously captured state to become the current state again. This is useful, for example, to implement backtracking search processes that update the data structure during search.

Inspired by a data structure proposed in 1978 by Baker, we present a snapshottable store, a bag of mutable references that supports snapshots. Instead of capturing and restoring an array, we can capture an arbitrary set of references (of any type) and restore all of them at once. This snapshottable store can be used as a building block to support snapshots for arbitrary data structures, by simply replacing all mutable references in the data structure by our store references. We present use-cases of a snapshottable store when implementing type-checkers and automated theorem provers.

Our implementation is designed to provide a very low overhead over normal references, in the common case where the capture/restore operations are infrequent. Read and write in store references are essentially as fast as in plain references in most situations, thanks to a key optimisation we call record elision. In comparison, the common approach of replacing references by integer indices into a persistent map incurs a logarithmic overhead on reads and writes, and sophisticated algorithms typically impose much larger constant factors.

The implementation, which is inspired by Baker's and the OCaml implementation of persistent arrays by Conchon and Filliâtre, is both fairly short and very hard to understand: it relies on shared mutable state in subtle ways. We provide a mechanized proof of correctness of its core using the Iris framework for the Coq proof assistant. [/details]

Safe, expressive and efficient programming of FPGAs circuits – Loïc Sylvestre

FPGAs (Field-Programmable Gate Arrays) are reconfigurable digital circuits: their behavior can be customized by logic synthesis of specification at the so-called register transfer level (RT level), in hardware description languages such as VHDL or Verilog. FPGAs are well suited to implement reactive systems, directly as synchronous circuits interacting with the external environment via I/O pins – the logic synthesizer ensuring that timing constraints are met, given the FPGA clock frequency. FPGAs are also used to implement hardware accelerators ; however, RT-level descriptions of transformational systems (or “computations”) – with latencies of several clock cycles – are difficult to debug, maintain and manually optimize. High-Level Synthesis (HLS) offers a simpler way of expressing computations, using a programming language compiled at the RT level. The advantage of this approach is to keep the implementation details hidden from the programmer, leaving the compiler responsible for scheduling computations over time. However, this leads to a loss of control over temporal behavior and therefore safety and efficiency for the circuits generated. As embedded systems, especially those based on FPGAs, need to perform more and more computations, while interacting with their environment, this thesis proposes a programming model to combine hardware description (data-flow oriented) and general-purpose parallel computation (control-flow oriented) using a synchronous approach. This programming model forms the basis for the design and implementation of Eclat, a functional-imperative, parallel and synchronous programming language, compiled to VHDL. Eclat is sufficiently precise to describe synchronous circuits at the RT level. It facilitates the programming of hardware accelerators, with a clear and predictable temporal semantics by which to exploit time-space trade-offs. Any Eclat program is reactive, with a mechanism for embedding computations within programs and thereby mix computation and interaction. Eclat also offers shared memory (in the form of RAM blocks), with deterministic concurrency. It is used to develop programming abstractions such as algorithmic skeletons and virtual machine implementations for high-level languages. This addresses, at various levels, the need to run general-purpose algorithms within FPGA-based reactive embedded applications.

After the talks there will be some pizzas offered by the OCaml Software Foundation and later on we’ll move to a pub nearby as usual.

Dune dev meeting

Etienne Marais announced

Hi camelers! :camel: We will hold our regular Dune dev meeting on Wednesday, November, 27th at 16:00 CET. As usual, the session will be one hour long.

Whether you are a maintainer, a regular contributor, a new joiner or just curious, you are welcome to join: these discussions are opened! The goal of these meetings is to provide a place to discuss the ongoing work together and synchronise between the Dune developers :ok_hand:

:calendar: Agenda

The agenda is available on the meeting dedicated page. Feel free to ask if you want to add more items in it.

:computer: Links

Other OCaml News

Old CWN

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe to the caml-list.