OCaml Weekly News
Hello
Here is the latest OCaml Weekly News, for the week of November 23 to 30, 2021.
Table of Contents
- opam 2.1.1, opam 2.0.10, and opam-depext 1.2
- OTOML 0.9.0 — a compliant and flexible TOML parsing, manipulation, and pretty-printing library
- New release of Fix
- New release of Menhir (20211125)
- Lwt 5.5.0, Lwt_domain 0.1.0, Lwt_react.1.1.5
- OCaml's CI is gradually moving to GitHub Actions
- How to combine 3 monads: Async/Lwt, Error and State?
- Old CWN
opam 2.1.1, opam 2.0.10, and opam-depext 1.2
R. Boujbel announced
We are pleased to announce several minor releases: opam 2.0.10, opam 2.1.1, and opam-depext 1.2.
The opam releases consist of backported fixes, while opam-depext
has been adapted to be compatible with opam 2.1,
to allow for workflows which need to maintain compatibility with opam 2.0. With opam 2.1.1, if you export
OPAMCLI=2.0
into your environment then workflows expecting opam 2.0 should now behave even more equivalently.
You'll find more information in the blog post .
OTOML 0.9.0 — a compliant and flexible TOML parsing, manipulation, and pretty-printing library
Daniil Baturin announced
A new 0.9.3 relase is available. Still not 1.0.0 just in case. The change I'm most glad I managed to make is that the lexer is now re-entrant and doesn't use any mutable state. Where can I apply for the "Designed for multicore OCaml" certification sticker? ;)
Breaking change in the functor interface
I found an oversight that took a breaking change to fix. It didn't break any package that was already in the OPAM repository, so I'm glad I noticed it before it caused anyone trouble.
My idea to make the functor take separate integer and float modules turned out to be misguided: it wouldn't compose
with Otoml.get_float ~strict:false
and similar functions that apply type conversions.
Logically, Otoml.get_float ~strict:false (Otoml.integer 5)
should produce Otoml.TomlFloat 5.0
. However, it means
that get_float
needs to know how to convert integers to float. If integer and float types are in separate modules,
that isn't possible.
So I combined both integers and floats in a single TomlNumber
. That way people who want to bring their own bignum
libraries will have to write more code, but numbers will behave as they are expected to in a dynamically typed
format.
module BigNumber = struct type int = Z.t type float = Decimal.t let int_of_string = Z.of_string let int_to_string = Z.to_string let int_of_boolean b = if b then Z.one else Z.zero let int_to_boolean n = (n <> Z.zero) (* Can't just reuse Decimal.to/of_string because their optional arguments would cause a signature mismatch. *) let float_of_string s = Decimal.of_string s (* Decimal.to_string uses "NaN" spelling while TOML requires all special float values to be lowercase. *) let float_to_string x = Decimal.to_string x |> String.lowercase_ascii let float_of_boolean b = if b then Decimal.one else Decimal.zero let float_to_boolean x = (x <> Decimal.zero) let float_of_int = Decimal.of_bigint let int_of_float = Decimal.to_bigint end module Otoml = Otoml.Base.Make (BigNumber) (Otoml.Base.StringDate)
The next release will likely be 1.0.0 for real.
New release of Fix
François Pottier announced
I am pleased to announce a new release of Fix, with several new modules contribued by Frédéric Bour (thanks!).
In short, Fix is a toolkit that helps perform memoization and fixed point computations (including data flow analyses). More generally, it offers a number of basic algorithmic building blocks that can be useful in many circumstances.
opam update opam install fix.20211125
Documentation can be found here:
- https://gitlab.inria.fr/fpottier/fix/-/blob/master/README.md
- http://cambium.inria.fr/~fpottier/fix/doc/fix/Fix/index.html
Enjoy,
François Pottier
francois.pottier@inria.fr
http://cambium.inria.fr/~fpottier/
2021/11/25
- The new module
CompactQueue
offers a minimalist mutable FIFO queue. It is comparable with OCaml'sQueue
module. In comparison withQueue
, it uses a more compact internal representation: elements are stored contiguously in a circular array. This has a positive impact on performance: both time and memory consumption are reduced. This data structure is optimized for maximum throughput. (Contributed by Frédéric Bour, reviewed by François Pottier.) - The new functor
DataFlow.ForCustomMaps
offers a forward data flow analysis that is tuned for greater performance. (Contributed by Frédéric Bour, reviewed by François Pottier.) - The new module
Indexing
offers a safe API for manipulating indices into fixed-size arrays. This API involves some dynamic checks as well as static type checks, thereby (hopefully) greatly reducing the risk of confusion in code that uses many arrays and many indices into these arrays. (Contributed by Frédéric Bour, reviewed by François Pottier.) - In
DataFlow
, allow the functionforeach_root
(which is part of the signatureDATA_FLOW_GRAPH
) to callcontribute x _
several times at a single rootx
.
New release of Menhir (20211125)
François Pottier announced
I am pleased to announce a new release of Menhir, with an exciting
contribution by Frédéric Bour: a groundbreaking performance improvement in
menhir --list-errors
. This is made possible by an entirely new reachability
algorithm, which has been designed and implemented by Frédéric, and which is
described in our paper "Faster Reachability Analysis for LR(1) Parsers". This
is the link to the paper:
http://cambium.inria.fr/~fpottier/publis/bour-pottier-reachability.pdf
To install the new release, just type
opam update opam install menhir.20211125
Enjoy!
François Pottier
Francois.Pottier@inria.fr
http://cambium.inria.fr/~fpottier/
- The command
menhir --list-errors
has been sped up by a factor of up to x100, and requires up to x1000 less memory, thanks to a new LR(1) reachability algorithm, which has been designed and implemented by Frédéric Bour. - Better document the restricted way in which the
error
token must be used when using--strategy simplified
. Menhir now checks that this token is used only at the end of a production, and warns if this is not the case. (Better yet, our suggestion is to not use theerror
token at all!) - The
$syntaxerror
keyword is now forbidden when using--strategy simplified
. This keyword will be entirely removed in the next release. Incidentally, we have just found out that it behaves differently under the code back-end and under the table back-end. - Disable OCaml warning 39 (unused rec flag) in the OCaml code produced by Menhir's code back-end. This does not affect the table back-end. (Reported by Armaël Guéneau.)
- Fix a bug in
--random-*
which could cause Menhir to diverge if the grammar uses theerror
token. - Warn if a terminal symbol is named
Error
. This creates a name clash in the public interface of the generated parser. - Menhir now requires OCaml 4.03.0 (instead of 4.02.3) and Dune 2.8.0 (instead of 2.0.0).
Lwt 5.5.0, Lwt_domain 0.1.0, Lwt_react.1.1.5
Raphaël Proust announced
It is my pleasure to announce the release of Lwt version 5.5.0, Lwt_domain version 0.1.0, Lwt_react version 1.1.5, Lwt_ppx version 2.0.3 and Lwt_ppx_let version 5.5.0.
https://github.com/ocsigen/lwt/releases/tag/5.5.0
All those packages can be installed via opam as usual.
:rotating_light: Deprecation
One notable change is the deprecation of Lwt_main.yield
and Lwt_unix.yield
. It is recommended to use Lwt.pause
instead.
:rocket: Lwt_domain: an interface to multicore parallelism
Another notable change is the addition of the Lwt_domain package. This package includes a single module Lwt_domain
with functions to execute some computations in parallel, using the features of Multicore OCaml. The package requires
an OCaml compiler with domains support to install.
Code for this package is the work of @sudha with reviews and packaging from Lwt contributors.
Other changes
The full list of changes is available in the CHANGES file.
OCaml's CI is gradually moving to GitHub Actions
Sora Morimoto announced
The OCaml team started switching to GitHub Actions last year for some of the official OCaml repositories. Also, we have released some CI related stuff, such as setup-ocaml, to the community. Some OCaml hackers also know that CI in the OCaml community is gradually switching to GitHub Actions nowadays.
However, what gradually became a problem when we started switching was that the number of concurrent jobs that could run in a free account on GitHub was not enough for our activeness.
One of the major pain points for compiler contributors is that the wait time for CI to complete, which is unrelated to the actual build, is too long. However, this has been a pain point in all services, even before GitHub Actions.
The GitHub team did their best to help us make it better. As a result, they offered to upgrade the OCaml organization's plan to the team plan for free, which means that we can now benefit from a range of features, including access to 3x more concurrent runners than before.
- About team plan: https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration
- Concurrency/plan: https://docs.github.com/en/get-started/learning-about-github/githubs-products#github-team
We would like to thank GitHub for supporting our team and Ahmed Bilal, who supported this effort.
How to combine 3 monads: Async/Lwt, Error and State?
Deep in this thread, Ivan Gotovchits said
The monads library provides the transformers for some well-known monads. All these monads have a more or less standard implementation, offering the same performance as any other monadic library can offer. Like there is no better way of implementing the state monad other than a function. We have experimented a lot with different performance optimizations, such as boxing and unboxing it and inlining various operators, and keep experimenting to get the maximum from the current compiler. In BAP, we heavily use the monads library, first of all for our knowledge representation and reasoning engine, which is the foundation for all BAP analyses. We also use it for emulating binary programs. The rich interface is here to make our life easier and more comfortable when we use monads. It definitely comes for free¹ as the number of functions doesn't affect the performance of the underlying monad.
But… there is always a but :) Stacking monads using a transformer does have a price. Even with the flambda compiler. The latter is doing an excellent job of unstacking them and eliminating the overhead of having a chain of monads. But our latest experiments show that a custom-made monad (still with the monads library) performs better under either branch of the compiler. We have rewritten our main monads that were relying on transformers and got from 20% to 50% performance improvement. But that is not to say that the monads library itself is slow or that we're not using it, it is to say that there are other options to transformers that might work in some cases. See the linked PR if you want to learn the trick.
¹⁾ Provided that we ignore the size of the executable, e.g., linking the core_kernel library results in a quite large binary, which may increase the startup time. Insignificantly, but in some use cases, it might be a significant factor.
Ivan Gotovchits then said
As it was already suggested, you can use monad transformers, to compose several monads into a single monad. As a show-case, we will use the monads library (disclaimer, I am an author of this library), which you can install with
opam install monads
It offers most of the well-known monads in a form of a monad transformer, which in terms of OCaml, is a functor that
takes a monad and returns a new monad that enriches it with some new behavior. For example, to make a
non-deterministic error monad, we can do Monad.List.Make(Monad.Result.Error)
and get a monadic structure (i.e., a
module that implements the Monad.S interface) that is both a list monad and an error monad. The small caveat is
that the operations of the wrapped monad, the error monad in our case, are not available directly, so we have to
lift them, e.g.,
let fail p = lift @@ Monad.Result.Error.fail p
So that in the end, the full implementation of the transformed monad still requires some boilerplate code,
module ListE = struct type 'a t = 'a list Monad.Result.Error.t include Monad.List.Make(Monad.Result.Error) let fail p = lift@@Monad.Result.Error.fail p (* and so on for each operation that is specific to the wrapped monad *) end
Now, let's try wrapping the Lwt monad into the state. We don't want to add the Error monad because Lwt is already the
error monad and adding an extra layer of errors monad is not what we want. First of all, we need to adapt the Lwt
monad to the Monad.S
interface, e.g.,
module LwtM = struct type 'a t = 'a Lwt.t include Monad.Make(struct type 'a t = 'a Lwt.t let return = Lwt.return let bind = Lwt.bind let map x ~f = Lwt.map f x let map = `Custom map end) end
If we want to keep the state type monomorphic, then we will need a module for it. Suppose your state is represented as,
module State = struct type t = string Map.M(String).t end
Now, we can use it to build our State(Lwt)
Russian doll,
module IO = struct include Monad.State.T1(State)(LwtM) include Monad.State.Make(State)(LwtM) (* let's lift [read] as an example *) let read fd buf ofs len = lift (Lwt_unix.read fd buf ofs len) end
The Monad.State.T1
functor is used to create the types for the generated monad. You can write them manually, of
course, like as we did in the List(Error) example, but the type generating modules are here for the
convenience¹
Now, let's get back to the problem of the lifting. It looks tedious to impossible to lift every operation from Lwt. Commonly, we try to put the smaller monad inside, to minimize the work, but it doesn't work with Lwt as the latter is not a transformer. So what is the solution? For me, the solution is to not lift the operations at all, but instead, define your IO abstraction and hide that it is using Lwt underneath the hood. This will make the code that uses this new abstraction more generic and less error-prone so that it can focus on the business logic and the implementation details could be hidden inside the monad implementation. This is what the monads are for, anyway.
¹⁾ We omit the types from the output of the Make
functor since for a long time OCaml didn't allow the
repetition of types in a structure so having the types in it will prevent us from composing various flavors of monads
using include
. It is also a long-time convention widely used in many OCaml libraries, including Core and Async. A
convention that we probably don't need anymore.
Old CWN
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.