OCaml Weekly News

Previous Week Up Next Week

Hello

Here is the latest OCaml Weekly News, for the week of September 01 to 08, 2020.

Table of Contents

OCaml 4.11.1: early bugfix release

octachron announced

A serious bug has been discovered last week in OCaml 4.11.0: explicit polymorphic annotations are checked too permissively. Some incorrect programs (possibly segfaulting) are accepted by the compiler in 4.11.0.

Programs accepted by OCaml 4.10 are unchanged.

We are thus releasing OCaml 4.11.1 as an early bugfix version. You are advised to upgrade to this new version if you were using OCaml 4.11.0.

It is (or soon will be) available as a set of OPAM switches with

opam switch create 4.11.1

and as a source download here: https://caml.inria.fr/pub/distrib/ocaml-4.11/

This bug was introduced when making polymorphic recursion easier to use. We are working on making the typechecker more robust and more exhaustively tested to avoid such issues in the future.

Bug fixes:

  • #9856, #9857: Prevent polymorphic type annotations from generalizing weak polymorphic variables. (Leo White, report by Thierry Martinez, review by Jacques Garrigue)
  • #9859, #9862: Remove an erroneous assertion when inferred function types appear in the right hand side of an explicit :> coercion (Florian Angeletti, report by Jerry James, review by Thomas Refis)

Rwmjones then said

We've now got 4.11.1 in Fedora 33 & Fedora 34. No apparent problems so far.

textmate-language 0.1.0

dosaylazy announced

I am pleased to announce textmate-language 0.1.0. Textmate-language is a library for tokenizing code using TextMate grammars. Therefore, it may be useful for implementing syntax highlighters. Please report any bugs or API inconveniences you find.

Batteries v3.1.0

UnixJunkie announced

OCaml Batteries Included is a community-maintained extended standard library for OCaml.

The latest API can be found here: https://ocaml-batteries-team.github.io/batteries-included/hdoc2/

This minor release adds support for OCaml 4.11. It has been available in opam for some days.

Special thanks to all the contributors!

The changelog follows:

  • Compatibility fixes for OCaml-4.11 #962 (Jerome Vouillon)
  • BatEnum: added combination #518 (Chimrod, review by hcarty)
  • fix benchmarks #956 (Cedric Cellier)
  • BatFile: added count_lines #953 (Francois Berenger, review by Cedric Cellier)
  • BatArray: use unsafe_get and unsafe_set more often #947 (Francois Berenger, review by Cedric Cellier)
  • fix some tests for ocaml-4.10.0 #944 (kit-ty-kate)
  • BatResult: BatPervasives.result is now equal to Stdlib.result instead of sharing constructors without being the same type #939, #957 (Clément Busschaert, Cedric Cellier).

Job offer in Paris - Be Sport

Vincent Balat announced

Be Sport is looking to hire an OCaml developer with skills in

  • Mobile/web feature design
  • Team management
  • Use of Social networks

She/he will take part in the development of our Web and mobile apps, entirely written in OCaml with Ocsigen, and participate in reflections on features.

Please contact me for more information or send an email to jobs@besport.com.

Some SIMD in your OCaml

Anmol Sahoo announced

Fresh from a weekend of hacking, I would like to share some results of an experiment I conducted of creating a library for exposing Intel AVX2 intrinsics to OCaml code. AVX2 is an instruction set subset that adds data-parallel operations in hardware.

I chose to fork the amazing bigstringaf library and modified it. You can find the additions to the code here - bigstringaf_simd.

Overview

Given a type Bigstring.t (1 dimensional byte arrays) there now exist functions such as -

val cmpeq_i8 : (t * int) -> (t * int) -> (t * int) -> unit

So cmpeq_i8 (x,o1) (y,o2) (z,03) will compare 32 bytes starting at o1 and o2 from x and y respectively and store the result in z at o3.

Why?

This was mainly an exercise in curiosity. I just wanted to learn whether something like this is viable. I also want to see if adding some type-directed magic + ppx spells can let us write data parallel code much more naturally similar to what lwt / async did for async code.

At the same time, you might ask - why not use something like Owl (which already has good support for data-parallel operations)? Apart from the fact that such libraries are oriented towards numerical code, I would also like to explore if we can operate directly on OCaml types and cast them into data parallel algorithms. Like how simdjson pushed the boundaries of JSON parsing, it would be nice to port idiomatic code to data-parallel versions in OCaml. Can we, at some point, have generic traversals of data-types, which are actually carried out in a data-parallel fashion?

Does it work?

Given the limitation of the current implementation (due to foreign function calls into C), I still found some preliminary results to be interesting! Implementing the String.index function, which returns the first occurence of a char, the runtime for finding an element at the n-1 position in an array with 320000000 elements is -

serial: 1.12 seconds
simd: 0.72 seconds (1.5x)

I still have to do the analysis what the overhead of the function call into C is (even with [@@noalloc]!

Future directions

It would be interesting to see, if we can create a representation which encapsulates the various SIMD ISA's such as AVX2, AVX512, NEON, SVE etc. Further more, it would be really interesting to see if we can use ppx to automatically widen `map` functions to operate on blocks of code, or automatically cast data types in a data parallel representation.

Disclaimer

This was mostly a hobby project, so I cannot promise completing any milestones or taking feature requests etc. I definitely do not recommend using this in production, because of the lack of testing etc.

A PPX Rewriter approach to ocaml-migrate-parsetree

Chet Murthy announced

TL;DR

Based on camlp5 and the pa_ppx PPX rewriters, I've written a new one, pa_deriving_plugins.rewrite, that automates almost all the work of writing a migration from one version of OCaml's AST to another.

  1. It took a few days (b/c of laziness) to write the initial PPX rewriter
  2. A day to get 4.02->4.03 AST migration working
  3. a couple of hours to get 4.03->4.02 working
  4. and a few more hours to get 4.03<->4.04 and 4.04<->4.05 working

At this point, I fully expect that the other version-pairs will not be difficult.

You can find this code [warning: very much a work-in-progress] at https://github.com/chetmurthy/pa_ppx/tree/migrate-parsetree-hacking

The file pa_deriving.plugins/pa_deriving_rewrite.ml contains the source for the PPX rewriter.

The directory pa_omp contains the migrations, typically named rewrite_NNN_MMM.ml.

A slightly longer-winded explanation

If you think about it, ppx_deriving.map isn't so different from what we need for ocaml-migrate-parsetree. ppx_deriving.map, from a type definition for ~ 'a t~, will automatically generate a function

map_t : ('a -> 'b) -> 'a t -> 'b t

If you think about it, if we could just substitute our own type for the second occurrence of t (somehow …. yeah grin) then it would be almost like what we want for o-m-p, yes?

With 11 versions of the Ocaml AST so far, maybe it's worth thinking about how to automate more of the migration task. Also, since so much of it is type-structure-driven, one would think that it would be an excellent opportunity to apply PPX rewriting technology. Indeed, one might think that a good test of PPX rewriting, is the ability to automate precisely such tasks.

So what's hard about this migration task? Here are some issues (maybe there are more):

  1. the types are slightly differently-organized in different versions of the AST. Types might move from one module to another.
  2. sometimes new types are introduced and old ones disappear
  3. constructor data-types may get new branches, or lose them
  4. record-types may get new fields, or lose them
  5. sometimes the analogous types in two consecutive versions are just really, really different [but this is rare]: we need to supply the code directly
  6. when mapping from one version to another, sometimes features are simply not mappable, and an error needs to be raised; that error ought to contain an indication of where in the source that offending feature was found
  7. And finally, when all else fails, we might need to hack on the migration code directly

But morally, the task is really straightforward (with problems listed in-line):

  1. use ppx_import to copy over types from each of the AST times of each Ocaml version
    • ppx_import works on .cmi files, and those have different formats in different versions of Ocaml. Wouldn't it be nice if it worked on .mli files, whose syntax (b/c OCaml is well-managed) doesn't change much?
  2. build a single OCaml module that has all the AST types in it (from all the versions of OCaml)
    • but without the form

      type t = M.t = A of .. | B of ....
      

      that is, without the "type equation" that allows for a new type-definition to precisely repeat a previous one.

  3. Then use ppx_import against this single module to construct a recursive type-declaration list of all the AST types for a particular version of OCaml, and apply a "souped-up" version of ppx_deriving.map to it, to map the types to another version of the AST types.
    • but ppx_deriving.map doesn't do this today, and besides, it would have to provide a bunch of "escape hatches" for all the special-cases I mentioned above.

But this is in principle doable, and it has the nice feature that all the tedious boilerplate is mechanically-generated from type-definitions, hence likely to not contain errors (assuming the PPX rewriter isn't buggy).

So I decided to do it, and this little post is a result.

Discussion

I think this is a quite viable approach to writing ocaml-migrate-parsetree, and I would encourage the PPX community to consider it. One of the nice things about this approach, is that it relies heavily on PPX rewriting itself, to get the job done. I think one of the important things we've learned in programming languages research, is that our tools need to be largely sufficient to allow us to comfortably implement those same tools. It's a good test of the PPX infrastructure, to see if you can take tedious tasks and automate them away.

I'm not going to describe anymore of how this works, b/c I'd rather get the rest of the migrations working, start figuring out how to test, and get this code integrated with camlp5.

But for anybody who's interested, I'd be happy to interactively describe the code and walk them thru how it works.

Louis Roché then asked

For a person who hasn't digged into OMP, can you explain how it is different from what is done currently? Because the idea I had of OMP is basically what you describe, a set of functions transformation an AST from vX to vX-1 and vX+1. So I am obviously missing something.

Chet Murthy replied

Yes, you're right: imagine a series of modules M2…M11. Each declares the same set of types, but with different definitions, yes? Then you'd have migration modules, migrate_i_j (j=i+1 or j=i-1) that have functions that convert between the analogously-named types. The entire question is: how are these functions implemented? By hand? With significant mechanized support? They can't be implemented fully-mechanically, because there are decisions to be made about how to bridge differences in type-definitions. For instance, look at the 4.02 type label and the 4.03 type arg_label. Sometimes these are analogous (and sometimes they're not). When they're analogous, the code that converts between -cannot- be automatically inferred: a human has to write it. But -most- of the code of these migration functions can be inferred automatically from the type-definitions themselves.

And that's really all that my little experiment does: automatically infer the migration code (most of the time) with some hints for those cases where it's not possible to automatically infer.

Now, why would one do this? Well, two reasons:

  1. it should be more maintainable to automatically generate most of the code from types, and it should be quicker to bring online a migration for a new version of the Ocaml AST.
  2. this should be a good test of PPX rewriting. That is, if we're going to build a macro-preprocessing support system, shouldn't it be able to make solving such straightforward, but very tedious, problems much, much easier?

Chet Murthy then added

I forgot to add a third reason why this PPX-rewriter-based approach is better:

  1. If you look at ocaml-migrate-parsetree "migrations", you'll find that they're almost all boilerplate code. But sprinkled here-and-there, is actual logic, actual decisions about how to come up with values for new fields, about which fields, when non-trivial (e.g. not "[]") should lead to migration-failure, etc. It is this code, that is the actual meat of the migration, and it's not at all obvious, when sprinkled thru the mass of mechanically-produclble boilerplate.

A mechanized production of that boilerplate would mean that we retained explicitly only this nontrivial code, and hence for maintenance we could focus on it, and make sure it does the right thing.

Josh Berdine asked

Figuring out ways to make maintaining this stuff more efficient would be great! One aspect that isn't clear to me is how this approach compares to the process currently used to generate the omp code. I haven't done it myself, but at first glance the tools to generate the omp code (e.g. gencopy) seem to also accurately be describable as heavily using ppx infrastructure in order to implement the map code from one version to another. Is there an executive summary that compares and contrasts that and this proposal?

Chet Murthy replied

From the README, gencopy is used to generate a prototype file for each migration, and then a human goes in and fixes up the code. A way to put my point is: gencopy should be provided the fixups in some compact form, and apply them itself.

telltime - when is when exactly?

Darren announced

I'm happy to announce release of telltime 0.0.1, a small cli tool for interacting with Daypack-lib (a schedule, time, time slots handling library) components.

It primarily answers time related queries, with support for union (||), intersect (&&) and "ordered select" (>>, explanation of this is at the bottom).

The query language, time expression, aims to mimic natural language, but without ambiguity. The grammar is only documented in the online demo here at the moment.

Some examples copied from the README are as follows.

Search for time slots matching Daypack time expression

"Hm, I wonder what years have Febuary 29th?"

$ telltime search --time-slots 5 --years 100 "feb 29 00:00"
Searching in time zone offset (seconds)            : 36000
Search by default starts from (in above time zone) : 2020 Sep 03 19:24:15

Matching time slots (in above time zone):
[2024 Feb 29 00:00:00, 2024 Feb 29 00:00:01)
[2028 Feb 29 00:00:00, 2028 Feb 29 00:00:01)
[2032 Feb 29 00:00:00, 2032 Feb 29 00:00:01)
[2036 Feb 29 00:00:00, 2036 Feb 29 00:00:01)
[2040 Feb 29 00:00:00, 2040 Feb 29 00:00:01)

"Would be handy to know what this cron expression refers to"

$ telltime search --time-slots 5 "0 4 8-14 * *"
Searching in time zone offset (seconds)            : 36000
Search by default starts from (in above time zone) : 2020 Sep 06 17:39:56

Matching time slots (in above time zone):
[2020 Sep 08 04:00:00, 2020 Sep 08 04:01:00)
[2020 Sep 09 04:00:00, 2020 Sep 09 04:01:00)
[2020 Sep 10 04:00:00, 2020 Sep 10 04:01:00)
[2020 Sep 11 04:00:00, 2020 Sep 11 04:01:00)
[2020 Sep 12 04:00:00, 2020 Sep 12 04:01:00)

"I have a bunch of time ranges, but some of them overlap, and they are not in the right order. If only there is a way to combine and sort them easily."

$ telltime search --time-slots 1000 "2020 . jan . 1, 10, 20 . 13:00 to 14:00 \
  || 2019 dec 25 13:00 \
  || 2019 dec 25 10am to 17:00 \
  || 2020 jan 5 10am to 1:30pm \
  || 2020 . jan . 7 to 12 . 9:15am to 2:45pm"
Searching in time zone offset (seconds)            : 36000
Search by default starts from (in above time zone) : 2020 Sep 06 18:01:12

Matching time slots (in above time zone):
[2019 Dec 25 10:00:00, 2019 Dec 25 17:00:00)
[2020 Jan 01 13:00:00, 2020 Jan 01 14:00:00)
[2020 Jan 05 10:00:00, 2020 Jan 05 13:30:00)
[2020 Jan 07 09:15:00, 2020 Jan 07 14:45:00)
[2020 Jan 08 09:15:00, 2020 Jan 08 14:45:00)
[2020 Jan 09 09:15:00, 2020 Jan 09 14:45:00)
[2020 Jan 10 09:15:00, 2020 Jan 10 14:45:00)
[2020 Jan 11 09:15:00, 2020 Jan 11 14:45:00)
[2020 Jan 12 09:15:00, 2020 Jan 12 14:45:00)
[2020 Jan 20 13:00:00, 2020 Jan 20 14:00:00)

Get exact time after some duration from now

$ telltime from-now "1 hour"
Now                   : 2020-09-03 15:53:29
Duration (original)   : 1 hour
Duration (normalized) : 1 hours 0 mins 0 secs
Now + duration        : 2020-09-03 16:53:29
$ telltime from-now "1.5 days 2.7 hours 0.5 minutes"
Now                   : 2020-09-03 15:55:43
Duration (original)   : 1.5 days 2.7 hours 0.5 minutes
Duration (normalized) : 1 days 14 hours 42 mins 30 secs
Now + duration        : 2020-09-05 06:38:13

Difference between ordered select and union

s1 >> s2 is similar to s1 || s2, but >> picks between s1 and s2 in a round robin fashion, instead of just picking the smallest between two.

One specific differing case would be when the search starts at 4pm today, 3pm || 5pm would return 5pm today and 3pm tomorrow, and so on, while 3pm >> 5pm would return 3pm tomorrow and 5pm tomorrow (a 5pm is only picked after a 3pm has been picked already).

Ocamlunit emacs minor-mode

Manfred Bergmann announced

Here is a first version of this plugin that allows running dune test with an Emacs key stroke. It shows the test result in a separate buffer and a simple colorized status 'message'.

https://github.com/mdbergmann/emacs-ocamlunit

While it is possible to run dune in 'watch' mode I'd like to manually run tests.

I didn't find a way to specify individual test modules in dune. Is that possible?

Sihl 0.1.0

jerben announced

I am happy to announce this milestone release of Sihl, a web framework for OCaml.

Github: https://github.com/oxidizing/sihl
opam: http://opam.ocaml.org/packages/sihl/

Sihl is really just a collection of services that can be plugged into each other and a tiny core that knows how to start them. The goal is to take care of infrastructure concerns so you can focus on the domain.

After many iterations, the API is in a shape where we dare to show it to you :slight_smile: It is still under heavy development so expect breakage without a major version bump. However, we just finished migrating a project from Reason on NodeJS to OCaml on Sihl, so we use it in production.

We provide service implementations that were useful to us so far. In the future we want to provide many more to cover all kinds of needs. (PRs are always welcome!)

Any feedback is greatly appreciated, thanks! :)

jerben then added

Here is an example of a tiny Sihl app:

module Service = struct
  module Random = Sihl.Utils.Random.Service
  module Log = Sihl.Log.Service
  module Config = Sihl.Config.Service
  module Db = Sihl.Data.Db.Service
  module MigrationRepo = Sihl.Data.Migration.Service.Repo.MariaDb
  module Cmd = Sihl.Cmd.Service
  module Migration = Sihl.Data.Migration.Service.Make (Cmd) (Db) (MigrationRepo)
  module WebServer = Sihl.Web.Server.Service.Make (Cmd)
  module Schedule = Sihl.Schedule.Service.Make (Log)
end

let services : (module Sihl.Core.Container.SERVICE) list =
  [ (module Service.WebServer) ]

let hello_page =
  Sihl.Web.Route.get "/hello/" (fun _ ->
      Sihl.Web.Res.(html |> set_body "Hello!") |> Lwt.return)

let routes = [ ("/page", [ hello_page ], []) ]

module App = Sihl.App.Make (Service)

let _ = App.(empty |> with_services services |> with_routes routes |> run)

promise_jsoo 0.1.0

Max LANTAS announced

Hello! I am announcing the first release of promise_jsoo, a library for JS promises in Js_of_ocaml.

https://github.com/mnxn/promise_jsoo
https://opam.ocaml.org/packages/promise_jsoo/

The library has bindings to the core Promise methods as well as helper functions that make it easier to deal with a Promise of an option or result. It is also possible to use this library with gen_js_api to make for an easier JavaScript binding experience

Inspired by aantron/promise, this library also uses indirection internally when handling nested promises in order to ensure that the bindings are type safe.

This project is part of ongoing work to port vscode-ocaml-platform to Js_of_ocaml.

Generated documentation can be found here.

Other OCaml News

From the ocamlcore planet blog

Here are links from many OCaml blogs aggregated at OCaml Planet.

Old CWN

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.