OCaml Weekly News

Previous Week Up Next Week

Hello

Here is the latest OCaml Weekly News, for the week of March 24 to 31, 2020.

Table of Contents

An In-Depth Look at OCaml’s New “Best-Fit” Garbage Collector Strategy

OCamlPro announced

The Garbage Collector is probably OCaml’s greatest unsung hero. Its pragmatic approach allows us to allocate without much fear of efficiency loss. We looked into its new "Best-fit" strategy and here is what we learned! http://www.ocamlpro.com/2020/03/23/ocaml-new-best-fit-garbage-collector/

First release of Pp, a pretty-printing library

Jérémie Dimino announced

I'm happy to announce the first release of the pp library! This library provides a lean alternative to the Format module of the standard library. It uses the same comcepts of boxes and break hints, however it defines its own algebra which some might find easier to work with and reason about. I personally do :) The final rendering is still done via a formatter which makes it easy to integrate Pp in existing programs using Format.

We introduced this module in Dune to help improve the formatting of messages printed in the terminal and it has been a success. The new API is smaller, simpler and makes it easy for developers to do the right thing. Once the Pp module of Dune was mature enough, we decided to extract it into a separate library so that it could benefit others.

The library itself is composed of a single Pp module and has no dependencies. Its documentation is self-contained and no previous knowledge is required to start using it, however the various guides for the Format module such as this one should be applicable to Pp as well.

If you have used Format before and like me found its API complicated and difficult to use, I hope that you will find Pp nicer to work with!

Josh Berdine then said

Another great resource for understanding the core mental model of Format is Format Unraveled, although if I understand pp correctly the discussion about Format not being document-based won't apply to pp.

soupault: a static website generator based on HTML rewriting

Daniil Baturin announced

1.10.0 release is available.

Bug fixes:

  • Files without extensions are handled correctly.

New features:

  • Plugin discovery: if you save a plugin to plugins/my-plugin.lua, it's automatically loaded as a widget named

my-plugin. List of plugin directories is configurable.

  • New plugin API functions: HTMLget_tag_name, HTML.select_any_of, HTML.select_all_of.
  • The HTML module is now "monadic": giving a nil to a function that expects an element gives you a nil back, rather than cause a runtime error.

routes: path based routing for web applications

Anurag Soni announced

0.7.2 release is now available on opam. There have been quite a few changes since the previous versions.

  • Routes doesn't deal with HTTP methods anymore
  • The internal implementation is now based around a trie like data structure
  • Routes have pretty printers
  • sprintf style route printing is supported again
  • Minimum supported OCaml version is now 4.05 (it used to be 4.06)
  • There is a release available for bucklescript as well and it is available to install via npm.

Compiler Engineer at Mixtional Code in Darmstadt or anywhere else in Germany

Gerd Stolpmann announced

Type of position:

  • regular hire (no freelancers)
  • full time
  • work from home anywhere in Germany, or in the office in Darmstadt
  • work for a small and highly skilled international team, located in the US and Europe
  • the team language is English

We are developing a compiler for a no-code platform that translates our DSL to bytecode and/or WebAssembly. The language is largely of functional type but is also able to manage state with a spreadsheet model, allowing reactive programming without having to resort to libraries. The language is statically typed using a Hindley-Milner type checker. The compiler is primarily written in OCaml. Other languages of our platform are Go, Elm, and Javascript.

We are looking for a compiler engineer with strong skills in all relevant areas:

  • fluent in OCaml or a similar language such as Haskell
  • Understanding of the structure of the DSL, including syntax and semantics
  • Translation of FP languages to executable code
  • Code optimization
  • Graph algorithms
  • Type checking

We are open to both juniors and seniors, and payment will be accordingly. We are not so much interested in formal certifications but rather in real practice, either from previous jobs, research projects, or contributions to open source projects.

The no-code platform is being developed by engineers in Europe and the US at various places, and we usually do not meet physically but in video conferences. Working from home is very usual. We also get you a desk in your home town if you prefer this. The compiler development is lead by Gerd Stolpmann from Darmstadt.

Due to the strong connections to the US, video conferences will often have to take place in evening hours, until around 7pm or 8pm.

Applications: please follow the "Apply" link at the official web page describing the position: https://rmx.mixtional.de/static/54657cda/

Gerd Stolpmann
CEO of Mixtional Code GmbH (and OCaml hacker of the first hour)
Contact and company details: https://www.mixtional.de/contact.html

Sébastien Besnier asked

I'm living in France, can I apply to the position (we are neighbors!)?

Gerd Stolpmann replied

Well, I can (at the moment) only make contracts using German law and for the social security system here. So, if you need a doctor you'd have to travel… If my company was a bit bigger there would be the option of opening a second site in France (even a very minimal one), but the setup costs are so far too high (lawyers and accountants), and it is too distracting for me to keep up with the fine points of the system in France. Unfortunately, the EU is not that far that it is super simple for an employer to hire anywhere in Europe. - Thanks for asking.

tiny-httpd 0.5

Simon Cruanes announced

I just released tiny-httpd 0.5 and the new tiny-httpd-camlzip, which makes it possible to use deflate transparently for queries and responses. The server has evolved quietly and is getting somewhat more robust: I'm using it for an internal tool with big html pages (up to several MB) and it's reasonably fast and doesn't seem to memleak. There's also an improved http_of_dir to quickly and simply serve a directory on an arbitrary port.

Previous announcement here

Visual Studio Code plugin for OCaml

Rudi Grinberg announced

I'm proud to announce a preview release of an VSC extension for OCaml. You can fetch and install this plugin directly from the extension marketplace if you search for "OCaml Labs". The extension isn't yet mature, but I believe that it offers a user experience comparable to other VSC extensions for OCaml already. The plugin should be used in conjunction with ocaml-lsp

The extension is for the OCaml "platform", which means that its scope includes support for various tools used in OCaml development such as dune, opam.

Bug reports & contributions are welcome. Happy hacking.

Dismas: a tool for automatically making cross-versions of opam packages

Daniil Baturin announced

opam-cross-* are seriously lagging behind the official opam repository and fdopen's opam-windows, not least because importing packages by hand is a lot of work. I suppose at least a semi-automated process could help those repos grow and stay in sync with the upstream much faster.

I've made a prototype of a tool for "stealing" packages into cross-repos. For obvious reasons it's called Dismas. You can find it here: https://github.com/dmbaturin/scripts/blob/master/dismas.ml

Limitations:

  • the code is a real mess for now
  • only dune is supported by automatic build command adjustment
  • it cannot handle cases when both native and cross-version of a dependency are needed

However:

  • For simple packages that use dune exclusively, it's completely automated. I've ported bigstreamaf and angstrom to test it, and cross-versions built just fine from its output, no editing was needed.
  • It automatically converts dependencies from foo to too-$toolchain and removes dependencies and build steps only

needed for with-test and with-doc.

$ ./dismas.ml windows containers ~/devel/opam-repository/packages/containers/containers.2.8.1/opam
opam-version: "2.0"
maintainer: "simon.cruanes.2007@m4x.org"
synopsis:
  "A modular, clean and powerful extension of the OCaml standard library"
build: [
  ["dune" "build" "-p" "containers" "-j" jobs "-x" "windows"]
]
depends: [
  "ocaml-windows" {>= "4.03.0"}
  "dune" {>= "1.1"}
  "dune-configurator"
  "seq-windows"
]
depopts: ["base-unix" "base-threads"]
tags: ["stdlib" "containers" "iterators" "list" "heap" "queue"]
homepage: "https://github.com/c-cube/ocaml-containers/"
doc: "https://c-cube.github.io/ocaml-containers"
dev-repo: "git+https://github.com/c-cube/ocaml-containers.git"
bug-reports: "https://github.com/c-cube/ocaml-containers/issues/"
authors: "Simon Cruanes"
url {
  src: "https://github.com/c-cube/ocaml-containers/archive/v2.8.1.tar.gz"
  checksum: [
    "md5=d84e09c5d0abc501aa17cd502e31a038"
    "sha512=8b832f4ada6035e80d81be0cfb7bdffb695ec67d465ed6097a144019e2b8a8f909095e78019c3da2d8181cc3cd730cd48f7519e87d3162442562103b7f36aabb"
  ]
}

$ ./dismas.ml windows containers ~/devel/opam-repository/packages/containers/containers.2.8.1/opam | diff
~/devel/opam-repository/packages/containers/containers.2.8.1/opam -
3c3,4
< synopsis: "A modular, clean and powerful extension of the OCaml standard library"
---
> synopsis:
>   "A modular, clean and powerful extension of the OCaml standard library"
5,7c6
<   ["dune" "build" "-p" name "-j" jobs]
<   ["dune" "build" "@doc" "-p" name ] {with-doc}
<   ["dune" "runtest" "-p" name "-j" jobs] {with-test}
---
>   ["dune" "build" "-p" "containers" "-j" jobs "-x" "windows"]
10,11c9,10
<   "ocaml" { >= "4.03.0" }
<   "dune" { >= "1.1" }
---
>   "ocaml-windows" {>= "4.03.0"}
>   "dune" {>= "1.1"}
13,21c12
<   "seq"
<   "qtest" { with-test }
<   "qcheck" { with-test }
<   "ounit" { with-test }
<   "iter" { with-test }
<   "gen" { with-test }
<   "uutf" { with-test }
<   "mdx" { with-test & >= "1.5.0" & < "2.0.0" }
<   "odoc" { with-doc }
---
>   "seq-windows"
23,27c14,15
< depopts: [
<   "base-unix"
<   "base-threads"
< ]
< tags: [ "stdlib" "containers" "iterators" "list" "heap" "queue" ]
---
> depopts: ["base-unix" "base-threads"]
> tags: ["stdlib" "containers" "iterators" "list" "heap" "queue"]

Things to do:

  • identify all packages that don't need cross-versions. Is cppo one of them, for example?
  • add support for cases when both native and cross versions are needed. If menhir the only one?
  • add support for other build systems. Do all of them work well with `OCAMLFIND_TOOLCHAIN=windows` if the build setup is written correctly?

Input from @toots and @pirbo is welcome.

Romain Beauxis then said

That's a great initiative! Here are a couple of thoughts:

  • For dune-based packages, things are indeed pretty straight-forward. Finding out which dependencies need to be ported as cross-dependency is indeed the part that's hard to automatize
  • For other build systems, it's less clear to me how to automatize. Maybe others have some thoughts about it.
  • The CI system on opam-cross-windows is pretty good at building from scratch and failing if some deps are missing so trial and error there can be a great tool.
  • Once solved for one cross situation, the problem of cross-dependencies should be exactly the same for all other cross environment (android, iOS)

I haven't looked at the tool very closely yet but I'd say a first improvement would be to be able to track cross-dependencies resolution and generate new version of the package using them and/or generate other cross-compiled packages using them.

Anton Kochkov said

For automated pull requests, you might be interested in https://discuss.ocaml.org/t/dependabot-and-ocaml/4282

Daniil Baturin then asked

I'm not sure if I understand the premise of dependabot. Why would anyone hardcode specific dependency versions? Maybe it makes sense in certain ecosystems that suffer from never-ending ecological disasters… ;)

In any case, most opam packages don't have a constraint on the upper versions of their dependencies. Can dependabot use custom tracking rules to check for presense of a newer version in the repo? My thought was much simpler actually: track the commits in opam-repository, run recently changed files through Dismas and send pull requests to opam-cross-*

Yawar Amin replied

It's common practice nowadays to use semantic versioning and have lockfiles for reproducible builds. Dependabot updates semantic version ranges and lockfiles. See e.g.

Multicore OCaml: March 2020 update

Anil Madhavapeddy announced

Welcome to the March 2020 news update from the Multicore OCaml team! This update has been assembled with @shakthimaan and @kayceesrk, as with the February and January ones.

Our work this month was primarily focused on performance improvements to the Multicore OCaml compiler and runtime, as part of a comprehensive evaluation exercise. We continue to add additional benchmarks to the Sandmark test suite. The eventlog tracing system and the use of hash tables for marshaling in upstream OCaml are in progress, and more PRs are being queued up for OCaml 4.11.0-dev as well.

The biggest observable change for users trying the branch is that a new GC (the "parallel minor gc") has been merged in preference to the previous one ("the concurrent minor gc"). We will have the details in longer form at a later stage, but the essential gist is that the parallel minor GC no longer requires a read barrier or changes to the C API. It may have slightly worse scalability properties at a very high number of cores, but is roughly equivalent at up to 24 cores in our evaluations. Given the vast usability improvement from not having to port existing C FFI uses, we have decided to make the parallel minor GC the default one for our first upstream runtime patches. The concurrent minor GC follow at a later stage when we ramp up testing to 64-core+ machines. The multicore opam remote has been updated to reflect these changes, for those who wish to try it out at home.

We are now at a stage where we are porting larger applications to multicore. Thanks go to:

If you do have other suggestions for application that you think might provide useful benchmarks, then please do get in touch with myself or @kayceesrk.

Onto the details! The various ongoing and completed tasks for Multicore OCaml are listed first, which is followed by the changes to the Sandmark benchmarking infrastructure and ongoing PRs to upstream OCaml.

Multicore OCaml

  • Ongoing
    • ocaml-multicore/ocaml-multicore#240 Proposed implementation of threads in terms of Domain and Atomic

      A new implementation of the `Threads` library for use with the new `Domain` and `Atomic` modules in Multicore OCaml has been proposed. This builds Dune 2.4.0 which in turn makes it useful to build other packages. This PR is open for review.

    • ocaml-multicore/safepoints-cmm-mach Better safe points for OCaml

      A newer implementation to insert safe points at the Cmm level is being worked upon in this branch.

  • Completed

    The following PRs have been merged into Multicore OCaml:

    • ocaml-multicore/ocaml-multicore#303 Account correctly for incremental mark budget

      The patch correctly measures the incremental mark budget value, and improves the maximum latency for the `menhir.ocamly` benchmark.

    • ocaml-multicore/ocaml-multicore#307 Put the phase change event in the actual phase change code. The PR includes the `major_gc/phase_change` event in the appropriate context.
    • ocaml-multicore/ocaml-multicore#309 Don't take all the full pools in one go.

      The code change selects one of the `global_full_pools` to try sweeping it later, instead of adopting all of the full ones.

    • ocaml-multicore/ocaml-multicore#310 Statistics for the current domain are more recent than other domains

      The statistics (`minor_words`, `promoted_words`, `major_words`, `minor_collections`) for the current domain are more recent, and are used in the right context.

    • ocaml-multicore/ocaml-multicore#315 Writes in `caml_blit_fields` should always use `caml_modify_field` to record `young_to_young` pointers

      The PR enforces that `caml_modify_field()` is always used to store `young_to_young` pointers.

    • ocaml-multicore/ocaml-multicore#316 Fix bug with `Weak.blit`.

      The ephemerons are allocated as marked, but, the keys or data can be unmarked. The blit operations copy weak references from one ephemeron to another without marking them. The patch marks the keys that are blitted in order to keep the unreachable keys alive for another major cycle.

    • ocaml-multicore/ocaml-multicore#317 Return early for 0 length blit

      The PR forces a `CAMLreturn()` call if the blit length is zero in `byterun/weak.c`.

    • ocaml-multicore/ocaml-multicore#320 Move `num_domains_running` decrement

      The `caml_domain_alone()` invocation needs to be used in the shared heap teardown, and hence the `num_domains_running` decrement is moved as the last operation for at least the `shared_heap` lockfree fast paths.

Benchmarking

The Sandmark performance benchmarking test suite has had newer benchmarks added, and work is underway to enhance its functionality.

  • ocaml-bench/sandmark#88 Add PingPong Multicore benchmark

    The PingPong benchmark that uses producer and consumer queues has now been included into Sandmark.

  • ocaml-bench/sandmark#98 Add the read/write Irmin benchmark

    A basic read/write file performance benchmark for Irmin has been added to Sandmark. You can vary the following input parameters: number of branches, number of keys, percentage of reads and writes, number of iterations, and the number of write operations.

  • ocaml-bench/sandmark#100 Add Gram Matrix benchmark

    A request ocaml-bench/sandmark#99 to include the Gram Matrix initialization numerical benchmark was created. This is useful for machine learning applications and is now available in the Sandmark performance benchmark suite. The speedup (sequential_time/multi_threaded_time) versus number of cores for Multicore (Concurrent Minor Collector), Parmap and Parany is quite significant and illustrated in the graph: 20dc869a8dda1c815714a97e6a84f6f81c914cf4.png

  • ocaml-bench/sandmark#103 Add depend target in Makefile

    Sandmark now includes a `depend` target defined in the Makefile to check that both `libgmp-dev` and `libdw-dev` packages are installed and available on Ubuntu.

  • ocaml-bench/sandmark#90 More parallel benchmarks

    An issue has been created to add more parallel benchmarks. We will use this to keep track of the requests. Please feel free to add your wish list of benchmarks!

OCaml

  • Ongoing
    • ocaml/ocaml#9082 Eventlog tracing system

      The configure script has now been be updated so that it can build on Windows. Apart from this major change, a number of minor commits have been made for the build and sanity checks. This PR is currently under review.

    • ocaml/ocaml#9353 Reimplement output_value using a hash table to detect sharing.

      The ocaml/ocaml#9293 "Use addrmap hash table for marshaling" PR has been re-implemented using a hash table and bit vector, thanks to @xavierleroy. This is a pre-requisite for Multicore OCaml that uses a concurrent garbage collector.

    As always, we thank the OCaml developers and users in the community for their code reviews, support, and contribution to the project. From OCaml Labs, stay safe and healthy out there!

Old CWN

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.