OCaml Weekly News
Hello
Here is the latest OCaml Weekly News, for the week of March 24 to 31, 2020.
Table of Contents
- An In-Depth Look at OCaml’s New “Best-Fit” Garbage Collector Strategy
- First release of Pp, a pretty-printing library
- soupault: a static website generator based on HTML rewriting
- routes: path based routing for web applications
- Compiler Engineer at Mixtional Code in Darmstadt or anywhere else in Germany
- tiny-httpd 0.5
- Visual Studio Code plugin for OCaml
- Dismas: a tool for automatically making cross-versions of opam packages
- Multicore OCaml: March 2020 update
- Old CWN
An In-Depth Look at OCaml’s New “Best-Fit” Garbage Collector Strategy
OCamlPro announced
The Garbage Collector is probably OCaml’s greatest unsung hero. Its pragmatic approach allows us to allocate without much fear of efficiency loss. We looked into its new "Best-fit" strategy and here is what we learned! http://www.ocamlpro.com/2020/03/23/ocaml-new-best-fit-garbage-collector/
First release of Pp, a pretty-printing library
Jérémie Dimino announced
I'm happy to announce the first release of the pp library! This library provides a lean alternative to the
Format module of the standard library. It uses the same comcepts of boxes and break hints, however it
defines its own algebra which some might find easier to work with and reason about. I personally do :) The final
rendering is still done via a formatter which makes it easy to integrate Pp
in existing programs using Format
.
We introduced this module in Dune to help improve the formatting of messages printed in the terminal and it
has been a success. The new API is smaller, simpler and makes it easy for developers to do the right thing. Once the
Pp
module of Dune was mature enough, we decided to extract it into a separate library so that it could benefit
others.
The library itself is composed of a single Pp
module and has no dependencies. Its documentation is self-contained
and no previous knowledge is required to start using it, however the various guides for the Format
module such as
this one should be applicable to Pp
as well.
If you have used Format
before and like me found its API complicated and difficult to use, I hope that you will
find Pp
nicer to work with!
Josh Berdine then said
Another great resource for understanding the core mental model of Format is Format Unraveled, although if I understand pp correctly the discussion about Format not being document-based won't apply to pp.
soupault: a static website generator based on HTML rewriting
Archive: https://discuss.ocaml.org/t/ann-soupault-a-static-website-generator-based-on-html-rewriting/4126/13
Daniil Baturin announced
1.10.0 release is available.
Bug fixes:
- Files without extensions are handled correctly.
New features:
- Plugin discovery: if you save a plugin to
plugins/my-plugin.lua
, it's automatically loaded as a widget named
my-plugin
. List of plugin directories is configurable.
- New plugin API functions:
HTMLget_tag_name
,HTML.select_any_of
,HTML.select_all_of
. - The
HTML
module is now "monadic": giving a nil to a function that expects an element gives you a nil back, rather than cause a runtime error.
routes: path based routing for web applications
Anurag Soni announced
0.7.2 release is now available on opam. There have been quite a few changes since the previous versions.
- Routes doesn't deal with HTTP methods anymore
- The internal implementation is now based around a trie like data structure
- Routes have pretty printers
- sprintf style route printing is supported again
- Minimum supported OCaml version is now 4.05 (it used to be 4.06)
- There is a release available for bucklescript as well and it is available to install via npm.
Compiler Engineer at Mixtional Code in Darmstadt or anywhere else in Germany
Gerd Stolpmann announced
Type of position:
- regular hire (no freelancers)
- full time
- work from home anywhere in Germany, or in the office in Darmstadt
- work for a small and highly skilled international team, located in the US and Europe
- the team language is English
We are developing a compiler for a no-code platform that translates our DSL to bytecode and/or WebAssembly. The language is largely of functional type but is also able to manage state with a spreadsheet model, allowing reactive programming without having to resort to libraries. The language is statically typed using a Hindley-Milner type checker. The compiler is primarily written in OCaml. Other languages of our platform are Go, Elm, and Javascript.
We are looking for a compiler engineer with strong skills in all relevant areas:
- fluent in OCaml or a similar language such as Haskell
- Understanding of the structure of the DSL, including syntax and semantics
- Translation of FP languages to executable code
- Code optimization
- Graph algorithms
- Type checking
We are open to both juniors and seniors, and payment will be accordingly. We are not so much interested in formal certifications but rather in real practice, either from previous jobs, research projects, or contributions to open source projects.
The no-code platform is being developed by engineers in Europe and the US at various places, and we usually do not meet physically but in video conferences. Working from home is very usual. We also get you a desk in your home town if you prefer this. The compiler development is lead by Gerd Stolpmann from Darmstadt.
Due to the strong connections to the US, video conferences will often have to take place in evening hours, until around 7pm or 8pm.
Applications: please follow the "Apply" link at the official web page describing the position: https://rmx.mixtional.de/static/54657cda/
Gerd Stolpmann
CEO of Mixtional Code GmbH (and OCaml hacker of the first hour)
Contact and company details: https://www.mixtional.de/contact.html
Sébastien Besnier asked
I'm living in France, can I apply to the position (we are neighbors!)?
Gerd Stolpmann replied
Well, I can (at the moment) only make contracts using German law and for the social security system here. So, if you need a doctor you'd have to travel… If my company was a bit bigger there would be the option of opening a second site in France (even a very minimal one), but the setup costs are so far too high (lawyers and accountants), and it is too distracting for me to keep up with the fine points of the system in France. Unfortunately, the EU is not that far that it is super simple for an employer to hire anywhere in Europe. - Thanks for asking.
tiny-httpd 0.5
Simon Cruanes announced
I just released tiny-httpd 0.5 and the new tiny-httpd-camlzip, which makes it possible to use deflate
transparently
for queries and responses. The server has evolved quietly and is getting somewhat more robust: I'm using it for an
internal tool with big html pages (up to several MB) and it's reasonably fast and doesn't seem to memleak. There's
also an improved http_of_dir
to quickly and simply serve a directory on an arbitrary port.
Previous announcement here
Visual Studio Code plugin for OCaml
Rudi Grinberg announced
I'm proud to announce a preview release of an VSC extension for OCaml. You can fetch and install this plugin directly from the extension marketplace if you search for "OCaml Labs". The extension isn't yet mature, but I believe that it offers a user experience comparable to other VSC extensions for OCaml already. The plugin should be used in conjunction with ocaml-lsp
The extension is for the OCaml "platform", which means that its scope includes support for various tools used in OCaml development such as dune, opam.
Bug reports & contributions are welcome. Happy hacking.
Dismas: a tool for automatically making cross-versions of opam packages
Daniil Baturin announced
opam-cross-* are seriously lagging behind the official opam repository and fdopen's opam-windows, not least because importing packages by hand is a lot of work. I suppose at least a semi-automated process could help those repos grow and stay in sync with the upstream much faster.
I've made a prototype of a tool for "stealing" packages into cross-repos. For obvious reasons it's called Dismas. You can find it here: https://github.com/dmbaturin/scripts/blob/master/dismas.ml
Limitations:
- the code is a real mess for now
- only dune is supported by automatic build command adjustment
- it cannot handle cases when both native and cross-version of a dependency are needed
However:
- For simple packages that use dune exclusively, it's completely automated. I've ported bigstreamaf and angstrom to test it, and cross-versions built just fine from its output, no editing was needed.
- It automatically converts dependencies from foo to too-$toolchain and removes dependencies and build steps only
needed for with-test
and with-doc
.
$ ./dismas.ml windows containers ~/devel/opam-repository/packages/containers/containers.2.8.1/opam opam-version: "2.0" maintainer: "simon.cruanes.2007@m4x.org" synopsis: "A modular, clean and powerful extension of the OCaml standard library" build: [ ["dune" "build" "-p" "containers" "-j" jobs "-x" "windows"] ] depends: [ "ocaml-windows" {>= "4.03.0"} "dune" {>= "1.1"} "dune-configurator" "seq-windows" ] depopts: ["base-unix" "base-threads"] tags: ["stdlib" "containers" "iterators" "list" "heap" "queue"] homepage: "https://github.com/c-cube/ocaml-containers/" doc: "https://c-cube.github.io/ocaml-containers" dev-repo: "git+https://github.com/c-cube/ocaml-containers.git" bug-reports: "https://github.com/c-cube/ocaml-containers/issues/" authors: "Simon Cruanes" url { src: "https://github.com/c-cube/ocaml-containers/archive/v2.8.1.tar.gz" checksum: [ "md5=d84e09c5d0abc501aa17cd502e31a038" "sha512=8b832f4ada6035e80d81be0cfb7bdffb695ec67d465ed6097a144019e2b8a8f909095e78019c3da2d8181cc3cd730cd48f7519e87d3162442562103b7f36aabb" ] } $ ./dismas.ml windows containers ~/devel/opam-repository/packages/containers/containers.2.8.1/opam | diff ~/devel/opam-repository/packages/containers/containers.2.8.1/opam - 3c3,4 < synopsis: "A modular, clean and powerful extension of the OCaml standard library" --- > synopsis: > "A modular, clean and powerful extension of the OCaml standard library" 5,7c6 < ["dune" "build" "-p" name "-j" jobs] < ["dune" "build" "@doc" "-p" name ] {with-doc} < ["dune" "runtest" "-p" name "-j" jobs] {with-test} --- > ["dune" "build" "-p" "containers" "-j" jobs "-x" "windows"] 10,11c9,10 < "ocaml" { >= "4.03.0" } < "dune" { >= "1.1" } --- > "ocaml-windows" {>= "4.03.0"} > "dune" {>= "1.1"} 13,21c12 < "seq" < "qtest" { with-test } < "qcheck" { with-test } < "ounit" { with-test } < "iter" { with-test } < "gen" { with-test } < "uutf" { with-test } < "mdx" { with-test & >= "1.5.0" & < "2.0.0" } < "odoc" { with-doc } --- > "seq-windows" 23,27c14,15 < depopts: [ < "base-unix" < "base-threads" < ] < tags: [ "stdlib" "containers" "iterators" "list" "heap" "queue" ] --- > depopts: ["base-unix" "base-threads"] > tags: ["stdlib" "containers" "iterators" "list" "heap" "queue"]
Things to do:
- identify all packages that don't need cross-versions. Is cppo one of them, for example?
- add support for cases when both native and cross versions are needed. If menhir the only one?
- add support for other build systems. Do all of them work well with `OCAMLFIND_TOOLCHAIN=windows` if the build setup is written correctly?
Input from @toots and @pirbo is welcome.
Romain Beauxis then said
That's a great initiative! Here are a couple of thoughts:
- For dune-based packages, things are indeed pretty straight-forward. Finding out which dependencies need to be ported as cross-dependency is indeed the part that's hard to automatize
- For other build systems, it's less clear to me how to automatize. Maybe others have some thoughts about it.
- The CI system on opam-cross-windows is pretty good at building from scratch and failing if some deps are missing so trial and error there can be a great tool.
- Once solved for one cross situation, the problem of cross-dependencies should be exactly the same for all other cross environment (android, iOS)
I haven't looked at the tool very closely yet but I'd say a first improvement would be to be able to track cross-dependencies resolution and generate new version of the package using them and/or generate other cross-compiled packages using them.
Anton Kochkov said
For automated pull requests, you might be interested in https://discuss.ocaml.org/t/dependabot-and-ocaml/4282
Daniil Baturin then asked
I'm not sure if I understand the premise of dependabot. Why would anyone hardcode specific dependency versions? Maybe it makes sense in certain ecosystems that suffer from never-ending ecological disasters… ;)
In any case, most opam packages don't have a constraint on the upper versions of their dependencies. Can dependabot use custom tracking rules to check for presense of a newer version in the repo? My thought was much simpler actually: track the commits in opam-repository, run recently changed files through Dismas and send pull requests to opam-cross-*
Yawar Amin replied
It's common practice nowadays to use semantic versioning and have lockfiles for reproducible builds. Dependabot updates semantic version ranges and lockfiles. See e.g.
Multicore OCaml: March 2020 update
Anil Madhavapeddy announced
Welcome to the March 2020 news update from the Multicore OCaml team! This update has been assembled with @shakthimaan and @kayceesrk, as with the February and January ones.
Our work this month was primarily focused on performance improvements to the Multicore OCaml compiler and runtime, as part of a comprehensive evaluation exercise. We continue to add additional benchmarks to the Sandmark test suite. The eventlog tracing system and the use of hash tables for marshaling in upstream OCaml are in progress, and more PRs are being queued up for OCaml 4.11.0-dev as well.
The biggest observable change for users trying the branch is that a new GC (the "parallel minor gc") has been merged in preference to the previous one ("the concurrent minor gc"). We will have the details in longer form at a later stage, but the essential gist is that the parallel minor GC no longer requires a read barrier or changes to the C API. It may have slightly worse scalability properties at a very high number of cores, but is roughly equivalent at up to 24 cores in our evaluations. Given the vast usability improvement from not having to port existing C FFI uses, we have decided to make the parallel minor GC the default one for our first upstream runtime patches. The concurrent minor GC follow at a later stage when we ramp up testing to 64-core+ machines. The multicore opam remote has been updated to reflect these changes, for those who wish to try it out at home.
We are now at a stage where we are porting larger applications to multicore. Thanks go to:
- @UnixJunkie who helped us integrate the Gram Matrix benchmark in https://github.com/ocaml-bench/sandmark/issues/99
- @jhw has done extensive work towards supporting Systhreads in https://github.com/ocaml-multicore/ocaml-multicore/pull/240. Systhreads is currently disabled in multicore, leading to some popular packages not compiling.
- @antron has been advising us on how best to port `Lwt_preemptive` and the `Lwt_unix` modules to multicore, giving us a widely used IO stack to test more applications against.
If you do have other suggestions for application that you think might provide useful benchmarks, then please do get in touch with myself or @kayceesrk.
Onto the details! The various ongoing and completed tasks for Multicore OCaml are listed first, which is followed by the changes to the Sandmark benchmarking infrastructure and ongoing PRs to upstream OCaml.
Multicore OCaml
- Ongoing
ocaml-multicore/ocaml-multicore#240 Proposed implementation of threads in terms of Domain and Atomic
A new implementation of the `Threads` library for use with the new `Domain` and `Atomic` modules in Multicore OCaml has been proposed. This builds Dune 2.4.0 which in turn makes it useful to build other packages. This PR is open for review.
ocaml-multicore/safepoints-cmm-mach Better safe points for OCaml
A newer implementation to insert safe points at the Cmm level is being worked upon in this branch.
- Completed
The following PRs have been merged into Multicore OCaml:
ocaml-multicore/ocaml-multicore#303 Account correctly for incremental mark budget
The patch correctly measures the incremental mark budget value, and improves the maximum latency for the `menhir.ocamly` benchmark.
- ocaml-multicore/ocaml-multicore#307 Put the phase change event in the actual phase change code. The PR includes the `major_gc/phase_change` event in the appropriate context.
ocaml-multicore/ocaml-multicore#309 Don't take all the full pools in one go.
The code change selects one of the `global_full_pools` to try sweeping it later, instead of adopting all of the full ones.
ocaml-multicore/ocaml-multicore#310 Statistics for the current domain are more recent than other domains
The statistics (`minor_words`, `promoted_words`, `major_words`, `minor_collections`) for the current domain are more recent, and are used in the right context.
ocaml-multicore/ocaml-multicore#315 Writes in `caml_blit_fields` should always use `caml_modify_field` to record `young_to_young` pointers
The PR enforces that `caml_modify_field()` is always used to store `young_to_young` pointers.
ocaml-multicore/ocaml-multicore#316 Fix bug with `Weak.blit`.
The ephemerons are allocated as marked, but, the keys or data can be unmarked. The blit operations copy weak references from one ephemeron to another without marking them. The patch marks the keys that are blitted in order to keep the unreachable keys alive for another major cycle.
ocaml-multicore/ocaml-multicore#317 Return early for 0 length blit
The PR forces a `CAMLreturn()` call if the blit length is zero in `byterun/weak.c`.
ocaml-multicore/ocaml-multicore#320 Move `num_domains_running` decrement
The `caml_domain_alone()` invocation needs to be used in the shared heap teardown, and hence the `num_domains_running` decrement is moved as the last operation for at least the `shared_heap` lockfree fast paths.
Benchmarking
The Sandmark performance benchmarking test suite has had newer benchmarks added, and work is underway to enhance its functionality.
ocaml-bench/sandmark#88 Add PingPong Multicore benchmark
The PingPong benchmark that uses producer and consumer queues has now been included into Sandmark.
ocaml-bench/sandmark#98 Add the read/write Irmin benchmark
A basic read/write file performance benchmark for Irmin has been added to Sandmark. You can vary the following input parameters: number of branches, number of keys, percentage of reads and writes, number of iterations, and the number of write operations.
ocaml-bench/sandmark#100 Add Gram Matrix benchmark
A request ocaml-bench/sandmark#99 to include the Gram Matrix initialization numerical benchmark was created. This is useful for machine learning applications and is now available in the Sandmark performance benchmark suite. The speedup (sequential_time/multi_threaded_time) versus number of cores for Multicore (Concurrent Minor Collector), Parmap and Parany is quite significant and illustrated in the graph:
ocaml-bench/sandmark#103 Add depend target in Makefile
Sandmark now includes a `depend` target defined in the Makefile to check that both `libgmp-dev` and `libdw-dev` packages are installed and available on Ubuntu.
ocaml-bench/sandmark#90 More parallel benchmarks
An issue has been created to add more parallel benchmarks. We will use this to keep track of the requests. Please feel free to add your wish list of benchmarks!
OCaml
- Ongoing
ocaml/ocaml#9082 Eventlog tracing system
The configure script has now been be updated so that it can build on Windows. Apart from this major change, a number of minor commits have been made for the build and sanity checks. This PR is currently under review.
ocaml/ocaml#9353 Reimplement output_value using a hash table to detect sharing.
The ocaml/ocaml#9293 "Use addrmap hash table for marshaling" PR has been re-implemented using a hash table and bit vector, thanks to @xavierleroy. This is a pre-requisite for Multicore OCaml that uses a concurrent garbage collector.
As always, we thank the OCaml developers and users in the community for their code reviews, support, and contribution to the project. From OCaml Labs, stay safe and healthy out there!
Old CWN
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.