OCaml Weekly News
Hello
Here is the latest OCaml Weekly News, for the week of December 01 to 08, 2020.
Table of Contents
- OCaml 4.12.0, second alpha release
- ez_subst.0.1.0 and ez_cmdliner.0.2.0
- New release of Menhir (20201201)
- http-multipart-formdata 1.0.0
- Multicore OCaml: November 2020
- Seq vs List, optimization
- dap 1.0.0 – Debug Adapter Protocol for OCaml
- ✂️ form2xml - a tiny cli tool to slice http form-data dumps
- http-multipart-formdata 1.0.1
- Set up OCaml 1.1.4
- First Public Release (beta) of the Memthol memory profiling visualizer
- Exception vs Result
- Making web calls to OCaml
- wasmtime 0.0.1: lightweight WebAssembly runtime
- First release of Lwt-exit
- Old CWN
OCaml 4.12.0, second alpha release
octachron announced
The release of OCaml 4.12.0 is approaching. We have released a second alpha version to help fellow hackers join us early in our bug hunting and opam ecosystem fixing fun.
Beyond the usual bug fixes this new alpha version removes the type system change that restricted the propagation of type information between branches of a "match". The newly introduced warning was more troublesome than expected, the feature has been thus postponed to 4.13 .
The base compiler can be installed as an opam switch with the following commands
opam update
opam switch create 4.12.0~alpha2
--repositories=default,beta=git+https://github.com/ocaml/ocaml-beta-repository.git
If you want to tweak the configuration of the compiler, you can pick configuration options with
opam update
opam switch create <switch_name> --packages=ocaml-variants.4.12.0~alpha2+options,<option_list>
--repositories=default,beta=git+https://github.com/ocaml/ocaml-beta-repository.git
where <option_list>
is a comma separated list of ocaml-option-* packages. For
instance, for a flambda and afl enabled switch:
opam switch create 4.12.0~alpha2+flambda+afl
--packages=ocaml-variants.4.12.0~alpha2+options,ocaml-option-flambda,ocaml-option-afl
--repositories=default,beta=git+https://github.com/ocaml/ocaml-beta-repository.git
All available options can be listed with "opam search ocaml-option".
The source code for the alpha is also available at these addresses:
- https://github.com/ocaml/ocaml/archive/4.12.0-alpha2.tar.gz
- https://caml.inria.fr/pub/distrib/ocaml-4.12/ocaml-4.12.0~alpha2.tar.gz
If you want to test this version, it is advised to install the alpha opam repository
https://github.com/kit-ty-kate/opam-alpha-repository
with
opam repo add alpha git://github.com/kit-ty-kate/opam-alpha-repository.git
This alpha repository contains various packages patched with fixes in the process of being upstreamed. Once the repository installed, these patched packages will take precedence over the non-patched version.
If you find any bugs, please report them here: https://github.com/ocaml/ocaml/issues
ez_subst.0.1.0 and ez_cmdliner.0.2.0
Fabrice Le Fessant announced
I am please to announce the new releases of two opam packages: ez_subst
and ez_cmdliner
. We use
both of them as dependencies of drom
(and use drom
to manage them).
ez_subst
is a simple library to perform string replacements in strings. It can be seen as a replacement forPrintf
when you are lost with too many%s
in one format, or a replacement forBuffer.add_substitute
when you want a more control. Replacements are chosen by functions, and can be separately specified using optional arguments`brace
(for${var}
),`paren
(for$(var)
),`bracket
(for$[var]
) and`var
(for$alphanum
). Separator$
can be changed, and notation can be symmetric (%{x}%
).https://ocamlpro.github.io/ez_subst
https://ocamlpro.github.io/ez_subst/doc/ez_subst/Ez_subst/V1/EZ_SUBST/index.htmlFor example:
open Ez_subst.V1 (* versionned interface *) let s = EZ_SUBST.string ~brace:(fun ctxt n -> string_of_int (ctxt + int_of_string n)) ~ctxt:3 "${4} ${5}" let s = EZ_SUBST.string ~sep:'!' ~paren:(fun () s -> String.uppercase s) ~ctxt:() "!(abc) !(def)" let s = EZ_SUBST.string ~sym:true ~sep:'%' ~brace:(fun ctxt_ s -> ctxt ^ " " ^ s) ~ctxt:"Hello" "%{John}% %{Sandy}%" let s = EZ_SUBST.string_from_list ~default:"unknown" [ "name", "Doe"; "surname", "John" ] "${name} $(surname) is missing"
ez_cmdliner
is a simple layer overcmdliner
to provide an interface à laArg
module. It provides support for a one-command and sub-commands modes. It also provides a ReST generator to document sub-commands and integrate the documentation in a Sphinx documentation (to use withdrom
for example).https://ocamlpro.github.io/ez_cmdliner
For example:
open Ezcmd.V2 let cmd_new = EZCMD.sub "new" (* for `drom new` *) ~args: [ [ "dir" ], Arg.String (fun s -> dir := Some s), EZCMD.info ~docv:"DIRECTORY" "Dir where package sources are stored (src by default)"; [ "library" ], Arg.Unit (fun () -> skeleton := Some "library"), EZCMD.info "Project contains only a library"; [ "i"; "inplace" ], Arg.Set inplace, (* for `-i` or `--inplace` *) EZCMD.info "Create project in the the current directory"; [], Arg.Anon (0, fun name -> project_name := Some name), EZCMD.info ~docv:"PROJECT" "Name of the project" ] ~doc:"Create a new project" (fun () -> action ~name:!project_name ~skeleton:!skeleton ~dir:!dir ~inplace:!inplace ~args) ~man: [ `S "DESCRIPTION"; `Blocks [ `P "This command performs the following actions:"; ] ] let () = EZCMD.main_with_subcommands ~name:"drom" ~version:"0.1.0" ~doc:"Create and manage an OCaml project" ~man:[] ~argv [ cmd_new ] ~common_args
Both packages are now available in opam repository.
New release of Menhir (20201201)
François Pottier announced
I would like to announce a new release of Menhir, the LR(1) parser generator
for OCaml. The most prominent new features are intended to improve the comfort
of the machinery that allows producing custom syntax error messages: a demo of
this machinery has been added, new library functions have been added so as to
make it easier to use, and the commands that deal with .messages
files have
been improved. An excerpt of the changelog appears below.
opam update opam upgrade menhir
Happy parsing!
2020/12/01
- The module
MenhirLib.ErrorReports
is extended with new functions:wrap_supplier
,extract
,sanitize
,compress
,shorten
,expand
. - The new module
MenhirLib.LexerUtil
offers a few functions that help reading a file, setting up a lexing buffer, printing source code positions, etc. - The new demo
calc-syntax-errors
demonstrates how to produce customized syntax error messages. - The new command
--merge-errors
merges two.messages
files. It can be useful when two or more users have independently produced partial.messages
files and wish to combine their work. (Suggested by Gabriel Scherer and François Bobot.) - The commands that read
.messages
files have been hardened so as to tolerate situations where a sentence mentions a nonexistent symbol or does not lead to an error state. When such a sentence is encountered, an error message is produced on the standard error channel; then, this sentence is ignored and processing continues. (As an exception, the command--compile-errors
refuses to proceed in the presence of such sentences.)
2020/11/22
- The new command line switch
--dump-resolved
writes a description of the automaton to the file.automaton.resolved
after all conflicts have been resolved and after extra reductions have been introduced. This file also shows which states have a default reduction. - The command line switch
--dump
writes a description of the automaton to the file.automaton
after benign conflicts have been silently resolved, but before severe conflicts are resolved and before extra reductions are introduced. (This behavior is unchanged.) The manner in which end-of-stream conflicts are displayed in this file has been improved. - In the files
.automaton
and.automaton.resolved
, the reduction table in each state is now presented in a much more compact and readable way. - In the files
.automaton
and.automaton.resolved
, the known suffix of the stack in each state is now explicitly shown. (Although it can be deduced from the LR(1) items, showing it helps.) - Document the problem caused by placing a module alias declaration
in an
.mly
file. (See Questions and Answers in the manual.) - Turn off a costly internal well-formedness assertion. This allows a 30% speedup in the construction of large automata and in the conflict explanation process. (Reported by Joe.)
http-multipart-formdata 1.0.0
Bikal Lem announced
It is my pleasure to announce the release of http-multipart-formdata v1.0.0. As the name suggests, the library implements functionality to allow HTTP file uploads and form processing. Tangentially, it implements the standard RFC 7578 - Returning Values from Forms: multipart/form-data which is the standard browsers use to send form data to a web server.
I developed this library as part of my endeavour to create ocaml web applications.
It is also an example of the parser construction library reparse which I also released a few days ago.
Multicore OCaml: November 2020
Anil Madhavapeddy announced
Welcome to the November 2020 Multicore OCaml report! This update along with the previous updates have been compiled by @shakthimaan, @kayceesrk, and @avsm.
Multicore OCaml: Since the support for systhreads has been merged last month, many more ecosystem packages compile. We have been doing bulk builds (using a specialised opam-health-check instance) against the opam repository in order to chase down the last of the lingering build bugs. Most of the breakage is around packages using C stubs related to the garbage collector, although we did find a few actual multicore bugs (related to the thread machinery when using dynlink). The details are under "ecosystem" below. We also spent a lot of time on optimising the stack discipline in the multicore compiler, as part of writing a draft paper on the effect system (more details on that later).
Upstream OCaml: The 4.12.0alpha2
release is now out, featuring the
dynamic naked pointer checker to help make your code only used external pointers that are boxed. Please
do run your codebase on it to help prepare. For OCaml 4.13 (currently the trunk
) branch, we had a
full OCaml developers meeting where we decided on the worklist for what we're going to submit upstream.
The major effort is on GC safe points and not caching the
minor heap pointer, after which the runtime domains support
has all the necessary prerequisites upstream. Both of those PRs are highly performance sensitive, so
there is a lot of poring over graphs going on (notwithstanding the irrepressible @stedolan offering a
massive driveby optimisation).
Sandmark Benchmarking: The lockfree and Graph500 benchmarks have been added and updated to Sandmark respectively, and we continue to work on the tooling aspects. Benchmarking tests are also being done on AMD, ARM and PowerPC hardware to study the performance of the compiler. With reference to stock OCaml, the safepoints PR has now landed for review.
As with previous updates, the Multicore OCaml tasks are listed first, which are then followed by the progress on the Sandmark benchmarking test suite. Finally, the upstream OCaml related work is mentioned for your reference.
Multicore OCaml
- Ongoing
ocaml-multicore/ocaml-multicore#439 Systhread lifecycle work
An improvement to the initialization of systhreads for general resource handling, and freeing up of descriptors and stacks. There now exists a new hook on domain termination in the runtime.
ocaml-multicore/ocaml-multicore#440
ocamlfind ocamldep
hangs in no-effect-syntax branchThe
nocrypto
package fails to build for Multicore OCaml no-effect-syntax branch, and ocamlfind loops continuously. A minimal test example has been created to reproduce the issue.ocaml-multicore/ocaml-multicore#443 Minor heap allocation startup cost
An issue to keep track of the ongoing investigations on the impact of large minor heap size for OCaml Multicore programs. The sequential and parallel exeuction run results for various minor heap sizes are provided in the issue.
ocaml-multicore/ocaml-multicore#446 Collect GC stats at the end of minor collection
The objective is to remove the use of double buffering in the GC statistics collection by using the barrier present during minor collection in the parallel_minor_gc schema. There is not much slowdown for the benchmark runs, normalized against stock OCaml as seen in the illustration.
- Completed
- Upstream
ocaml-multicore/ocaml-multicore#426 Replace global roots implementation
This PR replaces the existing global roots implementation with that of OCaml's
globroots
, wherein the implementation places locks around the skip lists. In future, theCaml_root
usage will be removed along with its usage in globroots.ocaml-multicore/ocaml-multicore#427 Garbage Collector colours change backport
The Garbage Collector colours change PR from trunk for the major collector have now been backported to Multicore OCaml. This includes the optimization for
mark_stack_push
, themark_entry
does not includeend
, andcaml_shrink_mark_stack
has been adapted from trunk.ocaml-multicore/ocaml-multicore#432 Remove caml_context push/pop on stack switch
The motivation to remove the use of
caml_context
push/pop on stack switches to make the implementation easier to understand, and to be closer to upstream OCaml.
- Stack Improvements
Fix stack overflow on scan stack#431 Fix issue 421: Stack overflow on scan stack
The
caml_scan_stack
now uses a while loop to avoid a stack overflow corner case where there is a deep nesting of fibers.ocaml-multicore/ocaml-multicore#434 DWARF fixups for effect stack switching
The PR provides fixes for
runtime/amd64.S
on issues found using a DWARF validator. The patch also cleans up dead commented out code, and updates the DWARF information when we docaml_free_stack
incaml_runstack
.ocaml-multicore/ocaml-multicore#435 Mark stack overflow backport
The mark-stack overflow implementation has been updated to be closer to trunk OCaml. The pools are added to a skiplist first to avoid any duplicates, and the pools in
pools_to_rescan
are marked later during a major cycle. The result of thefinalise
benchmark time difference with mark stack overflow is shown below:ocaml-multicore/ocaml-multicore#437 Avoid an allocating C call when switching stacks with continue
The
caml_continuation_use
has been updated to usecaml_continuation_use_noexc
and it does not throw an exception. The allocating Ccaml_c_call
is no longer required to callcaml_continuation_use_noexc
.ocaml-multicore/ocaml-multicore#441 Tidy up and more commenting of caml_runstack in amd64.S
The PR adds comments on how stacks are switched, and removes unnecessary instructions in the x86 assembler.
ocaml-multicore/ocaml-multicore#442 Fiber stack cache (v2)
Addition of stack caching for fiber stacks, which also fixes up bugs in the test suite (DEBUG memset, order of initialization). We avoid indirection out of
struct stack_info
when managing the stack cache, and efficiently calculate the cache freelist bucket for a given stack size.
- Ecosystem
ocaml-multicore/lockfree#5 Remove Kcas dependency
The
Kcas.Wl
module is now replaced with the Atomic module available in Multicore stdlib. The exponential backoff is implemented withDomain.Sync.cpu_relax
.ocaml-multicore/domainslib#21 Point to the new repository URL
Thanks to Sora Morimoto (@smorimoto) for providing a patch that updates the URL to the correct ocaml-multicore repository.
ocaml-multicore/multicore-opam#40 Add multicore Merlin and dot-merlin-reader
A patch to merlin and dot-merlin-reader to work with Multicore OCaml 4.10.
ocaml-multicore/ocaml-multicore#403 Segmentation fault when trying to build Tezos on Multicore
The latest fixes on replacing the global roots implementation, and fixing the STW interrupt race to the no-effect-syntax branch has resolved the issue.
- Compiler Fixes
ocaml-multicore/ocaml-multicore#438 Allow C++ to use caml/camlatomic.h
The inclusion of extern "C" headers to allow C++ to use caml/camlatomic.h for building ubpf.0.1.
ocaml-multicore/ocaml-multicore#447 domain_state.h: Remove a warning when using -pedantic
A fix that uses
CAML_STATIC_ASSERT
to check the size ofcaml_domain_state
in domain_state.h, in order to remove the warning when using -pedantic.ocaml-multicore/ocaml-multicore#449 Fix stdatomic.h when used inside C++ for good
Update to
caml/camlatomic.h
with extern C++ declaration to use it inside C++. This builds upbf.0.1 and libsvm.0.10.0 packages.
- Sundries
ocaml-multicore/ocaml-multicore#422 Simplify minor heaps configuration logic and masking
A
Minor_heap_max
size is introduced to reserve the minor heaps area, andIs_young
for relying on a boundary check. TheMinor_heap_max
parameter can be overridden using the OCAMLRUNPARAM environment variable. This implementation approach is geared towards using Domain local allocation buffers.ocaml-multicore/ocaml-multicore#429 Fix a STW interrupt race
A fix for the STW interrupt race in
caml_try_run_on_all_domains_with_spin_work
. Theenter_spin_callback
andenter_spin_data
fields ofstw_request
are now initialized after we interrupt other domains.ocaml-multicore/ocaml-multicore#430 Add a test to exercise stored continuations and the GC
The PR adds test coverage for interactions between the GC with stored, cloned and dropped continuations to exercise the minor and major collectors.
ocaml-multicore/ocaml-multicore#444 Merge branch 'parallel_minor_gc' into 'no-effect-syntax'
The
parallel_minor_gc
branch has been merged into theno-effect-syntax
branch, and we will try to keep theno-effect-syntax
branch up-to-date with the latest changes.
- Upstream
Benchmarking
- Ongoing
ocaml-bench/sandmark#196 Filter benchmarks based on tag
An enhancement to move towards a generic implementation to filter the benchmarks based on tags, instead of relying on custom targets such as _macro.json or _ci.json.
ocaml-bench/sandmark#191 Make parallel.ipynb notebook interactive
The parallel.ipynb notebook has been made interactive with drop-down menus to select the .bench files for analysis. The notebook README has been merged with the top-level README file. A sample 4.10.0.orunchrt.bench along with the *pausetimes_multicore.bench files have been moved to the test artifacts/ folder for user testing.
- We are continuing to test the use of
opam-compiler
switch environment to execute the Sandmark benchmark test suite. We have been able to build the dependencies,orun
andrungen
, theOCurrent
pipeline and its dependencies, andocaml-ci
for the ocaml-multicore:no-effect-syntax branch. We hope to converge to a 2.0 implementation with the required OCaml tools and ecosystem.
- Completed
ocaml-bench/sandmark#179 [RFC] Classifying benchmarks based on running time
The Classification of benchmarks PR has been resolved, which now classifies the benchmarks based on their running time:
lt_1s
: Benchmarks that run for less than 1 second.lt_10s
: Benchmarks that run for at least 1 second, but, less than 10 seconds.10s_100s
: Benchmarks that run for at least 10 seconds, but, less than 100 seconds.gt_100s
: Benchmarks that run for at least 100 seconds.
ocaml-bench/sandmark#189 Add environment support for wrapper in JSON configuration file
The OCAMLRUNPARAM arguments can now be passed as an environment variable when executing the benchmarks in runtime. The environment variables can be specified in the
run_config.json
file, as shown below:{ "name": "orun_2M", "environment": "OCAMLRUNPARAM='s=2M'", "command": "orun -o %{output} -- taskset --cpu-list 5 %{command}" }
ocaml-bench/sandmark#183 Use crout_decomposition name for numerical analysis benchmark
The
numerical-analysis/lu_decomposition.ml
benchmark has now been renamed tocrout_decomposition.ml
to avoid naming confusion, as there are a couple of LU decomposition benchmarks in Sandmark.ocaml-bench/sandmark#190 Bump trunk to 4.13.0
The trunk version in Sandmark ocaml-versions/ has now been updated to use
4.13.0+trunk.json
.ocaml-bench/sandmark#192 GraphSEQ corrected
The minor fix for the Kronecker generator has been provided for the Graph500 benchmark.
ocaml-bench/sandmark#194 Lockfree benchmarks
The lockfree benchmarks for both the serial and parallel implementation are now included in Sandmark, and it uses the
lockfree_bench
tag. The time and speedup illustrations are as follows:
OCaml
- Ongoing
ocaml/ocaml#9876 Do not cache young_limit in a processor register
The removal of
young_limit
caching in a register is being evaluated using Sandmark benchmark runs to test the impact change on for ARM64, PowerPC and RISC-V ports hardware.ocaml/ocaml#9934 Prefetching optimisations for sweeping
The PR includes an optimization of
sweep_slice
for the use of prefetching, and to reduce cache misses during GC. The normalized running time graph is as follows:ocaml/ocaml#10039 Safepoints
A draft Safepoints implementation for AMD64 for the 4.11 branch that are implemented by adding a new
Ipoll
operation to Mach. The benchmark results on an AMD Zen2 machine are given below:
Many thanks to all the OCaml users and developers for their continued support, and contribution to the project.
Acronyms
- ARM: Advanced RISC Machine
- DWARF: Debugging With Attributed Record Formats
- GC: Garbage Collector
- JSON: JavaScript Object Notation
- OPAM: OCaml Package Manager
- PR: Pull Request
- PR: Pull Request
- RFC: Request For Comments
- RISC-V: Reduced Instruction Set Computing - V
- STW: Stop-The-World
- URL: Uniform Resource Locator
Seq vs List, optimization
Deep in this thread, Sacha Ayoun asked and Raphaël Proust said
But then what’s the point of Seq ?
A bit of a spoiler for an upcoming release of a few of our libraries at Nomadic Labs…
We had a bug report: calls to some RPCs exposed by some of our binaries would occasionally cause some lag. One of the root causes of the issue was JSON serialisation. The original serialisation scheme was intended for a limited range of uses (especially, small sizes) but then it was used outside of this intended range and some relatively big values were serialised and pushed down the RPC stack.
To circumvent this, we are about to release
- a “json lexeme sequence” backend for our serialiser library:
construct_seq : 'a encoding -> 'a -> json_lexeme Seq.t
wherejson_lexeme = Jsonm.lexeme = [ `Null | `Bool of bool | … | `As | `Ae | `Os | `Oe ]
- a json lexeme sequence to string sequence converter.
For this second part, we actually have three different converters intended for slightly different uses. They have different granularity, they have different allocation profiles, and they make slightly different assumption most notably about concurrency:
string_seq_of_json_lexeme_seq : chunk_size_hint:int -> json_lexeme Seq.t -> string Seq.t
which uses one (1) internal buffer of sizechunk_size_hint
. Consuming one element of the resulting sequence causes several json lexemes to be consumed and printed onto the internal buffer until it is full. When this happens, a snapshot (copy) of the buffer is delivered in theCons
cell. So for chunk-size-hint of, say, 1Ko, the sequence translator uses roughly 1Ko of memory and emits 1Ko chunks of memory that the consumer is responsible for.small_string_seq_of_json_lexeme_seq : json_lexeme Seq.t -> string Seq.t
which translates each of the lexeme as a single string. It's a little bit more than a simpleSeq.map
because it needs to insert separators and escape strings. It mostly returns statically allocated strings so there are no big allocations at all.blit_instructions_seq_of_jsonm_lexeme_seq : buffer: bytes -> json_lexeme Seq.t -> (bytes * int * int) Seq.t
which works somewhat similarly to the first one but usesbuffer
instead of allocating its own. And it returns a seq of(source, offset, length)
which are intended to be blitted onto whatever the consumer wants to propagates the data too. This barely allocates at all (it currently does allocate relatively big chunks when escaping strings, but we have planned to improve this in the future. (The sequence returns a source to blit; this source is physically equal tobuffer
most of the time but not always; specifically, for large strings that are present within the json data, the sequence just points to them as a source.)
Note that the description above is a simplification: there is a bit more to it than that. Also note that all this is still Work In Progress. Check out https://gitlab.com/nomadic-labs/json-data-encoding/-/merge_requests/5 (the value to json lexeme sequence code) and https://gitlab.com/nomadic-labs/data-encoding/-/merge_requests/19 (the json lexeme sequence to string sequence code).
dap 1.0.0 – Debug Adapter Protocol for OCaml
文宇祥 announced
This is the debug adapter protocol library extract from ocamlearlybird. Include types generated from specification and a DAP prioritized JSON RPC implementation. It's useful to implement debug adapter in OCaml.
Debug adapter protocol
- Project page: https://github.com/hackwaly/ocaml-dap
- Documentation: https://hackwaly.github.io/ocaml-dap/
CHANGES:
Initial release.
- Specification version is 1.43
✂️ form2xml - a tiny cli tool to slice http form-data dumps
Archive: https://discuss.ocaml.org/t/ann-form2xml-a-tiny-cli-tool-to-slice-http-form-data-dumps/6912/1
😷 Marcus Rohrmoser announced
when doing static web sites, feedback is an issue. form2xml helps you keep the server stupid, but still makes form-data feedback possible.
Just dump the form posts, rsync and merge them into your client-side, unix pipe, toolchain.
form2xml bridges the tooling-gap between http and xml/xslt with utmost primitivity in the making. I chose simplicity over compliance because form2xml isn't intended to run server-side or unattended. I'm aware of excellent prior art, but hesitated to add build dependencies for now and rather see if it proves useful as is.
http-multipart-formdata 1.0.1
Bikal Lem announced
I have just released a maintenance release of http-multipart-formadata
. This is a maintenace release
to address a reported issue.
Set up OCaml 1.1.4
Sora Morimoto announced
We have a changelog since this release.
By the way, I'm preparing to publish v2 of setup-ocaml. It has a cache feature that is entirely independent of GitHub, so you don't have to worry about cache limit per repository, and you don't have to spend nearly 10 minutes on setup.
https://github.com/avsm/setup-ocaml/blob/master/CHANGELOG.md
First Public Release (beta) of the Memthol memory profiling visualizer
OCamlPro announced
We are happy to announce the first public release of Memthol, a visualizer and analyzer for memory profiling data generated from OCaml programs, thanks to the work of Adrien Champion and Vincent Laviron.
Memthol is a visualizer and analyzer for program profiling. It works on memory dumps containing information about the size and (de)allocation date of part of the allocations performed by some execution of a program. For information regarding building memthol, features, browser compatibility… refer to the memthol github repository. Please note that Memthol, as a side project, is a work in progress that remains in beta status for now.
Memthol's background
The Memthol work was started more than a year ago (we had published a short introductory paper at the JFLA2020). The whole idea was to use the previous work originally achieved on ocp-memprof, and look for some extra funding to achieve a usable and industrial version. Then came the excellent memtrace profiler by Jane Street's team.
The memtrace format is nicely designed and polished enough to be considered a future standard for other tools. This is why Memthol supports Jane Street's dumper format, instead of our own dumper library's.
Memthol is a self-funded side project, that we think is worth giving to the OCaml community. Its approach is valuable, and can be complementary. It is released under the free GPL licence v3.
We welcome any extra funding to achieve a usable and industrial version!
Memthol's features:
- multi-client: open several tabs in your browser for the same profiling session to visualize the data separately
- self-contained: the BUI packs all its static assets, once you have the binary you do not need anything else (except a browser)
- data-splitting: plot several families of data separately in the same chart by separating them based on size, allocation lifetime, source locations in the allocation callstack, etc.
Issues are welcome. As Memthol is mostly tested on the Chrome web browser, you might experience problems with other browsers. Do not hesitate open issues.
We have designed a mini-tutorial on Memthol available on our github repository and our blogpost, which you can find by following this link : https://www.ocamlpro.com/2020/12/01/memthol-exploring-program-profiling/
Exception vs Result
Chas Emerick discussed
Last week, @BikalGurung blogged On Effectiveness of Exceptions in
OCaml, in part as a follow-up to his announcement
of his parser combinator library
reparse, which eschews Result
-based error handling in favor
of exceptions. I've long preferred using Result
(and its equivalents in other languages), and my
experience so far with OCaml is that that preference is shared by many in the community and by authors
of key libraries, but I was happy to consider a new counterpoint.
Doing so prompted me to consider my rationale(s) more than I had previously, and do some additional
reading and research, all of which ended up further cementing my pro-Result
bias. What follows are
counterpoints to Bikal's two most consequential arguments (in my opinion), and some elaboration beyond.
Many thanks to Bikal for his posting his experience report!
Stacktrace / Location Information
First, Bikal focuses in on how useful error handling should "allow us to efficiently, correctly and
accurately identify the source of our errors". I agree, but he compares exceptions and result
on this
basis like so:
OCaml exception back traces - or call/stack traces - is one such tool which I have found very helpful. It gives the offending file and the line number in it. This make investigating and zeroing in on the error efficient and productive.
Using result type means you lose this valuable utility that is in-built to the ocaml language and the compiler.
It is true that Error
does not implicitly carry backtraces as exceptions do, but there is nothing
preventing one from choosing to include a backtrace with a returned error, since OCaml backtraces
helpfully exist separate from its exception facility:
let b x = if x > 0 then Ok 0 else Error ("unimplemented", Printexc.get_callstack 10) let a x = b x let _ = match a (int_of_string @@ Sys.argv.(1)) with | Ok v -> Format.printf "%d@." v | Error (msg, stack) -> Format.fprintf Format.err_formatter "Error: %s@." msg; Printexc.print_raw_backtrace stderr stack
$ ocamlc -g -o demo.exe src/demo.ml $ ./demo.exe -1 Error: unimplemented Raised by primitive operation at file "src/demo.ml", line 5, characters 31-56 Called from file "src/demo.ml", line 9, characters 14-47
From a strictly ergonomic standpoint, it makes sense to wish that e.g. the Error
constructor were
treated specially such that values it produced always carried a stack trace (as exceptions do/are), so
that programmers would not need to opt into it as above. However, that would not come without costs,
including a maybe-significant runtime penalty that might render Result
a less useful way to cheaply
signal recoverable error conditions (something that other exception-dominant languages/runtimes
struggle to do given that stacktrace generation is far from free).
Correctness
Bikal's final topic was re: correctness, and to what extent using one or another error-handling mechanism tangibly affects his work. What he says is short enough to reproduce in full:
I thought this would be the biggest advantage of using
result
type and a net benefit. However, my experience of NOT using it didn't result in any noticeable reduction of correct by construction OCaml software. Conversely, I didn't notice any noticeable improvement on this metric when using it. What I have noticed over time is that abstraction/encapsulation mechanisms and type system in particular play by far the most significant role in creating correct by construction OCaml software.
There's a lot left undefined here: what "correct by construction" might mean generally, what it means in the context of OCaml software development, how it could be measured (_is_ there a metric, or are we just reckoning here?), and so on.
While reminding myself of exactly what "correct by construction" meant, I came across a fantastic lecture by Martyn Thomas[1] that neatly defines it (and goes into some detail of how to go about achieving it); from the accompanying lecture notes[2]:
…you start by writing a requirements specification in a way that makes it possible to analyse whether it contains logical omissions or contradictions. Then you develop the software in a way that provides very strong evidence that the program implements the specification (and does nothing else) and that it will not fail at runtime. We call this making software “correct by construction”, because the way in which the software is constructed guarantees that it has these properties.
While we aren't working with anything as formal as a theorem prover when we are programming in OCaml,
it does provide us with a tremendous degree of certainty about how our programs will behave. One of the
greatest sources of that certainty is its default of requiring functions and pattern matches to be
exhaustive with regard to the domain of values of the type(s) they accept; i.e. a function that accepts
a result
must provide cases for all of its known constructors:
let get = function Ok v -> v
$ ocamlc -g -o demo.exe src/demo.ml File "src/demo.ml", line 15, characters 10-28: 15 | let get = function Ok v -> v ^^^^^^^^^^^^^^^^^^ Warning 8: this pattern-matching is not exhaustive. Here is an example of a case that is not matched: Error _
This one way we "provide evidence" to the OCaml compiler that our code does not not contain "logical omissions", to use Prof. Thomas' nomenclature.
There are ways to relax this requirement, though. Aside from simply telling the compiler to not bother us with its concerns via an attribute:
let get = function Ok v -> v [@@warning "-8"]
…we could simply use exceptions instead. For example, an exception-based variant of the program I started with earlier:
exception Unimplemented let a x = if x > 0 then 0 else raise Unimplemented let _ = Format.printf "%d@." @@ a (int_of_string @@ Sys.argv.(1))
This approach is less correct by any measure: the Unimplemented
exception is not indicated in the
signature of a
, making it easy to call a
without handling the exception, or being aware of its
possibility at all. Insofar as the exceptions in question are not intended to be fatal,
program-terminating errors, this approach absolutely increases the potential for "logical omissions",
increases the potential for programs to fail at runtime, and hobbles the exhaustivity guarantees that
the OCaml compiler provides for us otherwise.
Later in the reparse announcement thread, @rixed said (presumably in response to this tension):
If only we had a way to know statically which exceptions can escape any functions that would be the best of both worlds!
And indeed, this approach of incorporating known thrown exception types into function signatures is a known technique, (in)famously included in Java from the beginning (called checked exceptions), but widely disdained. I suspect that disdain was more due to Java's other weaknesses in exception handling than the principal notion of propagating exception types in function/method signatures. It would be interesting to see something like checked exceptions experimented with in OCaml, though it may be that doing so would nullify one of the primary benefits that those preferring exceptions enjoy (perceived improved aesthetics/clarity), and/or the work needed to achieve this might approximate the typed effect handling approaches that @lpw25 et al. have been pursuing for some time.
[1]: Making Software 'Correct by Construction'
https://www.gresham.ac.uk/lectures-and-events/making-software-correct-by-construction
[2]: https://www.gresham.ac.uk/lecture/transcript/download/making-software-correct-by-construction/
bnguyenvanyen said
Just chiming in to note that there has been an interesting discussion on this topic two years ago: https://discuss.ocaml.org/t/specific-reason-for-not-embracing-the-use-of-exceptions-for-error-propagation/1666/40
It's also interesting to note that that discussion also ended up talking about typed effects. As I understand it, they would indeed subsume checked exceptions, and I'm quite excited about them.
Yawar Amin also said
Cristiano Calcagno has been doing some pretty interesting work on this: https://github.com/reason-association/reanalyze/blob/72712393459d7e132c78e0700abffc5fc4cd09b8/EXCEPTION.md
Let me quote the central concept from there:
The exception analysis is designed to keep track statically of the exceptions that might be raised at runtime. It works by issuing warnings and recognizing annotations. Warnings are issued whenever an exception is raised and not immediately caught. Annotations are used to push warnings from he local point where the exception is raised, to the outside context: callers of the current function. Nested functions need to be annotated separately.
Later in the thread, Chet Murthy said
I'm going to address the general issue of "programming with monads", and not specifically the result monad, b/c I think it's just an instance of the general phenomenon.
TL;DR In 1992, when someone told me about "programming with monads", I replied that I already programmed with monads: I used the "SML Monad". And this LtU post seems to me to be pithily succinct (http://lambda-the-ultimate.org/node/5504 )
(1) when we talk about program correctness, we mean two things: reasoning about programs, and type-safety. I'll address each in turn below.
(2) All monadic transformations of which I am aware (exceptions, state, control, I/O) are direct equivalents to the "standard semantics" for such language-features, e.g. as described in Michael J.C. Gordon's book The Denotational Description of Programming Languages. Programming with monads is programming with some combinators and macros, on the right-hand-side of the denotational interpreters in that book.
(3) "reasoning about programs" has historically meant "equational reasoning", and IIUC, Felleisen&Sabry's work (and follow-on works) proved pretty conclusively that anything you can prove about the right-hand-side of the denotational semantics interpeter defiition, you can "pull back" to equational reasoning with extra rules, on the left-hand-side of the DS interpreter.
(4) "type safety":
If only we had a way to know statically which exceptions can escape any functions
There was a cottage industry of "effect type systems" to capture/reason-about exceptions, state, maybe other things, decades ago. They were judged too cumbersome for programmers to use, and hence died-out. >10yr ago there was a caml-light (OCaml?) variant that checked exceptions in function-types; it didn't catch on. Look at Java, where some exceptions are "checked" and others are not: some exceptions, it's just too cumbersome to track in the type system. And so either your "result" monad only captures some of the exceptions, or it's going to be wildly cumbersome.
(5) Monads are less-efficient than direct-style, memory-wise. For me, the moment in 1992 when I (an avowed SML/NJ bigot) became convinced of the superiority of caml-light (notwithstanding 2.5x slower on average) was when I realized that it was -so- much less memory-intensive. Because it didn't allocate stack-frames on the heap, and closures started out on the stack and only moved to the heap on-demand. Henry Baker made the observation >20yr ago that the stack is a form of nursery. Writing in monadic style is sacrificing this obviously performance advantage. In the era of multicore, arguments made back then about memory, can be recast as arguments about the cache today, since (as hardware designers put it) "memory is at infinity" today.
P.S. And yet, I use monads sometimes, too. Rarely. But for instance, it's a good model for (e.g.) writing a type-checker that wants to type-check a list of expressions (no unification, hence no side-effects) and not stop at the firs type-error, but rather gather together errors from all the expressions in the list, and produce an error with all of them combined. So the type-checker at the top of each member of the list catches any raised exception, stores it in an accumulator, and goes on to the rest of the list; at the end of the list, if the accumulator is empty, it returns the list of result-types; otherwise it raises an exception containing list of errors stored in the accumulator.
It's rare, and if the Result monad didn't exist, I'd hack something together, but …. it's literally
the only use I can think of, that wasn't driven by a library (e.g. bos
) using the Result monad itself
(and my needing to use that library).
And this efficiency is the real reason that exception-based backtraces are better: IIUC, OCaml exceptions are really cheap because they don't materialize that backtrace until demanded. It means that you have to be careful what code you put between the "try-with" and the demand for the backtrace, but it's efficient. Materializing the backtrace for every exception raised would be ….. pretty horrendously inefficient, and yet that's what you have to do if you use the result monad.
Malcolm also said
I wrote two blog posts on my experience using result
awhile ago, linked below. Much of it still
holds. Many of the pain points others have mentioned do exist, but in my judgement, given the current
state of Ocaml, results are strictly better (at the very least at the API boundary, assuming you can
convince yourself no exceptions escape it) than exceptions. I also believe that the reasonable error
values are necessary. For example, I know some APIs like some variation of ('a, string) result
which, IMO, is not a great API as I end up comparing strings and hoping the string value is actually
part of the API and not some rando value tossed in there. Double for when meaningful aspects of the
error are encoded in the string and I have to decode it to decide what to do.
For my own things I do require that all errors are convertible to a string so I can just show them to the user, this is especially important for development and debugging, IME. This is one of the few places where I do wish we had something like type classes so I could do something like:
foo () >>= function | Ok () -> yadda | Error err -> show err
YMMV
http://functional-orbitz.blogspot.com/2013/01/experiences-using-resultt-vs-exceptions.html
http://functional-orbitz.blogspot.com/2013/01/introduction-to-resultt-vs-exceptions.html
Making web calls to OCaml
Peter Fishman asked
Hi, I am new to OCaml and in fact, I'm not a even a programmer (although I did study CS at Cornell back in the 80's and learned functional programming in a language called scheme.) I am thinking about developing financial wellness web applications with the underlying computations in OCaml, but with the user interface in something else - like java script. How would a java script website make a call to an Ocaml program (or function)? Or put another way, can I publish a financial model built in OCaml so that a web (or mobile) application could call it - passing arguments to the function and receiving back the result of evaluating the functional expression? My apologies if this is not asked correctly or if it is a very basic question, but I am not sure I have the right terminology to ask the question properly! Help is appreciated!
Wojtek Czekalski replied
You can, there are multiple ways to achieve what you want. If I understand correctly you want to build a web server in OCaml and a web app front end. Here's a list of what different projects used: https://discuss.ocaml.org/t/your-production-web-stack-in-2020/6691/11
Edit: To elaborate because I realized I didn't answer your question, typically you'd have a frontend which uses something like rest or graphql to fetch data from your server. There's a lot to unpack here. I'm sure you'll be able to pull it off but if you're not comfortable with programming make sure that you approach the problem gradually and make sure to avoid analysis paralysis.
Yawar Amin also replied
Hi, a couple of thoughts here. As @wokalski said, you will need to set up a server application, and a web frontend. I don't know much about your background but, my guess is you would like to avoid complexity and keep things simple. Personally here's what I would recommend:
Write a simple command-line application, in the style of a Unix filter, in OCaml that takes 'requests' in the form of plain text on standard input and prints its calculation result to standard output. E.g., to take an input of
add 2 2
and output4
, it could work like this:$ echo 'add 2 2' | my_calculator.exe 4
- Next, use websocketd to wrap your calculator tool and serve it over WebSocket, which is a standard Web technology that allows clients to continuously talk to servers (2-way communication). So, clients could send a plain text command like
add 2 2
(note, exactly the same as you would have on the command line), and get back a response4
. - Finally, write a web application (just some HTML and JavaScript) that connects to the WebSocket server from step (2) and sends and receives messages. Here is an example of that: https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API/Writing_WebSocket_client_applications
The reason I am recommending this strategy, is to let you start small and simple, and skip over much of the complexity of dealing with modern web application development. You can focus on writing your calculator as a simple command-line tool, and 'outsource' the server part to a specialized tool.
Final thought: if you are working on a financial wellness tool, you almost certainly need decimal arithmetic (as opposed to binary arithmetic from OCaml's built-in float type). You will want to use a decimal package like https://github.com/janestreet/bigdecimal , or (disclaimer: mine) https://opam.ocaml.org/packages/decimal/ .
Good luck!
😷 Marcus Rohrmoser also replied
@peter I think I'm doing something similar – a simple web tool for geographic calculations from character sequences called geohash to gps coordinate pairs and vice versa. Here is it: https://demo.mro.name/geohash.cgi/u154. You'll find the source there, too.
Key is, I scale towards n=1, need no state.
The backend is <200 LOC to handle all the http stuff (there isn't much, no auth, no state, no cookies) and another ~100 LOC for the actual computation.
1 dependency, no 'modern' web toolkit, no client libs/frameworks, no concurrency
wasmtime 0.0.1: lightweight WebAssembly runtime
Laurent Mazare announced
We just released a first version of a package providing OCaml bindings to the
wasmtime WebAssembly runtime. The package is available on opam and can be found
in this GitHub repo. It can be used to run .wasm
modules in an OCaml process, including modules making system calls through WASI.
For now, the package only provides a low-level api closely matching the Rust implementation. We intend
to provide a higher level api on top of this.
The GitHub repo contains various examples in the tests
directory which reproduce some examples from
the main wasmtime repo.
Feedback/issue reports are very welcome!
First release of Lwt-exit
Raphaël Proust announced
On behalf of Nomadic Labs, I'm happy to announce the first release of Lwt-exit, a small opinionated library to cleanly handle exits and signals in applications that use Lwt.
The library is available through opam: opam install lwt-exit
,
hosted on gitlab: https://gitlab.com/nomadic-labs/lwt-exit,
distributed under the MIT license: https://gitlab.com/nomadic-labs/lwt-exit/-/blob/master/LICENSE
and the documentation is available online: https://nomadic-labs.gitlab.io/lwt-exit/
This library is used in the Tezos codebase to clean up system resources (flush buffered writes, cleanly close p2p connections, etc.) during exits. It is also used to attach signal handlers (both for interactive use via Ctrl+C and for daemonisation via systemctl).
Old CWN
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.