OCaml Weekly News

Previous Week Up Next Week

Hello

Here is the latest OCaml Weekly News, for the week of July 11 to 18, 2023.

Table of Contents

Combinaml.0.1 released - a customizable parser combinator library

traviss announced

This is my first public ocaml package. Please let me know if you have any feedback or advice on how I can improve it.

https://github.com/travisstaloch/combinaml

Its meant to be similar to angstrom except more easily customizable and with a few api differences here and there which i found useful.

I was using angstrom to parse peg grammars and ran into issues i could only solve by adding a tokenization step. This seemed messy so I decided to write this library and found a solution which uses an `until` parser like this:

let definition =
  lift2 pair (ident_str <* leftarrow)
    (until (ident_str <* leftarrow) expression)

let grammar =
  spacing *> many1 definition <* is_end_of_input >>| fun defs -> Grammar defs

It was just merged into ocaml/opam-repository a couple of hours ago and doesn’t seem to show up in opam search yet. I’m sure its a caching issue and will be there soon.

traviss later added

Help us Make the New Learn Area on OCaml.org Awesome

Sabine Schmaltz announced

We’ve been working on an overall outline for “an ideal state” of the Learn area. I took the current Work-in-progress, and updated the Miro board at https://miro.com/app/board/uXjVM4HlreI=/?share_link_id=39374874841. For many of the topics, we now propose tentative titles of documents relating to them. Some topics are still in need of titles.

Here’s an image of the state at the time I’m making this post:

fbe959fe8a234b2559494f733530e763742f722d_2_936x1000.jpeg

Note: I have to re-read the previous answers on this thread to make sure this captures the additional topics that have been brought up.

Now… what would be very interesting to me:

  1. Among these proposed documents, which ones are the most important to you?
  2. Which ones do you see as least important?
  3. Is there something you feel is missing? What would be the title of the missing document and where does it belong?

Later on, SayoBams asked and Claude Jager-Rubinson replied

Hi everyone, I’m working on implementing the changes for the new Learn Area and I’m currently looking at the books area. It made me wonder… do you know of any excellent new books that are not yet on https://ocaml.org/books?

Wang and Zhao (the team behind Owl) have an advanced book: https://link.springer.com/book/10.1007/978-1-4842-8853-5

https://caml.inria.fr/pub/docs/oreilly-book/html/index.html is a bit dated but excellent.

The Little MLer is a fantastic introduction to functional programming and, in particular, thinking recursively. It’s in SML but includes a page on translating the examples to OCaml.

I also can’t recommend highly enough The Functional Approach to Programming by Cousineau and Mauny, which uses Caml (no object system yet!). But it clarified many concepts that I hadn’t previously grokked. One of the huge advantages of OCaml is its stability and I think all of the examples still worked.

Maybe not so relevant for an OCaml books page but I also read the classic ML books when I was learning OCaml and they were a HUGE help: Paulson’s ML for the Working Programmer is phenomenal and I also recall Elements of ML Programming and Introduction to Programming using SML being helpful. SML and OCaml are close enough that both concepts and syntax readily transfer from the former to the latter.

T-Digest library

Simon Grondin announced

Github link This is just a minor release of a pandemic project that I never announced on discuss.ocaml.org. This library is “Complete”. There are no known bugs and no known missing features.

The T-Digest has become fairly well known in the last few years, but in short:

  • it’s a lossy data structure that allows the user to (very) accurately approximate percentiles and p_ranks without having to keep the entire sorted dataset in one place.
  • the user can combine multiple T-Digests just by concatenating them, and this can be done in the database itself!
  • both querying and insertion are blazing fast

facebook/infer has been using it for a few years, and I know it’s also used in a few closed-source projects elsewhere.

All comments and feedback is welcome! I hope this library proves useful to the OCaml ecosystem as a whole.

The future of OCaml, 2023 edition?

Masanori Ogino announced

The Future of OCaml page on OCamlverse needs some love, considering that an issue on GitHub from 2021 is still relevant. Although I have just posted what I am aware of on the GitHub issue, some of you should know even more. The page is the top result when you search “Future of OCaml” on Google, so leaving the page outdated will affect the impression of the language negatively. Shall we improve it?

OCaml-RDF 0.15.0

Zoggy announced

A new release of OCaml-RDF is available: https://www.good-eris.net/ocaml-rdf/posts/ocaml-rdf-0.15.0.html

This release includes new modules:

  • Rdf.Activitystreams defining the activitystreams/activitypub vocabularies,
  • Rdf.Nq to read and writes N-quads format.

A new package, rdf_json_ld implements part of the JSON-LD API: context processing, expansion, deserialization to RDF and serialization to Json-ld. Note that serialization produces a flat Json-ld rather than implementing the algorithm from the recommandation. It’s worth noting that the specification seems to have been written by javascript developers with little notion of typing. Moreover, JSON-LD format is far more complicated (and undoubtedly more energy-consuming) than simpler formats such as XML, Turtle or N-quads. I therefore advise against its use (but several activitypub servers seem to communicate only with this format…).

Packages rdf, rdf_ppx, rdf_json_ld, rdf_mysql, rdf_postgresql, and rdf_lwt are avalable in opam.

OCaml.org Newsletter: June 2023

Thibaut Mattio announced

Welcome to the June 2023 edition of the OCaml.org newsletter! As with the previous update, this has been compiled by @sabine and @tmattio.

The OCaml.org newsletter provides an overview of changes on the OCaml.org website and gives you a glimpse into what has been going on behind the scenes. You can find a list of previous issues here.

Our goal is to make OCaml.org the best resource for anyone who wants to get started and be productive in OCaml. We couldn’t do it without all the amazing OCaml community members who help us review, revise, and create better OCaml documentation. Your feedback enables us to better prioritise our work and make progress towards our goal. Thank you!

We present the work we’ve been doing this month in three sections:

  • Learn Area: We’re working towards making OCaml.org a great resource to learn OCaml and discover its ecosystem. This month, we continued working on the wireframes and designs of the new Learn area. We also focused on writing the new documentation with a couple of tutorials on Dune and S-Expressions.
  • Governance Page: The OCaml Platform team is working towards making the decision-making processes and ongoing development more transparent and community-driven (including the work on the OCaml Platform roadmap). To support the initiative, we’re working on a governance page that lists the teams and maintainers of the OCaml organisation.
  • General Improvements: As usual, we also worked on general maintenance and improvements and we’ve highlighted some of them in this newsletter.

Learn Area

  • 1. Redesign of the Learn Area

    Last month, we started working on the wireframes and the designs for the new Learn area, based on user feedback.

    This month, we made amendments to the wireframes and designs for the landing page in the learning area and subsequently created the wireframes for other necessary pages, namely “Get Started,” “Language,” “Tutorials,” “Exercises,” “Books,” and “Search Results.” We also held a interactive session with the OCaml.org team to review and rework the wireframes.

    At the end of the month, we also shared the updated designs to get feedback from the community.

    The work-in-progress designs are accessible on Figma.

    Next month, we’ll continue to improve the designs based on the feedback we received, and we’ll start sending Pull Requests to implement the UI.

  • 2. OCaml Documentation

    In addition to a complete redesign of the Learn area, our work involves a full revision of the documentation content, as well as the creation of new documentation.

    Last month, we completed the Sequences and Error Handling tutorials.

    This month, we held a workshop on writing new documentation with the OCaml.org team in order to kickstart the creation of many more documentation pages. The collaboration to write outlines for the new tutorials proved to be helpful, so we plan to hold regular workshops. We’re also planning to open these workshops to the community. Stay tuned!

    We created an entirely new tutorial on “File Manipulation” that is going to enter the community review phase soon. In addition, we worked on a new “Dune” tutorial and a new “S-Expressions” tutorial, and we created outlines for “Basic Datatypes” and “Values & Functions” tutorials.

  • 3. “Is OCaml X Yet?” Pages

    As part of the our work on the new Learn area, we started exploring the addition of “Is OCaml X yet?” pages, inspired by Rust’s excellent “Are we web yet?” page.

    As stated in the Pull Request, the goal of these pages is three-fold:

    • For newcomers, it offers an overview of the usability of OCaml for certain applications.
    • For OCaml users, it can help the discovery of libraries and frameworks to perform certain tasks.
    • For community members, it can serve as a roadmap to focus our efforts on addressing specific pain points to make OCaml competitive with other languages for specific use cases.

    We’ve engaged the community and authors of packages related to web development, and we received excellent feedback on the Pull Request.

    Next, we plan to focus the work on a single “Is OCaml Web Yet?” page and tackle other pages separately. We’ll continue to explore the ecosystem and merge an initial version of the page that we’ll aim to continuously improve to reflect the state of web development in OCaml.

  • 4. Preparing the Move of the opam Documentation to OCaml.org

    We worked on a patch that moves the opam documentation under the “Platform Tools” page in the Learn area.

    The intent behind this is to retire the public-facing website at opam.ocaml.org, now that we have a centralised directory for package documentation on ocaml.org.

    The long-term plan for the opam manual is to generate it via the package documentation pipeline. However, to realise this, the opam manual needs to be ported to odoc. As seen in in the OCaml Platform newsletter, the odoc team is currently working on improving odoc’s capabilities to create rich and easily navigable manuals.

Towards a More Transparent Governance For OCaml

In May, we merged a PR that extends the OCaml.org governance policy to include the governance of the OCaml Platform, including its lifecycle and the requirements for each stage.

This month, we worked on a new governance page that lists the teams and maintainers of the ecosystem.

The main challenge is to list the maintainers of each project accurately, going forward. To that end, we’re discussing using GitHub teams to get an up-to-date list of maintainers for each project.

General Improvements

A lot of work went on general maintenance and improvements this month!

Have a look a the list of relevant PRs and activities below for our highlights.

Relevant PRs and Activities:

  • We designed a banner for the OCaml home page and announced the ACM SIGPLAN award that OCaml received. – #1327
  • We began investigating how to load packages into the OCaml Playground.
  • We now recognise and display a Long-Term-Support version of OCaml (currently 4.14.1) on the main landing page, and the releases section has been moved from the Learn area to the main landing page. – #1277 & #1313
  • We added 55 RSS feeds from v2.ocaml.org to the blog aggregator on ocaml.org and discovered some faulty URLs in two of them. – #1329
  • We made a bit of progress towards a dark mode for ocaml.org by tidying up the Tailwind configuration, giving colors more semantic names, and factoring out repeated HTML into components. – #1350
  • We began working on enabling filtering by tags for blogs on ocaml.org. We sought community input on preferred filters/tags.
  • We worked on refining the documentation pipeline, specifically the tool voodoo, by removing dead legacy code and optimising the process for detecting README, LICENSE, and CHANGELOG files, with the aim of reducing the number of HTTP requests that ocaml.org makes to docs-data.ocaml.org.
  • A new broken link checker tool tarides/olinkcheck has been created. Efforts to integrate the tool with the package documentation pipeline are in progress, and a workflow that runs tarides/olinkcheck has been added to the GitHub repository. The tool extracts Hyperlinks from documents of the supported formats plaintext, S-expressions, YAML, and HTML, and it checks whether the given URL responds with a HTTP status 200. – #1345

Thibaut Mattio then added

This is an excellent opportunity to thank Ahrefs for their support as they provide a free account for OCaml.org!

The team has been using it numerous times to improve the site’s quality, including broken links, error pages, bad HTML formatting, etc. It’s been invaluable to identify how we can improve the site and if you’ve noticed improvements in the search engine ranking of the site (for instance https://ocaml.org/p/base/v0.15.0 now ranks higher than https://opam.ocaml.org/packages/base/ for me, it wasn’t the case a few months ago), it’s in large part thanks to Ahrefs who gave us the tools to improve SEO.

The goal of the broken link checker is to have something that can be integrated more easily into ocaml.org’s CI. It allows us to have the workflow on display in the last PR: https://github.com/ocaml/ocaml.org/pull/1354

Day of the Camel 2023: OCaml in Academia and Industry (online, 20 July 2023)

Roberto Blanco announced

We (I and my co-conspirator Ricardo Rodríguez) are organizing a new edition of our erstwhile one-day hybrid workshop dedicated to the OCaml programming language and its industrial users. We will have talks and discussions with members of the OCaml development team, as well as companies using the language to solve complex and interesting problems.

Once again, the objective is to present a broad picture of the OCaml ecosystem and, more widely, of functional programming (and related techniques) as a viable and powerful choice for building correct and reliable computer systems. This is done as part of the second edition of an OCaml summer school, hosted this year again by the University of Zaragoza, and (again) generously sponsored by the OCaml Software Foundation.

Participation is free and open to everyone. We will stream the workshop on Zoom (https://us06web.zoom.us/j/89373710207?pwd=ZGZuSnBFWEhSc2UzNnpSbWF4d0hzZz09, passcode: 247844) and Twitch. Here’s the preliminary schedule, additional information and updates can be found on our website.

20 July 2023, all times CEST (UTC+2)

Morning – Language session

  • 09:00-09:30: Carmen Lazo and José Merseguer (University of Zaragoza) – Welcome reception
  • 09:30-10:30: Florian Angeletti (Inria) – The OCaml project and ecosystem
  • 10:30-11:00: Coffee break
  • 11:00-12:00: OCaml developers – Round table and Q&A

Afternoon – Industry session

  • 14:30-14:55: Vincent Balat (Tarides / Be Sport) – Building functional systems / Social network for sports
  • 14:55-15:20: Javier Chávarri (Ahrefs) – Petabyte-scale web crawler
  • 15:20-15:45: Raphaël Proust (Nomadic Labs) – Tezos blockchain development
  • 15:45-16:10: Chris Casinghino (Jane Street) – Large-scale quantitative trading
  • 16:10-16:30: Coffee break
  • 16:30-17:30: Industrial users – Round table and Q&A

We look forward to seeing you there. Feel free to join, participate and distribute!

fuzzy_compare

Simon Grondin announced

A few days ago I posted about the T-Digest library. Today I’m back with another small algorithmic library:

-> Github link

You’re probably familiar with the Levenshtein distance: the number of single character edits (additions, deletions, replacements) between 2 strings.

Calculating the Levenshtein distance is famously more expensive than a simple equality check.

Instead of calculating the distance, this library instead returns whether 2 values are within D distance of each other (bool). There has been substantial development on the topic of Levenshtein automata in the last decade. See the “Fast String Correction with Levenshtein-Automata” paper by Klaus Schulz and Stoyan Mihov.

Using the graph construction technique from the paper, plus a few ideas from this article and several additional optimizations of my own, this library can answer the question (“are these 2 values within D edits of each other”) in the 1-10µs range, scaling linearly with the length of the values.

  • a Functor is provided to enable comparisons across any arbitrary types
  • string comparisons are provided (functorized) out of the box
  • reuse the same automaton across all comparisons with the same max_edits, regardless of the type of the values being compared
  • max_edits must be between 0 and 3 (inclusively) due to the astronomical scaling factor during graph building
  • most comparisons take under 5 µs, depending on the length of the values

It’s fast enough that it can be used instead of String.equal for some tasks and/or on large datasets.

All comments and feedback are welcome! I hope this library proves useful to the OCaml ecosystem as a whole. I’ll be back in a few days with a special algorithmic library to complete this little trilogy.

Ppxlib dev meetings

Sonja Heinze announced

Hello :wave: The `ppxlib` July dev meeting is tomorrow Tue, July 18th, at 6pm CET. Here’s what’s on our agenda so far:

  • OMP:
    • Do we “stop maintaining it” or do we add OCaml 5.1 support?
  • Ppxlib - OCaml trunk compact:
    • Currently, there’s no compatibility due to an ocaml-compiler-libs build problem. Who’s affected?
  • Ppxlib’s general maintenance:
    • OCaml 5.1 support: The bug fix around generative functor applications is being worked on.
    • We’re not in a hurry to bump the AST this time.
    • A few pending reviews on Ppxlib. What’s the best strategy for reviews / reacting to non-urgent issues now that we’re in “minimal maintenance mode”?
    • Is there anything else that will come up before September?
  • OCaml workshop 2023:
    • Recap on why our talk proposal on Ppxlib has been rejected.
  • Outreachy internship on Ppxlib:
    • How is it going? :heart:

We’re always happy to add things, if anyone is interested in anything else.

Explorations on Package Management in Dune

Continuing this thread, Thibaut Mattio said

Thank you @gasche for your interest and input!

Expanding on what @rgrinberg and @rjbou mentioned earlier, there are no intentions to phase out the `opam` client. As a matter of fact, the Dune team has currently forked opam and is patching the opam libraries, with the ultimate goal of merging them back upstream once the libraries exposing the necessary APIs for Dune package management appear to be stable. You can see some of this work in progress in pull requests like #5568, #5508, #5498, #5496, #5452, and so on.

As for the opam repository, you understood correctly that there is absolutely no plan to deprecate it, or even to make large changes to it in the context of Dune package management. The goal is for Dune package management to be 100% compatible with the opam repository.

To expand on a slightly divergent path and talk about the role of the OCaml Platform: the Platform essentially mirrors the state of the world. For the opam client to become deprecated in the Platform, it would need to become the de-facto reality first. While it might be that the opam client (as for any other tool) enters a maintenance mode and eventually becomes deprecated, that seems unlikely for now, given the number of users who are relying on the opam client. That being said, if and when that happens, the Platform’s role will be to make sure that there is a smooth transition path for users, and that’s something that will require careful planning and discussions. All of which is entirely out of scope for the initial release of package management in Dune.

On a different note, following a discussion with @dra27, opam switches are architected around findlib/ocamlfind. Dune package management presents an alternative solution to achieve the same result. As you point out, it’s not meant to be reusable between workflows: the opam packages Dune compiles for your project are intended for Dune’s internal use during its build, not for external use with the shell. This could be viewed as a parallel to how opam builds switches for opam exec (with eval $(opam env) serving as a convenient shortcut). So, you can think of Dune package management as performing a similar function but specifically for dune exec.

Now, there is the question of how to make sure this doesn’t create confusion and hurt adoption. But that’s not something that’s specific to Dune and opam. In fact, if anything else, it makes the Platform more cohesive: odoc, ocamlformat, merlin, utop and mdx are all tools that work well independently, but with which you don’t need to interact if you use Dune. Dune has grown as the frontend of the Platfrom and the integration with opam is another step in this direction, not something very new if we look at what’s being done with the other tools. And this is the best of both worlds: as a power user, you’re free to use each tool independently and you’re not locked in, but as a newcomer or even as a power user who’s happy with the default experience, you can just use Dune.

gasche replied

Reading the opam package management RFC gives me the impression that the situation with the `opam` client is in fact rather different from the situation with other tools, because Dune is in the process of reimplementing large parts of this logic internally, instead of delegating to opam, because you want a tighter integration into Dune internals than a pure-delegation model allows. As far as I know, dune is not reimplementing logic from merlin or odoc or ocamlformat.

I tried to make uninformed guesses at which part of the opam client responsibilities Dune would reuse and replace in my post above. My best guess as to what part you would reuse in the long term (of the opam client responsibilities) is “parsing and programmatic understanding of opam files”. Are there other important ones that I missed?

Another consequence of this design is that the new features which are planned, and are indeed quite nice, will be specific to projects that use dune for package management. The plan is for Dune to provide, for example, good support for incremental rebuilding (when dependencies change), caching (of package artifacts across independent projects), a nice local-switch-first command-line UI with lockfile integration by default, but also editor support (building package dependencies from the IDE directly). None of those features are planned for people using the opam client – if I understand correctly. Some of those features (in particular incremental rebuilding) are clearly in the ballpark of a build system and reimplementing within Dune makes a lot of sense. But for some others, for example the latter three in my list, adding them to the opam client would also have been a possible approach, but you chose to work within Dune instead.

This is also the root of my question on whether the long-term strategy is to keep two tools/codebases alive, or just one. For ocamlformat or odoc, it wouldn’t make sense to ask whether odoc will disappear once dune gets first-class documentation support. For the opam client and package management, it does.

Anil Madhavapeddy then said

This is also the root of my question on whether the long-term strategy is to keep two tools/codebases alive, or just one.

My longer term view has always been that we should focus on having well-specified file formats that our tools use, and let many domain specific tools that operate over that file metadata bloom. The reason for this is that files that are checked into a project have a habit of sticking around for the long-term (or forever, if you consider historical releases), whereas tools naturally evolve and perish.

The only thing necessary to publish something “into the OCaml community” (that is, something that shows up on a package search on the website) is a tarball with an opam file in it. This opam file specifies interdependencies and a build plan. We have, as of just now, 28296 of these checked into the central opam repository. The vast majority of those packages can be downloaded, extracted, and an installation plan generated simply by looking at the local opam file in the tarball and the central collection of them that represent potential dependencies (the opam repository).

Over the years, we’ve had many build tools spring up: OCamlMakefile, omake, ocamlbuild, oasis, b0, ninja, and jbuilder/dune. What makes dune so interesting from a long-term perspective is that the checked in dune file is also separately versioned, so that it should (with a sufficiently good specification) be possible to analyse the build logic of a repository just by examining it. With most of the other build systems, you needed to run an executable to get a build plan (notably with ocamlbuild, and even with oasis running over ocamlbuild), which tightly couples it to a particular tool. That’s why I’ve been so resistant to the idea of publishing opam packages which do not include a generated opam file (even if its autogenerated from dune), since you then lose the property of simply being able to examine a published artefact to determine how to build it.

What other file formats do we have in common platform tools? We used to have .merlin files, but they’re autogenerated now from a dune build plan in most cases. There are .ocamlformat files, mostly in a k/v format. Generally speaking, we’ve been pretty good at promoting and exposing metadata in an opam or dune file and not having too much of a proliferation of other files.

What tools operate over opam files?

Given that an opam file exists, what tools can actually run over them?

  • opam.exe - the main CLI client, and which exposes an excellent CLI interface to avoid having to parse them directly.
  • opam-0install-solver - implements a much-simplified version of the solver to do ’one-shot’ solutions that do not need to take existing packages into account.
  • (upcoming) the dune integration, which will also use opam files (and repositories) to perform source fetching operations. Notably, this also allows dune to generate build plans for non-dune packages, which was not possible before.
  • And others, like lsp-server, can also use these checked in files to perform editor-driven operations.

Do we actually need a CLI?

One key architectural difference between build systems and package managers is how stateful they are: build systems usually maintain very little outside of their build tree, whereas package managers (especially opam) have a lot more.

So why do we actually still need an active CLI? The zero-configuration ocaml-ci is a step towards showing that we don’t need anything beyond files that are checked into a source code repo! Consider the following operations, and mappings to how to do them by modifying files and having a background worker process watching for file changes:

  • opam install: Edit the opam file to add a new dependency, and then the background watcher can transactionally install it into a local switch.
  • opam pin: Edit the opam file to add a pin-depends.
  • opam remove: Edit the opam file to remove a dependency.
  • opam remote add: We don’t current offer an official way to check in which opam repositories a project depends on. Could use an x-opam-repos extension field and establish a standard.
  • opam switch: Edit the dune-workspace file to register a new local switch for a project with a compiler version.

Storing all project state in existing metadata files has huge workflow advantages: it means you can statelessly build a project without having to reconstruct local pins/switches for others, which in turns means that CIs like ocaml-ci “just work” when you push the code remotely! It also makes the act of releasing a package much easier, since you can just remove pins/overrides progressively with help from local editors and global CI tools. It also works really well in a monorepo workflow.

The purpose of this little segway is to demonstrate why I think well-specified and versioned file formats are more important than tools, since you can then build the right tool to solve your particular workflow problem for a given context. And to go back to the @gasche’s original question, I don’t think we should be thinking about the dune and opam projects/codebases merging, but rather what elements of their respective codebases should be focussed on to allow more interoperability between tools for their respective file formats.

Some possible considerations:

  • solvers: the full solver libraries are quite heavyweight (and subjectively, overly complex C++ based solvers), but opam-0install is a lovely alternative if single-shot solutions are all that is required. Can these be made more accessible and embeddable to other CLIs (initially dune, but also LSP and whatever else wants to solve for version constraints?)
  • repositories: how can we manipulate opam repositories in a more unified way? Right now they are just a collection of files, but we do need to figure out how to move older packages out of the way, but still retain the ability to install them on demand. This is a top priority for the opam-repository maintainers, and presumably will become a problem for other downstream users such as the coq-opam-repository maintainers as they hit scale issues as well.
  • more formal specifications: if we view tools as interpreters over DSLs (the opam and dune files), then why aren’t we formally specifying these better? After all, we have close to 30000 of them published now by thousands of us across 12 years! And we need to interoperate with other distributions and their package managers. Wouldn’t it be lovely to be able to install opam packages from within Debian, or even other multi-version package managers like Pub.dev

For dune, you can conduct a similar thought experiment, but the most obvious interop point is to take dune files and embed any OCaml project within a larger build system like Bazel or Buck2, without having to write any manual bridging rules.

I’m sketching out my thoughts on the opam repository management roadmap next, but I’d be delighted to hear more about others’ thoughts on what new tools you’d build over sufficiently well specified dune or opam file formats…

Moonpool 0.3

Simon Cruanes announced

:wave: deer OCaml aficionados,

Moonpool 0.3 was just released on opam. Moonpool is a new concurrency library for OCaml >= 4.08, with support for OCaml 5 from the get-go. It started out with a thread pool (possibly distributed on multiple domains to be able to use multiple cores) along with a future/promise module.

This release comes with a set of new features on top of pool+futures:

  • a small 'a Lock.t abstraction to protect a resource with a lock in RAII-style
  • a type of unbounded channels (which are fairly naive in implementation)
  • improvements to Pool such as Pool.run_wait_block: (unit -> 'a) -> 'a that runs a whole computation on the pool, and waits for its result (or re-raises)
  • add Fut.await (only on OCaml 5)
  • add support for domain-local-await if installed
  • a Fork_join module for, well, fork-join parallelism, including parallel for and parallel List.map~/~Array.map. These computations can be nested and “feel” like writing code in a direct style. This relies on effects and is only available on OCaml 5.

Examples for fork-join

The (too) classic parallel fibonacci function:

open Moonpool
let (let@) = (@@)

let rec fib_direct x =
  if x <= 1 then
    1
  else
    fib_direct (x - 1) + fib_direct (x - 2)

let rec fib x : int =
  (* some cutoff for sequential computation *)
  if x <= 18 then
    fib_direct x
  else (
    let n1, n2 =
      Fork_join.both
        (fun () -> fib (x - 1))
        (fun () -> fib (x - 2))
    in
    n1 + n2
  )

let fib_40 : int =
  let@ pool = Pool.with_ ~min:8 () in
  Pool.run_wait_block pool (fun () -> fib 40)

A parallel sum, from a test case:

let () =
  let total_sum = Atomic.make 0 in
  Pool.run_wait_block pool (fun () ->
      Fork_join.for_ ~chunk_size:5 100 (fun low high ->
          (* iterate on the range sequentially. The range should have 5 items or less. *)
          let local_sum = ref 0 in
          for i = low to high do
            local_sum := !local_sum + i
          done;
          ignore (Atomic.fetch_and_add total_sum !local_sum : int)));
  assert (Atomic.get total_sum = 4950)

Note that Fork_join.for_ gives its functional argument a range to process, the size of which is controllable with the optional chunk_size. This allows for large values to be passed to for_ without starting as many tasks, as demonstrated below:

Computing digits of π:

let my_pi : float =
  let@ pool = with_pool () in

  let num_steps = 100_000_000 in
  let num_tasks = Pool.size pool in

  let step = 1. /. float num_steps in
  let global_sum = Lock.create 0. in

  Pool.run_wait_block pool (fun () ->
      Fork_join.for_
        ~chunk_size:(3 + (num_steps / num_tasks))
        num_steps
        (fun low high ->
          let sum = ref 0. in
          for i = low to high do
            let x = (float i +. 0.5) *. step in
            sum := !sum +. (4. /. (1. +. (x *. x)))
          done;
          let sum = !sum in
          Lock.update global_sum (fun n -> n +. sum)));

  let pi = step *. Lock.get global_sum in
  pi

Here the Lock is not a performance issue because there are only num_tasks (ie roughly your CPU’s number of cores) chunks processed in the for_, so there’s only like 8 updates at the end, not 100_000_000 updates which would create a lot of contention.

binsec 0.8.0

Frédéric Recoules announced

On behalf of the BINSEC team, I am glad to announce that version 0.8.0 now lives in Opam.

As a short introduction, BINSEC is an open-source program analyzer developed at CEA List to help improve software security at the binary level. It has been successfully applied in a number of security-related contexts, such as vulnerability finding, (malware) deobfuscation, decompilation, formal verification of assembly code or even binary-level formal verification.

More information can be found on the website, including publications, tutorials or contacts, but also the description of this release as well as previous ones.

Help Review the new “File Manipulation” tutorial on OCaml.org

Sabine Schmaltz announced

there’s a new version of the “File Manipulation” tutorial on

https://staging.ocaml.org/docs/file-manipulation

For comparison: the old version of this tutorial is here https://ocaml.org/docs/file-manipulation.

https://github.com/ocaml/ocaml.org/pull/1400

Thanks for taking a look and giving feedback and suggestions for revising this! :)

Mutaml 0.1

Jan Midtgaard announced

I’m happy to announce the release of Mutaml 0.1, a mutation testing tool for OCaml: https://github.com/jmid/mutaml

Mutaml attempts to make small random changes your code, e.g., turning e+1 into e to see if the off-by-one change is caught by your test suite. By finding examples of uncaught wrong behaviour, it can thereby reveal limitations of a test suite and indirectly suggest improvements.

283db3d2c56d9d095bffcb754297777deefbbade.gif

Overall 0.1 is considered an initial working prototype.

@raphael-proust previously blogged about how he used it while developing Seqes.

Acknowledgements

Mutaml was developed with support from the OCaml Software Foundation. While developing it, I also benefitted from studying the nice source code of @antron’s bisect_ppx.

OCaml Platform Newsletter: June 2023

Thibaut Mattio announced

Welcome to the third instalment of the OCaml Platform newsletter!

This edition brings the latest improvements made in June to enhance the OCaml developer experience with the OCaml Platform. As in the previous updates, the newsletter features the development workflow currently being explored or enhanced.

The month’s standout highlight is undoubtedly the first alpha release of opam 2.2! Years in the making (opam 2.1 was released almost two years ago), the significance of the hard work put in by the opam team can’t be overstated. Much appreciation goes out to the opam team (Raja Boujbel, David Allsopp, Kate Deplaix, Louis Gesbert, in a united OCamlPro/Tarides collaboration), and especially to Raja Boujbel for diligently pushing the work to completion in order to achieve this alpha. The announcement holds more details, and we encourage you to provide feedback on the Discuss post.

  • Releases
  • Building Packages
    • Dune Exploring Package Management in Dune
    • opam Native Support for Windows in opam 2.2
    • Dune Improving Dune’s Documentation
    • Dune New dune show command
  • Generating Documentation
    • odoc Add Search Capabilities to odoc
  • Editing and Refactoring Code
    • Merlin Support for Project-Wide References in Merlin
    • Merlin Improving Merlin’s Performance
    • OCaml LSP Upstreaming OCaml LSP’s Fork of Merlin
    • OCaml LSP Extract code actions
    • OCaml LSP Support for Inlay Hints
  • Formatting Code
    • OCamlFormat Closing the Gap Between OCamlFormat and ocp-indent

Releases

June was a bustling month with a total of nine releases! This included three patch releases and one minor release of Dune, the release of the first alpha of opam 2.2, two minor releases of OCaml LSP, a minor release of Ppxlib, and a major release of dune-release. To learn about the features and improvements included in all of these, visit the OCaml Changelog.

Building Packages

  • Dune Exploring Package Management in Dune

    Contributors: @rgrinberg (Tarides), @Leonidas-from-XIV (Tarides), @gridbugs (Tarides), @kit-ty-kate (Tarides)

    There was notable progress on Dune lockdirs this month, the team is nearing the ability to lock and build simple opam packages.

    The improvements include:

    • The solver’s understanding of opam flags (with-test and with-doc)
    • Separate lockdirs per build context, allowing users to configure the policy for choosing package versions.
    • Configuration of lockdirs in the dune-workspace file per context.
    • Improved fetching to work with VCS repos and single files
    • System variable values now determined in line with opam’s approach

    Blockers to implement the end-to-end workflow are currently being discussed, and next month’s focus will be on increasing the coverage of opam features.

    Activities:

    • Lock file configuration in workspace – #7835
    • Use Dyn.variant constructor for Op – #7936
    • Lockdir package files have .pkg extension – #8014
    • Fix: Downloading local repo doesn’t work – #8060
    • Test errors for invalid opam repositories – #7830
    • Lock directory regeneration safety – #7832
    • Generate lockdir from current switch – #7863
    • Implementation of OpamSysPoll in Dune-terms – #7868
    • Lockdir encode/decode roundtrip tests – #7914
    • Document why local opam repo path is a Filename.t – #7971
    • Lockdir generation using opam switch prefers oldest – #7980
    • Arguments to specify contexts to dune pkg lock#7970
    • Lockdirs are data-only – #7979
    • Prefer newest packages by default – #8030
    • Don’t take global lock in dune pkg lock#8016
    • Conditional dependencies in lockdir – #8050
    • Removal of lock_dir field from Lock_dir.Pkg.t – #7965
    • Feature(pkg): extra sources – #8015
  • opam Native Support for Windows in opam 2.2

    Contributors: @rjbou (OCamlPro), @kit-ty-kate (Tarides), @dra27 (Tarides), @emillon (Tarides), @Leonidas-from-XIV (Tarides), @3Rafal (Tarides), @christinerose (Tarides), @sabine (Tarides)

    The first alpha of opam 2.2 was just released!

    The most anticipated feature is native Windows compatibility: opam can now be launched in any Windows terminal! It currently requires a preexisting Cygwin installation, a limitation that is set to be lifted for alpha2.

    As stated in the announcement, it should be noted that opam-repository isn’t compatible with Windows just yet. It requires the upstreaming of patches from ocaml-opam/opam-repository-mingw and dra27/opam-repository. This is set to occur before the final release of opam 2.2, so opam init can work with the upstream opam-repository on Windows.

    Windows support isn’t the only exciting feature in the release. To learn about other significant features included in opam 2.2, please read the announcement and don’t hesitate to share your feedback on the Discuss post.

    Activities:

    • Windows support
      • Improved local cygwin installation detection – #5544
      • Introduced some updates to Windows shell – #5541
      • Fixed detection issue when C++ compiler is prefixed – #5556
    • Other improvements
      • Fix performance regression in opam install/remove/upgrade/reinstall – #5503
      • Adjusted to open the release files for reading – #5568
      • Fixed OpenSSL missing message – #5557
      • Enhanced error reporting to print version when failing to parse it – #5566
    • Release management
      • Finalise release: Untie test from opam version – #5578
      • Prepared for the 2.2.0~alpha release with essential updates – #5580
      • Included 2.2.0-alpha binaries in install.sh – #5588
      • Readme updates – #5589
      • Documentation: update documentation to be embed in ocaml.org – #5593 #5594
      • Add some tests – #5385
      • Improved output cleanliness when stdout is not a TTY – #5595
      • Update lint for conflicts field’s filter that does not support package variables – #5535
      • Applied autoupdate to silence autogen warnings – #5555
    • Security audit
      • Fixed opam installing packages without checking their checksum when the local cache is corrupted – #5538
      • Reftests: add tests to check url handling behaviours – #5560
      • lint: add some lint & fix for url checks – #5561
      • opamfile: parse error on escapable paths – #5562
      • source: add –no-checksums & –require-checksums flags – #5563
      • No more populate opam file with extra-files – #5564
  • Dune Improving Dune’s Documentation

    Contributors: @emillon (Tarides)

    The effort to enhance Dune documentation continues. Past efforts focused on the high-level organisation of the documentation, and the new structure was published as part of Dune 3.8 release. This month, various improvements were made to the content of the documentation itself.

    Activities:

  • Dune New dune show command

    Contributors: @Alizter, @rgrinberg (Tarides), @snowleopard (Jane Street)

    A new dune show command group has been added as an alias to the existing dune describe command.

    The new command group comes with two new commands, dune show targets and dune show aliases, to enhance the introspection of Dune projects and discoverability of available Dune commands.

    • dune show targets [OPTION]… [DIR]… is inspired by ls and prints the targets available in a given directory.
    • dune show aliases [OPTION]… [DIR]… prints the aliases available in a given directory.

    Feedback on these new commands is welcome and can be shared on Dune’s issue tracker.

    Activities:

    • Create dune show command group – ocaml/dune#7946
    • dune show targets and dune show aliases commands –

    ocaml/dune#7946

Generating Documentation

  • odoc Add Search Capabilities to odoc

    Contributors: @panglesd (Tarides), @EmileTrotignon (Tarides), @trefis (Tarides)

    The profiling and optimising work that began last month on sherlodoc has shown results: the database size was reduced significantly, and the indexing time has also been greatly reduced.

    In addition, a complete overhaul of the search feature UI was conducted, with advice from the OCaml.org team.

    Attention then turned to testing (and debugging) the indexing/search more extensively.

    Additionally, progress was made on outputting usage statistics on the search index. Specifically, support for occurrences was untangled from source code rendering, and support was added for counting occurrences of values, modules, types, module types, class types, and constructors.

    The different pull requests are approaching merge-readiness. The next step will be to adapt the Dune and OCaml.org drivers to make the feature available to users of odoc.

    Activities:

Editing and Refactoring Code

  • Merlin Support for Project-Wide References in Merlin

    Contributors: @voodoos (Tarides), @let-def (Tarides)

    The entire stack of pull requests required for project-wide references, including the compiler patches, ocaml-uideps, Dune, Merlin, and ocaml-lsp, has been rebased to include the latest compiler changes.

    This allowed for the discovery of some issues with first-class modules and aliases. Alias tracking was also added to the shapes, which is required for occurrences to distinguish between different aliases of the same module.

    Activities:

  • Merlin Improving Merlin’s Performance

    Contributed by: @pitag-ha (Tarides), @3Rafal (Tarides), @voodoos (Tarides), @let-def (Tarides)

    Efforts to improve Merlin’s performance included ongoing work on Merlin benchmarking and error regression CI pipelines. Several issues in merl-an were fixed to stabilise the benchmarking CI and the proof of concept (POC) of the error regression CI that was opened in Merlin.

    The benchmarking CI was merged at the beginning of July, so Merlin is now being continuously benchmarked for performance regressions.

    Next month, experiments will continue on the best approach for the error regression CI before refocusing on concrete performance improvements.

    Activities:

  • OCaml LSP Upstreaming OCaml LSP’s Fork of Merlin

    Contributors: @voodoos (Tarides), @3Rafal (Tarides)

    The PR that removes OCaml LSP’s fork of Merlin has been merged!

    Following the merge, patches were added for compatibility with OCaml 5.1, and OCaml LSP 1.16.1 was released.

    Activities:

  • OCaml LSP Extract code actions

    Contributors: @jfeser, @rgrinberg (Tarides)

    OCaml LSP 1.16.1 introduces two new code action kinds to LSP: Extract local and Extract function.

    • The Extract local refactoring action takes an expression and introduces it as a new local let-binding in the enclosing function.

      let f x =  $x+1$ + 2 (* $..$ is the selected code *)
      (* Becomes: *)
      let f x =
        let new_var = x + 1 in
        new_var + 2
      
    • The Extract function refactoring action takes an expression and introduces it as a new function in the enclosing module.

      let f x =  $x+1$ + 2 (* $..$ is the selected code *)
      (* Becomes: *)
      let new_fun x = x + 1
      let f x = new_fun x + 2
      

    Activities:

  • OCaml LSP Support for Inlay Hints

    Contributors: @jfeser, @rgrinberg (Tarides), @voodoos (Tarides)

    The LSP 3.17 Spec introduced the feature of Inlay Hints, an enhancement that allows editors to integrate annotations in line with the text, in order to display parameters names, type hints, and so on.

    This month witnessed the commencement of Inlay Hints’ implementation in the OCaml LSP server. Currently, the pull request is undergoing review, with plans to integrate it into the subsequent minor release, OCaml LSP 1.17.0.

    d68d6fc69e8d1bd337b3c909a1a56abe516b936e_2_766x1000.png

    Activities:

Formatting Code

  • OCamlFormat Closing the Gap Between OCamlFormat and ocp-indent

    Contributors: @gpetiot (Tarides) and @EmileTrotignon (Tarides), @Julow (Tarides), @ceastlund (Jane Street)

    The pursuit of aligning OCamlFormat’s janestreet profile more closely with the output of ocp-indent, initiated a few months back, continued this month. A significant proportion of the changes this month revolved around the treatment of comments.

    The OCamlFormat team is also preparing the release of OCamlFormat 0.26.0, which will include all of the bug fixes and improvements implemented in the past months. If you’d like to get a glimpse of the formatting changes this entails, have a look at some of the preview PRs:

    Activities:

Old CWN

If you happen to miss a CWN, you can send me a message and I’ll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.