Previous week Up Next week

Hello

Here is the latest OCaml Weekly News, for the week of November 01 to 08, 2016.

  1. OCaml version 4.04.0 is released.
  2. RISC-V native backend, no longer cross-compiling
  3. The fastest stream library
  4. Other OCaml News

OCaml version 4.04.0 is released.

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2016-11/msg00010.html

Damien Doligez announced:
Dear OCaml users,

We have the pleasure of celebrating the discovery of Tutankhamun's
tomb by announcing the release of OCaml version 4.04.0.
This is a major release with several new features (most notably,
Spacetime). See the list of changes below.

It is (or soon will be) available as an OPAM switch, or as a source
download here: http://caml.inria.fr/distrib/ocaml-4.04/

Happy hacking,

-- Damien Doligez for the OCaml team.


OCaml 4.04.0:
-------------

(Changes that can break existing programs are marked with a "*")

### Language features:

- PR#7233: Support GADT equations on non-local abstract types
  (Jacques Garrigue)

- GPR#187, GPR#578: Local opening of modules in a pattern.
  Syntax: "M.(p)", "M.[p]","M.[| p |]", "M.{p}"
  (Florian Angeletti, Jacques Garrigue, review by Alain Frisch)

- GPR#301: local exception declarations "let exception ... in"
  (Alain Frisch)

- GPR#508: Allow shortcut for extension on semicolons: ;%foo
  (Jeremie Dimino)

- GPR#606: optimized representation for immutable records with a single
  field, and concrete types with a single constructor with a single argument.
  This is triggered with a [@@unboxed] attribute on the type definition.
  Currently mutually recursive datatypes are not well supported, this
  limitation should be lifted in the future (see MPR#7364).
  (Damien Doligez)

### Compiler user-interface and warnings:

* PR#6475, GPR#464: interpret all command-line options before compiling any
  files, changes (improves) the semantics of repeated -o options or -o
  combined with -c see the super-detailed commit message at
  https://github.com/ocaml/ocaml/commit/da56cf6dfdc13c09905c2e07f1d4849c8346eec8
  (whitequark)

- PR#7139: clarify the wording of Warning 38
  (Unused exception or extension constructor)
  (Gabriel Scherer)

* PR#7147, GPR#475: add colors when reporting errors generated by ppx rewriters.
  Remove the `Location.errorf_prefixed` function which is no longer relevant
  (Simon Cruanes, Jérémie Dimino)

- PR#7169, GPR#501: clarify the wording of Warning 8
  (Non-exhaustivity warning for pattern matching)
  (Florian Angeletti, review and report by Gabriel Scherer)

* GPR#591: Improve support for OCAMLPARAM: (i) do not use objects
  files with -a, -pack, -shared; (ii) use "before" objects in the toplevel
  (but not "after" objects); (iii) use -I dirs in the toplevel,
  (iv) fix bug where -I dirs were ignored when using threads
  (Marc Lasson, review by Damien Doligez and Alain Frisch)

- GPR#648: New -plugin option for ocamlc and ocamlopt, to dynamically extend
  the compilers at runtime.
  (Fabrice Le Fessant)

- GPR#684: Detect unused module declarations
  (Alain Frisch)

- GPR#706: Add a settable Env.Persistent_signature.load function so
  that cmi files can be loaded from other sources. This can be used to
  create self-contained toplevels.
  (Jérémie Dimino)

### Standard library:

- GPR#473: Provide `Sys.backend_type` so that user can write backend-specific
  code in some cases (for example,  code generator).
  (Hongbo Zhang)

- PR#6279, GPR#553: implement Set.map
  (Gabriel Scherer)

- PR#6820, GPR#560: Add Obj.reachable_words to compute the
  "transitive" heap size of a value
  (Alain Frisch, review by Mark Shinwell and Damien Doligez)

- GPR#589: Add a non-allocating function to recover the number of
  allocated minor words.
  (Pierre Chambart, review by Damien Doligez and Gabriel Scherer)

- GPR#626: String.split_on_char
  (Alain Frisch)

- GPR#669: Filename.extension and Filename.remove_extension
  (Alain Frisch, request by Edgar Aroutiounian, review by Daniel Bunzli
  and Damien Doligez)

### Code generation and optimizations:

- PR#4747, GPR#328: Optimize Hashtbl by using in-place updates of its
  internal bucket lists.  All operations run in constant stack size
  and are usually faster, except Hashtbl.copy which can be much
  slower
  (Alain Frisch)

* PR#6217, GPR#538: Optimize performance of record update:
  no more performance cliff when { foo with t1 = ..; t2 = ...; ... }
  hits 6 updated fields
  (Olivier Nicole, review by Thomas Braibant and Pierre Chambart)

- PR#7023, GPR#336: Better unboxing strategy
  (Alain Frisch, Pierre Chambart)

- PR#7244, GPR#840: Ocamlopt + flambda requires a lot of memory
  to compile large array literal expressions
  (Pierre Chambart, review by Mark Shinwell)

- PR#7291, GPR#780: Handle specialisation of recursive function that does
  not always preserve the arguments
  (Pierre Chambart, Mark Shinwell, report by Simon Cruanes)

- GPR#427: Obj.is_block is now an inlined OCaml function instead of a
  C external.  This should be faster.
  (Demi Obenour)

- GPR#580: Optimize immutable float records
  (Pierre Chambart, review by Mark Shinwell)

- GPR#602: Do not generate dummy code to force module linking
  (Pierre Chambart, reviewed by Jacques Garrigue)

- PR#7328, GPR#702: Do not eliminate boxed int divisions by zero and
  avoid checking twice if divisor is zero with flambda.
  (Pierre Chambart, report by Jeremy Yallop)

- GPR#703: Optimize some constant string operations when the "-safe-string"
  configure time option is enabled.
  (Pierre Chambart)

- GPR#707: Load cross module information during a meet
  (Pierre Chambart, report by Leo White, review by Mark Shinwell)

- GPR#709: Share a few more equal switch branches
  (Pierre Chambart, review by Gabriel Scherer)

- GPR#712: Small improvements to type-based optimizations for array
  and lazy
  (Alain Frisch, review by Pierre Chambart)

- GPR#714: Prevent warning 59 from triggering on Lazy of constants
  (Pierre Chambart, review by Leo White)

- GPR#723 Sort emitted functions according to source location
  (Pierre Chambart, review by Mark Shinwell)

- Lack of type normalization lead to missing simple compilation for "lazy x"
  (Alain Frisch)

### Runtime system:

- PR#7210, GPR#562: Allows to register finalisation function that are
  called only when a value will never be reachable anymore. The
  drawbacks compared to the existing one is that the finalisation
  function is not called with the value as argument. These finalisers
  are registered with `GC.finalise_last`
  (François Bobot reviewed by Damien Doligez and Leo White)

- GPR#590: Do not perform compaction if the real overhead is less than expected
  (Thomas Braibant)

### Tools:

- PR#7189: toplevel #show, follow chains of module aliases
  (Gabriel Scherer, report by Daniel Bünzli, review by Thomas Refis)

- PR#7248: have ocamldep interpret -open arguments in left-to-right order
  (Gabriel Scherer, report by Anton Bachin)

- PR#7272, GPR#798: ocamldoc, missing line breaks in type_*.html files
  (Florian Angeletti)

- PR#7290: ocamldoc, improved support for inline records
  (Florian Angeletti)

- PR#7323, GPR#750: ensure "ocamllex -ml" works with -safe-string
  (Hongbo Zhang)

- PR#7350, GPR#806: ocamldoc, add viewport metadata to generated html pages
  (Florian Angeletti, request by Daniel Bünzli)

- GPR#452: Make the output of ocamldep more stable
  (Alain Frisch)

- GPR#548: empty documentation comments
  (Florian Angeletti)

- GPR#575: Add the -no-version option to the toplevel
  (Sébastien Hinderer)

- GPR#598: Add a --strict option to ocamlyacc treat conflicts as errors
  (this option is now used for the compiler's parser)
  (Jeremy Yallop)

- GPR#613: make ocamldoc use -open arguments
  (Florian Angeletti)

- GPR#718: ocamldoc, fix order of extensible variant constructors
  (Florian Angeletti)

### Debugging and profiling:

- GPR#585: Spacetime, a new memory profiler (Mark Shinwell, Leo White)

### Runtime system:

- PR#7203, GPR#534: Add a new primitive caml_alloc_float_array to allocate an
  array of floats
  (Thomas Braibant)

### Manual and documentation:

- PR#7007, PR#7311: document the existence of OCAMLPARAM and
  ocaml_compiler_internal_params
  (Damien Doligez, reports by Wim Lewis and Gabriel Scherer)

- PR#7243: warn users against using WinZip to unpack the source archive
  (Damien Doligez, report by Shayne Fletcher)

- PR#7245, GPR#565: clarification to the wording and documentation
  of Warning 52 (fragile constant pattern)
  (Gabriel Scherer, William, Adrien Nader, Jacques Garrigue)

- #PR7265, GPR#769: Restore 4.02.3 behaviour of Unix.fstat, if the
  file descriptor doesn't wrap a regular file (win32unix only)
  (Andreas Hauptmann, review by David Allsopp)

- PR#7288: flatten : Avoid confusion
  (Damien Doligez, report by user 'tormen')

- PR#7355: Gc.finalise and lazy values
  (Jeremy Yallop)

- GPR#841: Document that [Store_field] must not be used to populate
  arrays of values declared using [CAMLlocalN] (Mark Shinwell)

### Build system:

- GPR#324: Compiler developers: Adding new primitives to the
  standard runtime doesn't require anymore to run `make bootstrap`
  (François Bobot)

- GPR#384: Fix compilation using old Microsoft C Compilers not
  supporting secure CRT functions (SDK Visual Studio 2005 compiler and
  earlier) and standard 64-bit integer literals (Visual Studio .NET
  2002 and earlier)
  (David Allsopp)

- GPR#507: More sharing between Unix and Windows makefiles
  (whitequark, review by Alain Frisch)

* GPR#512, GPR#587: Installed `ocamlc`, `ocamlopt`, and `ocamllex` are
  now the native-code versions of the tools, if those versions were
  built.
  (Demi Obenour)

- GPR#687: "./configure -safe-string" to get a system where
  "-unsafe-string" is not allowed, thus giving stronger non-local
  guarantees about immutability of strings
  (Alain Frisch, review by Hezekiah M. Carty)

### Bug fixes:

* PR#6505: Missed Type-error leads to a segfault upon record access.
  (Jacques Garrigue, extra report by Stephen Dolan)
  Proper fix required a more restrictive approach to recursive types:
  mutually recursive types are seen as abstract types (i.e. non-contractive)
  when checking the well-foundedness of the recursion.

* PR#6752: Nominal types and scope escaping.
  Revert to strict scope for non-generalizable type variables, cf. Mantis.
  Note that this is actually stricter than the behavior before 4.03,
  cf. PR#7313, meaning that you may sometimes need to add type annotations
  to explicitly instantiate non-generalizable type variables.
  (Jacques Garrigue, following discussion with Jeremy Yallop,
   Nicolas Ojeda Bar and Alain Frisch)

- PR#7112: Aliased arguments ignored for equality of module types
  (Jacques Garrigue, report by Leo White)

- PR#7134: compiler forcing aliases it shouldn't while reporting type errors
  (Jacques Garrigue, report and suggestion by sliquister)

- PR#7153: document that Unix.SOCK_SEQPACKET is not really usable.

- PR#7165, GPR#494: uncaught exception on invalid lexer directive
  (Gabriel Scherer, report by KC Sivaramakrishnan using afl-fuzz)

- PR#7257, GPR#583: revert a 4.03 change of behavior on (Unix.sleep 0.),
  it now calls (nano)sleep for 0 seconds as in (< 4.03) versions.
  (Hannes Mehnert, review by Damien Doligez)

- PR#7260: GADT + subtyping compile time crash
  (Jacques Garrigue, report by Nicolas Ojeda Bar)

- PR#7269: Segfault from conjunctive constraints in GADT
  (Jacques Garrigue, report by Stephen Dolan)

- PR#7276: Support more than FD_SETSIZE sockets in Windows' emulation
  of select
  (David Scott, review by Alain Frisch)

* PR#7278: Prevent private inline records from being mutated
  (Alain Frisch, report by Pierre Chambart)

- PR#7284: Bug in mcomp_fields leads to segfault
  (Jacques Garrigue, report by Leo White)

- PR#7285: Relaxed value restriction broken with principal
  (Jacques Garrigue, report by Leo White)

- PR#7297: -strict-sequence turns off Warning 21
  (Jacques Garrigue, report by Valentin Gatien-Baron)

- PR#7299: remove access to OCaml heap inside blocking section in win32unix
  (David Allsopp, report by Andreas Hauptmann)

- PR#7300: remove access to OCaml heap inside blocking in Unix.sleep on Windows
  (David Allsopp)

- PR#7305: -principal causes loop in type checker when compiling
  (Jacques Garrigue, report by Anil Madhavapeddy, analysis by Leo White)

- PR#7330: Missing exhaustivity check for extensible variant
  (Jacques Garrigue, report by Elarnon *)

- PR#7374: Contractiveness check unsound with constraints
  (Jacques Garrigue, report by Leo White)

- PR#7378: GADT constructors can be re-exposed with an incompatible type
  (Jacques Garrigue, report by Alain Frisch)

- PR#7389: Unsoundness in GADT exhaustiveness with existential variables
  (Jacques Garrigue, report by Stephen Dolan)

* GPR#533: Thread library: fixed [Thread.wait_signal] so that it
  converts back the signal number returned by [sigwait] to an
  OS-independent number
  (Jérémie Dimino)

- GPR#600: (similar to GPR#555) ensure that register typing constraints are
  respected at N-way join points in the control flow graph
  (Mark Shinwell)

- GPR#672: Fix float_of_hex parser to correctly reject some invalid forms
  (Bogdan Tătăroiu, review by Thomas Braibant and Alain Frisch)

- GPR#700: Fix maximum weak bucket size
  (Nicolas Ojeda Bar, review by François Bobot)

- GPR#708 Allow more module aliases in strengthening (Leo White)

- GPR#713, PR#7301: Fix wrong code generation involving lazy values in Flambda
  mode
  (Mark Shinwell, review by Pierre Chambart and Alain Frisch)

- GPR#721: Fix infinite loop in flambda due to [@@specialise] annotations

- GPR#779: Building native runtime on Windows could fail when bootstrapping
  FlexDLL if there was also a system-installed flexlink
  (David Allsopp, report Michael Soegtrop)

- GPR#805, GPR#815, GPR#833: check for integer overflow in String.concat
  (Jeremy Yallop,
   review by Damien Doligez, Alain Frisch, Daniel Bünzli, Fabrice Le Fessant)

- GPR#810: check for integer overflow in Array.concat
  (Jeremy Yallop)

- GPR#814: fix the Buffer.add_substring bounds check to handle overflow
  (Jeremy Yallop)

- GPR#880: Fix [@@inline] with default parameters in flambda (Leo White)

- GPR#525: fix build on OpenIndiana
  (Sergey Avseyev, review by Damien Doligez)

### Internal/compiler-libs changes:

- PR#7200, GPR#539: Improve, fix, and add test for parsing/pprintast.ml
  (Runhang Li, David Sheets, Alain Frisch)

- GPR#351: make driver/pparse.ml functions type-safe
  (Gabriel Scherer, Dmitrii Kosarev, review by Jérémie Dimino)

- GPR#516: Improve Texp_record constructor representation, and
  propagate updated record type information
  (Pierre Chambart, review by Alain Frisch)

- GPR#678: Graphics.close_graph crashes 64-bit Windows ports (re-implementation
  of PR#3963)
  (David Allsopp)

- GPR#679: delay registration of docstring after the mapper is applied
  (Hugo Heuzard, review by Leo White)

- GPR#872: don't attach (**/**) comments to any particular node
  (Thomas Refis, review by Leo White)
      

RISC-V native backend, no longer cross-compiling

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2016-11/msg00012.html

Nicolas Ojeda Bar announced:
Dear all,

A little over a year ago, I announced [1] a preliminary release of a
native-code backend for the OCaml compiler targeting the emerging
RISC-V architecture [2].  Due to the state of the RISC-V development
tools at the time, this backend existed only in the form of a
native-code cross-compiler and had many limitations.

Since then, the RISC-V community has made considerable progress to the
point that it is now easy to run a full Linux environment (including
gcc + friends) natively on RISC-V [3, 4].

Today I am happy to announce a preliminary, native release of the full
OCaml system on RISC-V. It is available at

    https://www.github.com/nojb/riscv-ocaml.

It targets the 64-bit variant of the RISC-V architecture, RV64G (the
32-bit variant should also work, but has not been tested).  All
libraries are supported (Dynlink and Num have a couple of issues left,
but I expect them to be resolved shortly).

I plan to maintain and keep developing this port for the foreseeable
future, tracking official OCaml releases.  It is currently based on
the recently released 4.04.0.

If you would like to play around with it, a Docker image is available
with an installed 4.04 ready to go:

    docker run -it nojb/riscv-ocaml:4.04.0 /bin/bash

As usual, any and all comments are warmly welcome.

Thanks!

Best wishes,
Nicolas

[1] https://sympa.inria.fr/sympa/arc/caml-list/2015-06/msg00046.html
[2] https://riscv.org/
[3] https://fedoraproject.org/wiki/Architectures/RISC-V
[4] https://hub.docker.com/r/sorear/fedora-riscv-wip/
      

The fastest stream library

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2016-11/msg00037.html

Oleg announced:
> On 21 April 2016 at 09:13, Gregory Malecha wrote:
> I'm wondering if there is any work (and interest) on supporting
> user-defined optimizations similar to GHC's rewrite rules in the Ocaml
> compiler. For example, a standard example would be specifying map fusion:

to which Gabriel Scherer commented on Thu, 21 Apr 2016 12:02:14 -0400

> Another approach that might be worth trying (sorry for not thinking
> about it earlier) is MetaOCaml. I tend of think of it as a tool to
> explicitly specify and control partial evaluation strategies.

Indeed. We'd like to point out an application of MetaOCaml, not just
to map fusion -- but also concat_map fusion and zip fusion, etc. We
present a streams library that supports the wide set of combinators --
from map and filter to concat_map (flat_map) and zip -- and produces
the hand-written quality code. It is faster than Batteries by up to more
than two orders of magnitude.

        http://okmij.org/ftp/meta-programming/strymonas.pdf
        http://strymonas.github.io/

Unlike GHC Rules, we guarantee the performance.
      
Gabriel Scherer then said and Oleg replied:
> I regret not being pointed to this work earlier, because I think that
> measuring the performance of Enum as a representative OCaml stream library
> performance is not the best choice : Enum is designed to be flexible in a
> bit too many ways to be efficient on pure-streaming scenarios (it supports
> a generic "clone", and effectful generators, that makes the codebase too
> complex for its own good; I think that Batteries community is aware that
> Enum is not as good as it should be right now). There are other, more
> efficient streaming libraries out there, including BatSeq in Batteries that
> should already be sensibly faster; Core and Containers also have more
> efficient streaming libraries.

Please do note that we make the double claim: wide expressivity *and*
guaranteed highest performance. We support not just the standard map and
filter, but zipping of streams with flat_maps in them, with both
finite and infinite streams. So we do want generality. From this
point, Enum was quite an appropriate.

Speaking of benchmarks, the real point of comparison is the
hand-written code for each benchmark, written with imperative loops,
references, etc. (We admit that some of that on several occasions
the hand-written baseline code was fine-tuned when it turned out that the
generated code outperformed it.) Please do feel free to suggest faster
versions.
      

Other OCaml News

From the ocamlcore planet blog:
Here are links from many OCaml blogs aggregated at OCaml Planet,
http://ocaml.org/community/planet/.

OCaml 4.04 Released!
 https://ocaml.io/w/Blog:News/OCaml_4.04_Released!

Exfiltrating log data using syslog
 https://hannes.nqsb.io/Posts/Syslog

Fifteenth OCaml compiler hacking evening at Pembroke
 http://ocamllabs.github.com/compiler-hacking/2016/11/01/fifteenth-compiler-hacking-evening
      

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.


Alan Schmitt