OCaml Weekly News Previous week Up Next week

Hello

Here is the latest OCaml Weekly News, for the week of January 05 to 12, 2016.

  1. BeSport is hiring (developers, trainees) in Paris
  2. Logs 0.5.0
  3. Including a C library statically in an Ocaml library
  4. The OCaml garbage collector, finalisers, and the right way of disposing native pointers in C bindings
  5. dead_code_analyzer 0.9
  6. Other OCaml News

BeSport is hiring (developers, trainees) in Paris

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2016-01/msg00021.html

Vincent Balat announced:
Jobs and internships
OCaml, Swift, Android developers
BeSport – Paris (75002)

BeSport SAS is looking for OCaml, Swift and Android developers (engineers and
trainees) to participate in the creation of its social network dedicated to
sport. Contact us at jobs@besport.com

THE COMPANY:
BeSport is a young company working on a Web and mobile application to connect
athletes and sport fans, around sport events. Designed as a social network, it
targets both amateurs and professionals, allowing everyone to create and
organize its events, to disseminate information and to receive personalized
news. A first version of the responsive Web application is already online and
mobile applications are being developed.
BeSport premises are located in the Sentier district, in the center of Paris
(metro station: Réaumur-Sébastopol).

WORK:
The Web application is entirely written in OCaml, using Ocsigen. We expect to
complement it soon with native mobile applications written in Swift and Java.
The developers will be integrated in the programming team: participation in
the writing of specifications, implementation (client / server), stylesheets,
tests… They will initially work on improving existing features, before
progressively taking the lead on some components.

SKILLS:
Candidates must have some expertise on some of the following technologies:
* Typed functional languages, especially OCaml (and Ocsigen Js_of_ocaml/Eliom)
* Databases
* iOS/Swift development
* Android development
* Web programming (CSS, Javascript…)
      

Logs 0.5.0

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2016-01/msg00026.html

Daniel Bünzli announced:
I'd like to announce Logs 0.5.0 an incompatible release of Logs. See the
release notes for details:

  https://github.com/dbuenzli/logs/blob/v0.5.0/CHANGES.md

Logs provides a logging infrastructure for OCaml
Homepage: http://erratique.ch/software/logs  
API docs: http://erratique.ch/software/logs/doc/

Many thanks to Edwin Török for his good suggestions.  
      
In another thread, Daniel Bünzli said:
> But it should be added that it may be problematic if you want the
> synchronous semantics and you are using lwt.

The recently released Logs 0.5.0 can now support this scenario. I also added
a note in the documentation to try to clarify these issues:

  http://erratique.ch/software/logs/doc/Logs.html#sync
      

Including a C library statically in an Ocaml library

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2016-01/msg00029.html

Malcolm Matalka asked:
The core problem I am having is a C library I want to bind has a number
of macros which I need the value of.  Here is how I am trying to solve
it, but perhaps there is a better way:

I have a small C library which gets compiled to libfoo.a which provides
functions that return the macro values, like:

int macro1() { return MACRO1; }

I then have an ocaml library, called ofoo, that uses Ctypes to bind to macro1:

let macro1 = foreign "macro" (void @-> returning int)

Where I am having issues is this small library, I'd prefer it to not
have to be installed on the system but just compiled into the Ocaml
library so that a user just has to link against that library.  Right
now, none of the symbols (macro1) are being included in the library, I'm
guessing because the linker sees no direct use of them.  And I'm not
even sure if I can get it included in the ocaml library.  I'm also not
able to get the libfoo symbols linked into a final executable, I'm
guessing for similar reasons.

What are my options here?

If I've missed any useful information, let me know.  I haven't interoped
much with C directory in Ocaml so I'm not sure what information is
important.
      
David Sheets replied:
I'd recommend using ctypes 0.4+ stub generation support which can bind
macro values and detect struct layout at an early compilation stage.
You can see it in use to do this in my ocaml-unix-errno library
https://github.com/dsheets/ocaml-unix-errno. One benefit with this
approach is greater static checking and c <-> ocaml type safety.

If you continue using dynamic bindings, two lonker flags may be of use to you:

--no-as-needed : on recent Ubuntu distributions, gcc automatically
includes the --as-needed flag which drops symbols that are not
referenced. Unfortunately, clang does not understand this flag so you
need to have a conditional in your build system to detect the compiler
in use if you want a cross-platform library.

-E : The Exports local symbols which are statically linked into a
binary into the dynamic symbol table so that they can be found with
dlsym.

You can use these with gcc like -Wl,-E for example.
      
Malcolm Matalka then asked and David Sheets replied:
> Will these approaches require that I have the C library installed to
> compile against any binary using my library or will the symbols be part
> of the ocaml library?  In my current version I have a libfoo.a that gets
> created in the project and then linked against the library.

In the stub generation case, you will not have symbols that represent
the macros. Instead, C code will be generated which generates OCaml
code which contains the macro constants.

In the dlsym static linking case, you would build regular cm(x)a files
and a .a file and install them all with ocamlfind. Your users don't
have to know about C libraries or have them installed globally on the
system.
      

The OCaml garbage collector, finalisers, and the right way of disposing native pointers in C bindings

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2016-01/msg00044.html

Martin Neuhaeusser asked:
during our work on some SMT solver bindings, a couple of question came up
regarding the behavior of the OCaml garbage collector (see also
https://github.com/Z3Prover/z3/issues/411). Assume we have defined a record
type containing a native pointer to some object from an external C-DLL:

type ocaml_record = {
  native_ptr np;
  [... some more fields ...]
}

When creating an ocaml_record, we register a finalizer that makes the
C library dispose the data belonging to the native pointer once the GC
collects the OCaml record value:

let f ocaml_record = NativeLib.dispose ocaml_record.np

let create [...] =
  let new_ocaml_record = { ... } in
  Gc.finalise f new_ocaml_record;
  new_ocaml_record

When calling one of the C-stubs, we pass the native pointer from the OCaml
record value:

let get_native_pointer ocaml_record = ocaml_record.np

NativeLib.native_function (get_native_pointer ocaml_record)

However, this has the problem that if ocaml_record has become otherwise
unreachable, the GC might collect ocaml_record directly after evaluating
(get_native_pointer ocaml_record), triggering the finalizer which disposes the
data pointed at by ocaml_record.native_ptr. The successive call to
NativeLib.native_function (i.e. the C-stub) results in a segfault, as it tries
to access data that has previously been freed by the finalizer.

I assume this is a common problem in writing interfaces to C libraries. If so,
is there a preferred way how to tackle it?
Two approaches that came to my mind are

1. One could design the C-stubs such that they accept values of type
ocaml_record and extract the native pointer within the C stub. In the C-stubs,
the GC must not collect an item that has been "pinned" by the CAMLparamX
macros, right?
2. One could invent some function that takes an ocaml_record, but does nothing
with it and whose calls do not get optimized away by the compiler...
Evaluating such a function after the call to NativeLib.native_function would
prevent the GC from collecting ocaml_record. However, this feels like a very
ugly hack.

Are there any better ideas? Any help and suggestions are highly appreciated.
      
Kim Nguyễn replied:
> 1. One could design the C-stubs such that they accept values of type
> ocaml_record and extract the native pointer within the C stub. In the C-stubs,
> the GC must not collect an item that has been "pinned" by the CAMLparamX
> macros, right?

That would work but in that case, the approach I prefer is to put the
native pointer in a custom block. This way you can attach your
finalizer (written in C) directly to the block and it will be called
when the (wraped) pointer itself is reclaimed. This also allows you to
fine tune the behaviour of the GC w.r.t. the use you make of such
pointers. Of course, one orthogonal problem is that if you create two
custom blocks with the same pointer inside, you must implement some
other mechanism to avoid double frees (like refcounting or something).
But this is a problem you already have with your OCaml record approach
(if two distinct records can hold the same native_ptr, then the
finalizer might get called twice). The advantage of custom blocks is
then that they are self sufficient and you can put such objects in
other data-structures (e.g. for debugging purposes). But indeed your C
bindings will need to extract the pointer from the custom block and
since they are given a heap allocated OCaml value (the custom block)
it will need to be protected bu CAMLparam macros. This also might make
your code more future-proof since it seems that at some point (just
from reading this mailing list) there will be (is ?) a version of the
OCaml runtime where naked pointer are forbidden on the heap (unless
they are wrapped in a custom block).


> 2. One could invent some function that takes an ocaml_record, but does
> nothing with it and whose calls do not get optimized away by the compiler...
> Evaluating such a function after the call to NativeLib.native_function would
> prevent the GC from collecting ocaml_record. However, this feels like a very
> ugly hack.

Very ugly indeed. And the OCaml compiler is getting better and better
at inlining stuff so it's quite hard to predict what is inlined and
what isn't (unless you write some "obviously" inefficient code that
has no chance what so ever to be inlined … but sill).
      
Gerd Stolpmann replied:
Dealing with naked pointers from OCaml is notoriously difficult. As you
found out, you have no good ways of controlling GC cycles, and to limit
bad effects of that. Also, there is a dangling pointer problem -
essentially the naked pointer can be mistaken as heap pointer between
the time the memory has been freed and the naked pointer is set to null.
Note that this aspect is practically impossible to do right from OCaml
code, and even in a C function it is easy to get wrong, resulting in
random crashes that occur infrequently.

For these reasons naked pointers are strongly discouraged. The way to go
is to wrap pointers into custom blocks
(http://caml.inria.fr/pub/docs/manual-ocaml/intfc.html#sec458), and to
do all pointer management in C.
      
Richard W.M. Jones also replied:
The other replies have covered some of the problems.  You may also be
interested in example code, and we've got lots :-)  Most of the bugs
have now been fixed, after several iterations.  Here are some things
to get you started.

(1) Simple example of a finalizer attached to a custom block:

https://github.com/libguestfs/libguestfs/blob/master/mllib/progress-c.c
https://github.com/libguestfs/libguestfs/blob/master/mllib/progress.ml
https://github.com/libguestfs/libguestfs/blob/master/mllib/progress.mli

(2) A more complex example using generational roots to deal with
callbacks from OCaml back to C:

https://github.com/libguestfs/libguestfs/blob/master/ocaml/guestfs-c.c

This one had a major bug, when we discovered that the root caused the
handle to be always reachable, so the finalizer was not called until
the program exited (the fact that we also have a #close method, which
we always called, made this less obvious than you might think at
first).  That was fixed in:

https://github.com/libguestfs/libguestfs/commit/8bbc5e73cb5b56b5cfbe979ac0e1c14d1701a0d8

(3) A tricky binding to libxml2.

Because libxml2 has objects containing pointers to other objects (at
the C level) we need to shadow these with OCaml structs, to ensure
that an OCaml object doesn't become unreachable when it is still
pointed to from another object.

https://github.com/libguestfs/libguestfs/blob/master/v2v/xml-c.c
https://github.com/libguestfs/libguestfs/blob/master/v2v/xml.ml
https://github.com/libguestfs/libguestfs/blob/master/v2v/xml.mli

If you look at the history of these files, you'll see we discovered
and fixed major bugs, eg this one concerned with the order in which we
freed objects:

https://github.com/libguestfs/libguestfs/commit/3888582da89c757d0740d11c3a62433d748c85aa

Note that (3) is a counter-example to the idea that you should use
custom blocks.  Custom block finalizers in OCaml have no ordering
guarantee - if you need an ordering guarantee you must use an OCaml
finalizer.

You can probably see this stuff gets complicated quickly.  Thank
goodness for valgrind!
      

dead_code_analyzer 0.9

Archive: https://sympa.inria.fr/sympa/arc/caml-list/2016-01/msg00046.html

Corentin De Souza announced:
I am happy to announce dead_code_analyzer version 0.9.

The dead_code_analyzer works as a complement to the compiler, warning you
about exported
values, methods, constructors and record fields that are never used.
It also looks for optional arguments always/never used.
In addition it reports some coding style issues, focusing on typing
information.

It needs OCaml 4.03 and can be installed through OPAM and github[1].
The tool assumes that .mli files are compiled with -keep-locs and .ml files
with -bin-annot.

Issues and pull requests are welcomed.

Happy new year,

Corentin

[1] https://github.com/LexiFi/dead_code_analyzer
      

Other OCaml News

From the ocamlcore planet blog:
Thanks to Alp Mestan, we now include in the OCaml Weekly News the links to the
recent posts from the ocamlcore planet blog at http://planet.ocaml.org/.

Richard Jones: Half-baked ideas: C strings with implicit length field
  https://rwmj.wordpress.com/2016/01/08/half-baked-ideas-c-strings-with-implicit-length-field/

Github OCaml jobs: Full Time: Software Developer (Functional Programming) at Jane Street in New York, NY; London, UK; Hong Kong
  http://jobs.github.com/positions/0a9333c4-71da-11e0-9ac7-692793c00b45

Functional Jobs: Software Engineer at Capital Match (Full-time)
  https://functionaljobs.com/jobs/8875-software-engineer-at-capital-match
      

Old cwn

If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.

If you also wish to receive it every week by mail, you may subscribe online.


Alan Schmitt