Hello
Here is the latest Caml Weekly News, for the week of July 24 to 31, 2007.
The CWN will go on hiatus for two weeks as I'll be in vacations with very limited internet access.
As a byproduct of the last ICFP programming contest, I'm releasing an implementation of Knuth-Morris-Pratt string searching algorithm that I wrote years ago. You can find it here, in my ocaml bazar: http://www.lri.fr/~filliatr/software.en.html Just in case it may be useful, as it finally happened to be for myself.
I am pleased to announce OCamlPCSC, an OCaml binding to PC/SC library. OCamlPCSC is a wrapper to libraries that follow the PC/SC standard for Smartcards. In Linux it uses PCSC-Lite from M.U.S.C.L.E. (http://www.linuxnet.com/) and in windows it uses winscard.dll. OCamlPCSC is available at http://www.di.ubi.pt/~desousa/ocamlpcsc. As any other OCaml library it's free for use and open-source (LGPL license).
In the aftermath of the ICFP contest, during which I used Luca de Alfaro's Vec, I felt like implementing ropes, based on Boehm's paper and the well-known code included in his conservative garbage collector. I later realized that the basic implementation strategies ("dense" leaves, bounded tree height and amortized constant time concatenation of small strings) could be generalized to build general extensible vectors similar to Vec. Such vectors (tentatively named "Vect" until I find a better name) have some interesting properties: * lower space overhead than other structures based on balanced trees such as Vec. The overhead can be adjusted, allowing to make "get" faster at the expense of "set" and viceversa. * appending or prepending a small vector to an arbitrarily large one in amortized constant time * concat, subarray, insert, remove operations in amortized logarithmic time * access and modification (get, set) in logarithmic time The first one is probably the most compelling. Currently, Vec imposes a 6-word overhead per element. Even after the obvious modification consisting in adding a new constructor for leaves, the overhead would still be 350%... Vect uses compact leaves with a configurable number of elements (32-64 seem good choices, leading to worst-case overheads of 100% and 50% respectively), which also helps with "get" due to the improved spatial locality. You can find the code for both Rope and Vect at http://eigenclass.org/repos/oropes/head/ It is still young and experimental, but it's maybe worth a look. Any feedback is very welcome. The documentation can be found under http://eigenclass.org/repos/oropes/head/doc/ I've spent some time benchmarking it against Vec; you can also find the code I used and the resulting graphs at the above address. To summarise how it compares to Vec: * Vec can be used when persistence is required, but Vect would probably be a poor choice in this case (until that is fixed using lazy rebuilding, which doesn't seem too hard), unless rebalancing explicitly before "taking the snapshot" is an option * Vect can append/prepend single elements or small vectors very quickly, in amortized constant time. See http://eigenclass.org/repos/oropes/head/append.png * as expected, Vec.set is faster than Vect's in general http://eigenclass.org/repos/oropes/head/set.png However, if the vector is balanced explicitly shortly before an update burst, Vect is somewhat surprisingly faster http://eigenclass.org/repos/oropes/head/set-balanced.png This might be attributed to Vect's smaller memory profile and the fact that it allows better use of the L2 cache, but there seems to be another factor that I have yet to discover. * Vect.get is considerably faster than Vec.get http://eigenclass.org/repos/oropes/head/get.png The above URL is a darcs repository, so if anybody shoots me a patch I'll be happy to apply it :)Jon Harrop asked and Mauricio Fernandez answered:
> Looks awesome! > It seems to use mutation internally. Is it not thread safe? The structure is persistent and all operations are non-destructive. Mutation is used only in the rebalancing operation and in set, but they affect an ephemeral forest of ropes/vects and a new string/array respectively, so the original structure is never modified and in principle all operations should be thread-safe. Ropes/vects being functional doesn't mean that they will perform well in a persistent setting however, see the clarification at http://eigenclass.org/repos/oropes/head/doc/Vect.html > I have some suggestions: > > I'd like metadata in every node, so I can provide a constructor that > combines > the metadata of child nodes and a cull function to accelerate searches. If I understand it correctly, that scheme could in the limit turn some O(n) searches into O(log n)?), right? Unlike Vec, Vect uses "compact" leaves (Leaf of 'a array) of bounded size (leaf_size, typically 16-64), which might not fit very well. Vect would need to know how to combine the metadata, wouldn't it? I was thinking about something like Leaf of ('meta -> 'meta -> 'meta) * 'meta * 'a array but I've realized that this wouldn't suffice. So, given that Vect.t is currently type 'a t = Empty | Concat of 'a t * int * 'a t * int * int | Leaf of 'a array something like this maybe? type ('a, 'meta) t = Empty of ('meta -> 'meta -> 'meta) | Concat of ('meta -> 'meta -> 'meta) * 'meta * ('a, 'meta) t * int * ('a, 'meta) t * int * int | Leaf of ('meta -> 'meta -> meta) * 'meta array * 'a array (* maybe also * 'meta to cache the last computation? *) or even without the ('meta -> 'meta -> 'meta) part, forcing the user to pass the function on each modification? Just thinking out loud. At any rate, it'd be better to provide it as a separate structure, any suggestions for the name?. > The usual HOFs, like map. I just pushed a patch with filter and map. The former is trivially implemented with fold + append (thanks to the O(1) append). I was going to code map the same way but I ended up making one that returns an isomorphic vect and is faster (since there's no need to rebalance). So Vect currently has iter, iteri, rangeiter, fold, map and filter. I'm considering renaming fold to fold_left and providing fold_right too.
I have released XmlRpc-Light 0.4, an XML-RPC library for OCaml based on Xml-Light and Ocamlnet 2. Source, downloads, and documentation are available here: http://code.google.com/p/xmlrpc-light/ This version adds support for writing servers. Two methods are currently provided: CGI (based on Netcgi2) and Netplex. The obligatory hello-world example looks like this: let server = new XmlRpcServer.cgi () in server#register "demo.sayHello" (fun _ -> `String "Hello!"); server#run () To build a Netplex server, just change "cgi" to "netplex". An example of a Netplex server including the required configuration file is in the examples/adder directory. All servers support the "system.getCapabilities" and "system.listMethods" introspection functions, as well as the "system.multicall" protocol. These can be disabled if desired by calling the "unregister" method. Other changes and improvements: - The default date-time functions use the format "20070729T10:42:00-07:00". This seems to be the most common interpretation of ISO 8601 used in XML-RPC servers. You can override this behavior by calling the "set_datetime_encode" or "set_datetime_decode" methods on the client or server. - Date-time parsing errors are now wrapped in XmlRpc.Error so that they will be relayed to clients as faults. - Error handling adheres much closer to the XML-RPC specification and its list of suggested fault codes and strings. - The client now sends a "text/xml" Content-Type header in requests. Thanks to Gerd Stolpmann for the help with Ocamlnet!Richard Jones suggested and Dave Benjamin answered:
> You might want to take a look at Julien Signoles' Calendar library for > date/time types and handling: > > http://www.lri.fr/~signoles/prog/calendar/ I have this library installed, and indeed considered using it when I began writing the date-time support. I would likely have used it, if only it had the ability to parse strings. I really wish Winer had considered alternatives to ISO 8601--say, UTC epoch seconds--in the design of XML-RPC, because it's barely a standard at all! There are so many variations and options that writing a parser for it borders on natural language processing. Even the W3C suggestion, which restricts ISO 8601 to a very small subset, doesn't help here since it still conflicts with the common usage in XML-RPC, with hyphens omitted between the date values. I decided to err on the side of oversimplification, and support only the most common format, leaving in hooks for users to customize the behavior as required. There is still benefit, of course, in using a standard date-time type. I only wonder if it is worth adding another library dependency; I am trying hard to keep the list small (currently only Xml-Light and Ocamlnet, which in turn requires PCRE). I think it would be great if a date-time type were made part of the official OCaml distribution. My only qualm with the Calendar library is that I feel a bit uncomfortable with a top-level module called "Printer" that is for the specific purpose of date formatting. I would assume that a module by that name were for communicating with "lpt", if anything. But hey, what's in a name, anyway... =) Thanks for the advice. I will consider it.
Tim Griffin is advertising a 3-year research associate position at the Cambridge Computer Lab, working on a project that seeks to design and implement a meta-language for the specification and implementation of correct Internet routing protocols. He says "A PL person would be perfect". Details here: http://www.admin.cam.ac.uk/offices/personnel/jobs/vacancies.cgi?job=2114
Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).
:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1
zM
If you know of a better way, please let me know.
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.