Hello
Here is the latest Caml Weekly News, for the week of 19 to 26 July, 2005.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/29668
Nate Gaylinn asked and Sylvain Le Gall answered:> I'm trying to figure out a nice way to parse OCaml code in a C++ program. > All I need is a very basic parse tree for use in syntax highlighting and > indenting OCaml code. > > I would love to make use of some of the programs that already come with > OCaml. For instance, ocamlc and camlp4 can both output OCaml parse trees > already! The only problem is the formats that these programs output are > (to my knowledge) completely undocumented. > > If I don't use these programs, I could always try using Lex and Yacc to do > the job, but I'm completely unfamiliar with these utilities and it would > take me quite some time to write input for parsing OCaml. > > Does anyone know where I could find reference to the various OCaml parse > tree formats? Does anyone know where I could find Yacc grammar > descriptions of OCaml, or would I have to write them myself? Does anyone > have any other suggestions of how to tackle the parsing problem? You should have a look to ocaml-ast-analyze (http://www.carva.org/sylvain.le-gall/ocaml-ast-analyze.html). It is a module build around camlp4 which can overridde the output of camlp4 (ie it can replace pr_dump.cmo for example). I use it to analyze ocaml code in order to find functions (which are "s_", "f_"... ) followed by a string (s_ 'Coucou') and output only the string ("msgstrid 'Coucou'). I think it is very useful to do analysis on Ocaml AST. There is no documentation, but i am planning to write some.Eric Cooper also answered:
OCaml uses ocamlyacc and ocamllex for its front-end: see .../parsing/parser.mly and lexer.mll. ocamlyacc is very similar to yacc/bison; reuse of the grammar should be straightforward. ocamllex is similar to lex/flex, but it supports mutually recursive rules instead of lex's explicit "start conditions", so you might have a bit more work if you want to translate it.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/29595
In the middle of this huge thread, Doug Kirk said and James Woodyatt added:> First, since this thread was started by somebody asking for ideas on > learning FP, I can site a couple of printed resources that have helped > me, but none are written using Ocaml: > > "Haskell School of Expression", Hudak, ISBN 0-52164-4089 > "Algorithms: A Functional Programming Approach", Rabhi, Lapalme, ISBN > 0-20159-6040 > "Structure & Interpretation of Computer Programs", Sussman, Abelson, > Sussman, ISBN 0-26269-2201 I would add that a good tutorial on monads is a resource that every functional programmer should read for comprehension. I haven't found one that I can recommend without hesitation, but here's a candidate: http://www.nomaware.com/monads/html/
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/29583
Continuing the thread from last week, Norman Ramsey said:In addition to other papers mentioned on this list, there are two unpublished papers that may have some value: Baudinet, Marianne and David MacQueen. 1985 (December). Tree pattern matching for ML (extended abstract). Unpublished manuscript, AT&T Bell Laboratories. Scott, Kevin and Norman Ramsey. 2000 (May). When do match-compilation heuristics matter? Technical Report CS-2000-13, Department of Computer Science, University of Virginia. My paper with Kevin Scott has some tutorial stuff in it. If you like, I can try to dig out source code, but it's in SML and may not be so useful.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/29724
Oliver Bandel announced:I have to announce version 0.2.8 of pftdbns, a tool which is useful in sorting/listing/moving files. It's name "pftdbns" is a short hand for "put files to directories (sorted) by name structure". It takes filenames, maps each char of the filename into a char, representing the charclass of it (a..z and A..Z -> "l" (letter), 0...9 -> "i" (integer" and so on)). This yields to an easy way of sorting files by names, based upon file-naming with certain filenaming-conventions. So, for example "hello.txt" and "ballo.txt" are part of the same name structure, as well as "1001.txt" and "8251.txt" but also "8251.jpg" are of the same name structure. For example "foobar.tex" and "foobar.txt" are equally structured too. The default behaviour is to move files into directories. The names of the directories are choosen from the string, which represents the name structure by default. Since version 0.2.2 the default-behaviour of moving files can be changed to only list the lilenames/namestructrues (and not moving them). Changes in 0.2.8: ================= Bug-Fix: -------- -inv option corrected! It didn't worked properly in version 0.2.6, because the filter function was stupid! Source restructuring: --------------------- Some changes in datastructures to have them more clear and have a more efficient program (multiple/unnecessary calls of the function "scan_string" removed). This was done in the parts of the program, where the file/template lists are build. I hope you enjoy this program, and I think if you have to handle a lot of files, this will be very helpful. You can find the tool here: ftp://www.belug.org/new/user/ob/Programs/Tools/pftdbns/ There also is a README in this directory, so that you can read more details. A description can also be found here: http://www.belug.org/~ob/ftp-area.html pftdbns uses the GPL-license.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/29708
Thomas Fischbacher started a huge (sometimes flaming) thread asking:I repeatedly stumble across problems where I would like to use a programming style which seems quite out of line with how one is supposed to use OCaml (to say it otherwise, it looks outright wrong and horrendously ugly), but which nevertheless might be just appropriate for the situation in question. So, I would like to hear the opinion of the OCaml community about this. Imagine constructing a binary heap of tuples. It is somewhat neat to be able to just use an array to store the entries of the heap which is pre-allocated to some fixed size and replaced by a larger array whenever more space is needed. The only thing known about the heap's entries at compile time is that they are bound to be tuples, hence boxed objects, but nothing else is known about their size or even type. As one does not have a prototype of such a tuple at the time the array is created, it seems to me as if the best thing one could do in OCaml would be: (1) Make an array of <tuple> option and initially fill it with None. (2) Make an optional array of tuples which is None until the first entry is made. One drawback of approach (2) is that when we "remove" entries from the heap, the underlying array will keep unnecessary references to values which by chance might be quite big, and which we might want to be reclaimed by GC, so that's not too beautiful for a general-purpose application. The major drawback of (1), however, is that there is a further level of indirection for array entries, which means some unnecessary consing, as well as more work for the machine to get at a given entry. If I just define a function let whatever () = Obj.magic (1,2); then let a = Array.make 10 (whatever());; would give me more or less just what I want. An array of boxed things that I then would like to use as in: # let () = a.(2) <- (1,2,3,4) in a.(2);; - : int * int * int * int = (1, 2, 3, 4) # let () = a.(3) <- (5,8,2,1) in a.(2);; - : int * int * int * int = (1, 2, 3, 4) # a.(3);; - : int * int * int * int = (5, 8, 2, 1) Mind you - in this case, I will only assign int*int*int*int tuples to that array, or the result of the evaluation of whatever() when I want to kill an entry. Plus, I guarantee never to look at any entry that is set to whatever(). In some other situation, I might be inclined to use the same code, but with string*bool instead of int*int*int*int. But again, I will promise never to put anything else but string*bool into the underlying array, and never look at entries which are not set properly. (Which, of course, includes never printing such an array at the toplevel unless it is fully occupied.) Please don't tell me that "well, OCaml is not Lisp, after all". This I already know. But is there no proper way to get around that (technically speaking) unnecessary extra level of indirection that is forced upon me due to static type checking? The very same problem appears in a different guise when one tries to write a function that flattens 'a array array -> 'a array. The best I could come up with here is: let join_arrays aa = let len_total = Array.fold_left (fun sf a -> sf+Array.length a) 0 aa and nr_arrays = Array.length aa in if len_total=0 then if nr_arrays = 0 then [||] else aa.(0) (* ^ Take a close look! *) else (* Here, we face the problem that we have to get some value of the proper type to init the joined array with. *) let first_entry = (let rec seek pos = let array_here = aa.(pos) in if Array.length array_here = 0 then seek (pos+1) else array_here.(0) (* This is guaranteed to terminate, as len_total>0! *) in seek 0) in let result = Array.make len_total first_entry in let transfer_to_result arr offset = let nr = Array.length arr in for i = 0 to nr-1 do result.(i+offset) <- arr.(i) done in let rec populate nr_array_now offset = if nr_array_now = nr_arrays then result else let array_now = aa.(nr_array_now) in let () = transfer_to_result array_now offset in populate (1+nr_array_now) (offset+Array.length array_now) in populate 0 0 ;; Of course, it is pretty awkward having to scan for the first element in the first nonempty array in an array of arrays, especially as that element really does not play a that special role. Could anyone please tell me how to do all this in a more appropriate way?Deep into the thread (please read the read online), Michael Alexander Hamburg said and Xavier Leroy replied:
> I was constructing a binary heap of tuples the other day. After pondering > these options, I just used Obj.magic 0 as the null value in the array. > The heap was in a module, so nothing else could see the array, and I could > prove that the code never accessed the null elements, so the use of > Obj.magic seemed justified. In other terms: " I was walking in the city the other day. I saw a syringe lying on the sidewalk. I stuck the needle in my forearm. That was a classy neighborhood, so the use of the syringe seemed justified. " Sorry for being sarcastic, but I strongly feel that any suggestion to use Obj functions should be avoided on this list. The OCaml compiler performs some type-dependent optimizations that can result in incorrect code (w.r.t. GC invariants) if wrong types are given using Obj.magic. For instance, the following implementation of "magic" arrays will eventually cause the GC to crash: type 'a t = int array let get (a: 'a t) i = (Obj.magic a.(i) : 'a) let set (a: 'a t) i (x: 'a) = a.(i) <- (Obj.magic x : int) while the same code with "string" instead of "int" will not. You don't understand why? Then, don't use Obj.magic. A few years ago, I spent one full day debugging a mysterious crash in code provided by a user, then realized that the problem was exactly the use of Obj.magic outlined above. I then swore to throw away all bug reports whose repro case uses Obj. So, you can break the type system with Obj, but you get to keep the pieces afterwards. Coming back to the initial question, I would first warn against premature optimization: quite possibly the overhead of the "option" solution is negligible. If not, just ask the user to pass an initial value of the heap element type to the "create heap" function.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/29781
Eric Stokes announced:Announcing the release of ocamldap 2.1.4, which includes bugfixes, and performance enhancements. The BER decoder sees probably its last significant performance increase in this release, however thats ok, because the increase is a quite large 2.5x speedup. To put this in perspective the bytecode decoder is now 10% faster than the first release of the native code decoder was, and is 14 times faster than perl's Net::LDAP.
Archive: http://thread.gmane.org/gmane.comp.lang.caml.general/29782
Gerd Stolpmann announced:The Ocamlnet project announces version 1.1 of this library collection. This release focuses on important additions: - Nethttpd is a library with HTTP 1.1 daemon functionality. The core of this daemon is quite complete; almost all HTTP 1.1 core features are supported (more or less only multipart/byteranges are omitted). The daemon can serve static and dynamic pages. There are two operating modes: In reactive mode, the server works sequentially, and in event-based mode, the server can do multiplexing. What is still missing are ready-to-use MPM containers (but it is already very easy to create a multi-threaded server with very good performance). Nethttpd was sponsored by Alex Baretta's company Baretta s.r.l. and implemented by me. Thank you, Alex. Nethttpd is released under the GPL (unlike the rest of Ocamlnet). - Smtp is a simple SMTP client library, i.e. it can transfer mail messages to a mail server. Smtp is a contribution by Pierre Habouzit. Furthermore, there are a few bugfixes: - Netmime.write_mime_message: A bug in CR/LF handling has been resolved. - The FastCGI library can now handle POST requests larger than 4K - Newer versions of the PCRE library are now supported. The addition of Nethttpd also made some refactoring necessary: - There is a new Nethttp module with common functions for all components dealing with HTTP. Nethttp includes an almost complete set of parsers and printers for the individual HTTP headers. - The cgi_environment class type has been extended. Especially, one can now set an error logging function. ---------------------------------------------------------------------------- What is Ocamlnet? ---------------------------------------------------------------------------- A collection of modules for the Objective Caml language which focus on application-level Internet protocols and conventions. The current distribution contains: - a mature implementation of the CGI protocol - an implementation of the JSERV protocol (AJP-1.2), can be used with mod_jserv (Apache JServ) and mod_jk (Jakarta connector) to connect application servers written in O'Caml with web servers - an implementation of the FastCGI protocol, for the same purpose - an HTTP daemon, for substituting the web server completely - a POP3 client - an SMTP client - a library of string processing functions related to Internet protocols (formerly known as "netstring" and distributed separately): MIME encoding/decoding, Date/time parsing, Character encoding conversion, HTML parsing and printing, URL parsing and printing, OO-representation of channels, and a lot more. Ocamlnet is developed as a SourceForge project: http://sourceforge.net/projects/ocamlnet Developers and code contributions are welcome. Ocamlnet is licensed under the zlib/libpng license, except the HTTP daemon, which is released under the terms of the GPL.
Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).
:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1
zM
If you know of a better way, please let me know.
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.