Hello
Here is the latest Caml Weekly News, for the week of January 23 to 30, 2007.
The announce below might be of interest for people looking for jobs involving OCaml. -- Alain Frisch ------------------------------------------------------------- Position available at Paris 7 The laboratory PPS of University Paris 7 is looking for candidates for a one year position, available immediately, around the CDuce project. CDuce is a programming language for XML (see http://www.cduce.org), close in spirit to OCaml, and whose compiler is implemented in OCaml. According to the profile of the recruited person the position will focus more on the development environment (e.g.: libraries for web development or web services, Eclipse plugins, Windows port) or on the research aspects (e.g.: concurrency, typing, distribution, verification) around CDuce. Candidates should be fluent in OCaml or in another functional language. Experience in Web development, environments for software development and/or XML would be useful as well. The annual gross salary will be around 28.000 Euros. If interested please send a mail to staff@cduce.org as soon as possible. Giuseppe Castagna
> Sorry if this an RADBOTFM case. Rule 2 in Chapter 18 of the manual states > that all local variables of type value must be declared using CAMLlocal > macros. However, later on when demonstrating caml_callback we get the > statements: > value* format_result_closure = caml_named_value("format_result"); > return strdup(String_val(caml_callback(*format_result_closure, Val_int(n)))) > (I've "simplified" the opening lines for clarity here - naturally it should > be static and once only!). > Two questions arise: > 1. Presumably it's OK to cache values returned by caml_named_value without > declaring them in a CAMLlocal "call" or by using register_global_root? No, any C pointer to a caml value must be known to the caml GC, because the caml GC might move the caml value. But you might no declare such a pointer if you are sur that the GC won't be triger between your affectation of the value to the C variable, and the use of the C variable.* In the given exemple, this is the case: nothing is done between the affectation and the use. By the way, when in doubt, or when you are a begginer not nowing well how the GC work, you should always use the CAMLlocal call. > 2. The result of caml_callback is passed straight to String_val. Therefore, > if I expand the line to: > value result = caml_callback(*format_result_closure, Val_int(n))); > return strdup(String_val(result)); > then does that work ok without using CAMLlocal1(result); Yes, for the same reason: you do nothing between the affectation of the result function, and the use of the variable.Hendrik Tews also answered:
> Sorry if this an RADBOTFM case. Rule 2 in Chapter 18 of the manual states > that all local variables of type value must be declared using CAMLlocal > macros. However, later on when demonstrating caml_callback we get the > statements: > value* format_result_closure = caml_named_value("format_result"); Note the type! format_result_closure is not of type value, Rule 2 does not apply! > 1. Presumably it's OK to cache values returned by caml_named_value without > declaring them in a CAMLlocal "call" or by using register_global_root? Yes it is OK. And you cannot use CAMLlocal or register_global_root, because they only deal with values and not pointers to values. The manual guarantees that the value pointed to by the result of caml_named_value doesn't move. Probably caml_named_value allocates a value outside the heap, registers it as a global root and gives you back its address. > value result = caml_callback(*format_result_closure, Val_int(n))); > return strdup(String_val(result)); > > then does that work ok without using CAMLlocal1(result); Yes. You should think of values as pointers that point to data that is moved around by the garbage collector. If there is any chance that the garbage collector is called, then you must make sure that it updates your pointer when it moves the data. Hence you have to register the value. If the garbage collector is not called under any circumstances then the data will not move, the pointer doesn't need to get updated and you don't have to register the value. Furthermore, if your are sure Is_long(your_variable) is true under any circumstances, you don't have to register your_variable, because it is not a pointer. Of course all that is strongly discouraged.
I am pleased to announce the OCaml Summer Project. The project is aimed at encouraging growth in the OCaml community by funding students over the summer to work on open-source projects in OCaml. At the end of the summer, we will fly all of the students who have completed their projects succesfully out for a meeting in New York, where people will present their projects and get a chance to shmooze with other members of the OCaml community. The project is being funded and run by Jane Street Capital. As people on the list likely know at this point, we make extensive use of OCaml here at Jane Street, and are excited about the idea of encouraging and growing the OCaml community. If you'd like to learn more about the project, you can look at our website here: http://osp2007.janestcapital.com We'd love to have professors tell their students about the project, since we hope it will do some real good in terms of increasing interest in functional programming. Please direct any questions or suggestions you have to osp@janestcapital.com.Gabriel Kerneis asked and Yaron M. Minsky answered:
> And I'm an interested student looking for ideas (the binary trees > project on the OCSP page seems fine but I'd be glad to have more > ideas) ;-) > Other questions : > - is it possible to propose several projects (per student) and let > the OCSP team decide which one is the better ? Just one proposal per student, I'm afraid. We'd rather you chose a project you liked and believed in, and come up with the best proposal you can fo rit. > (if i were to be selected and finish the job) what kind of > visa/passport/etc. do one need to come to the USA (I'm living in > France) - but i guess i can find this on the Internet - and to what > extent will you pay the flight/food/housing/etc. fees ? We will pay for the travel and your stay. I don't expect we'll pay for all of your meals, but we'll have a few dinners for the group. As for the visa issue, my expectation is that students will figure this out on their own. I suspect that for the most part I believe a tourist visa should do, but I don't really know the details, and it will no doubt vary from country to country.Jon Harrop said:
Sounds like an excellent idea and the projects all look fascinating. However, I do have some comments on the "Binary tree library" project: OCaml currently has two separate implementations of AVL trees in Map and Set functors. Set already has fast union and split operations. Having two separate implementations is wasteful but more efficient. The underlying tree code could be factored out into another functor but this is costly in terms of performance. Also, the OCaml stdlib has used an odd choice of optimisations: inlining height calculation (which is quite a small benefit in the context of functors and polymorphism) but not amortising single-child trees into a separate constructor (which can remove up to 50% of the GC's effort). So the code can be made shorter and faster. I've already implemented my own AVL set using the node-specialisation trick. Performance is ~30% faster, IIRC. I've also wanted to write a functional array based on AVL trees (O(log n) lookup but fast sub, append, insert, delete etc.) and a camlp4 extension to support pattern matching over this type. Lists and arrays are rather priviledged containers in OCaml, having pattern matching and literals, but trees are better in many respects and would make an excellent general-purpose container. Finally, having to use functors does obfuscate OCaml code that deal with Sets and Maps in many cases, particularly because there are no built-in Int and Float modules so you must write your own. I often find that this superfluous code is as long as all of the code using the Sets/Maps. Although it would be "dangerous", Sets and Maps implemented without functors are much easier to use. After all, Hashtbl is typically used in that way. It is also worth noting that several people (Diego, Jean-Christophe) have written other tree libraries using various data structures (RB, trie, splay etc.). As far as I can tell, AVL trees are a good all-rounder. Best of luck with the projects!Xavier Leroy said:
I just wanted to say a big "thank you" to you and the Jane Street Capital people for donating the time and money to organize such an event. It will be very interesting to see what comes out of it. My attempt to put the visa discussion at rest. I believe Yaron and Markus are right: a tourist visa (or visa waiver program) is probably enough; whether you need an actual visa or not is a complicated function of your country of citizenship and of the date your passport was issued, but for citizens of "old Europe", this function returns "no visa needed" with high probability. For more details, see the Web sites of the ministry of foreign affairs or of the US consulate in your country (URLs for France included below for your convenience). Let me add that if you never visited Manhattan before, it's well worth the trip. One more reason to participate in this project! http://www.diplomatie.gouv.fr/fr/pays-zones-geo_833/etats-unis_471/conseils-aux-voyageurs_13261/entree-sejour_29143.html http://www.amb-usa.fr/consul/niv/needvisa/default.htm
I am happy to announce the immediate availability of cmigrep, a small utility to mine cmi files for interesting bits of data. cmigrep is available in godi, or at http://homepage.mac.com/letaris A short description of features, cmigrep: <args> <module> cmigrep has two modes, the first and most common is that of searching for various types of objects inside a module. Objects that you can search for include switch purpose -t (regexp) print types with matching names -r (regexp) print record field labels with matching names -c (regexp) print constructors with matching names -e (regexp) print exceptions with matching constructors -v (regexp) print values with matching names -o (regexp) print all classes with matching names -a (regexp) print all names which match the given expression These are all very useful for finding specific things inside a given module. Here are a few examples, find some constructors in the unix module itsg106:~ eric$ cmigrep -c SO_ Unix SO_DEBUG (* socket_bool_option *) SO_BROADCAST (* socket_bool_option *) SO_REUSEADDR (* socket_bool_option *) SO_KEEPALIVE (* socket_bool_option *) SO_DONTROUTE (* socket_bool_option *) SO_OOBINLINE (* socket_bool_option *) SO_ACCEPTCONN (* socket_bool_option *) SO_SNDBUF (* socket_int_option *) SO_RCVBUF (* socket_int_option *) SO_ERROR (* socket_int_option *) SO_TYPE (* socket_int_option *) SO_RCVLOWAT (* socket_int_option *) SO_SNDLOWAT (* socket_int_option *) SO_LINGER (* socket_optint_option *) SO_RCVTIMEO (* socket_float_option *) SO_SNDTIMEO (* socket_float_option *) full types get printed in the case that the constructors have arguments. Notice that adding to the include path is modeled after the compiler. Findlib is also supported. itsg106:~ eric$ cmigrep -c "^Tsig_.*" -I /opt/godi/lib/ocaml/compiler- lib Types Tsig_value of Ident.t * value_description (* signature_item *) Tsig_type of Ident.t * type_declaration * rec_status (* signature_item *) Tsig_exception of Ident.t * exception_declaration (* signature_item *) Tsig_module of Ident.t * module_type * rec_status (* signature_item *) Tsig_modtype of Ident.t * modtype_declaration (* signature_item *) Tsig_class of Ident.t * class_declaration * rec_status (* signature_item *) Tsig_cltype of Ident.t * cltype_declaration * rec_status (* signature_item *) record field labels itsg106:~ eric$ cmigrep -r "^st_" Unix st_dev: int (* stats *) st_ino: int (* stats *) st_kind: file_kind (* stats *) st_perm: file_perm (* stats *) st_nlink: int (* stats *) st_uid: int (* stats *) st_gid: int (* stats *) st_rdev: int (* stats *) st_size: int (* stats *) st_atime: float (* stats *) st_mtime: float (* stats *) st_ctime: float (* stats *) findlib support, matching value names itsg106:~ eric$ cmigrep -package pcre -v for Pcre val foreach_line : ?ic:in_channel -> (string -> unit) -> unit val foreach_file : string list -> (string -> in_channel -> unit) -> unit nested modules itsg106:~ eric$ cmigrep -v ".*" Unix.LargeFile val lseek : file_descr -> int64 -> seek_command -> int64 val truncate : string -> int64 -> unit val ftruncate : file_descr -> int64 -> unit val stat : string -> stats val lstat : string -> stats val fstat : file_descr -> stats types itsg106:~ eric$ cmigrep -t ".*" Unix.LargeFile type stats = { st_dev : int; st_ino : int; st_kind : file_kind; st_perm : file_perm; st_nlink : int; st_uid : int; st_gid : int; st_rdev : int; st_size : int64; st_atime : float; st_mtime : float; st_ctime : float; } everything! itsg106:~ eric$ cmigrep -a ".*" Unix.LargeFile val lseek : file_descr -> int64 -> seek_command -> int64 val truncate : string -> int64 -> unit val ftruncate : file_descr -> int64 -> unit type stats = { st_dev : int; st_ino : int; st_kind : file_kind; st_perm : file_perm; st_nlink : int; st_uid : int; st_gid : int; st_rdev : int; st_size : int64; st_atime : float; st_mtime : float; st_ctime : float; } val stat : string -> stats val lstat : string -> stats val fstat : file_descr -> stats exception declarations itsg106:~/cmigrep eric$ ./cmigrep -e ".*" Unix exception Unix_error of error * string * string The second mode of cmigrep is for searching for modules in it's path, this lets you do a regular expression match on the full module path, including sub modules. For example, itsg106:~/cmigrep eric$ cmigrep -m Net -package netstring Netaccel Netaccel_link Netaddress Netaux Netaux.ArrayAux Netaux.KMP Netbuffer Netchannels Netconversion Netdate Netdb Netencoding Netencoding.Html Netencoding.Url Netencoding.Q Netencoding.QuotedPrintable Netencoding.Base64 Nethtml Nethtml_scanner Nethttp Nethttp.Header Netmappings Netmappings_iso Netmappings_jp Netmappings_min Netmappings_other Netmime Netsendmail Netstream Netstring_mt Netstring_pcre Netstring_str Netstring_top Netulex Netulex.Ulexing Netulex.ULB Neturl here are all modules starting with Net that contain a sub module itsg106:~/cmigrep eric$ cmigrep -m "^Net.*\." -package netstring Netaux.ArrayAux Netaux.KMP Netencoding.Html Netencoding.Url Netencoding.Q Netencoding.QuotedPrintable Netencoding.Base64 Nethttp.Header Netulex.Ulexing Netulex.ULB modules exactly one level deep itsg106:~/cmigrep eric$ cmigrep -m "^[^.]*\.[^.]*$" Bigarray.Array3 Bigarray.Array2 Bigarray.Array1 Bigarray.Genarray Hashtbl.Make Map.Make MoreLabels.Set MoreLabels.Map MoreLabels.Hashtbl Pervasives.LargeFile Random.State Scanf.Scanning Set.Make StdLabels.String StdLabels.List StdLabels.Array Unix.LargeFile UnixLabels.LargeFile Weak.Make
I would like to announce the first release of our new unit testing framework for OCaml. It can be downloaded from our website here: http://www.iinteractive.com/ocaml/ It is based heavily on the Perl unit testing framework of the same name, and produces TAP output (http://en.wikipedia.org/wiki/Test_Anything_Protocol) which can be read and analyzed by a wide range of existing Perl tools. The goal of this framework is to make writing unit tests as simple and as easy as possible (hence the name). Here is a basic example taken from the TestSimple test suite itself. #use "topfind";; #require "testSimple";; open TestSimple;; plan 9;; diag "... testing O'Caml TestSimple v0.01 ";; ok true "... ok passed";; is 2 2 "... is <int> <int> passed";; is 2. 2. "... is <float> <float> passed";; is "foo" "foo" "... is <string> <string> passed";; is [] [] "... is <'a list> <'a list> passed";; is [1;2;3] [1;2;3] "... is <int list> <int list> passed";; is ["foo";"bar"] ["foo";"bar"] "... is <string list> <string list> passed";; is (1,"foo") (1,"foo") "... is <int * string> <int * string> passed";; is TAPDocument.Ok TAPDocument.Ok "... is <type> <type> passed";; As this is our first released OCaml library, any and all feedback is very much appreciated.
The "1.0" release of the LablPCRE OCaml binding for PCRE is now available, fully supporting Linux and Windows builds PCRE versions 6.1 - 7.0 (current). LablPCRE provides simple and easy to use access to regular expression matching, offering a rich module-based interface based on PCRE's POSIX functions wrapper. This release has been built and tested using OCaml 3.09.3 on Fedora Core 6 and Windows XP, supports findlib and "hands-off" building and installing (no "configure" script or manual file editing required), and has pre-built binaries for [native] Windows XP. The full package is licensed under the "new" BSD license, and may be downloaded here: http://www.rftp.com/Downloads.shtml
Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).
:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1
zM
If you know of a better way, please let me know.
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.