Hello
Here is the latest Caml Weekly News, for the week of July 18 to August 01, 2006.
Apropos of the recent discussion on the best way to handle command- line arguments, I'd like to announce the release of pa_arg version 0.2.0, a syntax extension and library for parsing command-line options. It offers: * a clean, simple syntax for declaring and using options * a pure functional interface * GNU-style or Caml-style handling of options (-ane 'arg' vs -a -n -e 'arg') * symbol-typed options via polymorphic variants * generated code to produce a string representation of the chosen options * sensible defaults, but powerful, extensible behavior * and more... The extension and supporting module, along with a detailed example and more documentation are available at http://www.cs.cornell.edu/~ebreck/pa_arg/ It's also available from GODI as godi-pa_arg (or will be as soon as the GODI svn archive is back up). I'd like to thank Martin Jambon for his excellent camlp4 tutorial as well as detailed feedback on previous versions of this extension. thanks! Eric Breck A short example (a longer example is available on the website): main.ml contains: open Parseopt type option opts = { alpha : help = "smoothing parameter"; float; beta = false : action = set; bool; gamma : key = ["-x"]; [`Ecks | `Why | `Zee]; delta : int; } if the user types: ./main -alpha 0.45 -b -x why anonymous args then then the call parse_argv opts returns {alpha=Some 0.45; beta=true; gamma=Some `Ecks; delta=None}, ["anonymous";"args"] PS: If anyone remembers a prior posting of this to ocaml_beginners last year, this version is considerably improved, but somewhat different (for one thing,it no longer interfaces to Arg, but provides its own library). Therefore, it's incompatible with that version.
> Hi, does anyone know how (if) the available ocaml sqlite bindings > work with recent versions of sqlite (namely 3.x.x)? i think there are more than one available SQLite bindings for OCaml, including some that are just for versions of SQLite before the API was changed in version 3 and others engineered directly for SQLite3. I am using Bárður Árantsson's bindings that are available at: http://www.imada.sdu.dk/~bardur/personal/45-programs/ These particular OCaml bindings in many ways directly mirror the SQLite3 C API; they are not a DBI-like interfaces, but work well if you are familiar with SQLite3. I have modified the bindings slightly for my own needs by adding a few methods and fixing a few bugs (any fixes of which I plan to eventually contribute back to Bárður), but it works very well with the latest version of SQLite3. I am using it just fine with version 3.3.6 of SQLite. I have successfully used it to create database files of several gigabytes (populated with marshaled OCaml objects using Sqlite3's blob type).Toby Allsopp also answered:
> Hi, does anyone know how (if) the available ocaml sqlite bindings work with > recent versions of sqlite (namely 3.x.x)? http://metamatix.org/~ocaml/ocaml_sqlite3.html I've used this and it works fine, although there is a minor bug that causes seg faults in certain situations that I can't currently recall. I have a patch that I will send next week from work. (See the patch at the archive link.)
> Are there any best practices/libraries regarding logging in CAML? My > main use for it will be debugging (I know there are debugging > facilities in CAML, but I am very used to using log as a debug tool - > think Log4J for instance)... I do not know Log4J. For the same purpose as yours, I did a small Log library, using another one, called Tics. Tics is a binding to QueryPerformanceCouter and al. on MS Windows. At this moment, it is not multiplatform, but this should be done very easily. I need Tics because each message sent to Log is timestamped and stored in memory. The user of Log then can flush and print or display the stored message when possible. I did this because I have plenty of memory for those tests, and I wanted to avoid IO perturbations of timing. I also needed precise timing, because I'm working with hardware. Except from those constraint, I really don't think my tools are forth using.James Woodyatt suggested:
You may want to have a look at the Cf_journal module in my recently released OCaml NAE Core Foundation library. See http://sf.net/projects/ocnae/ for downloads. It's not anywhere near as full- featured as Log4J, but it was intended as a foundation upon which to build something that full-featured. I find it works pretty well for debugging purposes. Your mileage may vary.Jean-Christophe Filliatre suggested:
I'm also using logging as a debug facility sometimes, and I use a printf-like function for this purpose, as follows: ========================= let log_ch = open_out "logfile" let log = Format.formatter_of_out_channel log_ch let () = at_exit (fun () -> Format.pp_print_flush log (); close_out log_ch) let lprintf s = Format.fprintf log s ========================= which provides a function lprintf of type ========================= val lprintf : ('a, Format.formatter, unit) format -> 'a = <fun> ========================= to be used like Format.printf. This is surely not the best way to do, and not very powerful, but at least it is convenient to use when you already have Format-like printers for your datatypes (using %a).
> I would like to tinker with Cocoa bindings and try to move that > project forward. > > Where should I start from? I would like to take the route of > parsing Obj-C header files. I agree that this is long overdue. A long time ago, Mike Hamburg, Jeff Henrikson, and I made noises about working on this, based on Mike's work on addressing the runtime side of it (he got so far as to have a somewhat under-performant but usable runtime library integrating O'Caml with the Objective C runtime based, IIRC, on Obj.magic) and Jeff pointed out that Frontc, the parser that he used in his Forklift FFI, had diverged from probably the best one for real- world use, which is embedded in CIL and intertwined in ways that make it a challenge to backport. Mike, Jeff, if you're reading this, is this a fair characterization of your efforts and thoughts? In any case, I still believe that: 1) It's worth addressing whatever issues need addressing in Mike's work. 2) It's worth resolving what parser to use and, IMHO, how to evolve Forklift to support generating calls to and from Objective C via Mike's runtime work. 3) It's worth combining the two to provide the Forklift annotations to allow calling into and out of Cocoa on Mac OS X. 4) It's worth writing an FRP system for O'Caml a la Yampa for Haskell. 5) It's worth using said FRP system in conjunction with perhaps http://ocaml-win32.sourceforge.net, http://wwwfun.kurims.kyoto-u.ac.jp/soft/lsl/lablgtk.html, and our Cocoa bindings to create a truly cross-platform GUI environment for O'Caml.Joel Reymont then asked and Paul Snively answered:
> I must be missing something but ... what does FRP have to do with > Cocoa bindings? In and of itself, nothing; I just like the FRP approach to GUI programming, so I see an opportunity to kill two birds with one stone: 1) Provide O'Caml a nice FRP framework. 2) Provide O'Caml a nice GUI framework that doesn't suffer the vagaries of the usual OO GUI frameworks.James Woodyatt said:
> In and of itself, nothing; I just like the FRP approach to GUI > programming, so I see an opportunity to kill two birds with one stone: > 1) Provide O'Caml a nice FRP framework. At the risk of engaging in more than my fair share of self-promotion, I should point out that the OCaml NAE I/O Reactor library I just released is an FRP framework. It's pretty spare at the moment and needs a lot of additions. Also, I didn't write it with graphical user interfaces in mind-- the goal was a good framework for single- threaded multiplexing network application servers. (The acronym "NAE" stands for 'Network Application Environment'.) http://sf.net/projects/ocnae/ The Yampa framework doesn't strike me as appropriate for building a GUI. I suspect such a GUI toolkit would offer highly underwhelming performance characteristics. I could be wrong about that, and would welcome such a surprise, but that's what my unscientific guess tells me. > 2) Provide O'Caml a nice GUI framework that doesn't suffer the > vagaries of the usual OO GUI frameworks. For my own part, I plan to do all my GUI work in Cocoa with native Objective-C. A more useful addition to the OCaml HUMP, I argue, would be bindings for CoreData. http://developer.apple.com/macosx/coredata.html At some point, if no one else has done it first, I will get around to doing it myself. Don't anybody hold their breath waiting for it, though... I have a lot of hobby projects these days, what with a day job and a 6-month old baby in the house.
> I'm wondering if anyone has a sparse matrix library for ocaml, or > perhaps a wrapper around some C library. lacaml might be suitable for you. You can find it here: http://www.ocaml.info/home/ocaml_sources.html
I just released an update of "The Whitespace Thing" for OCaml, my preprocessor that lets you use Python or Haskell-style indentation to avoid most multi-line parenthesization. The update adds support for some language features that I had previously overlooked. http://people.csail.mit.edu/mikelin/ocaml+twt/ While the implementation is slightly lame, I'm using this every day on moderately large projects and I recommend it if you like this style.
Would you be interested in working for three months at Microsoft Research, Cambridge, on a project related to GHC or Haskell? MSR Cambridge now takes interns *year-round*, not just in the summer months. Simon Marlow and I are keen to attract motivated and well-qualified folk to work with us on our research, and on improving or developing GHC. (I'm being a bit cheeky sending this to the Caml mailing list! But there are lots of other programming-language folk at MSR (F#, security, probabilistic languages...), and stuff beyond that (machine learning, systems and networks...); see http://research.microsoft.com/aboutmsr/labs/cambridge/.) Anyway, if you or one of your students is interested, you can find more details here http://hackage.haskell.org/trac/ghc/wiki/Internships Our next empty slot is the Oct-Dec period, but you are welcome to apply for some later period.
Are there any screen-scraping packages for OCaml? I'm looking for something that would let me analyze the contents of a web page and extract, for example, all the image tags. I'm using Ruby for this at work and something like hpricot [1] is very neat but also somewhat slow. Thanks, Joel [1] http://code.whytheluckystiff.net/hpricot/Karl Zilles answered:
I don't think of this as screen scraping. Spidering might be a better word. I've done a good bit of this in OCaml. I use the curl package for downloading web pages and the netstring package for parsing them. I'm going to attach a couple of files that I use for this sort of stuff. The file htmltreeutils.ml has a bunch of functions for working with the results of a nethtml parse tree. So your program would look something like this.. and this hasn't been tested: open Htmltreeutils let result = Buffer.create 2000 in let connection = Curl.init () in Curl.set_httpget connection true; Curl.set_url connection "http://www.yahoo.com/randompage.html"; Curl.set_writefunction connection (fun s -> Buffer.add_string result s); Curl.set_headerfunction connection (fun s -> ()); Curl.perform connection; Curl.cleanup connection; let dom = get_parsed_html_from_string result in let img_tags = list_tags "img" dom in .... do something with img tags here like pull out their src attributes (The helper files are at the archive link.)Richard Jones also answered:
We did some web scraping using WWW::Mechanize + perl4caml. As a result, perl4caml contains pretty complete bindings for the WWW::Mechanize library. http://merjis.com/developers/perl4caml http://resources.merjis.com/developers/perl4caml/Pl_WWW_Mechanize.www_mechanize.html
Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).
:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1
zM
If you know of a better way, please let me know.
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.