Hello
Here is the latest Caml Weekly News, for the week of 22 February to 01 March, 2005.
Archive: http://caml.inria.fr/archives/200502/msg00455.html
Hans-Wolfgang Loidl announced:APPSEM II Summer School Frauenchiemsee, September 8-12, 2005 1st CALL FOR PARTICIPATION The IST-FET Summer School on Applied Semantics (APPSEM-II) will take place September 8-12, 2005 in Frauenchiemsee near Munich. We are proud to announce the following confirmed speakers: Andrew Pitts, Cambridge: Nominal Syntax and Semantics Philippa Gardner, London: Local Reasoning about Data Update Francois Pottier, Paris: A modern eye on ML type inference: old techniques and recent developments Gilles Barthe, Nice: Dependent Types in Programming Chris Hankin, London: Principles of Program Analysis The school is open to all interested graduated and (senior) undergraduate students as well as researchers. A limited number of student grants covering all fees, accommdation, and travel, are available. Neither participation nor grants are restricted to APPSEM sites. For details including deadlines please see the WWW page at http://www.appsem.org/summer_school.html
Archive: http://caml.inria.fr/archives/200502/msg00537.html
Erik de Castro Lopo announced:I had a request to make to the round_to_int patch easily accessible for anyone who wants it, so here it is: http://www.mega-nerd.com/OCaml/ I will try and keep this patch up to date as Xavier and the team release new versions of the already excellent O'Caml compiler.
Archive: http://caml.inria.fr/archives/200502/msg00530.html
Paul Argentoff asked and Jerome Simeon answered:> I have recently found a features in PXP named "pull parser", "event > interface". I hope these things can help me with such a problems as xmpp > streams parsing or huuuuge files parsing using Ocaml lazy streams (to avoid > "Out of memory" errors). Can anybody suggest an url/other place to read > more on these? I'm now reading the pxp source comments and version infos > from it's site. Those are just a pull variant of a SAX parser. People at BEA have done some work on that (They call it token stream): Daniela Florescu, Chris Hillery, Donald Kossmann, Paul Lucas, Fabio Riccardi, Till Westmann, Michael J. Carey, Arvind Sundararajan, Geetika Agrawal: The BEA/XQRL Streaming XQuery Processor. VLDB 2003: 997-1008 http://www.informatik.uni-trier.de/~ley/db/conf/vldb/vldb2003.html#FlorescuHKLRWCSA03 The XTiSP system which was presented at PLAN-X in January seems to have something similar as well: # XTiSP presented by Keisuke Nakano (UTokyo) http://xtisp.psdlab.org/ XML pull token streams also used extensively inside the Galax's query engine. There are probably other projects using those.Paul Argentoff then asked and Gerd Stolpmann answered:
> One more question: where can I find any documentation (besides comments) on > pxp-pp library? How can I use it? See the file doc/PREPROCESSOR which is part of the distribution tarball.
Archive: http://caml.inria.fr/archives/200502/msg00519.html
Richard Jones announced, spawning a big thread:http://merjis.com/developers/xphelloworld This is something I've been meaning to do for over a year now, and I've finally got around to it. In 2003 I worked on a project where we wrote a complex graphical (Gtk-based) application for Windows. The program was primarily written on Linux, and we developed a cross-platform Makefile and installer allowing us to target both Windows and Unix platforms. The managers of this project have kindly allowed me to release the Makefile, NSIS installer script, and supporting code into the public domain. This is a "Hello, World"-type program which shows how it is possible to write a cross-platform graphical application which targets Windows and Unix. On Windows, it comes with an installer, an uninstaller, a desktop icon and menu entries. It has the native Windows look and feel on Windows. On Linux/Unix it has the ordinary Gtk look and feel. License is public domain. You can do whatever you like with the Makefile and installer script, including writing proprietary packages. I need help documenting how to install all the many extra development packages required under Windows. Let me know if you can help me document this. At the moment I have a Windows box here which works, but I'll need to reverse engineer exactly what I installed and where I got each component from.Sven Luther said and Richard Jones answered:
> Well, that would be obvious. But then again, the interest is to do > cross-plateform builds, which i previously failed to do with ocaml/gtk. Ah, you mean something like cross-compiling? No, it won't help you do that. > Well, it was not entirely sure what exactly is included in it, since there is > not a single mention of the ocaml word in the whole body of the email, and > still it was sent to the ocaml mailing list, so ... I should probably have said lablgtk2, instead of just gtk.Blair Zajac asked and Richard Jones answered:
> What would it take to have a native Mac OS X interface that doesn't use X11? The code uses lablgtk2 for the interface. If you look in the source you'll see a ~10 line program called "hello.ml" which uses the lablgtk2 API. On Windows, widgets are rendered using Gtk with a Windows theme applied, so they're not native widgets, although they look like it. To get true native widgets across all three major platforms you'd have to use a different API completely. WxWidgets is the obvious choice (http://www.wxwindows.org/). You'd probably want to start with SooHyoung Oh's wxcaml (http://pllab.kaist.ac.kr/~shoh/ocaml/wxcaml/doc/) which was built using CamlIDL from the WxWidgets headers. WxWindows isn't really suitable for what I want to do because it doesn't support a rich canvas widget, nor a good rich text editor.The following exchange occurred, and Jon Harrop said:
Richard Jones wrote: > Jon Harrop wrote: > > Richard Jones wrote: > > > WxWindows isn't really suitable for what I want to do because it > > > doesn't support a rich canvas widget, nor a good rich text editor. > > > > Does it support cross-platform OpenGL? If so then you could write your > > GUI in OpenGL... > > Joke, right? No, not at all. Just this afternoon, a friend of mine suggested that I commercialise the OCaml port of my vector graphics engine: http://www.chem.pwf.cam.ac.uk/~jdh30/programming/opengl/smoke/ The OCaml implementation is much more evolved and vastly easier to use, of course. In particular, it makes cross-platform GUIs relatively trivial. I didn't believe him though. I mean who would want to be able to write cross-platform GUIs easily? Especially smoothly animated ones with alpha blending, texture mapping and integrated 2D and 3D. Seriously though, if I did this, would anyone be interested in buying it to develop commercial applications with for, say, 1,000UKP? > Blender actually has a GUI written in OpenGL. One of > the remarkable consequences of this is that you can smoothly zoom and > sheer the controls ... Yes, if you're already using OpenGL then there are a lot of advantages to having an OpenGL-based GUI. Even if you're not already using OpenGL, it is the most cross-platform GUI-capable API and runs on virtually any modern computer, typically with performance orders of magnitude better than anything you'll get with Qt, GTK, WxWindows or any other software renderer. I develop for Linux and just had a go on another friend's Apple PowerBook. Once you've added <-cclib "-framework Foundation"> to the link line, the OCaml code compiles and runs beautifully. If you want some examples of trivial OpenGL programs written in OCaml, have a look at the freebies from my book: http://www.ffconsultancy.com/products/ocaml_for_scientists/visualisation There are Linux and Mac OS X executables you can just click on. Also, check out the examples which come with lablGL and lablglut. The main omission for me is then the lack of native-looking drop-down menus and a save dialog. I tried to port my lablglut-based code to lablgtk but failed miserably - I couldn't even get a window with a menu bar and a full-size OpenGL widget. Incidentally, would someone be so kind as to send me some Windows executables of my demos? Then we could have the full complement. :-)Later in the thread, Jon Harrop said:
To clarify, here's an example of what I'm talking about: http://www.ffconsultancy.com/temp/smoke.png This is a screenshot of a cross-platform vector graphics editor I've been writing. The GUI is done entirely in OpenGL, overlaid on the vector graphics being edited. As it is based upon Smoke, the entire editor is only 2,166 lines of straightforward OCaml code. It features: 1. Vector graphics editing (the "cursor" and "line" tools). 2. An optionally-visible, snapable grid (the "hash" tool). 3. Smooth panning and zooming (the "eye" tool). 4. Native-format import and export. 5. EPS and SVG export. 6. Dynamically-loadable OCaml byte-code tools for the tool-bar (so users can create their own plug-in tools and sell them). Of course, awesome performance, anti-aliasing, transparency, gradient and radial fills and many other features are inherited from Smoke. I'd have thought that seasoned OCaml hackers would be able to knock out ass- kickingly good commercial applications in no time with a library like this...Slowly straying off-topic, Bardur Arantsson said:
Maybe you should try Eclipse and the OCaml-plugins: - http://www.eclipse.org/ - http://eclipsefp.sourceforge.net/ocaml/ Now, I haven't tested the OCaml/C++ support, so I don't know how good that is, but the Java support in Eclipse is incredible.Vincenzo Ciancia asked and Richard Jones answered:
> Do you mean that gtk has the native look and feel on windows, including > e.g. font selection or file open dialog? Good question. Answer is unfortunately no. The native look and feel is provided by the Gtk-Wimp theme (http://gtk-wimp.sourceforge.net/) which was recently integrated into Gtk itself. However the theming only applies to widgets, and not to such things as the file->open dialog. Note that even Microsoft isn't consistent in this area. Office XP, for example, uses its own widgets and dialogs. Printing is another area where things are complicated. Under Unix it's relatively straightforward: generate a PS or PPM file and pipe it into 'lpr'. On Windows things are much more complex. For the app I wrote in 2003, I managed a primitive form of graphic-only printing by hacking out a standalone printing program from the guts of the GIMP Windows printer driver (written in C, not OCaml). I can let anyone have this if they're interested.
Archive: http://caml.inria.fr/archives/200502/msg00585.html
mulhern asked and Sylvain Le Gall answered:> I need to use OCaml source code as input to a tool that generates a > related sort of output. It's not a source-to-source translator; the > code generated will not do what the OCaml source code does. It's not a > pretty printer either, the relationship isn't so direct. I'ld like to > write the tool itself in OCaml and am hoping to use CamlTemplate. > > What is the most direct way to get a useful, easily traversable, > representation of OCaml source code in OCaml? Clearly, there is one > embedded in the various ocaml tools. > > I understand that Camlp4 will dump out a binary AST, which is then > input to the OCaml compiler. If that is the best way to go, could > somebody give me some pointers about how to traverse this AST? I have > been unable to extract the information from the Camlp4 documentation. > > I have also been looking at the OCaml src code distribution. I realize > it's possible to pass the compiler a flag (dparsetree) that will cause > a pretty-printed version of the parse tree to be dumped out. On > examination of compiler.ml I can see how that ast that gets pretty > printed is constructed. Is my best bet to write an ocaml application > that just uses a a large chunk of the ocamlc source code modeling the > application as best I can on the compiler.ml source? > Or would I be better off parsing the pretty-printed stuff that gets > dumped out? Or, could I write my own printer that is not so pretty and > dumps a textual representation that is very easily parsed so that the > AST can be reconstructed and insert that into my local version of > ocamlc? > > I'm sure people have encountered similar problems before. Any advice > based on your experience would be very much appreciated. > Well, i have written a kind of library called ocaml-ast-analyze. It use camlp4 to register a printer of source code. You can inject function to do code analysis using this module and you can be able to analyze certain part of the code. I don't have released this library. If you are intersted in it, i can send you a copy. Can you take a look at the file, and then told me if you want the source ? URL : http://www.gallu.homelinux.org/cgi-bin/viewcvs.cgi/ocaml-ast-analyze/trunk/?root=svn And look at : http://www.gallu.homelinux.org/cgi-bin/viewcvs.cgi/ocaml-gettext/trunk/libgettext-ocaml/pr_gettext.ml?rev=359&root=svn&view=auto for example. Kind regard Sylvain Le Gall ps : it use camlp4, so it is limited to what camlp4 do, for example i should use the same command line as camlp4 invocation...
Archive: http://caml.inria.fr/archives/200502/msg00598.html
David Monniaux asked and Pierre Letouzey answered:> I'm currently playing with Coq and extraction, and I'm having the > following problems: > * Since the list constructor of standard OCaml (infix ::) is not (yet) > usable as a prefix ( :: ), I cannot just tell Coq to extract lists to > OCaml lists. (Future OCaml versions will allow prefix ( :: ), I'm told.) Yes, Yves Bertot has come long ago with the same problem, and I remember having submitted a feature-wish to the Ocaml team. > * OCaml compiles match e with true -> x1 | false -> x2 less efficiently > than if e then x1 else x2 (bug report filed, but not fixed so far). > > Unless I'm greatly mistaken, this can be fixed by a mere syntactic > transformation using Camlp4. I suppose I'm not the first person to have > encountered these problems... So has anybody done the necessary Camlp4 > scripts? :-) > Indeed, I've got such a camlp4 translator. I do not advertize it much, because it's not very robust. In particular, if you have defined your own customized "list" datatype or reused any of the Coq standard library names, that will lead to problems. Nevertheless, in normal situation, it will do what you expect. The script (containing instructions) is: http://www.lri.fr/~letouzey/download/pp_extract.ml In particular, there is a portion to uncomment if you want the sumbool type (left|right) to be translated to boolean and hence take advantage of "if ... then ... else" syntax.Martin Jambon also answered:
> I'm currently playing with Coq and extraction, and I'm having the > following problems: > * Since the list constructor of standard OCaml (infix ::) is not (yet) > usable as a prefix ( :: ), I cannot just tell Coq to extract lists to > OCaml lists. (Future OCaml versions will allow prefix ( :: ), I'm told.) > * OCaml compiles match e with true -> x1 | false -> x2 less efficiently > than if e then x1 else x2 (bug report filed, but not fixed so far). > > Unless I'm greatly mistaken, this can be fixed by a mere syntactic > transformation using Camlp4. I suppose I'm not the first person to have > encountered these problems... So has anybody done the necessary Camlp4 > scripts? :-) The following should solve your problem #2 (I just wrote as an exercise for myself :-) It works by overriding (partially) the regular syntax of OCaml (which might not be the syntax of Coq, I don't know Coq). If anyone has a better solution, please tell me. The general solution however seems to work only on the AST. Any comments or suggestions? (* File pa_matchbool.ml Compilation: ocamlc -c -I +camlp4 -pp 'camlp4o pa_extend.cmo q_MLast.cmo -loc loc' \ pa_matchbool.ml Test: camlp4o -I . pr_o.cmo pa_matchbool.cmo test_matchbool.ml \ -o test_matchbool.ppo *) let expand_match loc e l = match l with [ <:patt< True >>, None, e1; (<:patt< False >> | <:patt< _ >>), None, e2 ] | [ <:patt< False >>, None, e2; (<:patt< True >> | <:patt< _ >>), None, e1 ] -> true, <:expr< if $e$ then $e1$ else $e2$ >> | _ -> false, <:expr< match $e$ with [ $list:l$ ] >> let expand_function loc l = match expand_match loc <:expr< __matchbool >> l with true, e -> <:expr< fun __matchbool -> $e$ >> | false, _ -> <:expr< fun [ $list:l$ ] >> let match_case = Grammar.Entry.create Pcaml.gram "match_case";; EXTEND GLOBAL: match_case; Pcaml.expr: LEVEL "expr1" [ [ "match"; e = Pcaml.expr; "with"; OPT "|"; cases = LIST1 match_case SEP "|" -> snd (expand_match loc e cases) | "function"; OPT "|"; cases = LIST1 match_case SEP "|" -> expand_function loc cases ] ]; match_case: [ [ x1 = Pcaml.patt; w = OPT [ "when"; e = Pcaml.expr -> e ]; "->"; x2 = Pcaml.expr -> (x1, w, x2) ] ]; END;;
Archive: http://caml.inria.fr/archives/200502/msg00611.html
Deep in this thread, Christophe Troestler said:The following may is really nice (in French only unfortunately): "Principes et Programmation des Systèmes d'exploitation" http://cristal.inria.fr/~remy/poly/system/ Maybe -- if the authors agree -- some people may be interested to translate it to english ?
Archive: http://caml.inria.fr/archives/200502/msg00616.html
Bengt Nordström announced:TYPES Summer School 2005 Proofs of Programs and Formalisation of Mathematics August 15-26 2005, Goteborg, Sweden http://www.cs.chalmers.se/Cs/Research/Logic/TypesSS05/ During the last ten years major achievements have been made in using computers for interactive proof developments to produce secure software and to show interesting mathematical results. Recent major results are, for instance, the complete formalisation of a proof of the four colour theorem, and a formalisation of the prime number theorem. This two weeks' course is for postgraduate students, researchers and industrials who want to learn about interactive proof development. The present school follows the format of previous TYPES summer school (in Baastad 1993, Giens 1999, Giens 2002). There will be introductory and advanced lectures on lambda calculus, type theory, logical frameworks, program extraction, and other topics with relevant theoretical background. Several talks will be devoted to applications. During these two weeks we will present three proof assistants: Coq, Isabelle and Agda, which are state-of-the-art interactive theorem provers. Participants will get extensive opportunities to use the systems for developing their own proofs. No previous knowledge of type theory and lambda calculus is required. The school is organised by the TYPES working group "Types for Proofs and Programs", which is a project in the IST (Information Society Technologies) program of the European Union. A limited number of grants covering part of travel, fees and ackommodation are available. Neither participation nor grants are restricted to TYPES participants. Lecturers: --------- Jeremy Avigad, Carnegie-Mellon Connor McBride, Nottingham Yves Bertot, INRIA Sophia Alexandre Miquel, Paris 7 Thierry Coquand, Chalmers Tobias Nipkow, TU Munich Catarina Coquand, Chalmers Bengt Nordstrom, Chalmers Gilles Dowek, INRIA Futurs Erik Palmgren, Uppsala Peter Dybjer, Chalmers Christine Paulin, Paris Sud Herman Geuvers, Nijmegen Laurent Thery, INRIA Sophia John Harrison, INTEL Freek Wiedijk, Nijmegen Per Martin-Lof, Stockholm TENTATIVE PROGRAM ----------------- Introduction to Type Theory: Lambda-calculus Propositions-as-types Inductive sets and families of sets Predicative and non-predicative theories Foundations: Introduction to Systems: Coq Isabelle Agda Advanced applications and tools: Proving properties of Java programs Reasoning about Programming Languages Coinduction Correctness of floating-point algorithms Dependently typed programming: Dependently typed datastructures Compiling dependent types Formalisation of mathematics: Introduction Fundamental theorem of algebra Bishop' set theory Other examples, e.g. prime number theorem The organising committee: Andreas Abel, Ana Bove, Catarina Coquand, Thierry Coquand, Peter Dybjer and Bengt Nordstrom.
Archive: http://caml.inria.fr/archives/200502/msg00619.html
Martin Sandin announced:Tywith Module version 0.3 - initial release Tywith is an OCaml camlp4 parser extension which derives functions from type definitions. It's currently capable of generating 'string_of_<type>', 'map_<type>', and 'fold_<type>' functions for alias and variant types containing tuples and other types with the appropriate functions defined. Tywith special-cases built-in types such as list, int, and string to provide or use the appropriate functions. This package extends OCaml with the following syntactic construction type <def> with <id> [,<id>..]) The id's currently supported are 'string_of', 'map', and 'fold', which will result in the generation of the appropriate functions. The fold functions will typically depend on the presence of a map function for any type mentioned inside the definition which depend on a type parameter. Download Tywith from http://www.seedwiki.com/wiki/shifting_focus/tywith or directly from http://www.guldheden.com/~sandin/files/tywith03.zip
Archive: http://caml.inria.fr/archives/200502/msg00625.html
Paul Snively said:> Apropos Apple.... when M$ creates F# out of Ocaml, > why not to trigger Apple to create sometging like > BackendObjects or ScientificObjects (as complementary to > WebObjects) out of OCaml on OS-X?! > > Maybe we should trigger Apple for this?! > > If not, they will use Objective-C and Java and nothing else. Boy oh boy, where to begin with this? I was at Apple from 1989-1991, when John Sculley was CEO and at the time that Jean-Louis Gasseé was pushed out. At this time, Apple Computer still had an Advanced Technology Group that did pure research, and also had Apple Fellows who sometimes drove initiatives that involved programming languages. Some folks in ATG really liked Coral Common Lisp, so Apple bought Coral Software and thus Coral Common Lisp became Macintosh Common Lisp. Alan Kay was an Apple Fellow, so he and his colleagues created Squeak Smalltalk. Allen Cypher, author of "Watch What I Do," was in ATG, so we got projects like KidSim, which became Cocoa (not to be confused with Mac OS X's Cocoa APIs, of course). The Newton became an active project, and that prompted the development of Dylan and NewtonScript. Fast forward a few years, and MCL gets sold off, the Newton flops, Dylan gets canceled, Kay and Squeak leave Apple, Cocoa gets canceled, Steve Jobs returns and "Steves" not only whatever research projects are left, but also some arguably core technologies such as Quickdraw 3D. What Apple got instead was more standards-compliant technology such as OpenGL, a better-known base in the form of FreeBSD-based Mac OS X but with the Carbon and Cocoa APIs on top of it, and more focus on how to actually innovate in the marketplace: iPod and iTunes, iMac, Mac Mini... Oh, and a return to profitability during some of the personal computing market's toughest years. It's also interesting to note that several things that Apple dropped have nevertheless lived: Dylan is kind of sputtering along thanks to heroic efforts by the Gwydion Dylan team and the work of the former Functional Objects, Inc.; Squeak is extremely healthy; MCL is still the best Common Lisp on any platform; Quesa is a slowly-evolving but quite nice reimplementation of Quickdraw 3D atop OpenGL... Culturally, my read is that Apple now has an extreme aversion to taking on big promotional jobs, especially about languages: the Dylan debacle scarred them. Market pressures forced them to backpedal from their initial stance toward developers that Objective-C and Cocoa were the "real" Mac OS X APIs and Carbon was merely a transitional API to the reality that Carbon must provide access to virtually all Mac OS X capabilities, e.g. HIViews and Sheets, and Objective-C and Cocoa are marginal, Mac-OS-X-only tools for true believers. Later still they had to go so far as to add "Objective-C++" to their version of GCC so that developers could write their cross-platform core code in C++, and write only the UI layer in Objective-C, which then needed to be able to call the C++ core. Finally, it's been a long time since Apple was commanding 500% profit margins and could afford to throw money away on research projects. Sad, but true. So for a variety of political, historical, economic, and technical reasons, it's not at all realistic to expect Apple to pursue anything involving languages other than the absolutely mainstream: arguably the only reason they support Objective-C is that it and OpenStep already existed and were relatively mature at the time of the NeXT acquisition. Java made it for a couple of reasons: it's plenty mainstream enough, and the relationship between the Java runtime model and Objective-C runtime model is sufficiently incestuous that there's some leverage to be had by having them both. To try to bring this back home, this is why folks like Mike Hamburg and myself are putting some thought into how to marry O'Caml and Cocoa (which, by definition, includes the Objective-C runtime model). We'd like to have something resembling the fluidity of Java <-> Objective-C, but lacking reflection, we can't exactly, so on one hand we have to have Mike's Obj.magic magic, and on the other we need something very much like, if not exactly, Jeff Henrikson's Forklift magic. So we're left with a bit of a transcontinental railroad project, but with a lot of elbow grease and faith, hopefully the result will be nice, typesafe O'Caml modules reflecting the entirety of Apple's Cocoa APIs. And then maybe we can come up with functorized modules providing a consistent API to Cocoa, Win32, or lablgtk... ;-)
Archive: http://caml.inria.fr/archives/200502/msg00591.html
Jon Harrop asked and Radu Grigore answered:> Following my last post about the speed of set union in OCaml compared to > C++/STL. What is the complexity of set union in OCaml in terms of the number > of elements and the number of comparisons? As no one gave an informed answer I'll risk a hand-waiving one. When all elements of a set are bigger than all elements of the other set the Set.union function seems to simply add the elements of the small set to the bigger set one at a time. So the complexity looks like O(m lg n) where m is the size of the smaller set. For other cases the process is a bit more complex: take the root of the short tree, split the large tree into smaller/larger elements based on that root, compute union of "small" trees, "compute union of "large" trees", merge them. If I'm not mistaken this is O(m lg n) too.Jon Harrop added:
> Following my last post about the speed of set union in OCaml compared to > C++/STL. What is the complexity of set union in OCaml in terms of the number > of elements and the number of comparisons? As no one gave an informed answer I'll risk a hand-waiving one. When all elements of a set are bigger than all elements of the other set the Set.union function seems to simply add the elements of the small set to the bigger set one at a time. So the complexity looks like O(m lg n) where m is the size of the smaller set. For other cases the process is a bit more complex: take the root of the short tree, split the large tree into smaller/larger elements based on that root, compute union of "small" trees, "compute union of "large" trees", merge them. If I'm not mistaken this is O(m lg n) too.Xavier Leroy then said:
[ Complexity of Set.union ] > > For other cases > > the process is a bit more complex: take the root of the short tree, > > split the large tree into smaller/larger elements based on that root, > > compute union of "small" trees, "compute union of "large" trees", > > merge them. If I'm not mistaken this is O(m lg n) too. My hope is that union takes time O(N log N) where N is the sum of the numbers of elements in the two sets. I'm thoroughly unable to prove that, though, in particular because the complexity of the "split" operation is unclear to me. This bound is "reasonable", however, in that the trivial union algorithm (repeatedly add every element of one of the sets to the other one) achieves this bound, and the trick with "joining" is, intuitively, just an optimization of this trivial algorithm. > Now, what about best case behaviour: In the case of the union of two equal > height, distinct sets, is OCaml's union T(1)? Did you mean "of two equal height sets such that all elements of the first set are smaller than all elements of the second set"? That could indeed run in constant time (just join the two sets with a "Node" constructor), but I doubt the current implementation achieves this because of the repeated splitting.Jon Harrop answered:
> My hope is that union takes time O(N log N) where N is the sum of the > numbers of elements in the two sets. I'm thoroughly unable to prove > that, though, in particular because the complexity of the "split" > operation is unclear to me. Am I correct in thinking that your derivation of this assumes roughly equal-sized sets and that your complexity could be tightened a bit by using the two different set cardinalities explicitly? I ask this because the STL set_union is probably O(n+N) (inserting an already sorted range into a set is apparently linear) which is worse than the O((n+N) log(n+N)) which you've suggested for OCaml. But my OCaml code is vastly faster, so OCaml's complexity seems to be significantly better than that. At least in the special case of one small and one large set, which my code is bound by. Specifically, the sets have O(1) and O(i^2) elements when looking for the "i"th nearest neighbour. In reality this corresponds to computing the unions of sets containing 4 elements with sets containing 10^4 elements. Hmm, now that I come to think of it, my performance measurements have all been specific to silicon (that's where the 4 comes from). I'll try retiming on some other atomic structures, where the small sets will contain about 12 elements. I predict the OCaml code will do better relative to the C++ code, because the smaller sets won't be so small... > This bound is "reasonable", however, in that the trivial union > algorithm (repeatedly add every element of one of the sets to the > other one) achieves this bound, and the trick with "joining" is, > intuitively, just an optimization of this trivial algorithm. I see. This could be improved in the unsymmetric case, by adding elements from the smaller set to the larger set. But the size of the set isn't stored so you'd have to make do with adding elements from the shallower set to the deeper set. I've no idea what the complexity of that would be... As I know which of the two sets will be the larger and which will be the smaller, I'll try a customized union function which folds Set.add over the smaller set. > > Now, what about best case behaviour: In the case of the union of two > > equal height, distinct sets, is OCaml's union T(1)? > > Did you mean "of two equal height sets such that all elements of the > first set are smaller than all elements of the second set"? Yes, that's what I meant. :-) > That > could indeed run in constant time (just join the two sets with a > "Node" constructor), but I doubt the current implementation achieves > this because of the repeated splitting. Having said that, wouldn't it take the Set.union function O(log n + log N) time to prove that the inputs are non-overlapping, because it would have to traverse to the min/max elements of both sets?Radu Grigore replied:
> I ask this because the STL set_union is probably O(n+N) (inserting an already > sorted range into a set is apparently linear) which is worse than the O((n+N) > log(n+N)) which you've suggested for OCaml. The complexity of set_union is indeed O(n+N), see [0]. It is basically a merge of sorted _sequences_ [1]. I assume n is the size of the small set and N is the size of the small set and the heights are h=O(lg n), H=O(lg N). With this the complexity of Set.union is more like O(n lg(n+N)), at least when all elements in one set are smaller than the elements of the other set. > I see. This could be improved in the unsymmetric case, by adding elements from > the smaller set to the larger set. But the size of the set isn't stored so > you'd have to make do with adding elements from the shallower set to the > deeper set. I've no idea what the complexity of that would be... That is how it works now. As Xavier said the trickiest part is split. > > Did you mean "of two equal height sets such that all elements of the > > first set are smaller than all elements of the second set"? > > Yes, that's what I meant. :-) In that case the current Set.union simply adds elements repeatedly from the set with small height to the set with big height. > > That > > could indeed run in constant time (just join the two sets with a > > "Node" constructor), but I doubt the current implementation achieves > > this because of the repeated splitting. What splitting? I see none in this case. > Having said that, wouldn't it take the Set.union function O(log n + log N) > time to prove that the inputs are non-overlapping, because it would have to > traverse to the min/max elements of both sets? I agree. Also, such a check looks ugly to me (for a standard library). -- regards, radu http://rgrig.blogspot.com/ [0] http://library.n0i.net/programming/c/cp-iso/lib-algorithms.html#lib.set.union [1] http://rgrig.blogspot.com/2004/11/merging-lists.htmlRadu Grigore finally added:
> The complexity of set_union is indeed O(n+N), see [0]. Ah, BTW, if you do something like set_union(..., inserter(s)); where s is a set then inserter has logarithmic complexity so overall you get O((n+N) lg(n+N)).
Archive: http://caml.inria.fr/archives/200502/msg00660.html
Alex Baretta asked and Luc Maranget answered:> We have an incomprehensible warning when compiling code that looks like > the following: > > type value = > | Int of int > | Float of float > | Int32 of int32 > | Int64 of int64 > | Bool of bool > | String of string > > let to_int value = match value with > | Int x -> x > | _ -> raise Some_exception > > The compiler signals a warning for a fragile pattern matching at the "_" > character. > > Why in the world should this code signal such a warning? > > Alex > > > -- Hello, The warning (which you do not supply) attempt to be informative. # ocamlc -w A alex.ml File "alex.ml", line 13, characters 4-5: Warning E: this pattern is fragile. It would hide the addition of new constructors to the data types it matches. This warning has been introduced in response to user demand. The idea is to enforce some coding rule that promotes robustness. However the coding rule is so strict that it was decided that standard users can ignore it. Of course if you specify -w A on the command line, then you get all warnings, including the 'fragile pattern' warning. As regards defaults for warnings, here is more or less what I get on my ocaml installation. % ocamlc -v The Objective Caml compiler, version 3.09+dev11 (2004-11-30) % ocamlc -help ... -w <flags> Enable or disable warnings according to <flags>: A/a enable/disable all warnings ... E/e enable/disable fragile match ... default setting is "Aelz"
Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).
:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1
zM
If you know of a better way, please let me know.
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.