Previous week Up Next week
Hello,
Here is the latest Caml Weekly News, week 18 to 25 March, 2003.
1) toplevel command history/editing?
2) DocCHM Release
3) Loop times
4) Unix.lseek versus Pervasives.pos
5) Wanted - General Purpose "Glue Logic" Data-Structures
6) Haskell-like syntax
7) Announce: Regexp/OCaml syntax extension
8) camomile-0.3.0 is relased
9) XML Streams
10) Ocurl - A binding for O'Caml to libcurl
11) WDialog-2.00 released
12) Wtimer-1.0 released
13) Cross-platform GUI
======================================================================
1) toplevel command history/editing?
----------------------------------------------------------------------
Graham Guttocks asked and SooHyoung Oh answered:
> I was surprised to find that the ocaml toplevel doesn't
> support command history, line editing, etc, with the
> arrow keys as the Python interpreter does for example.
> This makes it difficult to interactively test code.
> Am I missing something?
Please check "ledit" and "ocaml2" in Caml humps pages.
======================================================================
2) DocCHM Release
----------------------------------------------------------------------
Nicolas Cannasse announced:
I'm pleased to announce the first release of DocCHM :
DocCHM is a CHM generator for OCamlDoc. It enable you to generate a CHM file
( Windows CompressedHTML Help file ) instead of the standard HTML output. It
automaticaly generate the index and the hyperlinks to upper module, types,
etc. You can then quickly browse the documentation, and search index by
name.
The full source code & bytecode binaries are available at :
http://tech.motion-twin.com
A link to the CHM documentation generated from the OCaml Standard Library is
also available.
======================================================================
3) Loop times
----------------------------------------------------------------------
Daniel M. Albro discussed in this long message:
(thread start: http://caml.inria.fr/archives/200303/msg00176.html)
OK, as requested here's a summary email. I think someone already
answered you about the input loop (recap -- Input-Output routines
are so slow that the speed of the loop around them fades into
insignificance by comparison).
I was trying to illustrate a point that a "break" statement
would be a nice addition to the language, but I got distracted a bit
into figuring out what is the fastest way to break out of a loop as
things stand now (as opposed to the best way, which might be different
from the fastest...). I did 8 different timing tests, which I will
present from fastest to slowest:
(1) Tail-recursive function, top-level defined (slightly
faster than (2), possibly because the variable ary is
more local)
---------------------------------------------------
let rec loop ary j =
if j = 10 then
()
else if ary.(j) = 5 then
()
else
loop ary (j + 1)
let _ =
let ary = [|1;2;3;4;5;6;7;8;9;10;11;12|] in
for i = 1 to 1_000_000_000 do
loop ary 0
done
real 0m26.787s
user 0m26.770s
sys 0m0.010s
---------------------------------------------
(2) Tail-recursive auxilliary function
---------------------------------------------
let _ =
let ary = [|1;2;3;4;5;6;7;8;9;10;11;12|] in
let rec loop j =
if j = 10 then
()
else if ary.(j) = 5 then
()
else
loop (j + 1)
in
for i = 1 to 1_000_000_000 do
loop 0
done
real 0m26.823s
user 0m26.780s
sys 0m0.020s
-----------------------------------------------------
(3) For loop with exception pre-built:
-----------------------------------------------------
exception Break
let _ =
let ary = [|1;2;3;4;5;6;7;8;9;10;11;12|] in
let exn = Break in
for i = 1 to 1_000_000_000 do
try
for j = 0 to 9 do
if ary.(j) = 5 then
raise exn
done
with Break -> ()
done
real 0m28.095s
user 0m28.070s
sys 0m0.030s
------------------------------------------------------
(4) Continuation-passing style
------------------------------------------------------
let _ =
let ary = [| 1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12 |] in
let rec loop f j =
if j = 10 || ary.(j) = 5 then f ()
else loop f (j + 1)
in
let rec outer i =
if i <= 1_000_000_000 then
loop (fun _ -> outer (i + 1)) 0
in
outer 1
real 0m29.999s
user 0m29.890s
sys 0m0.010s
------------------------------------------------------------
(5) For loop with exception
------------------------------------------------------------
exception Break
let _ =
let ary = [|1;2;3;4;5;6;7;8;9;10;11;12|] in
for i = 1 to 1_000_000_000 do
try
for j = 0 to 9 do
if ary.(j) = 5 then
raise Break
done
with Break -> ()
done
real 0m34.245s
user 0m34.080s
sys 0m0.010s
-----------------------------------------------------
(6) Tail recursive auxilliary function defined inside
the outer loop (sometimes, in more complicated cases,
this might be an option, I suppose, for example, if
the inner loop needs to be a separate function)
-----------------------------------------------------
let _ =
let ary = [|1;2;3;4;5;6;7;8;9;10;11;12|] in
for i = 1 to 1_000_000_000 do
let rec loop j =
if j = 10 then
()
else if ary.(j) = 5 then
()
else
loop (j + 1)
in
loop 0
done
real 0m35.188s
user 0m35.140s
sys 0m0.040s
------------------------------------------------------
(7) while loop with references
------------------------------------------------------
let _ =
let ary = [|1;2;3;4;5;6;7;8;9;10;11;12|] in
let j = ref 0 in
for i = 1 to 1_000_000_000 do
j := 0;
while !j < 10 do
if ary.(!j) = 5 then
j := 10
else
incr j
done
done
real 0m39.458s
user 0m39.440s
sys 0m0.000s
------------------------------------------------------
(8) "Higher order functions". This style is nifty because
it's like a break statement that can return a value.
------------------------------------------------------
let escape body =
let module Fail = struct exception T end in
let datum = ref None in
let throw v =
begin
datum := Some v;
raise Fail.T
end
in
try
body throw
with
Fail.T -> (match !datum with Some v -> v | None -> assert false)
let _ =
let ary = [|1;2;3;4;5;6;7;8;9;10;11;12|] in
for i = 1 to 1_000_000_000 do
escape (fun exit ->
for j = 0 to 9 do
if ary.(j) = 5 then exit()
done)
done
real 1m49.178s
user 1m48.900s
sys 0m0.280s
------------------------------------------------------
The main surprise to me was that number (7) was so slow. Anyway,
as matters currently stand it looks like tail recursive loops are
the way to go if you have to have a fast inner loop that can break
out early. HOWEVER, I would like to argue that for cases where
you really want to optimize a loop like this (and where you can't
radically change the algorithm or data structures, of course), it
would be really great if the language added a genuine break
statement (plus some sort of step value for for loops). As usual,
my argument is in terms of a timing test. If you have an inner
loop that *doesn't* have to break out in the middle, the for loop
is much faster than a tail recursive loop. Here are the
tests:
(1) for loop
------------------------------------
let _ =
let k = ref 0 in
for i = 1 to 1_000_000_000 do
k := 0;
for j = 1 to 10 do
incr k
done
done
real 0m22.867s
user 0m22.850s
sys 0m0.010s
------------------------------------
(2) tail recursive loop
------------------------------------
let _ =
let k = ref 0 in
let rec loop j =
incr k;
if j = 10 then () else loop (j + 1)
in
for i = 1 to 1_000_000_000 do
k := 0;
loop 1
done
real 0m37.105s
user 0m36.870s
sys 0m0.200s
------------------------------------
======================================================================
4) Unix.lseek versus Pervasives.pos
----------------------------------------------------------------------
Shivkumar Chandrasekaran asked and Xavier Leroy answered:
> It would seem to me that it would be convenient to have 64 bit versions
> of seek_in, seek_out, pos_in, pos_out in the Pervasives module. This
> would help decouple the Pervasives I/O module a little more from the
> Unix module.
You wish was granted in OCaml 3.05 and 3.06: module Pervasives.LargeFile.
And, yes, not mixing Pervasives I/O with Unix I/O is recommended,
unless you enjoy puzzles :-)
======================================================================
5) Wanted - General Purpose "Glue Logic" Data-Structures
----------------------------------------------------------------------
John Gerard Malecki asked and Nicolas Cannasse answered:
> I want to broaden my arsenal of data-structures that interface data
> between the specific structures that some algorithms require. Here is
> an example of a real problem that I solved too specifically:
>
> I had an existing reader that produced a list of objects, 'a list.
>
> I wrote a filter which fractured those items producing 'a list list.
>
> I then wanted to feed that data to an existing program which builds
> a balanced tree top-down. It expects 'a list which it then passes
> to a median finder which prefers its input list to be in random
> order.
>
> In the great ocaml tradition this worked just fine and the first time.
> (What a great language.) To really test things I then ran it on
> MOADB, the mother of all data bases, a very large but real-world
> data-base.
>
> The program broke down in 2 places. First, List.concat was used to
> convert the output of the fracturer from 'a list list to 'a list. Not
> only is List.concat not tail-recursive but @ (Pervasives.append) is
> also not tail-recursive. I modified the fracturer to directly produce
> 'a list.
>
> Second, the median program has some limitations. It not only prefers
> the input to be randomized but while it runs it too constructs some 'a
> list list. (Why? This median program returns not just the median but
> the set of items lt and gt the median.) I re-wrote the median program.
>
> I would prefer not to keep re-writing programs and instead have a
> better collection intermediate data-structures that I can use to glue
> my programs together efficiently and safely. One can argue that the
> median program was already flawed and it was best to re-write it.
> Still, it would be nice to have a data-structure which the fracturer
> could produce that not only can deal with large data sets but also has
> the randomizing property. Of course I want all this at very little
> expense as none of these manipulations are germane to the real problem
> at hand.
>
> I considered having the fracturer build a priority queue over random
> numbers but all that balancing book-keeping seems expensive. I guess
> what I'm looking for are inexpensive data-structures that have
> properties like
>
> - fast computation of cardinality
>
> - fast addition of single elements
>
> - fast addition of lists of elements
>
> - fast removal of single elements in random order
>
> - not choking on a large data size
>
> Any suggestions? Does anyone have other useful glue-logic
> data-structures which they use?
We're currently running a project called ExtLib (
http://sourceforge.net/projects/ocaml-lib ) than aims at providing thoses
kind of needed tail-safe data structures, and some other useful modules
built together as an additionnal standard library.
Any people interested in helping can join the mailling list.
======================================================================
6) Haskell-like syntax
----------------------------------------------------------------------
Oleg asked and Andreas Rossberg answered, about Haskell-like syntax:
>As was pointed out to me in private email, Andreas Rossberg might have
>something quite close to a usable preprocessor.
>
>http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&selm=3D105596.8FF65D58%40ps.uni-sb.de
>
>I'm sending him a copy of this message as a request to share his work.
Back from holidays. ;-)
For people who are really interested I put the last version of the
sources at
http://www.ps.uni-sb.de/~rossberg/hocaml-0.12.tgz
But please be aware that the code is almost 5 years old, was kind of
work in progress, and must have experienced a fair amount of bit rot by
now. Last time I checked it still built, though.
I'm not entirely happy with some of the syntactic decisions I made.
Also, I'm not sure whether the preprocessor approach really is terribly
useful for this kind of thing. But if somebody feels like digging into
it, feel free to contact me for questions.
======================================================================
7) Announce: Regexp/OCaml syntax extension
----------------------------------------------------------------------
Yutaka Oiwa announced:
I release a camlp4-macro package called Regexp/OCaml.
I believe Objective Caml is a powerful tool not only for writing an
interpreter and compiler but also for writing casual "script"
applications which were in a territory of Perl, Python and
Ruby. However, string decomposition is one of OCaml's weak points when
compared to those scripting languages: the interfaces of both Str
module and Pcre module are very primitive and cumbersome if they are
heavily used. Regexp/OCaml solves this point by providing syntax
support for regular expression matching.
Regexp/OCaml provides convenient syntax sugar for regular expression
match against strings using PCRE/OCaml library. The features of this
macro package are the following:
* Convenient syntax: similar to standard match-with expressions
* Binding matching substrings to variables: no more $1, $2, ...
* Automagical easy-to-use type-coercion: no flood of int_of_string etc.
* Support for optional-patterns: gives string option type etc.
* Default values for optional-patterns
For example, parsing an entry for some log file becomes as easy as follows:
try
while true do
let line = input_line ic in
Regexp.match line with
"^((dd):(dd):(dd))[(.*?)] (.*)$"
as hour : int, min: int, sec : int, name, line ->
let time = hour * 3600 + min * 60 + sec in
...
| "^# (.*)$" as meta_info ->
...
| _ -> ()
done
with End_of_file -> ()
This short code parses both line in format like
"(00:34:32) [foobar] something" and "# some meta info"
and binds appropriate data into variables which can be used inside "...".
Compare the code above with an equivalent without using syntax extension.
Regexp/OCaml is downloadable from the web location
http://www.yl.is.s.u-tokyo.ac.jp/~oiwa/caml/ .
In addition to the main macro called pa_regexp_match, the package also
contains two tiny macros:
1) pa_pragma changes an option for loaded camlp4 macros inside source code.
2) pa_once provides "once" construct which evaluates any expression
only once per execution (pa_regexp_match uses this internally).
Any comments will be greatly appreciated.
======================================================================
8) camomile-0.3.0 is relased
----------------------------------------------------------------------
Yamagata Yoriyuki announced:
camomile-0.3.0 is released. camomile is a Unicode library for ocaml.
Currently camomile provides
* Unicode data types. (Base, UTF8) the Unicode character type and
several Unicode string implementations. (abstract type internally
implemented as int array, resizable character array, and UTF8
encoded normal string.) Most string API of camomile is a functor over
string implementation. Also it should be easy to add new
implementation as OS wch.
* Character info. (UCharInfo)
* Code converter. (CharEncodings)
about 200 encodings are supported. Conversion is either string to
string, Unicode string to/from string, OO-channel, and Stream.
* Unicode normalforms, NFD NFKD NFC NFKC. (Normalform)
Due to the presence of combined characters such as accents, Unicode
strings have several equivalent representations. Unicode standard
defines several "normal forms" for such cases. This module provides
conversion to such normal forms.
* Case mappings (CaseMap)
Internationalised case mapping. German eszet, Greek sigma, Turkish i
(for Turkish locale) are correctly handled. (Azeri and Lithuanian
supports are implemented but not tested) Behaviour of case mapping
can be adjusted by locale.
* String Comparison (UCol)
Behaviour of comparison also can be adjusted by locale. By default,
it uses Unicode collation algorithm. (I believe it is the same as
Java.)
Changes from 0.2.x are
* Functor design: OO-design is abandoned. API taking unicode strings
now becomes functors over unicode string implementation.
* The locale can be specified for case mapping and string comparison.
Still quality of the code is alpha, I think.
Download :
http://prdownloads.sourceforge.net/camomile/camomile-0.3.0.tar.bz2
Homepage : http://camomile.sourceforge.net
camomile comes with a test suite, which includes, among other tests,
conformance tests of Unicode standard for collator and normaliser.
test suite :
http://prdownloads.sourceforge.net/camomile/blender-0.3.0.tar.bz2
======================================================================
9) XML Streams
----------------------------------------------------------------------
Matthew Boyd announced:
As part of a previous project, I have a couple of files for turning
basic XML documents (channels) into an XML stream. It doesn't do
everything, but if you just need to pull XML tags out of a document,
then you might find it useful: http://www.alve.com/~mattwb/saXml.html
======================================================================
10) Ocurl - A binding for O'Caml to libcurl
----------------------------------------------------------------------
Lars Nilsson announced:
I would like to announce the available of Ocurl, an O'Caml[1] binding for
the libcurl[2] multi-protocol file transfer library.
http://sourceforge.net/projects/ocurl/
The library currently covers the "easy" interface in libcurl. The "multi"
interface may be incorporated in later versions.
Documentation is rather sparse at this point in time, but the options
available follow the C interface closely and should hopefully not present a
too big stumbling block.
Currently it has only been tested (and developed under) Linux, although
nothing particularly platform dependent is present in the code.
Feedback, questions, etc, is more than welcome.
Ocurl is available under the MIT license (similarly to libcurl).
======================================================================
11) WDialog-2.00 released
----------------------------------------------------------------------
Gerd Stolpmann announced:
I have just released WDialog-2.00, the toolkit for dialog-centric web
applications. This release focuses on the maturity of the APIs, and
there is also an updated reference manual. See below for a feature list,
and for a small introduction to get the idea.
WDialog is hosted at Sourceforge, and you find links to all relevant
documents on the homepage: http://wdialog.sourceforge.net
As an improvement of the test releases the reference manual can now be
downloaded as bunch of HTML files, or as a single PDF. (See the download
section of the homepage.) Of course, the online version is still
available. The manual has 242 pages, and describes the toolkit in
detail.
WDialog has already been used to create web applications for our
customers, and there is now even a free application that bases on it:
WTimer. See the link
http://www.ocaml-programming.de/packages/documentation/wtimer
WDialog is a complicated answer for a complex problem. It is not just
JSP for O'Caml, as it analyzes the problem differently, and comes to
alternate conclusions. It consists of an XML document type, a
programming interface, and a certain way of doing things at runtime;
from this point of view it is an environment for programming web
applications. This means that there is some kind of philosophy behind
it, and you will need some time to become productive with it.
Gerd
************
Feature list
************
* Separation of the user interface (UI) definition from the
backend program, improving the structure of the application and
enabling advanced software engineering processes
* The UI definition is contained in an XML file, and it describes
the whole UI from the overall dialog structure to the style of
the individual HTML element
* The dialogs have built-in persistency for session state. The
HTML form elements can be tied to dialog variables reflecting
their current state
* A powerful template system manages the combination of individual
HTML fragments to whole pages
* The dialogs behave like GUI widgets. They visualize the state of
the session, and user clicks are treated like events that are
handled by programmable callback methods
* The callbacks are written in a real programming language
(Objective Caml, or Perl), making it possible to formulate all
algorithms and to use all system resources
* The application can run as CGI as well as application server
* The WDialog toolkit itself does not require any database as
backend store
* WDialog preprocesses all web inputs, and ensures that minimum
security standards are fulfilled
*****************************
Introduction into the concept
*****************************
The application is considered as a number of dialogs, i.e. groups of
interactions that share the same state. Dialogs are defined by terms of
object-oriented programming. There is a class-like definition for the
user interface containing which dialog variables exist, and which output
methods ("pages") can produce visualizations of the dialog state. This
definition focuses on the user interface as such, and because of this it
is also known as the "user interface definition". This definition is
given as an XML file, e.g.
<ui:dialog name="foo" start-page="start">
<ui:variable name="x" type="string"/>
<ui:page name="start">
<html>
<body>
One can include all HTML tags in pages, but also special
elements generating special contents, and controlling the
generation process
<ui:text variable="x"/>
<ui:button name="q" label="Trigger event 'q'"/>
</body>
</html>
</ui:page>
</ui:dialog>
Within the output methods (inside ui:page), one can use normal HTML
elements intermixed with the special user interface tags that have the
"ui:" prefix.
Of course, it would not be realistic to write the whole application in a
language that is restricted by the XML syntax. This works well for the
I/O part of the application, but not for the algorithmic part, and not
for the database part. Because of this, there is also a backend program
that is written in a universal programming language. Usually, there is a
class for every dialog, and the class defines certain methods that are
called in certain phases of the overall request processing. In
particular, one can program "pre actions" that are executed just before
a page is generated, and "post actions" to handle events resulting from
mouse clicks. There is a registry saying which class is used for which
dialog.
Dialog state is made persistant between page invocations. This is done
by data marshalling; one can either include the marshalled dialog
literally into the HTML page, or a reference to a backend store. The
effect is that the dialog variables ("x" in the above example) keep
their value automatically, it is not necessary to pass them explicitly
from page to page.
There are special variants of the HTML form interactors like <input>.
These variants are bound to dialog variables, i.e. they are initialized
from the current variable state, and user modifications are propagated
back to the variables. For buttons and hyperlinks, there are variants of
the standard HTML elements that "send events". WDialog automatically
recognized when such buttons and links are clicked, and the "event" is
extracted from the request, and stored as a regular O'Caml value.
Usually, these events are evaluated by user-written in code in the post
action method.
This is just a small introduction into the basic concepts, so I do not
go further here. The reference manual contains some introductory
chapters, and there are smaller and bigger examples of code that
hopefully explain arising questions.
======================================================================
12) Wtimer-1.0 released
----------------------------------------------------------------------
Gerd Stolpmann announced:
I have just released WTimer-1.0, a web application to manage time
sheets. It bases on the WDialog toolkit (that happens to be released at
the same time), and is meant as the production quality sample
application for WDialog.
You can read more about WTimer here:
http://www.ocaml-programming.de/packages/documentation/wtimer
Download the tarball here:
http://www.ocaml-programming.de/packages/wtimer-1.0.tar.gz
======================================================================
13) Cross-platform GUI
----------------------------------------------------------------------
Matt Gushee asked and Sergey Goldgaber answered:
> I wonder if anyone has thought about developing an OCaml interface to
> FOX (www.fox-toolkit.org)? Though I haven't done any serious
> programming with it yet, I've tried out a couple of FOX applications,
> and was quite impressed with the quality of the interface. If you
> read the Goals & Approach statement on the Web site, it's also very
> impressive.
>
> I think OCaml + FOX could be a winning combination. Unfortunately, I
> don't know C++, or I'd write the wrapper myself. But if anyone else
> feels like taking it on, I'm happy to help with testing and
> documentation.
Looking back through the archives, I see that Andrei Errapart was
working on just such a project:
http://caml.inria.fr:80/archives/200210/msg00376.html
======================================================================
Old cwn
----------------------------------------------------------------------
If you happen to miss a cwn, you can send me a message
(alan.schmitt@inria.fr) and I'll mail it to you, or go take a look at
the archive (http://pauillac.inria.fr/~aschmitt/cwn/). If you also wish
to receive it every week by mail, just tell me so.
======================================================================
Alan Schmitt