Hello
Here is the latest Caml Weekly News, for the week of June 05 to 12, 2007.
Following on from our "why not fork" discussion, here is my most elegant concurrent ray tracer written in vanilla OCaml. This program simply runs four processes (forking off three) in parallel to improve performance when 2-4 CPUs are available (increase "d" if you have >4 CPUs). This isn't as sophisticated as the JoCaml version but it does double performance (taking the time down from 4s to 2s) on my dual-core machine, making it faster than any language that uses a concurrent GC. I don't know about you guys, but this is a complete revelation for me! I believe the performance relies upon the Linux kernel lazily copying the process. Does OSX also do that? Anyway, here is the code: let delta = sqrt epsilon_float type vec = {x:float; y:float; z:float} let ( *| ) s r = {x = s *. r.x; y = s *. r.y; z = s *. r.z} let ( +| ) a b = {x = a.x +. b.x; y = a.y +. b.y; z = a.z +. b.z} let ( -| ) a b = {x = a.x -. b.x; y = a.y -. b.y; z = a.z -. b.z} let dot a b = a.x *. b.x +. a.y *. b.y +. a.z *. b.z let length r = sqrt(dot r r) let unitise r = 1. /. length r *| r type scene = Sphere of vec * float | Group of vec * float * scene * scene * scene * scene * scene let ray_sphere {x=dx; y=dy; z=dz} {x=vx; y=vy; z=vz} r = let disc = vx *. vx +. vy *. vy +. vz *. vz -. r *. r in if disc < 0. then infinity else let b = vx *. dx +. vy *. dy +. vz *. dz in let b2 = b *. b in if b2 < disc then infinity else let disc = sqrt(b2 -. disc) in let t1 = b -. disc in if t1 > 0. then t1 else b +. disc let ray_sphere' {x=ox; y=oy; z=oz} {x=dx; y=dy; z=dz} {x=cx; y=cy; z=cz} r = let vx = cx -. ox and vy = cy -. oy and vz = cz -. oz in let vv = vx *. vx +. vy *. vy +. vz *. vz in let b = vx *. dx +. vy *. dy +. vz *. dz in let disc = b *. b -. vv +. r *. r in disc >= 0. && b +. sqrt disc >= 0. type hit = {l: float; nx: float; ny: float; nz: float} let rec intersect ({x=dx; y=dy; z=dz} as dir) hit = function Sphere ({x=cx; y=cy; z=cz} as center, radius) -> let l' = ray_sphere dir center radius in if l' >= hit.l then hit else let x = l' *. dx -. cx in let y = l' *. dy -. cy in let z = l' *. dz -. cz in let il = 1. /. sqrt(x *. x +. y *. y +. z *. z) in {l = l'; nx = il *. x; ny = il *. y; nz = il *. z} | Group (center, radius, a, b, c, d, e) -> let l' = ray_sphere dir center radius in if l' >= hit.l then hit else let f h s = intersect dir h s in f (f (f (f (f hit a) b) c) d) e let rec intersect' orig dir = function Sphere (center, radius) -> ray_sphere' orig dir center radius | Group (center, radius, a, b, c, d, e) -> let f s = intersect' orig dir s in ray_sphere' orig dir center radius && (f a || f b || f c || f d || f e) let neg_light = unitise { x = 1.; y = 3.; z = -2. } let rec ray_trace dir scene = let hit = intersect dir {l=infinity; nx=0.; ny=0.; nz=0.} scene in if hit.l = infinity then 0. else let n = {x = hit.nx; y = hit.ny; z = hit.nz} in let g = dot n neg_light in if g < 0. then 0. else if intersect' (hit.l *| dir +| delta *| n) neg_light scene then 0. else g let fold5 f x a b c d e = f (f (f (f (f x a) b) c) d) e let rec create level c r = let obj = Sphere (c, r) in if level = 1 then obj else let a = 3. *. r /. sqrt 12. in let rec bound (c, r) = function Sphere (c', r') -> c, max r (length (c -| c') +. r') | Group (_, _, v, w, x, y, z) -> fold5 bound (c, r) v w x y z in let aux x' z' = create (level - 1) (c +| {x=x'; y=a; z=z'}) (0.5 *. r) in let w = aux (-.a) (-.a) and x = aux a (-.a) in let y = aux (-.a) a and z = aux a a in let c, r = fold5 bound (c +| {x=0.; y=r; z=0.}, 0.) obj w x y z in Group (c, r, obj, w, x, y, z) let level, n = try int_of_string Sys.argv.(1), int_of_string Sys.argv.(2) with _ -> 9, 512 let scene = create level { x = 0.; y = -1.; z = 4. } 1. and ss = 4 module String = struct include String let init n f = if n=0 then "" else let s = String.make n (f 0) in for i=1 to n-1 do s.[i] <- f i done; s end let raster y = y, String.init n (fun x -> let g = ref 0. in for dx = 0 to ss - 1 do for dy = 0 to ss - 1 do let aux x d = float x -. float n /. 2. +. float d /. float ss in let dir = unitise {x = aux x dx; y = aux y dy; z = float n } in g := !g +. ray_trace dir scene done done; let g = 0.5 +. 255. *. !g /. float (ss*ss) in char_of_int (int_of_float g)) let rasters d i = Array.init ((n+d-i)/d) (fun j -> raster(d*j+i)) let invoke (f : 'a -> 'b) x : unit -> 'b = let input, output = Unix.pipe() in match Unix.fork() with | 0 -> Unix.close input; let output = Unix.out_channel_of_descr output in Marshal.to_channel output (try `Res(f x) with e -> `Exn e) []; exit 0 | _ -> Unix.close output; let input = Unix.in_channel_of_descr input in fun () -> match Marshal.from_channel input with | `Res x -> x | `Exn e -> raise e let ( |> ) x f = f x let map (f : 'a -> 'b) a : 'b array = let n = Array.length a in let aux i x = if i<n-1 then invoke f x else let f_x = f x in fun () -> f_x in Array.mapi aux a |> Array.map (fun f -> f()) let () = let d = 4 in let sort a = Array.sort compare a; a in let image = Array.init d (fun i -> i) |> map (rasters d) |> Array.to_list |> Array.concat |> sort |> Array.map snd in Printf.printf "P5\n%d %d\n255\n" n n; for y = n - 1 downto 0 do print_string image.(y) doneJoel Reymont asked and Jon Harrop answered:
> I must have missed the JoCaml version. Where is it? http://jocaml.inria.fr/#examples There is also a JoCaml mailing list for JoCaml-related ramblings, and it could do with more rambling. ;-)
It is my pleasure to anounce a new JoCaml example, http://jocaml.inria.fr/#examples The example is a distributed version of a ray tracer written as en entry at the ICFP Programming contest. This is another example of the fork/exec technique (here used twice, for forking computing clients and spies that open a graphical window to display the images as they are computed.) This is also an example of non-obvious synchronisations written in JoCaml. Two points are worth a few words: * When a remote agent dies, its share of work is given to someone else [easy] * At the end of an image, tasks that are not yet done, are given to other remote agents (if any). This avoids the situation where everyone is waiting for the slowest agent to complete. (try ^Z on one of the remote agent...) Of course, this is not production code, but it apparently works... ** JoCamls 'R Us: A not so simple ray tracer in JoCaml ** This program is an example of making a pre-existing OCaml program distributed thanks to JoCaml. It is intended to illustrate the 'distributed loop' idiom of JoCaml in a realistic context. Information on JoCaml is available at http://jocaml.inria.fr/ The pre-existing program is the ray tracer written by the Camls 'R Us team at the ICFP 2000 programming contest. The scenes are described using a rather sophisticated language described on the contest page http://www.cs.cornell.edu/icfp/ We all know that ray tracers are 'concurrent by nature', but some interesting issues remain: such as load balance, management of failures... Those points are handled in files renderServer.ml and renderClient.ml
The new version of ODT has been released this morning. This version replaces the broken ODT 1.1 release. The 1.1 UI plugin was built incomplete because I forgot to update the plugin build script: the UI plugin did not contain the indentation configuration files. Now, ODT works as expected. It supports execution via the Java 1.5 JVM and, most of all, automatic indentation. Everything is available on the ODT website: http://ocamldt.free.fr. The "overview" page explains its current main features, and the "install notes" page details some requirements and incompatibilities with older versions of OCaml. Some screenshots are also available to show the GUI. Please, don't hesitate to try ODT (for personal or professional use) and forward this mail to anyone which could be interested in.
At http://www.boblycat.org/~malc/apc/ you will find CPU load monitor written in OCaml (sans some C glue). New things since last announcement include: The site was updated with a bit more sane documentation along with few links to Linux kernel mailing list discussions on the subject that would explain why sometimes native accounting can not be trusted (since 2.6.21 you can find a summary in Documentation/cpu-load.txt) There's also a small test application that showcases the problem. Ports to: Windows NT Mac OS X (only UP/PPC tested) Solaris (only X86 tested). While Solaris' CPU accounting seems accurate the same can not be said about Mac OS X and NT. With a help of third party application one can even begin to trust the accounting on NT (consult README.windows). So Mac OS X is in the most dire state at the moment. I'd imagine that writing XNU/X86 kernel extension to do more or less what's done under NT/X86 would be rather easy (alas i don't have access to this kind of system), XNU/PPC it seems would require completely different approach. Any help with NT, XNU would be greatly appreciated.
Here is a quick trick to help you read this CWN if you are viewing it using vim (version 6 or greater).
:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1
zM
If you know of a better way, please let me know.
If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a look at the archive or the RSS feed of the archives.
If you also wish to receive it every week by mail, you may subscribe online.