Last updated: 11 Feb 2004
I start with a very brief overview of the Erlang language, a functional language designed to support robust, reliable, distributed, near real-time applications, with the emphasis on the features the runtime must support. I then briefly list other efforts at developing an Erlang compiler. Next I discuss my work on developing the new runtime to support the EC compiler, and some of the problems and issues that arose in doing this. These include the representation and handling of Erlang terms and its consequences for garbage collection; the use of detached system pthreads to implement Erlang processes (and why we could not use the standard cancellation mechanisms); the use of the dlopen library to implement dynamic module loading; and some issues with external interaction. I conclude with some open issues to be resolved and areas for further research.
These are some of the key features needed in Erlang to support robust, reliable, distributed, near real-time applications.
Echo = spawn(echo,loop,[]), Echo ! {self(),"Here is a message"}, .... loop() -> receive {From,Msg} -> From ! Msg, loop() end. .... Echo = spawn(SomeNode,echo,loop,[]), .... Word = string:sub_word(" Hello old boy !",3), apply(io,write,[Word]),
These code fragments illustrate: process spawn, nb return of pid then used to send message to new process; the echo deamon code; remote spawn as easy as local; runtime binding of code refs.
Magnus a proposal is to develop a massively scaleable computing platform based on parallel computation using message passing & non-blocking interconnect based on WDM passive optical switch & DMA speed data transfers.
Sketch history of EC and links to Magnus proposal.
[tag|rights, index, nodeid, private]
Capability type key safe erlang component. Id is monotonically increasing integer index within node. Rights a subtype dependent bitmask, mostly map to BIFs for type + generic capability ops. Provides an index into table of resources on some node. Protected from change by hash/pw which only need be checked on originating node, hence no issues with key distribution.
Somefun = fun Name/Arity, Res = Somefun(Arg1, Arg2),
[type|addr|mod|fun|arity|initvar]
[type|len|data...]
concat
& split
BIFs
[len|add]
pairs
A good garbage collector is critical, but must know how to find all potential data references, and be able to suspend executing code whilst sweeping for these - highly non-trivial in a threaded env! The BDW GC was chosen as it had good reviews, is being actively developed, and is the standard GC bundled with GCC (though an older release). EC is currently using a global heap for efficiency in message passing (below). Note that OSE & Gerl both use thread-local heaps and hence copy on message passing. Hence get much better efficiency on the very common message passing operation, at cost of much harder collection since must stall ALL threads.
A key feature of Erlang is support for cheap and efficient process spawning. Hence we clearly needed to support some form of light-weight processes. For greatest portability POSIX pthreads were chosen. However hit portability issues with variations between implementations: in Makefile flags & libs, thread ID, concurrency (below). In end have only used a fairly minimal, and hence widely supported, set of pthread primitives.
Having chosen to use pthreads, it seemed an obvious choice to use the thread cancellation mechanisms provided, despite warnings that "cancellation was extremely problematic and fraught with danger". They were right (took 2 months to admit defeat!). In end effectively implemented user-space deferred cancellation.
eptr
)
M_F_A
apply(io,write,Args)
In EC the decision was made early to use standard C calling conventions, to maximise interoperability with code compiled in other languages. All function parameters and its return value had to be Erlang terms. Key issue was handling varying length arg lists at runtime.
dlopen()
et al
For dynamic code loading in interest of maximum portability, used the standard dlopen library. Although nonstd on MacOSX the 3rd party dlcompat library was used (now part of OS). Also issue of different Makefile flags on every platform! The module table is initialised with a module_info record for the erlang module, a placeholder for all BIFs implemented as part of the runtime - this is automatically generated from the master BIF list used by the compiler - sed is wonderful!
The last critical element of the runtime was providing support for interaction with other Erlang nodes, thus implementing the distribution features of Erlang which make the language so powerful. Need to implement suitable external representations (done as part of binary support, but two variants to handle EC extensions or keep compatibility).
External interaction consists of locating remote nodes, which usually requires interaction with EPMD ont he target system; and then implementing the distribution protocol, which negotiates flags and performs authentication (based on known shared secret "cookie"), and then leaves a connection open over which messages & signals flow. Based our code on Tobbe's erl_interface C_node code (part of OSE distro, suitably adapted for EC term representations and system interaction.
Some additional slides on Erlang and my proposed Safe-Erlang extensions.
fac(N) when N>0 -> N * fac(N-1); fac(0) -> 1.
Echo = spawn(echo,loop,[]), Echo ! {self(),"Here is a message"}, .... loop() -> receive {From,Msg} -> From ! Msg, loop() end.
Echo = spawn(SomeNode,echo,loop,[]), ....
<Type,Rights,Value,Node,Private>