Last updated: 7 Aug 2002
EC is a compiler which generates native machine code for Erlang Programs. It is intended to be both the core compiler technology in the Magnus massively scaleable computing platform, as well as a means to generate standalone executables of Erlang programs [Cast01].
The Safe Erlang extensions are a proposal to improve the safety of Erlang when used for mobile code applications amongst others. The proposed extension provide for a hierarchy of nodes within an Erlang system; the use of capabilities for nodes, processes, ports, and user capabilities; and some support for remote module loading in context. Details are provided in [BrSa99], with details of the original prototype in [Bro97d].
Currently EC only includes single process core Erlang syntax. Clearly it needs to have support for multiple processes and distributed nodes in order to implement some of the key features of Erlang. The following discussion attempts to codify some of the design choices currently proposed to implement these.
Will use a heavy weight address space to implement a node (ie a Unix process or similar). This will provide the necessary degree of protection between nodes (acknowledging that not all nodes will necessarily be running Erlang code, some may have C or other compiled code with an Erlang shim).
A node has a core data structure that defines information about itself. It manages the capabilities used to refer to and constrain access to all potentially unsafe resources used by its processes. It also manages it registered names table (which is linked up to higher level tables, see discussion below).
When a node is created (Unix process starts running), it has to initialise its key data structures, and then create a thread which starts executing the program defined for that node. It also needs to create a thread(s?) to handle messages sent to that node (to be forwarded to threads running in it, or for the node itself to handle). The main process thread then monitors all other threads created within it (relaying details as needed to threads in other nodes which are monitoring its threads). It also will monitor its subnodes (both those it creates, and any inherited, see below).
Currently propose implementing the "custom context" by changing which library modules are linked in (either statically or dynamically by changing the load library path) to alter the functionality implemented by various key functions (corresponding to unsafe BIFs).
Nodes have a name, which is an atom. Currently (in distributed Erlang)
nodes are named as: localname@hostname
. Probably need
some extension of this to identify multiple systems which comprise
a single DNS name, eg as in Magnus).
A subnode is just a node, but is distinguished from a topnode by having a defined parent node, and possibly having a initialised (pre-loaded) values for key node tables (esp registered names table).
A subnode must run on the same system as its parent. This is to allow for efficient sharing of information (such as the linked registered names tables) and message passing between nodes on a system.
A subnodes name is created as an extension/refinement of its
parent's name, vis: child.parent@hostname
.
It is an open question whether a system can host more than one topnode. On a general purpose O/S this is probably reasonable and needed (and would require the equivalent of the OTP epmd to relay comms between). On the Magnus Erlang engine, I'm not sure (and this will be related to the structuring of the O/S such as it is on these systems - whether there is a manager/init process which is distincy from being an Erlang node).
Also open is how to handle failure/termination of a subnode. This should certainly be signalled to its parent node so action may be taken. I'm not sure whether the death of a subnode should imply the death of all child subnodes of it, or whether child subnodes should be inherited by the parent. If the latter, this probably implies that information about all child subnodes should be given to the topnode to allow management of this.
Had some discussion about how to handle garbage collection (currently missing in EC). This impacts the detailed design for node & thread structures, esp as to how stacks & heap(s) are handled.
Agreed that using an existing GC library was by far the best idea. Must be thread-safe (work with pthreads), and suit the Erlang language semantics. TBD.
Erlang processes are intended to be lightweight, cheap execution threads. Pretty clear that this matches best onto a thread implementation. May be an issue of whether have sufficient threads available in a node (Unix process / address space) - TBD.
Also matches concept that a node defines a trust boundary, hence makes sense for threads to share the same address space. Can rely on Erlang language semantics, as implented by a trusted compiler to enforce this. And if not running Erlang code, then clearly need to be in a separate address space (Unix process), probably with more restricted access to other nodes.
A process has some core data structures which defines information about itself and its parent node and process (which may not be on the same node or even system, given Erlang's ability to spawn remote processes).
Need to think about how to handle monitoring of other processes (and nodes), both local (almost certainly just relying on existing signals) and remotely (need to mediated by node management threads on remote system and messages being passed back).
Had quite a bit of discussion about name handling, esp wrt handling of "global" names and deficiencies in current approach.
Decided wanted to have several registered name tables (or alternatively a series of linked tables) which were concatentated so as seen as a single table to Erlang threads, with more local names overriding more global names.
Have a couple of possibilities for forming the hierarchy:
Whichever is chosen, it needs to be efficient, which suggests having all accessible tables visible in shared memory to all nodes/threads. Which is one of the reasons suggesting that subnodes must be on the same system as their parent nodes.
Another long discussion. Agreed that a capability is an unforgeable reference to some resource along with a list of rights granted to access/use that resource. For efficiency of checking & handling it needs to be a fixed size. Must comprise at list details of the node to which it belongs, and index into the relevant table of resources the rights mask, and the validating info.
In the Safe Erlang proposals I had defined a capability as:
{Type,NodeId,Value,Rights,Private}
Agreed that this looks good so far, and refined the components as follows:
Maurice wants to drop the concept of ports as currently exist in OTP Erlang entirely. Argument is that since we compile these are not needed, can call library routines directly (though precise details need to be decided upon). This implies we replace the current concept of a port, with a range of more specific capability types:
Currently looking at implemnting a node as a heavy-weight Unix process in its own protected address-space.
Need to define what data defines a node. From the discussion of capabilities above, clearly need tables of each type of resource indexed by them (file, module, network, node, process).
In the Safe Erlang prototype the following were defined:
%% ninfo is a record containing all the information a node needs -record(ninfo, {name=noname@nohost, % Name of this node self=bad_capa, % Capability for this node parent=bad_capa, % Capability for parent node names=[], % Names Table [{Name,Capa*] modules=[], % Modules Table [{Name,RealName}*] subnodes=[], % Subnodes Table [CNode*] processes=[], % Processes Table [CPid*] processes_count=0, % count of processes in node monitors=[], % Monitoring Processes Table [Pid*] p_rights=0, % Process Rights for this node flags=[], % Node Flags [Flag*] capa_mod=?capa_mod, % Capa mod (sserl_hcapa|sserl_pcapa) capa_state, % Capability State (private to capa_mod) capabilities=[], % Capability Table (private to capa_mod) status=dead, % Status - alive | dead | halt ticker=noticker % Ticker Pid | noticker }).
Also need to define what information is placed in shared memory (the registered names tables and communications buffers for the node at least).
Currently looking at implementing threads (erlang processes) using the Posix threads library (pthreads).
Need to define what information a process needs about itself.
In the Safe Erlang prototype the following were defined:
%% pinfo contains SSErl info needed by processes -record(pinfo, {self=noproc, % Capability for self() node={}, % Capability for our parent node init={}, % Initial call for this process p_rights=0, % Process Rights from node info flags=[], % Flags [flags*] from node info modules=[], % Modules [{Name,RealName}*] from node info rem_mod=[] % Remote Modules table [{Name,Mid}] }).
Want a simple, fixed-sized format for capabilities. Based on the fields listed above, suggest the following form:
+--------+---------+--------+-------------+---------+ | Type | Rights | Value | Node | Private | | bits/5 | bits/27 | int/32 | atom ptr/64 | int/64 | +--------+---------+--------+-------------+---------+
This allows type+rights to pack into a 32-bit word, value is a 32-bit word, node is a pointer to an atom on the heap (64-bit?) and private is an opaque 64-bit value.
This allows type+rights to pack into a 32-bit word, value is a 32-bit word, node is a pointer to an atom on the heap (64-bit?) and private is an opaque 64-bit value.
Had some discussion about representation of atoms (would perhaps like just a smaller atom table reference, but Maurice doesn't like the concept of an atom table - think this needs to be resolved) TBD.
Need to think about how will access non-Erlang functions inline. Should this be provided at language level or only via run-time support libraries. If the former (which I think is probably wanted) there are likely serious safety issues since can't enforce control of address space for non-Erlang code.
Likely mechanism is to just allow arbitrary mode:fn()
calls inline,
which can be resolved to any compiled code module, Erlang or other.
Provide controls by limiting which compiled modules can be linked in
(statically or dynamically) to the node.
Likely further down the track, but need some ideas about how will handle code distribution to systems without local file systems, and also replacement of code on running systems.
Note - EC currently only has static linking, need to add dynamic libraries and implement apply BIF.
With dynamic linking, look at sourcing library as a binary message rather than from file system. Probably relates to how do safe code distribution as well.
Probably need at least two forms: distribution of "trusted" compiled binaries (need some form of signature?), as well as distribution of modules in intermediate form which can be compiled locally using trusted compiler and then loaded.