Introducing SERCs Safer Erlang
by
SSErl - Prototype of a Safer Erlang
Dr Lawrie Brown
School of Computer Science,
Australian Defence Force Academy,
Canberra, Australia
Email: Lawrie.Brown@adfa.edu.au
Last updated: 24 April 1997.
Abstract
In order to support outsourced and third party telecommunications
applications, there is a desire to modify the Erlang language and
execution environment to provide safe and partitioned execution of
externally sourced or outsourced programs which are imported and
run on a local Erlang system.
This paper outlines a possible design approach, and describes the
initial prototype.
Introduction
Erlang is a declarative language for programming
concurrent and distributed systems which was developed at the Ericsson
and Ellemtel Computer Science Laboratories
[AVWW96], [Arms96],
[Wiks94]. It is a dynamically typed, single
assignment language which uses pattern matching for variable binding
and function selection, has explicit mechanisms to create concurrent
and distributed processes, and advanced facilities for error detection
and recovery.
Mobile Code is code sourced from remote, possibly
"untrusted" systems or suppliers, but imported and
executed on a local system.
Consequently such code needs to be executed within some form of
"constrained" or "sandbox" environment to
protect the local system from accidental or deliberate inappropriate
behaviour.
Given the anticipated rapid growth in telecommunications applications
software, there is expected to be a rapidly increasing need to
support third party outsourced code being executed on trusted systems.
It is believed this can be done with an acceptable level of safety by
the use of containment methods being developed to support the
concept of mobile code, as used in
Java [GM95],
SafeTCL [OLW96],
Omniware [LSW95],
and Telescript [Tar95],
amongst other systems
(see overview in Brow96b).
The approach being considered here is support the execution of
a number of mutually untrusted (and untrusting programs) within
a dedicated Erlang node. This involves partitioning the node into
a collection of separate "subnodes", which provide a restricted
execution environment or "sandbox", along with controlled access
to processes in other "sandbox" environments. Each of these
"sandboxes" form a separate security domain, where the operations
available will be constrained by an appropriately chosen security
policy. Different policies can be enforced in different "sandboxes".
Making Erlang Safer
The Erlang language provides a number of inherent benefits. Its
dynamic typing and single assignment prevent many classes of errors.
The main additions involve controlling access to resources used to
create and communicate with other processes, and to external devices.
Currently this is through the use of Pids and Ports, with few
restrictions on their use. Also, there is a need to partition a
single "hardware" erlang node into a number of subnodes, each
with a custom view of the world (in terms of registered names
and which modules are available and used).
I propose protecting the former (Pids and Ports), as well as Nodes, by
making them password capabilities
[APW86]. In a password capability system, the
capability is a data item which indicates the entity owning it (a node
in this case), and a random value (selected sparsely from a large
address space). An appropriate capability must be supplied, explicitly
or implicitly, in order to perform most "unsafe" operations (which in
Erlang involve the use of BuiltIn Functions - BIFs). The user is free
to try and forge a capability, but it is statistically highly
improbable that they'll create a valid one. The capability has no
meaning on its own, but is only of use when supplied to its owner (a
node) along with a request for some operation. One advantage of
password capabilities is the ease of revocation, by removing it from
its owning entity's list of currently valid capabilities. Any process
subsequently trying to use it will fail with an invalid_capability (as
also occurs if a forged capability is used).
For the latter, I propose creating a concept of
subnodes to provide custom views. Each subnode should
provide a "context" for processes executing in it. It provides
distinct "registered names" and "module alias" tables. The registered
names can be modified by processes with an appropriate subnode
capability, and is used to send messages to named servers. The "module
alias" table is specified when the node is created, and is used to
alias module names at run-time when functions are invoked. By
appropriate customisation, modules executing within a subnode can be
provided with a custom view of modules used and servers available.
Prior Work
A system with similar goals, though with a stronger focus on code
mobility, was SafeErlang, developed by Gustaf Naeser et al.
at Uppsala [Nae97a], [JNS97].
SafeErlang incorporates three new concepts. Encrypted
capabilities are used for pids and nodes, but not (yet) for
open_port. This seriously limits its ability to protect key standard
library routines, and accesses to external resources. Also the use of
encryption introduces problems with appropriate key distribution in
distributed environments, along with the selection of appropriate
cryptographic algorithms. Subnodes are used to
provide a custom collection of modules (with names rewritten at compile
time), and to provide resource limits on processes executing in the
subnode. Lastly, a new module loading (mid) mechanism
is supplied to support code mobility. A new "code" server is used to
manage the code (mid) distribution as required. In their current
system, this necessitates a recompilation of source every time the
module is loaded (in part to cope with varying module names needed,
depending on which subnode it is being loaded in to).
The focus of their project was to support mobile agents, reflected in
the emphasis on providing new module management functions, and providing
resource limits for subnodes. For this, as reported in
[JNS97], it has been successful.
However I found the arrangement of servers used, and the division
of responsibility for implementing various operations, to be
unnecessarily complex. In SafeErlang, each "real" erlang node (system) has
a "code" server which manages the new module loading mechanism;
a "gate" server which manages the keys for all the subnodes on that system;
and a "name" server which implements the replacement registered name system.
Also for each (sub)node on the system, there is a "node" server process
which manages the information for that subnode.
Every time an "unsafe" BIF is invoked in a user process,
a request is sent to the "gate" server which decrypts and checks the
capability supplied. Usually a message is then sent from the "gate" server
to the relevant node manager which actually
implements the requested operation. Consequently the node managers
must explicitly manage all the requested links between processes.
Responses are returned via the same path.
I believe this involves an unnecessary amount of message passing and
state maintenance, and that a simpler and cleaner design is possible.
SERCs Safer Erlang Prototype
SERCs Safer Erlang (SSErl) is a prototypical implementation of what I
believe to be a simpler and more comprehensive design of a secure
erlang execution environment. It is less concerned with module
mobility (though that should be added later by providing a custom error
handler along with the use of the module alias mechanism), than with
providing a comprehensive and elegant implementation of capabilities
and subnodes with as few changes as possible to the existing erlang
language specification. It is currently implemented as a collection of
glue functions substituted by a modified erlang compiler for all calls
of "unsafe" BIFs. These interact with "node" server processes, one for
each distinct (sub)node on the erlang system. The prototype supports
capabilities for pids, ports, and nodes; and a hierarchy of subnodes on
each erlang system. Its uses a modified compiler (adapted from that
developed by Naeser [Nae97a] for the SafeErlang
system).
Most of the glue functions have the form:
- check with the node manager to see if the operation is permitted by
the capability
- if so some key information is returned (generally a pid or a capability)
- perform the desired operation (in the users process) if necessary
Some routines require two capability checks (generally to see if the
executing process is permitted to do, and the target capability
permits, the desired operation). Functions to spawn new processes, open
a port, or create a new subnode are also a little, but not much, more
complex, since they must initialise some data structures. I believe
this general structure mirrors fairly closely the logic that should be
followed if these features were to be implemented within the Erlang
RunTime System (ERTS) for production use.
Each SSErl process maintains a record of information which includes:
- self
- a capability for itself, which determines which (potentially unsafe)
operations the process is permitted to perform.
- parent
- a capability for its parent node, used to restrict rights for newly
created subnodes, and some other operations.
- group_leader
- a capability for the process group_leader (used for I/O).
- apply check function
- the name of the function which may (optionally) be called by the
apply glue routine before any external function calls to validate those calls,
described in [Bro97b].
Copied from the parent node state table.
- aliases
- the list of module aliases, used redirect external function calls
from "well-known" library modules to safer variants.
Copied from the parent node state table.
This record is stored in the process dictionary,
and is protected by modified put and erase
functions from changes outside the sserl glue routines.
Capabilities
In the prototype, a capability is a tuple {Type,Node,Ref,Rand},
where Type is one of capapid|capaport|capanode|caparef;
Node is the name of the (sub)node which created the capability;
Ref is an erlang reference value; and Rand is a random number.
The latter two provide the statistical protection for the capability.
The capability has no intrinsic meaning until it is supplied to the
owning node manager. The node manager maintains as part of its state
a list of [{Capa,Value}*] which maps a capability to its value.
Value is a tuple of the form {Class,Val,Rights}, where
Class identifies the object referenced as one of process|port|node|{user type};
Val is the process id, port no, node manager process id, or user
supplied value (atom or integer);
and Rights is the list of access rights.
Currently the access rights supported for the various classes are:
- process
- db, exit, garbage_collect, group_leader, kill, link, local, open_port,
priority, process_info, register, restrict, revoke, send, spawn, spawn_link,
trace, trap_exit, unlink, unregister, view
- port
- link, local, restrict, revoke, send, unlink, view
- node
- delete_module, info, link, load_module, local, net_kernel, newnode,
processes, restrict, revoke, shutdown, spawn, spawn_link, unlink, view
- ref
- local, restrict, revoke, view
Most of these correspond to permitting the BIF of the same name
(or the process_flag for trap_exit or priority).
Rights specific to capability manipulation include:
- info
- permits access to node state information;
- local
- indicates that name registrations may not be made globally available;
- restrict
- permits restriction of the capability
- revoke
- permits revokation of the capability (provided it is a restricted variant).
- view
- permits viewing of the capability value {Class,Val,Rights}.
An appropriate capability must be supplied either explicitly (as a
pid/port/node argument), or implicitly (from the processes knowledge of
its own capability, or its parent node's capability),
in order to perform most "unsafe" BIFs.
A capability is created whenever a process is spawned, a port is opened,
a subnode is created, a reference is made,
or an existing capability is restricted.
They may also be created with very limited rights for existing processes
outside the SSErl environment as part of its initialisation,
or to correspond to pids from a list_to_pid BIF.
Capabilities are destroyed (removed from the relevant node table)
when the associated object (process, port or node) dies.
User capabilities (references with a user supplied type and value) are intended
to assist in providing finer control for file accesses, I/O device accesses,
or other potentially sensitive operations.
Subnodes
SSErl subnodes provide a distinct context for processes executing
within them. Each subnode has a "node" server process which maintains
the state for that subnode. The servers are registered by their node
name in the "real" erlang system registered names table.
This allows glue functions executing in user processes
to communicate with a specified node server (as given by
the node name embedded in a capability). This also allows access to
non-local node servers by sending a message to '{node host}'.
The state managed by the server process for each subnode includes:
- name
- the name of the (sub)node as an atom, extended from the system name
- self
- a capability for itself (defined in its own capability table).
- parent
- a capability for its parent (defined in its parents capability table)
- capability table [{Capa,Value}*]
- maps capabilities to their associated real data (pid) & rights
- registered name table [{Name,Capa,Pid}*]
- maps names to a process capability, thus permitting
different subnodes to have the same name referencing different processes,
allowing custom variants of standard services to be provided
- module alias table [{Name,Alias}*]
- remaps module names at runtime to an alias name,
allowing different subnodes to direct
the same module name to different actual modules. Currently this
uses an exact match, it may be extended to include a prefix-match,
allowing name extensions on all otherwise unmatched names, if desired.
- subnode table [{CNode,Pid}*]
- provides a list of all subnodes which are children of this subnode
- process table [{CPid,Pid}*]
- provides a list of all processes belonging to the subnode
- prototypical process rights
- used to restrict rights for newly created processes in the node
- apply check function {Mod,Func}
- the name of the function which may (optionally) be called by the
apply glue routine before any external function calls to validate those calls,
see [Bro97b].
Glue Functions
Glue functions have been provided for a number of "unsafe" BIFs
in order to implement the capability and subnode functionality,
and to impose more stringent checks on the right to perform these functions.
These "unsafe" BIFs may be catagorised in groups as:
Apply, Spawn, Module, Node, Process, Misc and Db.
There are also some new BIFs to manipulate capabilities and subnodes.
Calls made to these BIFs are replaced by the modified compiler
(more specifically by the sys_pre_expand module),
eg. erlang:spawn(M,F,A) becomes sserl_bifs:k_spawn(M,F,A).
As mentioned earlier, most of the glue functions have the form:
- check with the node manager to see if the operation is permitted by
the capability
- if so some key information is returned (generally a pid or a capability)
- perform the desired operation (in the users process) if necessary
For example, the link
glue routine is:
k_link(CPid) ->
Pid = node_request(check,CPid,link),
link(Pid).
By category, the glue functions are:
Apply
- apply({Module,Function}, ArgList)
- apply(Module, Function, ArgList)
- both safe and unsafe calls could result from a call to apply.
Nested applies are checked, safe variants are allowed, unsafe variants
are rewritten, the apply check function is called (if specified),
the module names are aliased, and finally the desired function is called.
A process capability is returned.
Spawns and Open_Port
- spawn(Module, Function, ArgList)
- spawn_link(Module, Function, ArgList)
- these check that the requesting process has the right to, and the
parent node permits, the spawn to occur. The new process is created
and initialised. A process capability for the new process is returned.
- spawn(Node, Module, Function, ArgList)
- spawn_link(Node, Module, Function, ArgList)
- these spawn the process on a separate subnode (on either the same
or a different underlying erlang system), returning a process capability.
- open_port(PortName, PortSettings)
- creates a new port, and returns a capability for it, if the
process is permitted. Unfortunately this affects communications with
the port, since the real underlying Pid is used as part of the
communications. Currently there is no easy way to catch and rewrite
this code, so port interaction code must be rewritten slightly to work
with the SSErl system.
Module
- delete_module(Module)
- checks the process is permitted to, and aliases the module name
before deleting the module.
- load_module(Module)
- checks the process is permitted to, and aliases the module name
before loading the module.
- purge_module(Module)
- checks the process is permitted to, and aliases the module name
before purging the module.
- check_process_code(Pid, Module)
- retrieves the real pid, aliases the module name and is permitted.
- function_exported(A1,A2,A3)
- is permitted as is.
- module_loaded(Module)
- aliases the module name and is permitted
- pre_loaded()
- is permitted as is.
Node
- alive(Name, Port)
- blocked as redundant
- disconnect_node(Node)
- blocked as redundant
- get_cookie()
- permitted, but should become redundant
- set_cookie(Node, Cookie)
- permitted if self has net_kernel right, though should become redundant
- is_alive()
- permitted, but redundant
- monitor_node(Node, Flag)
- permitted if self has net_kernel right
- newnode(Name)
- newnode(Parent,Name)
- newnode(Parent,Name,{Node_Rights,Proto_Rights,Names,Aliases,Options,Apply_Chk})
- new BIFs to create a subnode of the parent node, if processes parent node
capability permits. Any option values not supplied (given as nil) will be
inherited from the parent node.
- node()
- returns the name of the parent node of the process
(should really be a capability)
- node(Arg)
- returns the name of the node which created Arg
(should really be a capability)
- node_info(Node)
- new BIF to return the node state information if Node permits info.
- nodes()
- returns a list of node names (should really be capabilities)
- halt()
- shutdown(Node)
- new BIF to terminate a node (halt defaults to parent node) if
shutdown is permitted.
If the topnode on an erlang system is shutdown, then the entire system
is halted.
Process Communications
- exit(Pid, Reason)
- used to cause a process to exit if permitted
- kill(Pid)
- used to kill a process if permitted
(equiv to exit(Pid,kill) but with a separate right)
- group_leader()
- returns the capability of the group leader process
- group_leader(Leader, Pid)
- used to set the group leader of a process if permitted
- link(Pid)
- links to a process if permitted
- list_to_pid(List)
- converts a list to a pid, returning a (very restricted) capability
- pid_to_list(Pid)
- converts the pid referenced by the capability to a list if view permitted
- processes()
- processes(Node)
- returns a list of all process capabilities executing on a node
(defaults to parent node).
- process_flag(Flag, Option)
- used to set the trap_exit or priority flags, if self has the
corresponding right. Altering the error_handler is not permitted.
- process_info(Pid)
- process_info(Pid, Key)
- retrieves process information if permitted.
- register(Name, Pid)
- used to register a name and corresponding capability
on the parent node if both the capability and self permit.
Any form of capability may be registered.
Currently names are strictly local. A global server may be supported later.
- registered()
- retrieves a list of registered name and capability pairs.
- self()
- returns the processes own capability.
- send(To,Msg)
- BIF which handles the message send operation " To!Msg".
"To" may be a locally registered name, a remotely registered name and
remote node capability, or a process capability;
which identifies the process to receive the message.
- unlink(Pid)
- unlinks from a process if permitted.
- unregister(Name)
- unregister(Pid)
- unregister the name or corresponding pid if it permits.
- whereis(Name)
- returns the capability referenced by the registered name on the
parent node.
Capability Functions
- check(Capa,Op)
- new BIF to check if the capability permits op.
- make_ref()
- make_ref(Type,Val)
- creates a reference capability. The latter call associates a user
Type (an atom) and Val (an atom or integer) with the reference capability
for use as a user capability.
- restrict(Rights)
- restricts the list of rights for a processes self capability -
does not create a new capability.
nb. the new list of rights will be the intersection of the existing
and supplied lists of rights.
- restrict(Capa,Rights)
- new BIF to create a restricted version of an existing capability.
nb. the new list of rights will be the intersection of the existing
and supplied lists of rights.
- revoke(Capa)
- revoke(Capa,Master)
- new BIF to revoke a capability restricted from self or Master.
- same(C1,C2)
- new BIF to test whether the two capabilities (which may be restricted
variants) refer to the same underlying object (eg pid or port).
- view(Capa)
- new BIF to access the information the capability refers to
(its real pid and access rights) if view is permitted.
Misc
- erase()
- put(Key,Value)
- permitted except that the sserl process information may not be erased
or modified.
- garbage_collect()
- garbage_collect(Pid)
- permitted if self or pid respectively permits garbage_collect.
- statistics(Type)
- permitted.
- trace(Pid,How,Flags)
- permitted if pid permits trace.
DB
All the various "db_*" BIFs are replaced by functions which check
self permits db before they are called.
This probably ought to be refined further, but has not been a
priority for this prototype.
Using the SSErl Prototype
The SSErl prototype is distributed a tar file which includes source
and precompiled jam files for the sserl, modified compiler, and changed
standard library modules. A README file is included which describes
the (minimal) customisation required. Also required is a working
Erlang 4.4.1 system (available from [Erla96]).
Once installed, a Unix shell script sserl
is used to start
the SSErl system. It is invoked as "sserl"
(normally),
or "sserl -verbose"
(for copious debug information).
It starts erlang in a distributed mode by default. It includes a
slightly modified Eshell which initialises the SSErl environment, and
executes any commands given using the modified apply in an sserl
environment. Thus all the new BIFs are available.
An alternate script nsserl
is provided which uses a custom
boot file start_sserl.boot
, which must be generated from
an appropriately customised copy of start_sserl.script
using mkboot:mkboot(start_sserl)
. This script is much
more dependent on the Erlang system structure.
A number of additional utility routines are provided in the
sserl
module, and have been incorporated into
shell_default
, and are available directly from the shell
prompt. These are all described in the shell help()
. Some
of the most useful include:
- info()
- info(Node)
- which display the node status information (rather long and verbose).
- ps()
- ps(Node)
- which lists all processes executing on a node
- names()
- names(Node)
- which lists all registered names on a node
- mknode(Name)
- safenode(Name)
- create a new unrestricted or limited subnode
Subnodes are created by the newnode(Name)
BIF (or the safety
policynode(Name,Policy_Module)
or safenode(Name)
library functions, see [Bro97b]).
A capability for the new node is returned. This may then be used with
spawn
to run processes in the node.
Some functions are provided in the test
module in the test
subdirectory, to exercise various aspects of the SSErl environment,
particularly focusing on the modified BIFs. See test:help()
for details of the various test functions.
An abbreviated sample sserl session is given in the listing below.
It assumes sserl was started in the test subdirectory of the distribution.
Some details have been omitted for brevity.
UNIX> sserl
Erlang (JAM) emulator version 4.4
Eshell V4.4 (abort with ^G)
SSErl Node 'lpb@galaxy.serc.rmit.edu.au' initialised.
(lpb@galaxy.serc.rmit.edu.au)1> help().
** shell internal commands **
... various standard output omitted
** commands in module sserl (SERCs Safer Erlang) **
init() -- Create top node (done by shell).
help() -- Displays this help.
info() -- Displays info about top node.
info(Cnode) -- Displays info about node.
... other help details omitted
(lpb@galaxy.serc.rmit.edu.au)2> info().
Node Info Details
Name 'lpb@galaxy.serc.rmit.edu.au'
Node Capa {capanode,'lpb@galaxy.serc.rmit.edu.au',#Ref,7330370}
Parent Capa topnode
Subnodes
Processes
{capapid,'lpb@galaxy.serc.rmit.edu.au',#Ref,2092775} -> <0.27.0>
{capapid,'lpb@galaxy.serc.rmit.edu.au',#Ref,2248609} -> <0.21.0>
Process Cnt 2
Process Rights [..rights..]
Capabilities
{capapid,'lpb@galaxy.serc.rmit.edu.au',#Ref,2092775} -> {process,<0.27.0>,[..rights..]}
{capapid,'lpb@galaxy.serc.rmit.edu.au',#Ref,5056377} -> {process,<0.15.0>,[..rights..]}
{capapid,'lpb@galaxy.serc.rmit.edu.au',#Ref,2583009} -> {process,<0.20.0>,[..rights..]}
{capapid,'lpb@galaxy.serc.rmit.edu.au',#Ref,2248609} -> {process,<0.21.0>,[..rights..]}
{capanode,'lpb@galaxy.serc.rmit.edu.au',#Ref,7330370} -> {node,<0.25.0>,[..rights..]}
Seed {667,25635,181}
Names
file_server -> {capapid,'lpb@galaxy.serc.rmit.edu.au',#Ref,5056377}, <0.15.0>
Aliases []
Options []
Status alive
Ticker <0.26.0>
ok
(lpb@galaxy.serc.rmit.edu.au)3> N1=safenode(saf1).
{capanode,'saf1.lpb@galaxy.serc.rmit.edu.au',#Ref,5591103}
(lpb@galaxy.serc.rmit.edu.au)4> P1=spawn(N1,test,test,[]).
{capapid,'saf1.lpb@galaxy.serc.rmit.edu.au',#Ref,5293704}
Test - simple test to see self - at time {15,55,9}
test: self {capapid,'saf1.lpb@galaxy.serc.rmit.edu.au',#Ref,5293704} -> {process,<0.32.0>,[db,exit,garbage_collect,group_leader,kill,link,local,open_port,priority,process_info,register,send,spawn,spawn_link,trace,trap_exit,unlink,unregister,view]}
Process List for node 'saf1.lpb@galaxy.serc.rmit.edu.au'
Capa# Pid Current Call
5293704 <0.32.0> {sserl_bifs,k_process_info,2}
(lpb@galaxy.serc.rmit.edu.au)5> ps(N1).
{capapid,'saf1.lpb@galaxy.serc.rmit.edu.au',#Ref,7545331}
Process List for node 'saf1.lpb@galaxy.serc.rmit.edu.au'
Capa# Pid Current Call
7545331 <0.33.0> {sserl_bifs,k_process_info,2}
5293704 <0.32.0> {test,snooze,1}
(lpb@galaxy.serc.rmit.edu.au)6> stop().
UNIX>
Programming in the Safe Environment
Programming in the safe environment should be very little different
from normal erlang coding, save that some operations may be restricted
when the code is executed.
Generally the new BIFs would only be used in creating a custom
environment, or in some utility modules (which handle display
of capabilities for example). As an example, the utility function
safenode(Name)
which is supplied as part of a suite
of utility functions in the safety
module, is listed
below (note it uses other utility functions from the safety
and ordsets
modules).
%% safenode/1 - creates a "safer" subnode of the parent
%% node rights exclude delete_module,load_module,net_kernel,newnode
%% processes within it may not use group_leader,open_port,priority
safenode(Name) ->
CParent = get_dict(node), % get parent capability
% restrict node rights list from parent for new node
Ri = view_rights(CParent), % get rights of parent node
NR = subtract(Ri,
list_to_set([delete_module,load_module,net_kernel,newnode])),
% restrict proto process rights for new node
St = node_info(CParent),
PR = subtract(St#ninfo.p_rights,
list_to_set([group_leader,open_port,priority])),
% add safe module aliases to Aliases table
NewAli = [{file,safe_file},{lists,safe_lists},{ordsets,safe_ordsets},
{random,safe_random},{string,safe_string},{unix,safe_unix}],
Ali = append(NewAli, St#ninfo.aliases),
% start the safe versions of daemons used by safenode modules
catch safe_file:start(),
% create new node with safer rights and custom world-view
newnode(CParent,Name,{NR,PR,nil,Ali,nil,nil}).
This also demonstrates the use of aliases. The safenode
library function uses the safe_file_server. It restricts file
access to the current directory only, but is accessed using the usual
file
functions, with the module name being appropriately
aliased.
Limitations of the SSErl Prototype
The current prototype has a number of limitations.
- most of the current library modules are not (yet) compiled in
the safe environment. Whilst they should run, they will not see the
custom subnode environment, nor will they handle capabilities
- performance will be reduced due to the extra layer of glue functions,
and the necessity to exchange messages with the node manager process
for all unsafe BIFs
- open_port functionality involves a visible change in use, since
the real underlying Pid must be used as part of the communications dialog
- display of capabilities for processes and nodes is obviously
different to what is seen at present (though presumably the io_lib
functions could be changed to hide this)
- values returned for nodes are currently inconsistent, some
functions return a node capability, others return a node name.
In part this is due to not having underlying node capabilities
in the net_kernel.
All of these limitation could be addressed by incorporating the
changes directly in a new version of the Erlang Run-Time System.
Incorporating SSErl in the Erlang RunTime Environment
Once experience with this prototype has verified the validity of this
approach to providing a safe erlang execution environment, it would be
much better to incorporate the changes into a new version of the ERTS.
Also at this time, further safety checks could be made on the manner
in which the ERTS has been written.
Capabilities and Subnodes in the ERTS
Capabilities should be a fundamental erlang data type, similar
to a reference. It would be uniquely tagged of course, and
should include some identifier for the subnode
which created the capability (probably an index into the atom
table, as I believe is done now for processes and references),
along with a random value selected unpredictably and sparsely
from a large possible space. This should use around 128 bits.
Any pseudo-random generator function used must be
seeded with information that is hard to predict externally
(ie some combination of time and current system data structures
at least, a true random source would be ideal though).
These capabilities would be used for processes, ports, and nodes,
as well as extensibly for other data requiring protection.
Subnodes should be added as a concept in the ERTS. They will primarily
involve a table of relevant status information for each distinct subnode,
along with some means of locating this table in the system both
internally and externally. The table will include much of the information
currently managed by the node manager processes in this prototype.
The capability table will map capabilities to their underlying
values and rights.
The process state information will need to be extended to include
capabilities for itself, its parent, and its group leader;
and probably a pointer directly to its parent node state table for
efficiency. Note the parent node capability need not be the same
as that recorded in the node state table, it may very well be a
restricted version of it.
All the BIFs implemented in the ERTS which involve potentially unsafe
operations will need to be rewritten to incorporate an appropriate
check of rights from the supplied (or inherited) capabilities
before proceeding.
Auditing BIFs and Standard Library Routines
All BIFs and standard library functions which are written in a general
programming language (eg C) will need to be audited for careless coding
practises which could be used to subvert the type safety of the
system. These have been found to be a major source of security flaws
in existing systems (eg see discussion on Java weaknesses in
[DFW96]).
This component will be time-consuming, but necessary to ensure safety.
Examples of poor style include any use of the standard C functions
gets, sprintf, strcat, strcpy; ie any functions which could overrun
a buffer supplied to them due to the absence of bounds checks on these
parameters. The basic requirement is that all parameters be checked
to ensure that bounds are not exceeded, that their values are sane,
and cannot cause a run-time execution fault.
Other Changes for Improved Security
Some other changes which could be considered in the ERTS include
the placement and implementation of the message buffer, and of
the process scheduler.
Currently there is a single message buffer shared by all processes in
an Erlang Node. A single buffer was chosen for efficiency and capacity
management reasons, but does leave all processes on the Node
susceptible to a denial of service attack. This could be created by a
rogue process flooding some server with mal-formed messages that are
not matched by any receive patterns in that server, and thus not
flushed from the buffer. With the implementation of subnodes,
consideration should be given to making the message buffer a component
of the subnode rather than the node. Also, to reduce the impact of
flood attacks, some mechanism for garbage collecting "old" messages,
perhaps with a caveat that only messages which have been checked
against a pattern and rejected some number of times are eligible. This
is similar to solutions proposed to overcome the current spate of TCP
SYN attacks on the Internet.
The process scheduler should also probably be modified with the
introduction of subnodes. Rather than share CPU cycles amongst
all ready processes, consideration could be given to allocating
shares to the various subnodes, and then dividing that amongst
all processes in a subnode.
Protecting Erlang from External Attack
If the ERTS is assumed to be safe from compromise (ie assume
that no-one will gain root type privileges on its host and
interfere with its address space(s) directly), then the only
mechanism for external subversion is via "spoof" messages being
sent to the port(s) associated with the net_kernel in distributed
Erlang implementations. At present, the only security mechanism
used is to require a suitable "cookie" be sent with each message
[AVWW96]. However, this is sent in the
clear, and is subject to eavesdropping, and subsequent masquerade
by an attacker.
In order to secure these messages being exchanged between distributed
Erlang nodes, it is necessary to either physically protect all
communications links used, or to employ cryptographic techniques to
secure the communications. Possible approaches to the latter involve
the use of a "digital signature" instead of a "cookie" (eg perhaps a
signed hash using the shared secret), or alternatively, full encryption
of all links. The use of SSL (secure socket layer) code would most
likely be the best choice here [HY96]. In any
case, it would mean that the new "safe distributed erlang" would be
incompatible with the existing system. This may, or may not, be a
problem.
Conclusions
This paper describes the rationale, design approach, and details of
the SSErl prototype of a more secure Erlang execution environment.
The prototype will be used to evaluate whether an appropriate level
of abstraction has been chosen, and whether the interfaces provided
are appropriate for the development of "safe" imported code systems.
It is anticipated that once the design approach is validated, it
will then be incorporated in a new version of the Erlang RunTime System.
Acknowledgements
The SSErl prototype and this paper were written during my special studies
program in 1997, whilst visiting SERC in Melbourne
and NTNU in Trondheim, Norway. I'd like to thank my colleagues at these
institutions, and at the Ericsson Computer Science Laboratory in Stockholm
for their discussions and support.
References
- APW86
-
M. Anderson, R.D. Pose, C.S. Wallace,
"A Password Capability System",
The Computer Journal, Vol 29, No 1, pp 1-8, 1986.
- AVWW96
-
J. Armstrong, R. Virding, C. Wikstrom, M. Williams,
"Concurrent Programming in Erlang",
2nd edn, Prentice Hall, 1996.
http://www.ericsson.se/erlang/sure/main/news/book.shtml.
- Arms96
-
J. Armstrong,
"Erlang - A Survey of the Language and its Industrial Applications",
in INAP'96 - The 9th Exhibitions and Symposium on Industrial Applications of Prolog,
Hino, Tokyo, Japan, Oct 1996.
http://www.ericsson.se/cslab/erlang/publications/inap96.ps.
- Bro96b
-
L. Brown,
"Mobile Code Security",
in AUUG 96 and Asia Pacific World Wide Web 2nd Joint Conference,
AUUG, Sept 1996.
http://lpb.canb.auug.org.au/adfa/papers/mcode96.html.
- Bro97b
-
L. Brown,
"Custom Security Policies in SSErl",
Australian Defence Force Academy, Canberra, Australia, Technical Note, Apr 1997.
http://lpb.canb.auug.org.au/adfa/papers/ssp97/sserl97b.html.
- DFW96
-
D. Dean, E.W. Felten, D.S. Wallach,
"Java Security: From HotJava to Netscape and Beyond",
in Proceedings IEEE Symposium on Security and Privacy,
IEEE, May 1996.
http://www.cs.princeton.edu/sip/pub/secure96.html.
- Erla96
-
Erlang Systems,
"Erlang Distribution",
Ericsson Software Technology AB, Erlang Systems, 1996.
http://www.ericsson.se/erlang/.
- GM95
-
J. Gosling, H. McGilton,
"The Java Language Environment: A White Paper",
Sun Microsystems, May 1995.
ftp://ftp.javasoft.com/docs/.
- HY96
-
T.J. Hudson, E.A. Young,
"SSLeay and SSLapps FAQ",
Uni. Queensland, 1996.
http://www.psy.uq.edu.au:8080/~ftp/Crypto/.
- JNS97
-
I. Jonsson, G. Naeser, D. Sahlin, et al.,
"Adapting Erlang for Secure Mobile Agents",
in Practical Applications of Intelligent Agents and Multi-Agents: PAAM'97,
London, UK, Apr 1997.
http://www.ericsson.se/cslab/~dan/reports/paam97/final/paam97.ps.
- LSW95
-
S. Lucco, O. Sharp, R. Wahbe,
"Omniware: A Universal Substrate for Mobile Code",
in Fourth International World Wide Web Conference,
MIT, Dec 1995.
http://www.w3.org/pub/Conferences/WWW4/Papers/165/.
- Nae97a
-
G. Naeser,
"Your First Introduction to SafeErlang",
CS, Uppsala University, Jan 1997.
http://www.csd.uu.se/~gaffe/general/safe/nae97a.ps.gz.
- OLW96
-
J.K. Ousterhout, J.Y. Levy, B.B. Welch,
"The Safe-Tcl Security Model",
Sun Microsystems Laboratories, Mountain View, CA 94043-1100, USA, Nov 1996.
http://www.sunlabs.com/research/tcl/safeTcl.ps.
- Tar95
-
J. Tardo,
"An Introduction to Safety and Security in Telescript",
General Magic Inc., 1995.
http://cnn.genmagic.com/Telescript/TDE/security.html.
- Wiks94
-
C. Wikstrom,
"Distributed Programming in Erlang",
in PASCO'94 - First International Symposium on Parallel Symbolic Computation,
Sep 1994.
http://www.ericsson.se/cslab/erlang/publications/dist-erlang.ps.
by
SSErl - Prototype of a Safer Erlang