Erlang — An Open Source Language for Robust Distributed Applications

Dr Lawrie Brown

School of Computer Science, Australian Defence Force Academy, Canberra, Australia

Abstract

This paper introduces Erlang, a functional language designed to support robust, reliable, distributed, near real-time applications. Its was developed by the Ericsson Computer Science Laboratories, and was released as open source in late 1998. Erlang includes not just the base language, but the Mnesia DBMS, a web server, an ORB, and an SNMP agent. Collectively these provide some very good tools for applications development.

Introduction

Erlang is a declarative language for programming concurrent and distributed systems. It was developed at the Ericsson Computer Science Laboratories to satisfy a requirement for a language suitable for building large soft real-time control systems, particularly telecommunications systems [AVWW96] [Arms97] [Arms96] [Wiks94]. It is a dynamically typed, single assignment language which uses pattern matching for variable binding and function selection, has inherent support for lightweight concurrent and distributed processes, and has error detection and recovery mechanisms. Most of the Erlang system is written in Erlang (including compilers, debuggers, standard libraries), with just the core run-time system and a number of low-level system calls (known as BIFs for BuiltIn Functions) written in C. Distributed use is almost transparent, with processes being spawned on other nodes, instead of locally. An Erlang node is one instance of the run-time environment, often implemented as a single process (with many threads) on Unix systems. Erlang is currently being used in the development of a number of very large telecommunications applications, and this usage is expected to increase.

A binary distribution of the Erlang system has been available for several years. In late 1998, Ericsson announced the release of Erlang as open source, under the Erlang Public Licence (a derivative of the Mozilla licence). The release comprises full source for the run-time system, compilers (both the original JAM and the new BEAM), the Mnesia DBMS, libraries and utilities. It is available from the erlang.org site [Erlang]. The features of open-source Erlang are described in a White Paper [ADLM98], and in the references listed above.

Why a Functional Language?

Erlang is a functional language. It inherits benefits such as pattern matching for selecting between alternate function clauses, minimal side-effects which greatly simplify analysis and testing of code, and dynamically typed symbolic data representation. Application are composed of many functions, and the lack of side-effects in most means its easy to test and validate components of the application. Generally the only side-effects occur when interacting with other processes or external resources. This interaction can be easily isolated to assist in the testing process.

In additional, because the primary design focus was on distributed soft real-time systems, the language includes an extremely effective distribution and concurency mechanisms. Spawning a new process (thread) is extremely cheap and efficient, and spawning a process on another node is a trivial extension. Given a process identifier, all other communication and interaction with it is independent of whether it is local or remote. This greatly eases the implementation of robust distributed applications.

Experience with a number of large application developments seems to show that using Erlang leads to a faster implementation cycle and time to market, compared to using other more traditional development languages (see [AVWW96]).

There are of course, some disadvantages. Many numerical algorithms don't translate well into Erlang since they usually assume mutable data elements. The use of dynamic typing and garbage collection does involve some runtime overhead, though given the speed of current systems, in many applications this has not proved to be a problem. Also, because the original design focus was not on user interface issues, graphical interfaces may not be as efficient as in some other languages, though they certainly are available.

There is however, always the possibility of either calling out to an external program or using a linked in driver, which can be derived from code in any language. It comes down to a case of selecting the best language for any particular task. And in real, large scale, applications, this may often involve using several langauges for different components of the system (cf the AXD301 ATM switch later).

Features

Because of its design goal to support soft real-time control systems, Erlang includes a number of features more often considered as part of the operating system, including:

Concurrency: using extremely light-weight processes communicating by message passing
Distribution: with easy creation of and interaction with processes on remote nodes
Robustness: using error and exception handling mechanisms to support fault-tolerant systems, including monitoring of processes on remote nodes
Soft real-time response: capable of responding to events within a few milliseconds, and with garbage collection handler capable of supporting this feature
Hot code upgrade: of code modules on a running system with handover from previous to current code versions
Incremental code loading: where code can be loaded either at boot time (embedded systems) or as needed (general purpose systems)
External interfaces to devices, files, network, other processes: using the same message passing mechanism as used with other Erlang processes (Erlang has the view everything is a process, much like the Unix view that everything is a file)

More details are provided in the White Paper [ADLM98].

Components

As well as the basic language development and run-time system, the open-source Erlang distribution includes a number of standalone components which can be used as building blocks when developing applications including:

Mnesia: a fast, distributed, real-time database for Erlang, running in the same address space as the applications that use it, which supports arbitrarily complex data structures and dynamic scheme changes. It was designed specifically to meet the robust, fault tolerant, requirements of telecommunications applications (eg. billing), but is suitable for a number of other applications. It includes automatic consistency maintenance of replicated data distributed over a number of systems. A SQL interface will be available shortly. See [MNW99] for additional details.
Inets: a package of Internet tools including a fully featured HTTP server with Apache style configuration. The server can be used as a standalone web server, or the HTTP protocol modules can be built into applications to provide a web user interface to them.
Orber: a CORBA v2.0 object request broker (ORB). Erlang programs can be clients or servers. It includes support for common object services such as persistence, naming & events.
SNMP: an extensible SNMP v1/v2 agent and MIB (ASN.1) compiler.

As well, the standard library includes a number of modules for applications monitoring and debugging, GUI interface support (using Tk widgets), parse tools, etc.

A Trivial Example

The following is a very simple illustration of concurrent distributed Erlang use, programming a very simple bank account server (mickey mouse student example :-)

    -module(bankserver).
    -export([start/1]).
    start(Sum) ->                 % start account server
        register(bank,self()),    % register as 'bank'
        account(Sum).             % process transactions
    account(Sum) ->               % transaction processing loop
        receive                   % await transaction
          {Pid, Ref, Amount} ->   % account update msg received
            NewSum = Sum+Amount,  % update sum
            Pid ! {Ref,NewSum},   % send response back
            account(NewSum);      % loop (recurse)
          stop -> nil             % end server msg received
        end.

This could be started on some remote node specified by BankNode, with a balance of 1000, and updated, as follows:

    ...
    % spawn a bank account process with initial Sum 1000
    Bank_pid = spawn(BankNode,bankserver,start,[1000]),
    ...
    Ref = make_ref(),             % make a unique ref value
    Bank_pid ! {self(),Ref,17},   % send msg to server
    receive                       % await reply from server
        {Ref,New_balance} ->      % reply says updated ok
            ...
    end,
    ...

Alternatively, any other process in that node could communicate with it using the registered name (which maps to the appropriate pid):

    bank ! {self(),Ref,-12},      % send msg to server
    ...

Note that this example doesn't use any of the standard mechanisms for programming and monitoring client-server applications, just the core language features. It could be made much smaller and more reliable using them (but then you wouldn't see what was going on behind the scenes!)

A Slightly More Serious Example

In the Appendices at the end of this paper is code for a very simple HTTPD/0.9 web server. It has just the minimum functionality, extracting the name from the request, appending it to the supplied document root, and returning either that file or an error response. However it does illustrate how Erlang can be used to implement a TCP/IP server, and shows some more extensive error handling.

Whilst probably too lean for use as is, this code could form the outline of code permitting configuration of some application using a web interface. Of course if serious web server is required, then a fully featured HTTPD/1.0 server is supplied as part of the Open Source Erlang system.

In the second Appendix is the code for a simple application monitor, which starts the requested application, catches and logs an errors that killed it, and then restarts the application. Again this is minimal code, and a fully featured monitor is provided as part of the system, but it illustrates how the process works. It could be used to monitor the above httpd server by being run as:

    monitor:start(httpd,start,[]).

It could easily be extended to monitor an application on a remote node, simply by providing an additional argument Node, and then changing the spawning of the application to be remote using:

    spawn_link(Node,Module,Function,Args)

Production Applications

Erlang has been used as the major language in the development of a number of Ericsson products in the last few years. The two best known examples are the Mobility Server, and the AXD301 ATM switch.

The Mobility Server is an intelligent call control system incorporated within the Consomo PBX to provide a personal number service for mobile phone users. It comprises 486 Erlang modules with 230k loc (lines of code) and 20k loc in C for the device interfaces, and was written by a team of 35 people on time and under budget. It has now been sold to a number of customers.

The AXD301 ATM switch is a new asynchronous transfer mode (ATM) switching system which combines features associated with data communication, such as compactness and high functionality, with features from telecommunications, such as robustness and scaleability. The AXD 301 system is designed for nonstop operation, with duplicated hardware, and modular software, which can be upgraded to facilitate the introduction of new functionality without disturbing traffic.

There are a number of components in its application architecture, and different languages were used for each. All of the device interfaces and interaction with the switching fabric was written in C. The web based management interface was written in Java. The remainder of the application, including the overall control and management was written in Erlang. Each of these components comprise about 150k loc.

There are consistent reports that programming in a functional language such as Erlang is faster, results in smaller source code size, and is more maintainable than when traditional procedural languages are used. This is discussed in [Arms96].

And Where do I come In?

For those of you who know me, and are wondering how I came to be playing with Erlang when my research areas are cryptography and security: well during my sabbatical in 1997 I started a project looking at what extensions would be needed to a functional language like Erlang to support safe mobile code execution. The answer is surprisingly little, and you can read about it in [BrSa97], [Bro97d]. It primarily involves replacing forgeable identifiers for pids and ports etc with capabilities which explicitly state what operations are permitted on them, the provision of a hierarchy of sub-nodes within each physical node, which can support a range of security restrictions, and support for remote code loading in context (so references to modules in code loaded from a remote site are also resolved from that remote site). Currently these modifications are being evaluated using a prototype which rewrites the Erlang source in a "safe" manner. For production use though, these changes would need to be incorporated into the actual Erlang run-time system.

Conclusions

In this paper I have introduced Erlang, an open-source language from Ericsson, which I believe provide some very useful tools for building large, scaleable, robust, distributed applications. Take a look for yourselves.

Online Resources

Information and sources are available for:

Erlang: at http://www.erlang.org
or http://www.serc.rmit.edu.au/mirrors/ose_mirror/ (oz shadow)
Eddie: at http://www.eddieware.org

References

ADLM98: Joe Armstrong, Bjarne Däcker, Thomas Lindgren, Håkan Millroth, "Open-source Erlang - White Paper", Ericsson Computer Science Laboratory, Stockholm, Sweden, white paper, 1998. http://www.erlang.org/white_paper.html.
AVWW96: J. Armstrong, R. Virding, C. Wikstrom, M. Williams, "Concurrent Programming in Erlang", 2nd edn, Prentice Hall, 1996. http://www.ericsson.se/erlang/sure/main/news/book.shtml.
Arms96: J. Armstrong, "Erlang - A Survey of the Language and its Industrial Applications", in INAP'96 - The 9th Exhibitions and Symposium on Industrial Applications of Prolog, Hino, Tokyo, Japan, Oct 1996. http://www.ericsson.se/cslab/erlang/publications/inap96.ps.
Arms97: Joe Armstrong, "The Development of Erlang", in Proceedings of the ACM SIGPLAN International Conference on Functional Programming, ACM, pp 196-203, 1997.
BrSa97: L. Brown, D. Sahlin, "Extending Erlang for Safe Mobile Code Execution", School of Computer Science, Australian Defence Force Academy, Canberra, Australia, Technical Report, No CS03/97, Nov 1997. http://lpb.canb.auug.org.au/adfa/papers/tr9703.ps.gz.
Bro97d: L. Brown, "SSErl - Prototype of a Safer Erlang", School of Computer Science, Australian Defence Force Academy, Canberra, Australia, Technical Report, No CS04/97, Nov 1997. http://lpb.canb.auug.org.au/adfa/papers/tr9704.html.
Erlang: Erlang Systems, "Open Source Erlang Distribution", Ericsson Software Technology AB, Erlang Systems, 1999. http://www.erlang.org/.
MNW99: H Mattsson, H Nilsson, C Wikstrom, "Mnesia: A Distributed Robust DBMS for Telecommunications Applications", in First Intl. Workshop on Practical Aspects of Declarative Languages (PADL'99), 1999. http://www.ericsson.se/cslab/~klacke/padl99.ps.
Wiks94: C. Wikstrom, "Distributed Programming in Erlang", in PASCO'94 - First International Symposium on Parallel Symbolic Computation, Sep 1994. http://www.ericsson.se/cslab/erlang/publications/dist-erlang.ps.

Appendix - httpd.erl

A very simple HTTPD/0.9 web server, capable of parsing a request and returning the named file below the configured document root.

%% httpd.erl - a simple HTTPD/0.9 web server in Erlang
-module(httpd).

-export([start/0,start/1,start/2,process/2]).
-import(regexp,[split/2]).

-define(defPort,8888).				%% port to use if not given
-define(docRoot,"./htdocs").			%% HTML document root

%% start mini HTTPD/0.9 server, can specify port/docroot if wanted
start() -> start(?defPort,?docRoot).
start(Port) -> start(Port,?docRoot).  
start(Port,DocRoot) -> 
    case gen_tcp:listen(Port, [binary,{packet, 0},{active, false}]) of
	{ok, LSock}	-> server_loop(LSock,DocRoot);
	{error, Reason}	-> exit({Port,Reason})
    end.

%% main server loop - wait for next connection, spawn child to process it
server_loop(LSock,DocRoot) ->
    case gen_tcp:accept(LSock) of
	{ok, Sock}	->
	    spawn(?MODULE,process,[Sock,DocRoot]),
	    server_loop(LSock,DocRoot);
	{error, Reason}	->
	    exit({accept,Reason})
    end.

%% process current connection
process(Sock,DocRoot) ->
    Req = do_recv(Sock),
    {ok,[Cmd|[Name|[Vers|_]]]} = split(Req,"[ \r\n]"),
    FileName = DocRoot ++ Name,
    LogReq = Cmd ++ " " ++ Name ++ " " ++ Vers,
    Resp = case file:read_file(FileName) of
        {ok, Data}	->
	    io:format("~p ~p ok~n",[LogReq,FileName]),
	    Data;
	{error, Reason}	->
	    io:format("~p ~p failed ~p~n",[LogReq,FileName,Reason]),
	    error_response(LogReq,file:format_error(Reason))
    end, 
    do_send(Sock,Resp),
    gen_tcp:close(Sock).

%% construct HTML for failure message
error_response(LogReq,Reason) ->
    "<html><head><title>Request Failed</title></head><body>\n" ++
    "<h1>Request Failed</h1>\n" ++ "Your request to " ++ LogReq ++
    " failed due to: " ++ Reason ++ "\n</body></html>\n".

%% send a line of text to the socket
do_send(Sock,Msg) ->
    case gen_tcp:send(Sock, Msg) of
        ok		-> ok;
	{error, Reason}	-> exit(Reason)
    end.

%% receive data from the socket
do_recv(Sock) ->
    case gen_tcp:recv(Sock, 0) of
	{ok, Bin}	-> binary_to_list(Bin);
	{error, closed}	-> exit(closed);
	{error, Reason}	-> exit(Reason)
    end.

Appendix - monitor.erl

A very simple application monitor which illustrates how errors can be caught and an application restarted.

%% monitor - simple application monitor

-module(monitor). 
-export([start/3,run/3]). 
-define(max,5).

%% spawn of monitor process
start(Module, Function, Args) -> 
    spawn(?MODULE,run,[Module, Function, Args]).

%% start application being monitored with traps being caught
run(Module, Function, Args) -> 
    process_flag(trap_exit, true), 
    Child = spawn_link(Module,Function,Args),
    monitor(Child, ?max, Module, Function, Args).

%% wait for something to go wrong, log, restart application
monitor(Child, N, Module, Function, Args) -> 
    receive 
        {'EXIT', Child, Why} when N > 0 -> 
            io:format("~p:~p~p died ~p~n", [Module,Function,Args,Why]), 
            NewChild = spawn_link(Module,Function,Args),
            monitor(NewChild,N-1,Module,Function,Args); 
        {'EXIT', Child, _} -> 
            io:format('too many restarts on ~p:~p~p!~n', [Module,Function,Args])
    end.

The latest version of this paper may be found at: http://lpb.canb.auug.org.au/adfa/papers/auug99-erl.html.

This paper was last revised: 30 July 1999.