Mobile Code Security
Dr Lawrie Brown
School of Computer Science,
Australian Defence Force Academy,
Canberra, Australia
Email: Lawrie.Brown@adfa.edu.au
Abstract
With the growth of distributed computer and telecommunications systems,
there have been increasing demands to support the concept of "mobile
code", sourced from remote, possibly untrusted, systems, but executed
locally. The best known examples of this are WWW applets, but it also
is manifest in dynamic email, and more recently, in supporting third party
suppliers in the emerging Telecommunications Information Networking
Architecture (TINA). Supporting mobile code introduces a number of
serious security and safety issues that must be addressed. This paper
will introduce some of these issues, and outline some of the proposed
solution approaches, as utilised in languages such as Safe-TCL, Java,
and Omniware.
Introduction
"Mobile Code" is code sourced from remote, possibly "untrusted" systems,
but executed on your local system. Examples include:
web applets, dynamic email, and TINA building blocks.
The concept of "mobile code" has been called by many names:
mobile agents, mobile code, downloadable code, executable content,
active capsules, remote code, and others.
All these deal with the local execution of remotely sourced code.
Mobile Code Examples
Examples of mobile code include:
- Web Applets
- Mini-programs written in Java, which are
automatically loaded & run on being named in an HTML document.
A document can include a number of applets, and these may be sourced
from a number of different servers, and run virtually without the user
being aware of them.
- Dynamic Email
- One proposal for the provision of dynamic email suggested
incorporating Safe-TCL scripts as components of MIME email.
These scripts could be run either on mail delivery, or when the mail
is read by the recipient.
- TINA Building Blocks
- The evolving "Telecommunications Information Networking
Architecture" (see NDC95)
includes support for 3rd party service providers
who can supply TINA Building Blocks (objects), which can manipulate
network resources in order to provide value added services to clients.
An outline of some of the security and safety issues is given in
Shah96.
All of these examples illustrate that the use of mobile code will
raise of number of serious security and safety issues.
This paper will outline some general approaches to,
and specific examples of, "safe" systems.
I finish by mentioning some flaws which have been
found in existing systems, in order to derive some lessons for future designs.
Low-level Security Issues
The use of "mobile code" raises a number of obvious security issues:
- access control -- is the use of this code permitted
- user authentication -- to identify valid users
- data integrity -- to ensure the code is delivered intact
- non-repudiation -- of use of the code, for both the sender and
the receiver, especially if its use is being charged
- data confidentiality -- to protect sensitive code
- auditing -- to trace uses of mobile code
Techniques for providing these security services are well known.
Their provision is not a technical problem, but rather a political
and economic one. It involves the use of cryptographic extensions
to communications protocols. These are well described in the
OSI Security Framework, the ISO 10181 and CCITT X.810-X.816 standards,
in the IETF IP-SEC proposals, and the Secure Web protocols.
Clearly a system which supports "mobile code" will need to
provide these services. Before too long, I believe we will see them.
A more interesting question, though, is how to address the issue of
how to safely execute the code once it is validly and correctly
delivered to the end-user's system.
Mobile Code Safety
The prime focus of this paper is on the techniques which can be used
to provide for the safe execution of imported code on the local system.
This has to address threats due to rogue code being loaded and run.
Of course in many ways, these problems are not new: they have been a
key component of operating systems design on multi-user systems for
many years. The traditional approach to addressing these problems
has been to use heavy address space protection mechanisms, along
with user access rights to the file system and other resources.
The difference between the traditional problems, and those posed by
mobile code, is one of volume and responsiveness. Mobile code
is intended for quick, lightweight execution, which conflicts
with the cost of heavy address space mechanisms in most current operating
systems. Also, each mobile code unit can, in one sense, be thought of as
running as its own unique user, to provide protection between
the various mobile code units and the system. Traditional methods
of adding new users cannot cope with this demand.
The types of attacks which need to be guarded against include:
- denial of service
- disclosure of confidential information
- damage or modification of data
- annoyance attacks
Some example scenarios which can be imagined include:
a Video-on-Demand service which discretely scans local files for information;
An online game which opens a covert connection to run programs locally;
an Invisible program that captures system activity information.
Resource Access & Safety
Fundamentally, the issue of safe execution of code comes down to a
concern with access to system resources.
Any running program has to access system resources in order to perform
its task. Traditionally, that access has been to all normal user resources.
"Mobile Code" must have restricted access to resources
for safety. However, it must be allowed some access in order to perform
its required functions. Just which types of access, and how these are
to be controlled, is a key research issue.
The types of resources to which access is required include:
- file system
- network
- random memory
- output devices (entire display, various windows, speaker ...)
- input devices (keyboard, mic ...)
- process control (access to CPU cycles)
- user environment
- system calls
Language Support for Safety
When considering means of providing safe execution, if heavy address
space protection mechanisms are not being used, then considerable
reliance is going to be placed on the verified use of
type-safe programming languages. These ensure that
arrays stay in bounds, that pointers are always valid, and that code cannot
violate variable typing (such as placing code in a string and then executing
it). These features are needed to ensure that various code units do not
interfere with each other, and with the system.
If type-safe languages are being used, we
want assurance of the type-system's soundness and safety,
want validation of type-checking implementations, and of course, all
without compromising efficiency.
In addition, a range of usual sound programming proceedures need to
be followed. The system should be designed in a modular fashion,
separating interfaces from implementations in programs, and with
appropriate layering of libraries and module groups, with
particular care being taken at the interfaces between security boundaries.
One general approach to designing "safe" execution evironments is to
remove general library routines which could compromise security,
and replace them with more specific, safer ones,
eg. replace a general file access routine with one that
can write files only in a temporary directory.
Great care is needed with this approach to ensure that unforeseen
interactions or implementation flaws do not negate the desired
security. This has been an area where failures have occured on a
number of occasions.
Granting Access to Resources
One of the key issues in providing for safe execution of "mobile code"
is determining exactly which resources a particular code unit is to be
granted access to. That is, there is a need for a security policy which
determines what type access any "mobile code" unit has.
This policy may be:
- fixed for all "mobile code" units
- very restrictive but easy, and the approach currently used to
handle applet security in web browsers such as Netscape.
- user verifies each security related access requests
- relatively easy, but rapidly gets annoying, and eventually is
self-defeating when users stop taking notice of the details of the requests.
Whilst there is a place for querying the user, it should be used exceedingly
sparingly.
- negotiate for each "mobile code" unit
- much harder, as some basis is needed for negotiation,
perhaps based on various profiles, but ultimately this is likely
to be the best approach.
In the longer term, some mechanisms are needed to permit negotiation of
appropriate accesses. How this is expressed is, I believe, one of the
key research issues. Initially this is likely to be based on
a simple tabular approach, based on the various categories mentioned
above. While adequate for the simplistic applets seen to date,
this is unlikely to be sufficient for more complex "mobile code"
applications. For these, some faily powerful language is going to be needed
to express the required types of accesses, along with a means of reasoning
about those requests. For example, consider a simple "mobile code"
text-editor -- it should be able to change any textual
file specified by the user, have access perhaps to a preferences file,
but otherwise be denied access to all other files. How can this be expressed and
reasoned with? This is an area that needs considerable additional work,
but will be a key to the successful use of "mobile code".
Mobile Code Technologies
Having considered some of the issues raised by the need for "safe"
execution of "mobile code", I will now summarise some approaches
that have been tried.
One method of categorising "mobile code" technologies, given in
TW96, is based on the type of code distributed:
Source Code
The first approach is based on distributing the
source for the "mobile code" unit used.
This source will be parsed and executed by an
interpreter on the user's system. The interpreter is
responsible for vetting source to ensure it obeys the required
language syntactic and semantic restrictions; and then for
providing a safe execution "sand-box" environment.
The safety of this approach relies on the correct specification
and implementation of the interpreter.
The main advantages of the source code approach is the distribution of
relatively small amounts of code; the fact that since the user has the
full source, it is easier to vet the code; and that it is
easier for the interpreter to contain the execution environment.
Disadvantages include the fact that it is slow, since the source must
first be parsed; and that it is hard to expand the core functionality,
since the interpreter's design limits this.
Programmable MUDs
One early example which included aspects of "mobile code"
were some of the MUDs which were programmable (see Bro93)
eg MUCK, MOO, UberMUD. These systems could execute
source authored by arbitrary users anywhere in the world, manually transferred
to the MUD system, and subsequently executed in the MUD interpreter environment.
Safeguards were provided by the fact that the MUD interpreter had
no other access to host system apart from the single MUD database file.
However, any MUD program had full access (as the running user) to MUD data.
One limitation was that users needed explicit permission to author code:
once granted however, they were trusted not to abuse the privilege.
These systems are early illustrations of some of the concepts: the use
of a "sand-box" interpreter, and restrictions on the source of code.
Safe-TCL
The most widespread and common example of the source code approach is
Safe-TCL, a subset of the TCL language with restricted features for safety.
TCL was designed by John Ousterhout as a
simple, clean, interpreted, embeddable command language,
with graphical toolkit (Tk) (see Ous94).
Safe-TCL is restricted by having
limited file system access, and is prevented from executing arbitrary
system commands. Safe-TCL code is usually
executed by the "untrusted interpreter" (that is the interpreter
which executes code from an untrusted source). A key component of the
Safe-TCL system is the provision of another "trusted interpreter"
(which executes code from a trusted source). Trusted code
can be used to extend the capabilities of the Safe-TCL system.
Such extension code can be invoked by any code running on the
"untrusted interpreter", but the extension code uses the
"trusted interpreter".
This provides a clean mechanism for extending the system.
Safe-TCL was designed by Nathaniel Borenstein and Marshall Rose as
a means of augmenting email to include active messages, termed
"Dynamic Email" (see Bor94). With the addition
of new MIME types: application/safe-tcl
and multipart/enabled-mail,
Safe-TCL programs could be incorporated into email messages,
and executed either on delivery or access by the recipient.
The concepts in Safe-TCL were subsequently adopted by the Tcl group,
and are now incorporated as standard in the latest Tcl/Tk releases.
More recently, Safe-TCL has been adapted for use on the web to execute
"Tclets" -- Safe-TCL code downloaded by a web browser and executed by
an interpreter on the user's system (see Tclet96).
It is currently handled by a
plug-in on common browsers such as Netscape (see Lev96).
Safe-TCL has also been extensively used in the First Virtual Internet
payment system.
JavaScript
JavaScript is a source-level scripting language, which is
embedded in an HTML document. It is NOT Java! It is
interpreted by the user's web browser, and
allows control over most of the features of the web browsers.
It has access to most of the content of its HTML document, and has
full interaction with the displayed content.
It can access Java methods (& vica versa), providing access
to features not present as standard in JavaScript.
Currently there is only a very coarse level of security management:
it is either enabled or disabled. Its security features are not
yet well documented.
Intermediate Code
A second approach to providing "mobile code" is to have the
programs compiled to a platform-independent intermediate code,
which is then distributed to the user's system.
This intermediate code is executed by an interpreter on the user's system.
Advantages are that it is faster to interprete than source,
since no textual parsing is required, and the intermediate code
is semantically much closer to machine code.
The interpreter provides a safe execution "sand-box", and again, the
safety of the system depends on the interpreter.
The code in general is quite small, and the user's system can vet the
code to ensure it obeys the safety restrictions.
Disadvantages of this approach are its moderate speed, since an interpreter
is still being used, and the fact that less semantic information is
available to assist in vetting the code than if source was available.
JAVA
Probably the best known intermediate code technology today is Java.
It is Sun Microsystems' "executable content" technology, using an
interpreted, dynamic, type-safe object-oriented language
(see GM95). Its safety features include the use of
runtime bytecode verification, late dynamic binding of modules,
automatic memory management, and exception processing. Considerable effort
has gone in to ensuring its safety in design and implementation.
This safety is, however, dependent on the correct specification
and implementation of both the verifier/interpreter AND the
standard library implementation (esp. SecurityManager). Failures
in these areas have led to some security flaws,
as described later.
Telescript
Telescript is a technology for creating distributed applications
using "mobile agents" (see Tar95). A key
difference between Telescript and Java is that a Telescript
"mobile agent" is a migrating process that is able to autonomously
transfer its execution to a different system by asking to "go" elsewhere.
Like Java, it is an interpreted, dynamic, type-safe object-oriented language,
compiled to an intermediate code, with runtime type checking and
late dynamic binding, automatic memory management, and exception processing.
Additional features include object persistence and remote access,
enabling objects to access each other over the network.
Because of its migratory and remote access features,
authentication and protection features are integral.
Currently, Telescript is also supported via Netscape plugins for
web applications, as well as using dedicated interpreters for
other distributed applications.
Native Binary Code
The final category of code distribution uses native binary code,
which is then executed on the user's system. This gives the
maximum speed, but means that the code is platform dependent.
Safe execution of binary code requires:
- restricted use of instruction set
- restricted address space access
Approaches to ensuring this can rely upon:
- tradional heavy address space protection, which is costly in
terms of system performance and support;
- the verified use of a trusted compiler, which guarantees to
generate safe code that will not violate the security restrictions;
- the use of "software fault isolation" technologies
(see WLAG93) which augment the instruction stream,
inserting additional checks to ensure safe execution
(Ste92).
A combination of verified use of a trusted compiler, and the
software fault isolation approach has created considerable interest,
especially when used with a Just-in-time Compiler.
Just-in-time Compilation
Just-in-time Compilation (JIT) is an approach that combines the portability
of intermediate or source code with the speed of binary code.
The source or intermediate code is distributed, but is then
compiled to binary on the user's system before being executed.
If source is used, it is slower, but easier to check.
If intermediate code is used, then it is faster.
Another advantage is that the user can utilise their
own trusted compiler to verify code, and insert the desired
software fault isolation run-time checks.
This approach is being used with Java JIT compilers, and
also in the Omniware system.
Omniware
Omniware is yet another technology for "mobile code"
(see LSW95). Omniware code is
written in C++, which is then compiled to an intermediate
code for the OmniVM. This is distributed, and at run-time
is translated to native code for execution.
It relies on "software fault isolation" techniques
to enforce safe execution of binaries. This
adds special checking code which emulates a MMU in software, placing
each module in its own protection domain. The run-time environment
vets access to resources. The major advantages claimed for Omniware
are that it uses a standard, well known language, C++, that it
is fast, since binary code is actually being executed, and yet it
is safe, due to the use of the "software fault isolation" techniques.
Theory vs Practice
There are a number of good proposals for providing safe execution of
"mobile code". However, some flaws have been found in practice.
Most of the recent effort has focused on Java (see below), although the
researchers believe that other systems would be likely to
have similar flaws if they were as closely scrutinised.
By examining the flaws found, some lessons may be drawn to assist
with future designs.
Java Implementation Flaws
A number of implementation flaws have been found in the Java system
(see DFW96, Ban95,
Yel95).
These include:
- Problems with network security, mostly created using DNS spoofing
to subvert the interpreter's view of the domain namespace, and subsequently to
violate the restriction on opening connections only back to the source
of the Java code.
- There were some early problems with buffer overflows in sprintf
in the original JDKs. These have mostly been fixed, except in javap
(where care is still needed).
- Some of the standard routines provide information about the
layout of storage for objects. This is probably not a serious flaw,
but more information is revealed than is perhaps necessary.
- In HotJava, the proxy variables were public, which meant that any
Java program could change them, and thus redirect all requests from the
user's browser.
- There are some problems with inter-applet security. Applets are
supposed to be quarantined from each other. However, using the
thread manager, an applet can discover which other applets have
running threads, control attributes of these threads, and even
discover the applets names (since these are encoded in the thread names).
All of these flaws can be corrected by changes to the standard Java
run-time environment: many have already been made.
Java Language vs Bytecodes
More serious are some deficiencies in the design of the Java language
itself, or more correctly, in differing semantics between the
Java language, and the bytecodes of the Java Virtual Machine (JVM)
to which the language is compiled.
DFW96 have identified two significant flaws.
The first, and most serious, relates to superclass constructors.
Whenever an object is created, a constructor is called for it.
These constructors are required to call the constructor for the
super (parent) class first. Unfortunately the Java language prohibits,
but the bytecode verifier allows, the creation of a partially initialised
class loader, which can then be used to thwart some of the security
checks on object creation, and to violate the strong typing of objects.
DFW96 have found by using this attack, they can
get and set the value of any non-static variable, and call
any method (including native methods with fewer security
restrictions).
The second flaw identified relates to the Java package names.
Again the bytecode verifier allows a leading "/" on package names,
which is interpreted by the run-time system as an absolute pathname
to some package. Since the package is on the local system, it
is regarded as trusted code. If a user is running Java on a system
that allows any other type of network file access (eg FTP server
with an incoming directory), then that can be used to place code
on the system which can then be executed by the user's Java interpreter.
Also identified were some problems with object initialisation, where
object constructors are working with partially initialised objects.
All of these suggest that some further work is needed on the design of the
Java language, and particularly on its relation to the JVM bytecodes.
Security Failure Lessons
Experience with systems (esp. Java) have highlighted some dangers,
showing that failures can occur in both the implementation and the
specification of the system. Correct specification does not prevent
poor implementation, weakening its security. Great care is needed.
Ideally it should be possible to formally verify the language design,
and then validate its implementation. In practise, this is unlikely
to be possible for some time. Some of the methods and procedures used in the
IT Security Evaluation community may, however, assist in the creation
of more reliable systems.
Conclusions
"Mobile code" is here with increasing demands for its use.
Safe execution of "mobile code" implies a need for
controlled access to resources, access which ideally should be negotiated
for each "mobile code" unit. The means for achieving this is
a subject for considerable additional research.
Approaches taken so far to providing "mobile code" include
the distribution of: source, intermediate code, or binary code,
and the use of Just-In-Time compilers.
Experience with these systems has shown that safe and
secure systems need both correct specification and implementation.
There is still considerable research and development needed
in these systems. However I believe the goal of safe and secure
"mobile code" execution is reasonable and achievable.
Acknowledgements
The material in this paper has been revised from survey work done by
Bahram Shahimi, as part of his Master's thesis:
"A Preliminary Framework for the Security of
Building Blocks in TINA" for NTNU. Bahram and I are also indebted
to Fergus O'Brien for his invaluble comments and
critiques of this work, and his assistance in liasing with TINA-C.
I would also thank several members of the
School of Computer Science, ADFA for their comments and suggestions.
References
- Ban95
-
Banks, J.A,
"Java Security",
MIT, December, 1995.
http://www-swiss.ai.mit.edu/~jbank/javapaper/javapaper.html.
- Bor94
-
Borenstein, N.S.,
"Email With a Mind of Its Own: The Safe-Tcl Language for Enabled Mail",
in ULPAA'94,
Barcelona, 1994.
http://minsky.med.Virginia.edu:80/sdm7g/Projects/Python/safe-tcl/.
- Bro93
-
Brown, L.,
"MUDs - Serious Research Tool or Just Another Game",
School of Computer Science, Australian Defence Force Academy, Canberra, Australia, No TR CS14/93, September, 1993.
gopher://gopher.adfa.edu.au:70/00/About%20ADFA/Computer%20Science/Technical%20Reports/.
- DFW96
-
Dean, D., Felten, E.W., Wallach, D.S.,
"Java Security: From HotJava to Netscape and Beyond",
in Proceedings IEEE Symposium on Security and Privacy,
IEEE, May, 1996.
http://www.cs.princeton.edu/sip/pub/secure96.html.
- GM95
-
Gosling, J., McGilton, H.,
"The Java Language Environment: A White Paper",
Sun Microsystems, May, 1995.
ftp://ftp.javasoft.com/docs/.
- LSW95
-
Lucco, S., Sharp, O., Wahbe, R.,
"Omniware: A Universal Substrate for Mobile Code",
in Fourth International World Wide Web Conference,
MIT, December, 1995.
http://www.w3.org/pub/Conferences/WWW4/Papers/165/.
- Lev96
-
Levy, J.,
"A Tcl/Tk Netscape Plugin",
in Proceedings Tcl/Tk Workshop 1996,
Usenix, 1996.
http://www.sunlabs.com/tcl/plugin/paper.html.
- NDC95
-
Nilsson, G., Dupuy, F., Chapman, M.,
"An Overview of the Telecommunications Information Network Architecture",
TINA-C, 1995.
http://www.tinac.com/.
- Ous94
-
Ousterhout, J.,
"Tcl and the Tk Toolkit",
Addison-Wesley, Reading, Mass., 1994.
- Shah96
-
Shahimi, B.,
"A Preliminary Framework for the Security of Building Blocks in TINA",
Department of Computer Science and Telematics, NTNU, Trondheim, Norway, Masters Thesis, April, 1996.
- Ste92
-
Stefen. J.L.,
"Adding Run-Time Checking to the Portable C Compiler",
Software - Practise and Experience, Vol 22, No 4, pp 305-316, April, 1992.
- TW96
-
Tennenhouse, D.L., Wetherall , D.J.,
"Towards an Active Network Architecture",
Computer Communication Review, 1996.
http://www.tns.lcs.mit.edu/publications/ccr96.html.
- Tar95
-
Tardo, J.,
"An Introduction to Safety and Security in Telescript",
General Magic Inc., 1995.
http://cnn.genmagic.com/Telescript/TDE/security.html.
- Tclet96
-
Sun Microsystems,
"Tcl/Tk: Create Web Apps In Internet Time",
Sun Microsystems Online, August, 1996.
http://www.sun.com/960710/cover/.
- WLAG93
-
Wahbe, R., Lucco, S., Anderson, T., Graham, S.L. ,
"Efficient Software-Based Fault Isolation",
in 14th ACM Symposium on Operating Systems Principles,
ACM, Asheville, NC, December, 1993.
- Yel95
-
Yellin, F.,
"Low Level Security in Java",
in Fourth International World Wide Web Conference,
MIT, December, 1995.
http://www.w3.org/pub/Conferences/WWW4/Papers/197/40.html.