Introduction

SMT
Version 2.81

SMT 2.0 is an evolution of SMT version 1.x of the iMatix SMT kernel. SMT 1.x was specifically designed for TCP/IP server development. SMT 2.x is a more generalised approach that makes multithreaded FSMs suitable for a variety of purposes. SMT 2.0 is not backwards compatible with SMT 1.x, but it is quite straight-forward to convert applications.

The main objectives of SMT 1.0 were:

Mulithreading - a cleaner and more efficient approach for certain types of problems than alternatives such as multiprocessing (forking) or iteration.
Simplicity - using the Libero method to simplify an otherwise complex multithreaded model.
Portability - so that a SMT application runs on UNIX, Windows 95, Windows NT, and Digital VMS with similar functionality.

SMT 2.0 is meant to answer a wider set of issues. Specifically, our goal was to provide multithreading capability in these domains:

Internet servers
Other communication servers
Asynchronous FSM architectures
Internally multithreaded (batch) programs
GUI-based programs (e.g. Windows)

In SMT 1.x, external events were received and processed by a kernel built-in to each SMT program. The SMT 1.x kernel was specifically designed to be driven by TCP/IP events. An SMT 1.x program is invoked, does its work, and finally terminates. We call this type of program a 'batch' program. This is suitable for servers ('daemons'), but not for real-time programs that must be integrated into an large-scale event-driven architecture.

SMT 2.x is designed as an event-passing kernel. The application consists of a number of agent programs, each running one or more threads. Each thread has an event queue. Threads send each other events, which are queued, and delivered by the kernel to the state machine that controls each thread. The SMT kernel API lets you send and receive events, create threads, etc. There are various ways to construct an agent program (single- threaded, multithreaded), and different ways to handle event queues (one thread per queue, or several threads per queue). Agents and events can have priorities, which changes the order of execution and delivery.

SMT provides a number of standard agents that are easily reused in applications. For example, in an Internet application, the socket i/o agent collects events from the Internet sockets used by the application. When a thread wants to read data from a socket, it sends an event to the socket i/o agent, telling it which port, and how much data. The socket i/o agent reads the data and returns that as a event. Other standard agents are: a logging agent to write log file data; an operator console agent to handle error and warning events; a timer agent to generate alarm events.

SMT also includes a number of protocol agents for use in SMT applications: echo, HTTP, FTP.

In SMT 1.x, external events such as TCP/IP events were collected by the kernel, in SMT 2.x such events are collected by an agent program. Thus it is possible to add support to any external event source.

Classic Multithreading Environments

Multithreaded programming is often perceived as complicated. When we look at multithreading facilities provided by existing operating systems, we tend to agree. The most common type of multithreading is pre-emptive multithreading. This is typically seen on UNIX and Windows NT systems. The characteristics of this approach are:

The multithreading system (or operating system) switches between threads arbitrarily. This is 'pre-emptive'.
The thread program looks like standard procedural code.
Threads communicate using semaphores: when one thread must wait for another, it sets a semaphore and then waits until the second thread resets the semaphore.
Threads can access shared resources by using semaphores, or by defining critical sections. A critical section is a block of code that executes entirely, without a thread switch.
I/O is handled in critical sections.

A less common type of multithreading is cooperative multithreading. One example is the chained multithreading method used on Digital VMS systems. The characteristics of this approach are:

Each thread decides when to return control to the multithreading system or operating system.
The thread program consists of a chain of small blocks of code.
Threads can communicate using semaphores.
Threads can access shared resources at any time - threads always execute as if in a critical section.
I/o is handled asynchronously: i/o operations are usually the basic unit of logic in a thread. Each block of code in a thread requests an i/o operation, and specifies the successor block.

If we compare these methods, we can see advantages and disadvantages in each:

The chained method is much harder to program.
The chained method can produce much more efficient programs, especially if they are i/o bound.
The pre-emptive method requires extra work to define critical sections; it is not fail-safe. By forgetting to define a critical section, a program may work one day, then fail another time. The chained method does not require critical sections.

Both these methods are expensive to program, and can produce code that is hard to maintain, error-prone, and therefore very expensive to make robust enough for real applications.

The History Of Libero Multithreading

In 1990, Leif Svalgaard wrote a tiny multitasking monitor for MS-DOS, to demonstrate that multitasking did not require megabytes of memory. This monitor was based on an event-passing kernel. It worked well and could multitask several DOS sessions simply and efficiently. This project was remarkable because it took a very short time to write (one long weekend) and because it required so little memory to run (several kilobytes). This monitor was based on Leif Svalgaard's earlier work in operating systems design, and defined the core principles of the SMT 2.x kernel.

In 1993, Pieter Hintjens developed a complex multithreaded application using the Digital VMS chained multithreading method. Under severe time constraints, he was obliged to take a radical alternative to the normal approach. He used Libero to abstract the 'chain' of multithreaded logic. This reduced the development cost by an estimated 80%, and resulted in a very stable and efficient application. This experience showed that the Libero state-machine abstraction - already useful for writing normal procedural code - was also good in multithreaded applications.

In 1993, Christian Rozet and Stephen Bidoul of ACSE built a version of Libero that generated a C++ 'asynchronous finite-state machine' to handle events coming from a GUI (MS-Windows). The resulting applications were in effect multithreaded applications, with the multithreading handled by Libero (actually the code that Libero generated).

In 1995, Pieter Hintjens and Pascal Antonnaux built SMT version 1.0, and a set of demonstration programs. The smthttpd web server ran on UNIX and Windows 95, showing that portable multitasking was a realistic objective.

SMT 2.0 is a fresh approach that combines the experience of these projects:

It is built on an event-based kernel that integrates smoothly into the event-driven state machine inside each program.
It uses Libero to abstract the multithreading logic, so that the application program is easy to write and maintain.
It can be oriented towards socket i/o (important for Internet server programs), towards Windows event handling, or towards any other event source.

Differences with Classic Multithreading

The main differences between SMT and 'classic' multithreading are:

Multithreading works at the user level, not the kernel level. This is sometimes called 'internal multithreading' or 'pseudo- multithreading'. User-level multithreading is transparent to the operating system, and can be 100% portable (as it is in SMT).
SMT cannot make direct use of multiple CPUs, since threads are not visible to the operating system.
Threads communicate with events as well as with semaphores. This is a clean abstraction that lets you design an object-oriented application.
SMT is simpler to use.
SMT is portable to (almost) any operating system and programming language, although the primary implementation is in ANSI C.

We note some other points of interest:

Thread switching occurs only between dialog action modules. A single dialog module will always run to completion. Thus, threads can share resources (data, files,...) without locking, critical sections, or other special safeguards.
SMT provides a high-level framework for constructing real applications. This is useful even without the multithreading aspects.
SMT uses asynchronous or non-blocking i/o as far as possible - for Internet sockets and file access. This results in efficient applications that can handle large numbers of connections with a low overhead per connection.

Why Use Multithreading?

We will consider two specific problems. Firstly, construction of an industrial-scale Internet server. Secondly, construction of a multi-level finite-state machine application.

There are many different ways to design an Internet server. The main problem is to handle multiple connections at once ('concurrency'). The classic way to get concurrency is to use the operating system multitasking functions. This is straight-forward enough. For instance, under UNIX, the server process uses the fork() system call to create a 'clone' of itself. At any moment there are multiple copies of the server process, each handling one connection. The operating system switches rapidly between these processes, so giving concurrency.

The problems with this design become apparent when you try to use it for large-scale work. Firstly, it is not portable - the fork() system call does not work on all operating systems. Secondly, each fork() call duplicates the server program in memory. This duplication takes a certain time, as does the eventual removal of the server process. A protocol like HTTP creates a large number of short-term connections. Lastly, each additional instance of the server process consumes system resources so that a typical system cannot handle more than a few hundred connections.

There are variations on this design that eliminate some of the problems. For instance, you can create a fixed number of instances of the server beforehand, then allow that number of connections. This eliminates the cost of creating and removing server processes, but does not raise the ceiling on the maximum number of connections.

A more sophisticated approach is to handle multiple connections within a single process. This is relatively simple to arrange, using the BSD socket select() function. This lets a program wait for activity on a set of open sockets. The logic of such a program is: wait for activity on a socket; handle the activity; repeat. This approach works when the logic of 'handle the activity' is simple. In realistic applications, however, this logic becomes complex, and involves activity such as reading or writing to files, or manipulating several sockets at once.

The SMT kernel uses the last approach, but provides a level of abstraction that makes the approach practical for large-scale problems. You can create one or several 'threads'. Each thread executes a copy of the finite state machine. The basic unit of logic in a thread is the code module.

The number of threads is limited only by the memory available to the process. Creating or removing a thread is fast (so a new connection can be established faster than using a fork() call), and as far as the operating system is concerned, there is just one process (so the cost to the operating system is lower).

Let's consider the design of an application that consists of several interworking state machines. This is the kind of design one finds in telecommunications and other specialised domains. The approach can be used in many areas. Typically, such a state machines processes an event queue; one state machine can send events to another.

In this type of design we need to save the 'state' of each state machine in some way so that it can process events in a meaningful manner. The 'state' consists of the actual state, the last event, and context information that the state machine program needs to remember between events. We can define this 'state' as a thread: the requirements are very close to that of the Internet server described above. If we only want a single thread in any state machine, we can consider this a special case of the general case, which is a full multithreaded approach.

Our conclusion from these two chains of argument is that a state-machine approach to multithreading is useful and valuable in real applications. Since Libero already provides a state-machine abstraction that converts a state-machine diagram into generated code, it is reasonable to use this mechanism to implement a generic and portable type of multithreading.

The SMT 2.x kernel works with a specific Libero code generation schema, smtschm.c, to provide this generic multithreading.

What You Should Know

If you intend to write Internet servers, you should have a basic understanding of the concepts behind IP, TCP/IP, and UDP/IP. While the SMT kernel does a good job of packaging and abstracting the Internet programming model, it is no substitute for a solid understanding of the issues involved. We recommend that you be familiar with these aspects at least:

The differences between TCP/IP and UDP/IP.
How TCP and UDP socket connections work.
The BSD socket programming interface.

The SFL socket functions provide the main abstraction layer; for instance you will not need to consider system-specific issues such as the WINSOCK interface.

Before designing or writing an SMT application you should understand the Libero method of program design. The main components of an SMT application - the agents - are designed and written using Libero.

Before writing SMT applications you should be familiar with the standard function library (SFL), since many SFL functions are used in a typical SMT application.