Stupid Message-oriented Multiplexing Protocol

Please don't use this, I wrote it when I was younger and foolisher. This page is here for historical purposes only. You might want to read the articles on protocol design linked from the front page of the site.

Introduction

What we want to do is have multiple channels running over the same TCP connection. Why would we want to do that? Well, it saves time opening new connections (e.g. when using SSL), we can have multiple requests to the server waiting for replies at once. Unlike pipelining, even if one of our requests returns a lot of data, we don't have to wait until all of its is sent - we can get replies to other requests while we are getting all that data.

The SMOMP protocol (I need a better name...) solves this problem by allowing multiple channels of communications to be run over the same TCP connection, for client/server protocols. A channel allows two entities to send messages back and forth. The receiver gets the total message the sender sent. Therefore, unlike a regular protocol, we don't need to use \r\n to signify the end of a request, since the object that gets a message knows it got the total contents of the message. For example, if the sender sent the message HELLO, the receiver will be notified it got a message of length 5 with the content HELLO.

Protocol Definition

A message is made up of two parts - the channel ID, and the actual data of the message. The channel ID is two bytes chosen by the client. Each channel has a unique ID, so we can have 65536 channels. The message data is some number of bytes, the contents of which depend on the protocol.

To send a message we prepend the channel ID (two characters) to the message data, and the resulting string is encoded as a netstring and added to a sending queue. At the same time the messages in the queue are taken one by one and sent over the TCP connection.

What exactly is a netstring? To quote Dan Bernstein's article:

A netstring is a self-delimiting encoding of a string. Netstrings are very easy to generate and to parse. Any string may be encoded as a netstring; there are no restrictions on length or on allowed bytes. Another virtue of a netstring is that it declares the string size up front. Thus an application can check in advance whether it has enough space to store the entire string.

For example, the string hello world! is encoded as <31 32 3a 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 2c>, i.e., 12:hello world!,. The empty string is encoded as 0:,.

When the other side receives the message, it know how much to read since the message is a netstring. Once the receiver has read in the whole message, it extracts the message data and channel ID from the netstring and hands the message to the object that is using that channel.

Closing and Opening Channels

Channel \000\000 is a control channel. Messages sent on this channel are used to open and close channels, and whatever other meta-information we need. Its messages are in the form <channel id><command>.

The command "o" means open a new channel, with the given channel id. It is sent before any messages can be sent on a given channel.

The command "c" means the channel on the sending side has closed down. That means it will send no more messages, and the receiver should be aware of this, and when it finished processing whatever messages it has should sent back a close message of its own.

Example

Let's look at an example - a counter application. Whenever a new channel is opened, the server creates a new counter with value 0. The client can then increment the counter, and the server will send back it's new value. Let's look at an example session:

Client on channel ab sends: INCREMENT 100
Client on channel zx sends: INCREMENT 50
Client on channel ab sends: INCREMENT 20
Client on channel ab sends: VALUE?
Client on channel zx sends: VALUE?

Server on channel ab responds: 120
Server on channel zx responds: 50

So what was actually sent was:

15:abINCREMENT 100,14:zxINCREMENT 50,14:abINCREMENT 20,8:abVALUE?,8:zxVALUE?,

Notice that we do not know what order the server will respond in - perhaps ab's response will be returned first, perhaps zx's. SO, the response from the server can be either:

5:ab120,4:zx50,

or, alternatively:

4:zx50,5:ab120,

Techniques

Sending large messages is not a good idea, since this will prevent other channels from sending messages while the large message is being sent. The solution is to break up the data into multiple peices of a small size, e.g. 10kb. If there are more pieces of the data to be sent after this one, we prepend "M" to the data, and send that as the message. If this is the last piece, we prepend "S" the data before sending it as the message.

The receiver, knowing this data being sent is broken up (because it just issued a download command for a file, for example) examines the data of the messages as it gets them. If the data starts with "M", it knows there are more pieces of data coming, but if it starts with "S" it knows this is the last piece of data.

Another benefit of this technique is that it can be used to send large amounts of data when the sender does not know how much data will be generated (e.g. when sending the results of a long-running program.)

Sample Implementation

You can download a sample implementation written in Python - multiplex-0.4.2.tar.gz.

Related Protocols

Return to homepage