This appendix describes each of the protocols currently available in Version 3.3 of the x-kernel . (Additional protocols are available in Version 3.2.) The description for each protocol provides the following information.
Name of the protocol. This name, given in all lower-case letters, can be given as an argument to xGetProtlByName to get a capability for (pointer to) the protocol. Note that there are multiple implementations of various protocols; i.e., a given name might map to multiple implementations. The implementation bound to a name in a given kernel is set in graph.comp.
Reference to a document that gives the specification for the protocol. In cases where no formal specification exists, this section gives a high-level description of the protocol.
A brief description of what the protocol does. Outlines any unusual features and bugs, including any features of the protocol specification not implemented.
Indicates whether the protocol is in the ASYNC realm (supporting push, demux and pop), the RPC realm (supporting call, calldemux and callpop), the CONTROL realm (existing only to allow control operations), or the ANCHOR realm (interfacing with the host system).
A discussion of the number of participants the protocol expects to see and what it expects to see on the participants' stacks.
Non-standard control operations supported by the protocol. For each control operation, the type of the input and output argument is given (i.e., the type used to interpret the buffer argument). In the case of control operations that take multiple arguments, a set of types is given. Non-primitive types are generally defined in the protocol's .h file.
A description of interfaces not encapsulated within x-kernel operations
A description of configuration options for the protocol, including descriptions of of what this protocol expects of the protocols below it. If the protocol can only be configured above a certain protocol, the appropriate graph.comp line is given explicitly.
Who to complain to if the protocol fails to work as advertised.
ARPARP (Address Resolution Protocol)
D. Plummer. An Ethernet Address Resolution Protocol. Request for Comments 826, USC Information Sciences Institute, Marina del Ray, Calif., Nov. 1982.
ARP translates IP addresses into ethernet addresses, and vice versa (i.e., it also implements RARP). This implementation of ARP supports a single interface, but may be multiply instantiated to support several network interfaces.
ARP is in the CONTROL realm. There are no ARP sessions -- control operations may be performed on the protocol object only.
typedef int ( ArpForEachFunc) ( ArpBinding *, void * );
name=arp protocols=eth;
Larry Peterson and Norm Hutchinson
BID is the filtering module of the BootId protocol. The BootId protocol is designed to advise workstations that a peer has rebooted, to protect protocols from receiving messages generated during previous boot incarnations, and to inform higher protocols of a peer's reboot in a timely fashion.
If an upper protocol registers with BIDCTL protocol and messages from its session pass through BID sessions, the BootId protocol guarantees that a message from a rebooted peer will not be sent to an upper protocol until the upper protocol has been informed of the reboot.
BID sessions stamp all outgoing messages with a local and remote BootId and filter out all incoming messages which do not have the correct BootIds. Determination of the ``correct'' BootId is made by the BIDCTL protocol. BID requires BIDCTL.
BID is not reliable. If there is confusion between two peers as to what their mutual BootIds are, messages between them will be silently dropped until the confusion is resolved.
BID is in the ASYNC realm.
BID expects an IPhost on the top of each participant. It examines this value but does not remove it form the participant stack before opening its transport protocol.
BID expects to be configured above two protocols. The first is the transport protocol and the second is the BIDCTL protocol.
Ed Menze
BIDCTLBIDCTL (Bootid Control Protocol)
BIDCTL is the control module of the BootId protocol. The BootId protocol is designed to advise workstations that a peer has rebooted, to protect protocols from receiving messages generated during previous boot incarnations, and to inform higher protocols of a peer's reboot in a timely fashion.
If an upper protocol registers with BIDCTL protocol and messages from its session pass through BID sessions, the BootId protocol guarantees that a message from a rebooted peer will not be sent to an upper protocol until the upper protocol has been informed of the reboot.
Upper protocols register their desire to be informed of a peer's reboot by openEnabling BIDCTL with that remote peer's IPhost. When BIDCTL determines that the remote peer has rebooted, it informs all interested upper protocols via a control operation (see below.) If an upper protocol is no longer interested in learning about a peer's reboot, it may openDisable BIDCTL.
BIDCTL is in the CONTROL realm. There are no BIDCTL sessions.
BIDCTL openEnable and openDisable expect a single participant containing the IPhost of the remote peer.
The remaining control operations are not necessary for most users of BIDCTL. They are provided mostly for the use of filtering protocols (e.g., BID) which work in conjunction with BIDCTL.
The BootId of the output structure will be 0 (an invalid BootId) if BIDCTL doesn't yet know the peer's BootId.
BIDCTL expects only its transport protocol below it. It will open the transport protocol with a single participant consisting of the remote IP host.
BIDCTL uses an internal checksum and works correctly in the presence of dropped messages, so a reliable transport protocol is not necessary.
As an optimization, BIDCTL can perform an IP local-net broadcast to inform interested peers that it has rebooted. A rom file entry of the form:
bidctl bcastwill cause the broadcast and an entry of the form:
bidctl nobcastwill suppress it. Without a rom file entry, BIDCTL will perform the broadcast unless BIDCTL_NO_BOOT_BCAST is defined during compilation.
Ed Menze
BLASTBLAST (RPC Blast Micro-Protocol)
S. O'Malley and L. Peterson. A Dynamic Network Architecture. ACM Transactions on Computer Systems 10, 2 (May 1992), 110--143.
B. Welch. The Sprite remote procedure call. University of California at Berkeley, Tech Report UCB/CSD 86/302, June 1986.
BLAST is a micro-protocol version of Sprite RPC's fragmentation algorithm. The algorithm was extracted from Sprite and made into a stand-alone protocol. BLAST takes a large message, fragments it into smaller packets, and sends them. The maximum packet size accepted by BLAST (as returned by the GETMAXPACKET control op) is the product of the maximum number of fragments handled by BLAST (16 by default) and the optimal packet size of BLAST's lower protocol. Blast is tuned for the local area networks and should not be used across the Internet.
The receiver gathers all of the packets and sends a NACK if it has reason to believe (through time-outs or other considerations) that a packet has been dropped. BLAST can handle any number on outstanding messages between two hosts (buffer space permitting, of course). The protocol is bidirectional; i.e., it supports blasts in both directions over the same session. Small messages take a short cut through the protocol and do not require the allocation of any resources.
The sender keeps a copy of the message around until a time-out occurs or the higher level protocol that sent the message notifies BLAST that it can free the message (through a FREERESOURCES control op.) Users of blast are strongly encouraged to free messages as soon as possible. The sender knows which BLAST (BLAST can be instantiated more than once) and which message to free because when a push was performed blast writes into a message attribute attached by CHAN (or some other high level protocol) a pointer to itself and a 32 bit integer ticket which uniquely identifies the message.
Because the sending BLAST may time-out and release a message before all fragments have been received, BLAST is not reliable. It is, however, very persistent.
BLAST performance is critically dependant upon the time-out strategy used and the initial values of those timers. As mentioned earlier the sender uses a timer to free resources after a set interval has occurred. Tuning this timer for use with higher level protocols which do not explicitly free resources is very difficult. For applications which do free resources this time-out interval has no effect on performance unless it is set to too small a value. The receiver sets a timer whenever a fragment from a new packet arrives. The only purpose of this timer is to detect the drop of the last fragment. This timer is set to some constant times the number of fragments in the message. If this timer expires to early this is detected by the code and the constant is increased by a factor of two. After a NACK is set to the round trip time plus some constant times the number of fragments. The purpose of this time is to generate a new NACK if the original NACK or retransmitted segments are lost.
BLAST is in the ASYNC realm.
BLAST neither removes nor adds anything to the participant stacks.
BLAST requires only its lower transport protocol. Since BLAST doesn't use host addresses, it can sit on top of protocols using different address types without modification.
Sean O'Malley and Ed Menze
CHANCHAN (RPC Channel Micro-Protocol)
S. O'Malley and L. Peterson. A Dynamic Network Architecture. ACM Transactions on Computer Systems 10, 2 (May 1992), 110--143.
B. Welch. The Sprite remote procedure call. University of California at Berkeley, Tech Report UCB/CSD 86/302, June 1986.
CHAN is a single protocol version of Sprite RPC's reliable request-reply channel. The algorithm was extracted from Sprite and made into a stand-alone protocol. Each CHAN session supports the Birrell-Nelson implicit acking RPC algorithm between two hosts.
CHAN provides ``at most once'' RPC semantics. When a CHAN call returns successfully, the protocol guarantees that the request has been processed exactly once by the server. If CHAN returns unsuccessfully (XK_FAILURE), the server may have processed the request once, or it may not have seen the request at all.
Channel numbers are entirely internal to the CHAN protocol. When a new client channel session is created, a new host-host channel number is selected internally by CHAN. When protocols openEnable CHAN, they will receive connections from any channel number on any remote host. Each open of CHAN by a client session will result in the passive creation of a corresponding session on the server.
Each channel session will accept only a single outstanding request. Sending additional requests on a channel before the first request has returned is not allowed.
CHAN relies on the BIDCTL and BID protocols to determine when a peer has rebooted. When notified of a peer's reboot, CHAN will disable all active channels to that host. Outstanding calls will return XK_FAILURE, as will all subsequent calls on that channel session. Replies sent through disabled server channels will be discarded.
CHAN must know several things about the transport protocol used to actually send the message. This information is represented in the following structure:
typedef struct { XObj transport; int ticket; int reliable; int expensive; unsigned int timeout; } chan_info_t;
This structure is defined in the CHAN session state and a pointer to it is attached as an attribute to each outgoing message. Before the message is send CHAN zero's out all fields of the structure. When xPush returns CHAN assumes that some lower level protocol may have filled in the fields.
If transport has been defined CHAN will perform a FREERESOURCES control operation on transport when the current message has been acked. If the lower level protocol is reliable CHAN will never retransmit the entire message and will not start a timer. If the lower level protocol is expensive CHAN will not retransmit the entire message when it times out. It simply requests an ACK. The timeout field is ignored for the moment. If transport has been defined CHAN will invoke a CHAN_RETRANSMIT control operation on it before retransmitting. If this control operation returns 0 CHAN will not retransmit the body of the message. This allows a lower level protocol like BLAST to discourage CHAN from retransmitting while the message is still being sent.
CHAN lies on the boundary between the ASYNC realm and the RPC realm. That is, it looks like an ASYNC protocol to protocols below it, and an RPC protocol to protocols above it.
CHAN neither removes from nor adds to the participant stacks, passing the participants untouched to the transport protocol on an open and ignoring the participant structure on an openenable.
CHAN requires its lower transport protocol configured as the first lower protocol and BIDCTL configured as the second lower protocol. CHAN requires that it's transport protocol will deliver incoming messages from different hosts through different lower sessions and that all CHAN messages from the same host come from the same lower session.
CHAN is a realm boundary protocol which assumes its transport protocol is symmetric (in the ASYNC realm.)
Because CHAN affixes a pointer to the outgoing message it must be in the same address space as any transport protocol which will attempt to set the structure passed in the attribute.
Sean O'Malley
This hardware-independent protocol provides the interface between the rest of the x-kernel protocols and the actual ethernet drivers. It has a UPI interface to protocols above it and interacts with the drivers through a specialized UPI interface. There should be a separate instantiation of the ETH protocol for each driver protocol.
ETH is in the ASYNC realm.
ETH expects a single remote participant with an ETHhost pointer on the top of the stack. If the local participant is present it is ignored.
Ethernet driver protocols should include the file protocols/eth/eth_i.h which defines the interface between ETH and the drivers.
ETH will openenable its driver protocol once at initialization time, without a participant list. This gives the driver protocol the XObj it should use in xDemux when it delivers messages.
ETH calls xPush with the driver protocol object (not a session) to send a message. ETH never opens the lower protocol.
ETH will attach a pointer to an ETHhdr as a message attribute for each outgoing message:
typedef struct { ETHhost dst; ETHhost src; u_short type; } ETHhdr;
ETH requires that the driver attach a message attribute pointing to an appropriate ETHhdr structure for every incoming message. For both incoming and outgoing messages, the ETHhdr type field will be in network byte order.
ETH requires the driver protocol to implement the control op GETMYHOST.
ETH provides support for IEEE 802.3 packet formats. An upper protocol registering with Ethernet type 0 is assumed to the recipient for all IEEE 802.3 packets. Conversely, a protocol using an Ethernet type smaller than the maximum IEEE 802.3 data size will have its packets sent using IEEE 802.3 format (i.e., with the packet length overwriting the type field.)
Each instantiation of ETH should be configured above its corresponding driver protocol.
ETH recognizes the following ROM options:
eth/xxx mtu N: Instantiation xxx of ETH should use an MTU of N (decimal). Default is 1500.
Ed Menze
This hardware-independent protocol provides the interface between the rest of the x-kernel protocols and the actual FDDI drivers. It has a UPI interface to protocols above it and interacts with the drivers through a specialized UPI interface. There should be a separate instantiation of the FDDI protocol for each driver protocol.
FDDI is in the ASYNC realm.
FDDI expects a single remote participant with an FDDIhost pointer on the top of the stack. If the local participant is present it is ignored.
FDDI driver protocols should include the file protocols/fddi/fddi_i.h which defines the interface between FDDI and the drivers.
FDDI will openenable its driver protocol once at initialization time, without a participant list. This gives the driver protocol the XObj it should use in xDemux when it delivers messages.
FDDI calls xPush with the driver protocol object (not a session) to send a message. FDDI never opens the lower protocol.
FDDI will attach a pointer to an FDDIhdr as a message attribute for each outgoing message:
typedef struct { FDDIhost dst; FDDIhost src; u_short type; } FDDIhdr;
FDDI requires that the driver attach a message attribute pointing to an appropriate FDDIhdr structure for every incoming message. For both incoming and outgoing messages, the FDDIhdr type field will be in network byte order.
FDDI requires the driver protocol to implement the control op GETMYHOST.
Each instantiation of FDDI should be configured above its corresponding driver protocol.
FDDI recognizes the following ROM options:
fddi/xxx mtu N: Instantiation xxx of FDDI should an MTU of N.
David Yates
ICMPICMP (Internet Control Message Protocol)
J. Postel. Internet Protocol. Request for Comments 792, USC Information Sciences Institute, Marina del Ray, Calif., Sept. 1981. ;
ICMP handles control messages for IP. This implementation is complete in that it handles all possible incoming ICMP requests.
ICMP is in the CONTROL realm. ICMP sessions may be opened to allow control operations.
ICMP neither removes nor adds anything to the participant stacks. It passes the participants directly to IP.
name=icmp protocols=ip;
Clinton Jeffery
J. Postel. Internet Protocol. Request for Comments 768, USC Information Sciences Institute, Marina del Ray, Calif., Aug. 1980.
IP handles fragmentation and routing required in transmitting messages across heterogeneous interconnected networks. This implementation is complete, with the exception of some of the optional header fields.
IP is in the ASYNC realm.
IP removes a pointer to an IPhost from the top of the stack of each participant. If the local participant is missing or if the local IPhost pointer is ANY_HOST, IP will select an appropriate local IPhost.
IP must be configured above VNET:
name=ip protocols=vnet;
If an explicit route for a remote network is not specified, IP will forward packets for that network to a default gateway, if one has been configured. The default gateway can be set with a rom file entry of the form:
ip gateway 127.1.22.11
If no default getway has been configured, or the specified default gateway can not be reached directly, IP will operate without a default gateway and ERR_XOBJ will be returned in cases where a default gateway would otherwise have been used.
Clinton Jeffery, David Kays and Ed Menze
SELECTSELECT (RPC Select Micro-Protocol)
S. O'Malley and L. Peterson. A Dynamic Network Architecture. ACM Transactions on Computer Systems 10, 2 (May 1992), 110--143.
B. Welch. The Sprite remote procedure call. University of California at Berkeley, Tech Report UCB/CSD 86/302, June 1986.
SELECT is a micro-protocol that performs the addressing function of Sprite RPC; i.e., it demultiplexes request messages to the appropriate procedure.
SELECT is in the RPC realm.
SELECT removes a pointer to a long (the remote procedure number) from the top of the stack of the first participant.
SELECT expects one RPC realm protocol below it.
Sean O'Malley
SIMSIMETHSIMSIMETH (Simulated Simulated Ethernet Driver Protocol)
SIMSIMETH simulates an x-kernel ethernet driver by sending and receiving messages using any x-kernel protocol that accepts UDP addresses. SIMSIMETH is platform independent. SIMSIMETH interoperates with SIMETH ( and itself). SIMSIMETH's primary purpose is to support the testing of protocols between x-kernel simulators such as the SunOS simulator and native mode x-kernel implementations such as the Mach 3.0 implementation.
Each instantiation of SIMSIMETH is associates a simulated Ethernet address with a specific UDP address and simulates an Ethernet driver for a single interface. SIMSIMETH reads the UDP port from the ROM file and gets its IP address by performing a GETMYHOST control operation on the protocol configured below it.
The mapping between Unix UDP ports and SIMSIMETH Ethernet addresses is very simple. The six bytes of SIMSIMETH Ethernet address are formed by the concatenation of the four byte IP host number for the Unix host on which the simulator is running and the two byte UDP port used by the SIMSIMETH instantiation. Note that this IP host must be valid and defined by the protocol graph below SIMSIMETH. Since one generally runs two copies of ARP when SIMSIMETH is configured keeping your IP addresses straight is sometimes difficult. See the CONFIGURATION section below.
When a message is sent using SIMSIMETH a map is checked to see if a lower level session exists for that destination address. If no lower level session exists one is created by performing an open on the protocol configured below SIMSIMETH (most probably) UDP. Once SIMSIMETH has opened a lower level session it is never closed. SIMSIMETH then pushes an Ethernet header on the message and pushes the message using the lower level session. When a packet arrives at SIMSIMETH the Ethernet header is removed and SIMSIMETH presents the packet received as an incoming Ethernet packet.
Note that an x-kernel may be configured with multiple instantiations of SIMSIMETH, each with its own UDP port, to simulate a multihomed host. SIMSIMETH can awkwardly simulate Ethernet broadcast messages. When an outgoing broadcast message is sent to SIMSIMETH, SIMSIMETH asks its corresponding ARP protocol for a dump of all hosts in its table. SIMSIMETH then sends the message to each of these hosts in a point-to-point fashion. Note that for a reasonable simulation of Ethernet broadcast, all x-kernel s in communication should have the same ARP table (see the ARP appendix.)
The primary purpose of SIMSIMETH is to allow simulated x-kernels and native x-kernels to interoperate. This is possible because SIMSIMETH interoperates with the SIMETH protocol. Interoperability is achieved as follows. The graph.comp file for the simulated x-kernel is unchanged and is rooted at SIMETH the simulated Ethernet driver. The graph.comp for the machine running a native x-kernel must contain the same graph in the simulated version accept that SIMSIMETH is configured instead of SIMETH and SIMSIMETH is configured on top of the standard Internet protocol graph. If the following was the graph.comp on the SunOS simulator:
#SunOS Graph.comp @; name=simeth; name=eth protocols=simeth; name=arp protocols=eth; name=vnet protocols=eth,arp; name=ip protocols=vnet; name=udp protocols=ip; name=udptest protocols=udp; @; prottbl = ../prottbl.simsimeth;
The corresponding graph for the native mode x-kernel would be:
#Mach 3.0 Graph.comp @; name=ethdrv/SE0; name=eth/lower protocols=ethdrv/SE0; name=arp/lower protocols=eth/lower; name=vnet/lower protocols=eth/lower,arp/lower; name=ip/lower protocols=vnet/lower; name=icmp/lower protocols=ip/lower; name=udp/lower protocols=ip/lower; name=simsimeth protocols=udp/lower; name=eth/upper protocols=simsimeth; name=arp/upper protocols=eth/upper; name=vnet/upper protocols=eth/upper,arp/upper; name=ip/upper protocols=vnet/upper; name=udp/upper protocols=ip/upper; name=udptest protocols=udp/upper; @; prottbl = ../prottbl.simsimeth;
The /lower protocols in the above graph.comp correspond to the Internet protocol suite implemented in the SunOS kernel. The /upper protocols correspond the protocols running on the simulated x-kernel. Note that a user defined protocol table is needed. This is done because the protocol table must define the correct (800) relative ETH protocol id for IP for this technique to work. The default protocol table defined by the x-kernel defines the IP protocol number as 3900 to avoid interference when testing new protocols. Note also that if you use SIMSIMETH please make sure that your versions of UDP, IP, ARP, VNET, and ETH are at least as new as SIMSIMETH.
There must be a similar correspondence between the ROM files on the SunOS side and the native Mach 3.0 side. For example the following is an example ROM file for the SunOS platform:
simeth 9875 eth mtu 1400 arp 192.12.69.1 192.12.69.67 9875 arp 192.12.69.2 192.12.69.99 1234
While the following is an example ROM file for the Mach 3 platform:
arp/lower 192.12.69.99 08:00:2b:23:6d:ec # mozart simsimeth 1234 eth/upper mtu 1400 arp/upper 192.12.69.1 c0:0c:45:43:26:93 # translation: 192.12.69.67 9875 arp/upper 192.12.69.2 c0:0c:45:63:04:d2 # translation: 192.12.69.99 1234
The SunOS ROM file is identical to the standard ROM file except for the addition of a line to set the Ethernet MTU to 1400 bytes. If the default MTU of 1500 bytes were to be used SunOS IP would fragment the outgoing simulated Ethernet packets into two real Ethernet packets of 1500 bytes and 64 bytes respectively. This pattern of packets can result in a serious increase in the number of dropped packets. The MTU of the simulated Ethernet driver on the Mach platform (eth/upper) is also set to 1400 bytes to avoid the same problem. Note you should always manually set the MTU's of the simulated Ethernet drivers to the same number!
The ROM for Mach 3.0 must be changed to set the appropriate IP to Ethernet address correspondence for the real ARP (ARP/lower). While the simulated ARP (ARP/upper) must be configured with the same information as given in the SunOS ROM file. This is made more complex because ARP on Mach 3.0 platforms expects real Ethernet addresses while the ARP on the SunOS platform expects to find simulated Ethernet address. Therefore the user must manually convert a port IP address pair into an Ethernet address using the algorithm given above. For example the "Ethernet address": c0:0c:45:43:26:93 is simply the hex translation of 192.12.69.67 9875.
Note that when the client is started it should be passed the simulated IP address of the server (192.12.69.1or2) not the real IP address of the server.
Working graph.comp, ROM, and protocol table files can be found in the Template directory.
SIMSIMETH is in the ASYNC realm, supporting the Ethernet driver interface described in the ETH appendix.
SIMSIMETH supports the Ethernet driver interface rather than a standard xkernel UPI interface and thus makes no use of participant stacks.
SIMSIMETH supports the Ethernet driver interface described in the ETH appendix.
SIMSIMETH should be configured on top of any protocol that takes UDP addresses and below any protocol which supports the ETH lower level driver interface. It can be configured in either the driver section or the protocol section of graph.comp.
SIMSIMETH recognizes the following ROM options:
simsimeth nnnn: This instantiation of simsimeth should use UDP port nnnn. There must be such a line for each instantiation of SIMSIMETH in the x-kernel .
Sean O'Malley
SUNRPCSUNRPC (Sun Remote Procedure Call Protocol)
Remote Procedure Calls: Protocol Specification. Sun Microsystems, Inc., Mountain View, Calif., May 1988.
Sun RPC is a fairly complete implementation of Sun's remote procedure call protocol. This implementation is compatible with Sun's native implementation. The interface to our implementation of Sun RPC in no way resembles Sun's. The Sun Portmapper is treated as a separate protocol that sits on top of Sun RPC. This implementation only supports UDP as a lower level protocol. The protocol supports unreliable remote procedure calls to possibly heterogeneous hosts. Our implementation of the Sun RPC protocol uses most but not all of the original Sun RPC include files and XDR files. Thus, you must have access to Sun include and XDR files to compile this protocol.
SUNRPC is in the RPC realm.
SUNRPC removes three pointers to longs from the remote participant stack, representing the procedure, version and program number (starting from the top of the stack.)
The local participant may be present, in which case it is passed on untouched to the next protocol, or it may be absent.
SUNRPC is a realm boundary protocol which assumes its lower protocol is symmetric (in the ASYNC realm.)
name=sunrpc protocols=udp;
Sean O`Malley and Richard Schroeppel
? Sun Microsystems, Inc., Mountain View, Calif., May 1988.
The portmapper supports four of the five of the procedures defined by Sun. Indirect Call, Function 5, is not implemented. This implementation is compatible with Sun's native implementation. The x-kernel portmapper is treated as a separate protocol that sits on top of the x-kernel implementation of Sun RPC. This implementation only supports UDP as a lower level protocol. Client and server protocols running above the sunrpc protocol make remote procedures calls to the portmapper, using its well-known UDP port number 111.
Our implementation of the Sun Portmapper protocol uses some of the original SUN_RPC include files. You must have access to Sun include files to compile this protocol.
The portmapper is in the ASYNC realm.
The portmapper is a realm boundary protocol which assumes its single lower protocol is in the RPC realm.
name=pmap protocols=sunrpc;
Sean O`Malley and Richard Schroeppel
TCPTCP (Transmission Control Protocol)
Transmission Control Protocol. Request for Comments 793, USC Information Sciences Institute, Marina Del Rey, Calif., Sept. 1981
TCP is a reliable stream transport protocol. It maintains a connection between the server and the client, and provides reliable stream delivery to the process. This implementation is an encapsulation of the Unix 4.3 BSD implementation.
This implementation of TCP supports input and output buffering. Output buffers are contained within TCP. If the amount of data sent and unacknowledged by the peer reaches the output buffer size, TCP will block subsequent xPush's (or will return XMSG_ERR_WOULDBLOCK in the case of non-blocking I/O.)
TCP provides support for users to work with finite input buffers. TCP will limit the amount of input data sent to its upper protocol via xDemux to the size of the input buffer. When data have been consumed from the user's input buffer, free buffer space must be signalled to TCP via a TCP_SETRCVBUFSPACE call (see below.) If a user does not wish to use input buffering, a control message signalling an empty buffer should be sent in response to each xDemux.
TCP is in the ASYNC realm.
TCP removes a pointer to a long (the TCP port number) from the participant stack. TCP ports must be less than 0x10000. If the local participant is missing, or if the local protocol number is ANY_PROT, TCP will select an unused local port.
Note: all protocols using TCP without having OOB data delivered in-band must be prepared to accept this upcall.
name=tcp protocols=ip;
Norm Hutchinson, Herman Rao, and David Mosberger-Tang
TESTTEST (instantiated as 'chantest', 'udptest', etc.)
The test protocol runs a ``ping-pong'' test of the protocol below it for various message lengths and numbers of iterations.
Transport test protocols run in one of two roles, either as ``client'' or as ``server.'' The client will send a message to the server and wait for a reply before sending the next message. There are no provisions for retransmission -- if the protocol below the test protocol drops a message, the test will fail.
When the test protocol instantiates, it can determine which role it should assume in several ways. Command line parameters can be used to cause the same kernel to run as the server on one machine and as the client on another. The server should be started up with a ``-s'' flag:
xkernel -s
The client side must be told the host address of the server peer (note that on the sunos platform, this should be the address of the simulated IP host.) This can be done with the ``-c'' command line option, e.g.:
xkernel -c192.12.69.54
If you will be running several tests between the same hosts you may find it convenient to copy the test protocol to your build directory and edit the declaration of ServerAddr and ClientAddr in your local copy to name these hosts directly. This will eliminate the need for client and server flags. Note that most test protocols include other files from the
protocols/test directory, so you will either have to copy those files as well or add $(XRT)/protocols/test to the TMP_INCLUDES variable in your Makefile.
If a test protocol is configured with an instance name of 'client' or 'server', it will come up in the appropriate role. This can be used to run both a client and server in the same kernel for loopback testing as in this graph.comp excerpt:
... name=udp protocols=ip; name=udptest/server protocols=udp; name=udptest/client protocols=udp;
If you have configured several standard test protocols into the kernel, you can run any subset of them by putting the test names onto the command line, e.g.:
xkernel -testip -testudp
With no command line test selections, all of the configured test protocols will run.
The number of round trips for each packet size can be set with the ``trips'' flag:
xkernel -trips=10000
If you use ``dns'' lines (see 10.7) in your rom file to map host names to IP addresses, then you can use the name in place of the IP address when starting the client, i.e.,
Rom file entry:
dns mars 192.12.69.54
Sample command lines for starting client:
xkernel -c mars xkernel -cmars
Note that the name on the command line must be an exact match (not a substring) of the rom file entry.
The test protocols all use the common trace variable prottest which can be set in the third section of graph.comp:
@; ... name=udptest protocols=udp; @; name=prottest trace=TR_EVENTS;
If you set a trace level when you declare the test protocol in the second section of graph.comp, it will be ignored.
Remember that if you are using simeth you must use the name of the simulated host when you invoke the client, not the real host.
UDPUDP (User Datagram Protocol)
J. Postel. User Datagram Protocol. Request for Comments 768, USC Information Sciences Institute, Marina del Ray, Calif., Aug. 1980.
UDP is a trivial protocol that dispatches messages that arrive at the host to a process running on the host.
UDP is in the ASYNC realm.
UDP removes a pointer to a long (the UDP port number) from the participant stack. UDP ports must be less than 0x10000. If the local participant is missing, or if the local protocol number is ANY_PORT, UDP will select an unused local port.
name=udp protocols=ip;
Larry Peterson and Sean O'Malley VCACHEVCACHE (Virtual Caching Protocol)
VCACHE is a simple session caching protocol which provides a time buffer between when an upper protocol closes a lower session and when that close is propagated to the lower session.
When a VCACHE session is closed, it is placed in a time-expiration cache. When VCACHE receives subsequent opens (active or passive), it first looks in its cache for sessions with the same hlp and the same remote host. After a configurable amount of time, closed VCACHE sessions will be destroyed and the lower session actually closed.
As an example, VCACHE can be inserted between a UDP server and UDP to prevent UDP sessions from being closed when the server releases all references.
Server sessions (those created as a result of a match between an incoming packet and an openenable) are always cached. If an instantiation of VCACHE is running in symmetric mode, client sessions (those created by an xOpen) will be cached as well. It is reasonable to configure VCACHE in symmetric mode above symmetric (ASYNC realm) protocols and in asymmetric mode above asymmetric (RPC realm) protocols, though this usage is certainly not required.
VCACHE can be used with either ASYNC or RPC realm lower protocols and assumes the realm of the protocol below it.
VCACHE passes participants directly to the lower protocol and does not make direct use of any participant information.
VCACHE uses a single lower protocol.
VCACHE recognizes the following ROM options:
Run this instantiation in symmetric mode.
Ed Menze VCHANVCHAN (Channel Virtual-Protocol)
S. O'Malley and L. Peterson. A Dynamic Network Architecture. ACM Transactions on Computer Systems 10, 2 (May 1992), 110--143.
B. Welch. The Sprite remote procedure call. University of California at Berkeley, Tech Report UCB/CSD 86/302, June 1986.
VCHAN is a virtual protocol that multiplexes multiple client procedure invocations over some number of open channels. The call blocks if there are no idle channels available. VCHAN was originally based on the Sprite RPC protocol.
VCHAN initially opens a default number of channels for a new session, though this number can be increased or decreased via control operations.
VCHAN is in the RPC realm.
VCHAN expects an IPhost pointer on the stack of each participant. It will not remove this pointer before passing the address down to the lower protocol.
VCHAN expects to be configured above another RPC realm protocol. It expects that each xOpen on the lower protocol with the same participants will return a new lower session.
Ed Menze
VDISORDERVDISORDER (Virtual Disorder Protocol)
Shuffles the order of incoming packets. Used to exercise the recovery mechanisms of other protocols.
VDISORDER sessions deliver packets in a somewhat different order than they receive them. The characteristics of the reordering are controlled by per-session parameters which can be set by control operations. VDISORDER will eventually deliver all packets it receives.
VDISORDER has no effect on outgoing packets.
VDISORDER is in the ASYNC realm.
VDISORDER passes participants to the lower protocols without manipulating them.
Erich Nahum
VDROPVDROP (Virtual Drop Protocol)
Throws away occasional incoming packets. Used to exercise the recovery mechanisms of other protocols.
VDROP sessions throw away incoming packets at regular intervals. By default, this interval is set in a somewhat random fashion at session creation time, though it can be set explicitly on a per-protocol basis via a ROM option (see CONFIGURATION below) or on a per-session basis via a control operation.
VDROP has no effect on outgoing packets.
VDROP should probably allow sessions to have more interesting distributions of drop intervals than ``once every N packets.''
VDROP is in the ASYNC realm.
VDROP passes participants to the lower protocols without manipulating them.
VDROP recognizes the following ROM options:
vdrop/xxx interval N: Instantiation xxx of VDROP will use N as the drop interval for all of its sessions.
Ed Menze VMUXVMUX (Virtual Muliplexing Protocol)
VMUX sits above several lower protocols, making them appear as a single protocol. VMUX makes very simple decisions about which protocol to use for each connection.
VMUX is only active during connection establishment. When openenabled, VMUX openenables all of its lower protocols on behalf of the upper protocol. When opened, VMUX attempts opens its lower protocols in the order in which they were specified in the down vector until a successful open occurs. The successfully opened lower session is then returned.
Since VMUX is active only during open, it should be considered in the same realm as the protocols below it (which should probably all be in the same realm.)
VMUX passes its participants to the lower protocols unmodified.
VMUX forwards control operations to the last protocol in the down vector list.
The lower protocols will be opened in the order in which they are listed in the configuration entry.
Ed Menze
VNETVNET (Virtual Network Protocol)
VNET is a virtual protocol which manages multiple physical network protocols. When opened with an IP address, VNET determines if the host can be reached directly on one of its physical networks. If it can, a session on that network is opened. If it can not be directly reached, an ERR_XOBJ is returned.
VNET sits above pairs of network protocols (one per interface) and ARP protocols. When opened with a remote IP address, VNET compares the net number with that of its lower protocols to determine if the host can be reached directly on a local network, opening the appropriate interface protocol (if possible.)
If opened with an IP broadcast address, VNET will determine which networks are matched by the broadcast address and will open a lower session on each of those networks. A push on a VNET broadcast session will result in a push on all of these lower network sessions.
Use of the IP broadcast address 255.255.255.255 will result in a VNET session which broadcasts on all of the local networks.
VNET is in the ASYNC realm.
VNET removes a pointer to an IPhost from the top of the stack of the remote participant. Only the remote participant is processed. New participants are created for opening the lower network protocols.
Determines the address class of the given IP host. The address class is one of the following:
Indicate the number of interfaces used by the VNET protocol (protocol only.)
Indicates (through the xControl return value) whether the given host is on one of VNET's interfaces. When performed on a session, only those interfaces active on that session will be considered (a typical VNET session only uses one interface, though a broadcast session may have more than one.) When performed on a protocol, all interfaces are considered.
Returns sizeof(IPhost) if it is on a local network, 0 if it is not.
Indicates (through the xControl return value) whether the given host is an address which might be used to reach this host on VNET's local networks (i.e., if the address is one of this host's IP addresses or is a broadcast address.) Returns sizeof(IPhost) if it is local (or broadcast), 0 if it is not.
VNET expects its lower protocols to be configured in network/ARP pairs:
name=vnet protocols=eth/1,arp/1,eth/2,arp/2;
Ed Menze
VSIZEVSIZE (Size Virtual-Protocol)
S. O'Malley and L. Peterson. A Dynamic Network Architecture. ACM Transactions on Computer Systems 10, 2 (May 1992), 110--143.
VSIZE is a virtual protocol that multiplexes messages through N lower-level protocols based on the size of the message being sent. By default, VSIZE determines the maximum packet size that each lower level protocol can handle by performing a GETOPTPACKET control operation on the first N-1 lower protocols (the last lower protocol is assumed to have an infinite maximum packet size). VSIZE sends each message using the lower level protocol with the smallest index whose optimum packet size is greater than the length of the message.
VSIZE is in the ASYNC realm.
VSIZE passes participants to the lower protocols without manipulating them.
VSIZE forwards control operations to the ``largest message'' protocol.
VSIZE's lower protocols should be order by decreasing efficiency and increasing packet size.
VSIZE recognizes the following ROM options:
vsize/xxx cutoff C1 C2: Instantiation xxx of VSIZE should use a cutoff length of C1 bytes for its first down protocol and a cutoff value of C2 bytes for its second down protocol. This control operation allows the user of VSIZE to override the GETOPTPACKET. Note this operation does not check to see if the specified cutoff value is less than the maximum packet size of the lower level protocol.
Ed Menze VTAPVTAP (Virtual Wiretap Protocol)
VTAP simulates a wiretap by intercepting all messages as they are sent or received.
VTAP is a virtual protocol that intercepts messages. It will print the entire message unless the tap has been disabled.
VTAP is in the ASYNC realm.
VTAP passes participants to the lower protocols without manipulating them.
VTAP expects to be configured between two ASYNC protocols.
David C. Schwartz