--- /dev/null
+Rx protocol specification draft
+Nickolai Zeldovich, kolya@MIT.EDU
+
+Introduction
+============
+
+Rx is a client-server RPC protocol, an extended and combined version
+of the older R and RFTP protocols. This document describes Rx, but
+the details of Rx security protocols (such as Rxkad) are not specified.
+
+Rx communicates via UDP datagrams on a user-specified port. Rx also
+provides for multiplexing of Rx services on a single port, via a
+16-bit service ID which identifies a particular Rx service that's
+listening on a given port akin to a port number. Therefore, an Rx
+service is identified by a triple of <IP address; UDP port number;
+Rx service ID>.
+
+The protocol is connection-oriented -- a client and a server must
+first hand-shake and establish a connection before Rx calls can be
+made. Said hand-shaking is implicit upon the first request if no
+authentication is desired, or can consist of a pair of Challenge
+and Response requests in order to establish authentication between
+the client and the server.
+
+Protocol Overview
+=================
+
+As mentioned above, Rx uses UDP/IP datagrams on a user-specified
+port to communicate. An optional user-selectable authentication
+and encryption method can be used to achieve desired security.
+Each Rx server may provide multiple services, specified by the
+Service ID. This allows for service multiplexing, much in the
+same way as UDP port numbers allow for multiplexing of UDP
+datagrams addressed to the same host.
+
+Each client and server pair that want to communicate using Rx must
+establish an Rx connection, which can be thought of as a context
+for all subsequent Rx activity between these two parties. An Rx
+connection can only be associated with a single Rx service.
+
+Each Rx connection context contains multiple channels, which are
+used for data transmission and actually performing an RPC call.
+The channels are independent of each other, allowing multiple
+RPC calls to be performed to the same Rx server simultaneously.
+
+An Rx call involves the transmission of call arguments over an Rx
+channel to the server and reception of the reply data. For each
+Rx call, an available Rx channel must be allocated exclusively to
+that call. The channel cannot be used for anything else until the
+call completes. After call completion, the channel may be reused
+for subsequent Rx calls.
+
+Rx Connections
+==============
+
+This section makes many references to fields of an Rx header; see
+the ``Packet Formats'' section for specific layout of the Rx header.
+
+The connection epoch is a unique value chosen by Rx on startup and
+used by the peer to both to identify connections to this host, and
+to detect when this host's Rx restarts. An Rx connection between
+two hosts is identified by:
+
+ { Epoch, Connection ID, Peer IP, Peer Port },
+ if the high bit of the epoch (+) is not set
+ { Epoch, Connection ID },
+ if the high bit of the epoch (+) is set
+
+This means that if the high epoch bit is set, the recipient of a
+packet should accept packets for this Rx connection from any IP
+address and port number. Conversely, if the high bit is not set,
+the IP and port number must be the same in order for packets to
+be properly recognized as being part of the same connection.
+
+Connection ID is chosen by the client that establishes the connection.
+The last two bits of the same 32-bit field are used by Rx to multiplex
+between 4 parallel calls on the same connection. Each one of them is
+called an Rx channel, and therefore the field is denoted "Channel ID".
+
+Call number identifies a particular call within a channel (so there
+are four call numbers associated with an Rx connection). Each new
+call should start with a higher number than the previous call, and
+typically this is just the previous call number + 1. The initial
+call number must be non-zero, since call number zero indicates a
+connection-only Rx packet (see below). The call number is chosen
+by the peer initiating the call. Although only one call can use
+a channel at one time, the call number allows peers to distinguish
+packets on the same channel that belong to different calls.
+
+The sequence number is similar to the sequence number in TCP, but
+instead of bytes they count packets within a call. Sequence numbers
+always start with 1 at the beginning of each call, and are incremented
+by 1 for each additional packet sent. Retransmissions in Rx are done
+on a packet-by-packet basis, identified by these sequence numbers.
+
+Every outgoing packet associated with a certain connection is stamped
+with a serial number in the serial field, and the serial number is
+incremented by 1 for every packet sent. This is used by the flow
+control mechanisms (described below). The serial number for a
+connection should start out with 1 (i.e., the first packet sent
+should have a serial number of 1.)
+
+Service ID identifies a particular Rx service running on a given
+host/port combination. This is analogous to how UDP port numbers
+allow multiplexing packets to a single IP address. Note that once
+an Rx connection has been created, the service ID may not be changed;
+existing implementations cache the service ID value for a given
+connection, and will ignore service ID values in subsequent packets.
+
+The Checksum field allows for an optional packet checksum. A zero
+checksum field value means that checksums are not being computed.
+An Rx security protocol (identified by the security field, described
+below) may choose to use this field to transport some checksum of
+the packet that is computed and verified by it (for example, rxkad
+uses this field for a cryptographic header checksum). Rx itself
+makes no use of the checksum field.
+
+The status field allows for additional user flags to be transported
+with each packet. These have no significance to the protocol itself.
+These flags are associated with a call rather than an individual
+packet.
+
+The security field specifies the type of security in use on this
+connection. These values don't have a defined mapping in the Rx
+protocol but rather are mapped to specific Rx security types by
+the application using Rx.
+
+An Rx security protocol can use the checksum field as described
+above, and can also modify the packet payload in any way, for
+instance by encrypting the contents or adding headers or trailers
+specific to the security protocol (although the end result must
+be a properly sized packet that Rx will be able to transmit.)
+
+The "Flags" field consists of a number of single-bit flags with
+meanings as follows. The actual bit values are defined below,
+in the ``Protocol Constants'' section.
+
+ * CLIENT-INITIATED
+ This packet originated from an Rx client (as opposed
+ to server). To avoid packet loops, a server should
+ always clear the CLIENT-INITIATED flag on any packets
+ it sends, and discard incoming packets without the
+ CLIENT-INITIATED flag.
+
+ * REQUEST-ACK
+ Sender is requesting acknowledgement of this packet,
+ via an Ack packet response.
+
+ * LAST-PACKET
+ This packet is the last packet in this call from the
+ sender.
+
+ NOTE: some older Rx implementations, which do not
+ support the trailing packet size fields in Rx Ack
+ packets, use the LAST-PACKET flag for computing the
+ MTU. In particular, when a DATA packet with the
+ REQUEST-ACK flag but without the LAST-PACKET flag
+ is received, the MTU is adjusted down to the size
+ of that packet.
+
+ * MORE-PACKETS
+ More packets are going to be following this one. This
+ flag is set on all but the last packet by the sender
+ transmitting a list of packets at once, for possible
+ optimization at the receiver end.
+
+ * SLOW-START-OK
+ In an ack packet, indicates that the sender of this
+ packet supports the slow-start mechanism, described
+ below under ``Flow Control''.
+
+ * JUMBO-PACKET
+ In a data packet, indicates that this packet is part
+ of a jumbogram, and is not the last one. See the
+ ``Jumbograms'' section below for more details.
+
+Packet Types
+============
+
+The "Type" field indicates the contents of this packet. Actual
+values are specified in the ``Protocol Constants'' section.
+This section describes the simpler packet types, and subsequent
+sections cover more complex packet types in more detail.
+
+Certain type packets are connection-only requests (that is, they
+are not associated with an RPC call). A connection-only request
+is indicated by a zero call number. Valid packet types in a
+connection-only context are Abort, Challenge, Response, Debug,
+Version, and the parameter exchange packet types. All other
+packets can only be used in the context of a call. Additionally,
+Abort can be used both in a connection and call context.
+
+The payload of the packet following the header depends on the
+type of the field, as follows:
+
+ * DATA type (Standard data packet)
+
+ The payload of a data packet is simply the Rx payload,
+ corresponding to the sequence number and call specified
+ in the header. The actual data that is transmitted in
+ Rx data packets is described below.
+
+ The receipt of a data packet by a client implicitly
+ acknowledges that the server has received and processed
+ all the packets that have been transmitted to it as
+ part of this call.
+
+ * ACK type (Acknowledgement of received data)
+
+ An acknowledgement packet provides information about
+ which packets were or were not received by the peer,
+ and other useful parameters. The semantics of these
+ packets are described below in the ``Call Layer''
+ section.
+
+ * BUSY type (Busy response)
+
+ When a client tries to start a new call on a channel
+ which the server still considers active, a busy response
+ is returned. The call and channel number in the packet
+ header indicate which call is being rejected. This packet
+ type has no payload associated with it.
+
+ * ABORT type (Abort packet)
+
+ Indicates that the relevant connection or call (if the
+ call number field is non-zero) has encountered an error
+ and has been terminated. The payload of the packet has
+ a network-byte-order 32-bit user error code.
+
+ * ACKALL type (Acknowledgement of all packets)
+
+ An acknowledge-all packet indicates the obvious: the peer
+ wants to acknowledge the receipt of all packets sent to
+ it. This could be used, for example, when a connection
+ is being closed and the client wants to ensure that no
+ retransmissions are attempted after it exits.
+
+ There is no payload associated with an acknowledge-all
+ packet.
+
+ * CHALLENGE, RESPONSE types (Challenge request/response)
+
+ The payload of the packet is security-layer-specific
+ data, and is used to authenticate an Rx connection.
+
+ Perhaps this should include a reference to some spec
+ on rxkad (or rxkad should just be added to this spec.)
+
+ * DEBUG type (Debug packet)
+
+ Rx supports an optional debugging interface; see the
+ ``Debugging'' section below for more details.
+
+ * PARAMS types (Parameter exchange)
+
+ These types were assigned in AFS 3.2 but never used for
+ anything, and therefore have no protocol significance
+ at this time.
+
+ * VERSION type (Get AFS version)
+
+ If a server receives a packet with a type value of 13, and
+ the client-initiated flag set, it should respond with a
+ 65-byte payload containing a string that identifies the
+ version of AFS software it is running. The response should
+ not have the client-initiated flag set.
+
+ Nothing should respond to a version packet without the
+ client-initiated flag, to avoid infinite packet loops.
+
+Call Layer
+==========
+
+ The call layer provides a reliable data transport over an
+ Rx channel, and is used by the RPC layer to make Rx calls.
+ One of the most important pieces of the call layer is the
+ Rx acknowledgement packet. The acknowledgement packet is
+ used by Rx to determine when retransmissions are needed,
+ as well as determining the proper transmission / receiving
+ parameters to use (such as the transmit window size and
+ jumbogram length, described in more detail below).
+
+ A new call is established by the client simply sending a
+ data packet to the server on an available channel. Either
+ side can indicate that they have no more data to send by
+ setting the LAST-PACKET flag in their last Rx packet. The
+ call remains open until the upper layer informs Rx that it
+ is done with the call. (The upper layer in this case would
+ most likely be the Rx RPC layer.)
+
+ The structure of an Rx acknowledgement packet is described
+ in the Packet Formats section. We will refer to particular
+ fields of the acknowledgement packet here by names.
+
+ The <Buffer Space> field specifies the number of packets that
+ the sender of the acknowledgement is willing to provide for
+ receiving packets for this call. The sender, presumably,
+ should not send packets beyond the number specified here,
+ without receiving further acknowledgement allowing it.
+
+ The <Max Skew> field indicates the maximum packet skew that
+ the sender of this packet has seen for this call. If a
+ packet is received N packets later than expected (based
+ on the packet's serial number, i.e. if the last received
+ packet's serial number is N higher than this packet's),
+ then it is defined to have a skew of N. This can be used
+ to avoid retransmission because of packet reordering.
+
+ The <First Sequence> number specifies the sequence number of
+ the first packet that is being explicitly acknowledged (either
+ positively or negatively) by this packet. All packets with
+ sequence numbers smaller than this are implicitly acknowledged.
+
+ The <Reserved> field, previously used to indicate the previous
+ received packet, is no longer used. It should be set to zero
+ by the sender and not interpreted by the receiver.
+
+ The <Serial Number> field indicates the serial number of the
+ packet which has triggered this acknowledgement, or zero if there
+ is no such packet (i.e. the ack packet was delayed and should not
+ be used for round-trip time computation). The receiver should
+ note that any transmitted packets with a serial number less than
+ this, which are not acknowledged by this packet, are likely lost
+ or reordered. Thus, these packets should be retransmitted, after
+ a possible delay to allow for packet reordering (as measured by
+ packet skew).
+
+ The trailing fields after the variable-length acknowledgements
+ section are not always 32-bit aligned with respect to the packet,
+ and aren't always present. (Their presence depends on the Rx
+ version of the peer.) The maximum and recommended packet sizes
+ are, respectively, the largest possible packet size that the peer
+ is willing to accept from us, and the size of the packet they
+ would prefer to receive. In absence of these fields, it should
+ be assumed that the maximum allowed packet size is 1444 bytes.
+
+ The receive window size indicates the size of the ACK sender's
+ receive window, in packets. Its use is described below in
+ the "Flow Control" section. If this field is absent, the
+ implementation must assume a maximum window size of 15 packets;
+ older implementations that do not support this trailing field
+ only allow for a window of 15 packets.
+
+ The "Max Packets per Jumbogram" field indicates how many packets
+ the ACK sender is willing to receive in a jumbogram (also
+ described below). All packets in a jumbogram are always of the
+ same size (except the last one), regardless of the maximum and
+ recommended packet sizes described above.
+
+ The <Reason> field specifies a particular type of an ack packet.
+ Valid reason codes are specified in the ``Packet Formats and
+ Protocol Constants'' section; their meanings are as follows:
+
+ REQUESTED
+ Acknowledgement was requested. The peer received
+ a packet from us with the acknowledgement-requested
+ flag set, and is acknowledging it.
+
+ DUPLICATE
+ A duplicate packet was received. The duplicate
+ packet's serial number is in the <Serial> field.
+
+ OUT-OF-SEQUENCE
+ A packet was received out of sequence. The serial
+ number of said packet is in the <Serial> field.
+
+ WINDOW-EXCEEDED
+ A packet was received but exceeded the current
+ receive window, and was dropped.
+
+ NO-SPACE
+ A packet was received, but no buffer space was
+ available and therefore it was dropped.
+
+ PING
+ This is a keep-alive packet, used to verify that
+ the peer is still alive. If the REQUEST-ACK flag
+ in the Rx packet is set, the recipient of this
+ packet should reply with a PING-RESPONSE packet.
+
+ PING-RESPONSE
+ This is a response to a keep-alive ack (ping).
+
+ DELAYED
+ A delayed acknowledgement, usually because a certain
+ amount of time has passed since the receipt of the
+ last packet and there are outstanding unacknowledged
+ packets. Should not be used for RTT computation.
+
+ OTHER
+ Un-delayed general acknowledgement, which does not
+ fall in any of the above categories.
+
+ A peer should never delay the transmission of an ack packet
+ in response to a received packet unless it sets the delayed
+ ack type field. This is because ack packets (except for
+ delayed ones) are used for RTT computation by Rx.
+
+ All acknowledgement packets should have the REQUEST-ACK
+ flag in the Rx header turned off, except for PING type
+ ack packets.
+
+ The <Ack Count> field specifies the number of bytes following
+ in the acknowledgements section. Each of those bytes indicate
+ the acknowledgement status corresponding to a sequence number
+ between firstSequence and firstSequence+ackCount-1 inclusively.
+ There can be up to 255 bytes in the acknowledgements section.
+ Typically the ack count is the receive window size of the
+ ack packet sender, and the individual packet status bytes
+ correspond to the packets in the current receive window.
+ The values in each of those bytes can be as follows:
+
+ 0 Explicit negative acknowledgement: packet with the
+ corresponding sequence number has not been received
+ or has been dropped.
+ 1 Explicit acknowledgement: packet with the corresponding
+ sequence number has been received but not processed by
+ the application yet.
+
+ It's important to note the distinction between packets with
+ sequence numbers before firstSequence, between firstSequence
+ and firstSequence+ackCount-1, and those with sequence numbers
+ of at least firstSequence+ackCount. Those in the first category
+ have been passed up to the application level and the sender
+ (recipient of this ack) can recycle packets with such sequence
+ numbers.
+
+ Packets in the second category are individually acknowledged
+ in the acknowledgements section, either as being queued for
+ the application or not received. The recipient of the ack
+ should keep all packets with sequence numbers in this range,
+ but avoid retransmitting the positively acknowledged ones.
+ Negatively acknowledged packets should be retransmitted.
+ A more detailed explaination of the retransmit strategy is
+ given below.
+
+ Packets in the third category are not acknowledged at all,
+ and the recipient of the ack should assume no knowledge
+ of their state. Since the Rx receive window should not
+ exceed the size of an ack packet, the sender shouldn't
+ have transmitted any packets in this category anyway.
+
+ * Round-trip time computation
+
+ To determine when packet retransmission is necessary, Rx
+ computes some statistics about the round-trip time between
+ the two hosts: exponentially-decaying averages of the
+ round-trip time and the standard deviation thereof. Each
+ acknowledgement packet which mentions a specific packet in
+ the <Serial> field and is not delayed is used to update the
+ round-trip statistics. First, the round-trip time for this
+ packet (R) is computed as the difference between the arrival
+ time of the ack packet and the time we transmitted the
+ packet with the serial number specified in <Serial>.
+
+ Next, the round-trip time average and standard deviation
+ values are updated. For instance, this algorithm could
+ be used:
+
+ RTTdev = RTTdev * (3/4) + |RTTavg - R| / 4
+ RTTavg = RTTavg * (7/8) + R / 8
+
+ * Packet retransmission
+
+ In order to support reliable data transport, Rx must retransmit
+ packet which are lost in the network. This must not be done
+ too early, otherwise we might retransmit a packet whose first
+ copy is still in transit, thereby wasting bandwidth.
+
+ Rx computes a retransmit timeout value T, and retransmits any
+ packet which hasn't been positively acknowledged since last
+ transmission for at least T seconds. This timeout could be
+ computed as follows from the round-trip statistics above:
+
+ T = RTTavg + 4 * RTTdev + 0.350
+
+ This allows the packet to be up to 4 deviations late and still
+ not be retransmitted. The 350 msec fudge factor is used to
+ compensate for bursty networks, though it is likely becoming
+ less relevant (and accurate) with time.
+
+ A more clever algorithm could take into account the maximum
+ packet skew rate, and improve the retransmission strategy to
+ take into the account the likelihood that a given packet has
+ been reordered, and give it extra time before retransmission.
+
+ * Keepalive and Timeout
+
+ The upper layer (either the Rx RPC layer or the application)
+ have to specify a timeout, T, to the call layer. If the peer
+ is not heard from within T seconds, the call layer declares
+ the call to be dead and propagates the error to the upper
+ layer.
+
+ In order to determine whether the peer is still alive or not,
+ keepalive requests are used. These take form of an ack PING
+ and PING-RESPONSE packets. When the client has not received
+ any response from the server, either to the original request
+ or the keepalive requests, in T seconds, the call times out.
+
+ The following strategy may be used to determine when to send
+ keepalive requests:
+
+ Compute a keepalive timeout, KT = T/6
+
+ If the call was initiated KT seconds ago, or KT
+ seconds have passed since the last keepalive
+ request transmission, send a keepalive packet.
+
+ This strategy limits the number of transmitted keepalive
+ packets to a fixed number in the case of a dead server,
+ and proportional to the real timeout in case of a slow
+ server. It also allows up to 5 keepalives to be dropped
+ before the server is erroneously declared dead.
+
+ * Flow Control
+
+ Every Rx client or server has associated with each Rx call a
+ receive and transmit window. These windows indicate the number
+ of packets that haven't been fully acknowledged packets (that
+ is, not read by the peer's application) that an Rx sender can
+ have outstanding at any time. A sender's transmit window may
+ never be greater than it's peer's receive window for that call.
+ The receive windows are exchanged via the "Receive Window Size"
+ parameter in an Ack packet.
+
+ Rx ``sliding windows'' are similar to those used by TCP, except
+ they measure packets rather than bytes. Also, in TCP the window
+ effectively applies to bytes in flight between the two peers,
+ whileas in Rx the window applies to packets between the user
+ applications. For example, a transmit window of 8 on a certain
+ Rx connection means that at most 8 packets can be transmitted
+ and not yet read by the peer's application at any time. The
+ sequence number of the first packet that hasn't been read by
+ the application is indicated by the First Sequence field of
+ an Ack packet.
+
+ The selection of initial window sizes isn't strictly defined
+ by the Rx protocol, but here are a few things that one might
+ want to consider when choosing initial windows:
+
+ * A useful strategy can be to advertise a small receive
+ window until the application starts reading data, and
+ advertise a larger window afterwards.
+
+ * The transmit window should be initially a conservative
+ small value. Once an Ack packet is received, the peer's
+ advertised receive window can be used to choose a better
+ transmit window.
+
+ Rx uses the slow start, congestion avoidance, and fast recovery
+ algorithms[6]. The algorithms are modified to work in the context
+ of Rx packet-based transmission windows, and are described below.
+
+ These algorithms require two additional variables to be maintained
+ for each active Rx call: a congestion window, cwind, and a slow
+ start threshold, ssthresh.
+
+ Define a "negative ack" as an Ack packet that contains a negative
+ acknowledgement followed by a positive one. Similarly, define a
+ "positive ack" to be any Ack that is not negative. Upon receiving
+ three negative acks for a call in a row since the last congestion
+ avoidance attempt (if any), the Rx protocol enters congestion
+ avoidance for that Rx call.
+
+ * Slow start, congestion avoidance, and fast recovery algorithms
+
+ First, the congestion window, cwind, is initialized to 1.
+ The number of unread transmitted packets is now limited not
+ only by the transmission window, but also by the congestion
+ window. The latter limit is a little different: Rx may
+ send up to cwind packets (by sequence number) past the last
+ contiguous positively acknowledged packet. For example,
+ if an Ack packet indicates that packets 1, 2 and 8 were
+ received, and cwind is 2, Rx may transmit packets 3 and 4.
+
+ When congestion occurs (indicated by a negative ack or a
+ packet retransmission timeout), Rx enters congestion avoidance
+ and fast recovery. The slow-start threshold, ssthresh, is
+ set to half of the effective transmission window (minimum of
+ cwind and transmit window), but no less than 2 packets.
+
+ If triggered by a negative ack, any negatively acknowledged
+ packets should be retransmitted as soon as possible (i.e.
+ window-permitting).
+
+ If triggered by a retransmission timeout, the congestion
+ window is reset to a single packet.
+
+ When in fast-recovery mode, every additional negative ack
+ packet received causes cwind to be increased by one packet.
+ A positive ack packet causes cwind to be set to ssthresh,
+ and terminates fast recovery. At this point we are back
+ to congestion avoidance, since the cwind is half the original
+ transmission window.
+
+ When packet acknowledgements are received, the congestion
+ window should be increased. If cwind is less than ssthresh,
+ cwind should be increased by 1 for each newly acknowledged
+ packet. If cwind is at least ssthresh, cwind is increased
+ by 1 for each newly received Ack packet.
+
+ The size of the receive window should not grow past the size of
+ an Rx ack packet (which can acknowledge up to 255 packets at a
+ time.)
+
+Debugging
+=========
+
+Rx provides for an optional debugging interface, using the Debug AFS
+packet type, allowing remote Rx clients to query an Rx server for
+some Rx protocol statistics. Not all implementations are required
+to implement this interface. Some parts of this interface may also
+be specific to a particular implementation of Rx. In order to prevent
+packet loops, a server should only reply to debug packets with the
+client-initiated flag set.
+
+The payload of a debug request packet is always the same; both of
+the 32-bit quantities are in network byte order:
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Debug Type |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Debug Index |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+The debug type indicates the kind of debug information being sent
+or requested, and determines the format of the rest of the packet.
+The debug index allows some debug types to export array-like data,
+indexed by this field. The following debug types are defined for
+the Transarc implementation:
+
+ 0x01 Retrieve basic connection statistics
+ 0x02 Get information about some connections
+ 0x03 Get information about all connections
+ 0x04 Get all Rx stats
+ 0x05 Get all peers of this server
+
+The index field in the debug packet indicates which element of the
+debug information the client wants to access, in cases where there
+are multiple entries in question.
+
+The responses to each of those debug queries contain the following
+information:
+
+1. Retrieve basic connection stats
+
+ An array of general statistics about packet allocation,
+ server performance, and so on. The first octet in this
+ response represents the debug protocol version being used
+ by the server. See RX_DEBUGI_VERSION* in rx/rx.h.
+
+2, 3. Get information about connections
+
+ Both of these calls return a struct rx_debugConn (see
+ rx/rx.h), indexed by the "index" field.
+
+ The first version of the debug call (type 2) only retrieves
+ information about connections which are deemed interesting,
+ that is, connections which are active, or about to be
+ reaped.
+
+ The end of the list is signaled by a response where the
+ connection ID value is 0xFFFFFFFF.
+
+4. Get Rx stats
+
+ This call returns a struct rx_stats to the client in network
+ byte order, containing various statistics about the state of
+ Rx on the server (see rx/rx.h).
+
+5. Get all Rx peers
+
+ Similar to the connection request above (2, 3) this call
+ returns all the Rx peers of the server (in a network-byte-order
+ struct rx_debugPeer), indexed by the index field in the request.
+ End of list is indicated by a host value of 0xFFFFFFFF. (These
+ are the first 4 octets.)
+
+In response to unknown requests, the server returns 0xFFFFFFF8 in the
+debug type field.
+
+ XXX The response interface should probably be fixed
+ to include a fixed header that indicates whether
+ the request was successfully completed.
+
+Jumbograms
+==========
+
+To be able to transmit more data in a single packet, Rx supports
+``jumbograms'', which are single UDP datagrams containing multiple
+sequential Rx DATA packets. In a jumbogram, all packets except the
+last one must be of a fixed maximal size (1412 bytes). Because all
+the packets in the jumbogram are sequential, only one full header
+is needed. Here is what a jumbogram could look like:
+
+ +-----------+---------------+--------------+---------------+
+ | Rx header | 1412 byte pkt | Short header | 1412 byte pkt | ->
+ +-----------+---------------+--------------+---------------+
+
+ +--------------+- -+-----------------------+
+ -> | Short header | ... | <= 1412 byte last pkt |
+ +--------------+- -+-----------------------+
+
+Every Rx packet in a jumbogram except the first one must be preceeded
+by the short Rx header, and all packets except the last one must have
+the Jumbogram Rx flag set in their respective headers. The number of
+packets in a jumbogram may not exceed the peer's advertised Max Packets
+Per Jumbogram value in the Ack packet.
+
+The maximum number of packets per jumbogram should be assumed to be 1
+(i.e., no jumbograms) unless explicitly specified otherwise by an Ack
+packet. If an Ack packet is received without the packet-per-jumbogram
+field, it might indicate that the peer is now running a version of Rx
+that does not support jumbograms, and therefore no jumbograms should
+be sent until they are explicitly enabled again.
+
+The short header in a jumbogram has the following makeup:
+
+ 0 1
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Flags | Reserved |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Checksum |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+All the packets in the jumbogram have the same Rx header fields
+(from the full Rx header) except for Flags, Checksum, Sequence,
+and Serial. The flags and checksum field for subsequent packets
+are taken from the short header preceeding that packet in the
+jumbogram. The sequence and serial numbers are assumed to be
+consecutive, and are incremented by 1 from the first packet in
+the jumbogram (ie the full Rx header).
+
+Retransmitted packets should not be sent in a jumbogram.
+
+RPC Layer
+=========
+
+This section discusses how an RPC call is made using the Rx protocol.
+There are two common ``types'' of Rx calls: simple and streaming.
+These mostly reflect a difference in the upper-level API rather than
+in the Rx protocol. A simple Rx call has a fixed number of input
+variables and a fixed number of output variables. A streaming Rx
+call, in addition to the above, allows the user to send and receive
+arbitrary amounts of data (whose length should be specified as a
+fixed-length argument.)
+
+In either case, an Rx call consists of two basic stages: client
+sending the data to the server, and server sending the response
+back to the client. No data can be sent by the client in the
+same call after the server has started sending its response.
+
+Each remote function call associated with a particular Rx service
+(identified by the IP-port-serviceId triplet, as mentioned above)
+is assigned a 32-bit integer opcode number. To make a simple Rx
+call, the caller must transmit the opcode number followed by the
+expected arguments for that call over an Rx channel using XDR
+encoding. The callee uses XDR to unmarshall the opcode and input
+arguments, performs a function call corresponding to that opcode
+and arguments, and then uses XDR to encode the return values back
+to the caller. The caller then uses XDR to receive the output
+variables.
+
+For streaming calls which send data from the caller to the callee,
+the convention is to include the length of the data to be sent as
+one of the fixed-length arguments, and send the variable-length
+data immediately after the fixed-length portion. For streaming
+calls which receive data, the convention is for the callee to first
+reply with a fixed-length field specifying the number of bytes it's
+about to send, and then send those bytes. Upon completion of the
+streaming part of the call, the output arguments are sent back to
+the caller in fixed-length XDR form, as with simple calls.
+
+Packet Formats and Protocol Constants
+=====================================
+
+ * Rx packet
+
+ Every simple Rx packet has an Rx header, of the form below.
+ All quantities are in network byte order.
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |+| Connection Epoch |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Connection ID | * |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Call Number |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Sequence Number |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Serial Number |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Type | Flags | Status | Security |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Checksum | Service ID |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Payload ....
+ +-+-+-+-+-
+
+ [*] The field marked with * is the Channel ID. The last
+ two bits of the connection ID are used to multiplex
+ between 4 parallel calls.
+
+ [+] The bit marked with + is used to indicate that only
+ the connection ID should be used to identify this
+ connection, and sender host/port should not be used.
+
+ The values for the Flags field are defined as follows:
+
+ 0000 0001 CLIENT-INITIATED
+ 0000 0010 REQUEST-ACK
+ 0000 0100 LAST-PACKET
+ 0000 1000 MORE-PACKETS
+ 0001 0000 - Reserved -
+ 0010 0000 SLOW-START-OK
+ 0010 0000 JUMBO-PACKET
+
+ Commonly, but not necessarily, the following value mappings
+ for the Security field are used:
+
+ 0 No security or encryption
+ 1 bcrypt security, only used in AFS 2.0
+ 2 "krb4" rxkad
+ 3 "krb4" rxkad with encryption (sometimes)
+
+ The following packet type values are defined:
+
+ 1 DATA Standard data packet
+ 2 ACK Acknowledgement of received data
+ 3 BUSY Busy response
+ 4 ABORT Abort packet
+ 5 ACKALL Acknowledgement of all packets
+ 6 CHALLENGE Challenge request
+ 7 RESPONSE Challenge response
+ 8 DEBUG Debug packet
+ 9 PARAMS Exchange of parameters
+ 10 PARAMS Exchange of parameters
+ 11 PARAMS Exchange of parameters
+ 12 PARAMS Exchange of parameters
+ 13 VERSION Get AFS version
+
+ * Rx acknowledgement packet
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Buffer Space | Max Skew |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | First Sequence |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Reserved |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Serial |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Reason | Ack Count | Acknowledgements ...
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ..
+
+ ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ ... Acks | Reserved | Reserved |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Maximum Packet Size |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Recommended Packet Size |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Receive Window Size |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Max Packets per Jumbogram |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ Note that the trailing fields can have arbitrary alignment,
+ determined by the number of individual acks in the packet.
+ There are three reserved octets between the variable acks
+ section and the start of the trailing fields; they also have
+ no particular alignment.
+
+ The valid values for the Reason code are:
+
+ 1 REQUESTED
+ 2 DUPLICATE
+ 3 OUT-OF-SEQUENCE
+ 4 WINDOW-EXCEEDED
+ 5 NO-SPACE
+ 6 PING
+ 7 PING-RESPONSE
+ 8 DELAYED
+ 9 OTHER
+
+Acknowledgements
+================
+
+Jeffrey Hutzelman <jhutz@cmu.edu> reviewed an early draft of this
+specification, and provided much appreciated feedback on technical
+details as well as document structuring.
+
+Love Hornquist-Astrand <lha@stacken.kth.se> made many corrections
+to this specification, especially regarding backwards-compatibility
+with older Rx implementations.
+
+References
+==========
+
+ [1] /afs/sipb.mit.edu/contrib/doc/AFS/hijacking-afs.ps.gz
+
+ [2] OpenAFS: src/rx/
+
+ [3] /afs/sipb.mit.edu/contrib/doc/AFS/ps/rx-spec.ps
+
+ [4] ftp://ftp.stacken.kth.se/pub/arla/prog-afs/shadow/doc/r.vdoc
+
+ [5] ftp://ftp.stacken.kth.se/pub/arla/prog-afs/shadow/doc/rx.mss
+
+ [6] http://web.mit.edu/rfc/rfc2001.txt
+
+$Id: rx-spec,v 1.22 2002/10/20 06:46:00 kolya Exp $