Drizzle Wiki
Advertisement
STATUS: This is currently a proposed draft as of October 24, 2008

Drizzle Protocol
----------------

The Drizzle protocol works over TCP, UDP, and Unix Domain Sockets
(UDS, also known as IPC sockets), although there are limitations when
using UDP (this is discussed below). In the case of TCP and UDS,
a connection is made, a handshake is performed, and then a command
response loop is started. Socket communication ends when either side
closes the connection or a QUIT command is issued.

TCP and UDS communications will be full duplex. This means that as
the client is sending a command, it is possible for the server to
report an error before the sending of data completes. This allows
the server to do preliminary checks (table exists, authentication,
...) before a request is completely sent s the client may abort. This
will primarily be used for large requests (INSERTing large BLOBs).

TCP and UDS communications will also allow for pipeling of requests,
and concurrent command execution. This means a client does not need
to wait for a command to finish before a new command is sent. It is
even possible a later command issued will complete and have a result
before an earlier command. Result packets may also be interleaved,
so a client issuing concurrent commands must be able to parse results
concurrently.

UDP sockets are supported to allow small, fast updates for
applications such as statistical gathering. Since UDP does not
guarantee delivery, this method should not be used for applications
that require reliable transport. When using UDP, the client handshake
packet, authentication packet (if needed), and command packet are
bundled into a single UDP packet and sent. This puts a limitation on
the size of the request being made, and this limit can be different
between network hosts. The absolute limit is 65,507 bytes (28 bytes
used for IPv4 and UDP headers), but again, this can depend on the
network hosts. Responses are optional when issuing UDP commands,
and this preference is specified in the handshake packet.

All sizes given throughout this document are in bytes. Byte order
for all multi-byte binary objects such as lengths and mutli-byte
bit-fields are packed little-endian like MySQL (yes, not big-endian
which is usually used for network order).


Changes from MySQL protocol
---------------------------

Here is a list of the changes made in Drizzle for those who already
know the internals of the MySQL protocol.

* Length encoding mechanism has changed, there is no longer a 3-byte
  length option.
* Attributes in packets are specified as variable keys and values
  rather than a predefined structure. This allows the protocol to
  stay flexible in the future.
* Packet length removed from packet header. The end of a packet is
  now marked by a special end-of-packet parameter.
* Packet sequence number removed from packet header since this
  duplicates TCP functionality.
* Command identifier added to packet header to track concurrent command
  and result packets on a single connection.
* Pluggable authentication system.
* Pluggable checksum.
* Pluggable compression schemes.


Packet Sequence Overview
------------------------

The sequence of packets for a simple connection and command that
responds with an OK packet:

S: Server Handshake
C: Client Handshake
C: Command
S: OK

Subsequent commands may be issued without the handshake packet
exchange. The sequence of packets for a simple connection and query
command with results:

S: Server Handshake
C: Client Handshake
C: Command
S: Result
S: Fields (multiple packets)
S: EOF
S: Rows (multiple packets)
S: EOF

When authentication is required for a command, the server will ask
for it. For example:

S: Server Handshake
C: Client Handshake
C: Command
S: Authentication Required
C: Authentication Credentials
S: Result
S: Fields (multiple packets)
S: EOF
S: Rows (multiple packets)
S: EOF

The server will use the most recent credential information when
processing subsequent commands. If a client wishes to multiplex
commands on a single connection, it can do so using the Command
Identifiers. Here is an example of how the packets could be ordered,
but this will largely depend on the servers ability to process the
commands concurrently and the processing time for each command.

S: Server Handshake
C: Client Handshake
C: Command (Command ID=1)
C: Command (Command ID=2)
S: Result (Command ID=2)
S: Field (Command ID=2)
S: OK (Command ID=1)
S: Fields (multiple packets with Command ID=2)
S: EOF
S: Rows (multiple packets with Command ID=2)
S: EOF

As you can see, the commands may be executed with results generated
in any order, and the packet containing the results may be interleaved.


Length Encoding
---------------

Some lengths used within the protocol packets are length encoded. This
means the size of the length field will vary between 1 and 9 bytes,
and is determined by the value of the first byte.

0-250 - Actual length
251   - NULL value (only applicable in row results)
252   - Following 2 bytes hold length
253   - Following 8 bytes hold length
254   - Depends on context, usually signifies end-of-file
255   - Depends on context, usually signifies error


Chunked Encoding
----------------

Some parameters within the protocol, such as arguments for query
commands or fields in a result row, may be sent using chunked
lengths. This is very similar to the chunked transfer encoding used
in HTTP. This allows a large peice of data to be sent without knowing
the initial length up front. It also allows for a large peice of data
to be aborted gracefully (without having to close the connection) in
the event of an error. To send data with chunked encoding, a series of
length-encoded chunks are sent, one after another, and is terminated
by a 0-length encoding value. If the encoding needs to be aborted,
a value of 254 is sent. An example length encoding is:

FC 00 20    (length encoded value, size=8192)
<...8192 bytes of data...>
FC 00 20    (length encoded value, size=8192)
<...8192 bytes of data...>
FC 44 01    (length encoded value, size=324)
<...324 bytes of data...>
00          (length encoded value, size=0, marks end of chunks)


Packets
-------

With the exception of the server handshake packet, a packet consists
of a packet header containing a command ID, followed by a series of
packet parameters. For packets sent from the client to server, the
first parameter is the command type. This is followed by zero or more
parameters that are relevant to the command being issued.  The packet
is completed when a special end-of-packet parameter type is given.


Packet Parameters
-----------------

Packet parameter names are defined in a global namespace, although
not all parameters are relevant for all packet types. Parameters are
enumerated, and the name is specified with a length-encoded value
representing the enumerated name. Each packet parameter may have a
value associated with it, and each parameter defines the size and
how that value is given. The list of possible packet parameters are:

"Length-encoded string" means a length-encoded value, followed by a
string of that length.

For plugin parameters such as auth, checksum, and compression,
a server handshake packet will be have a parameter given for each
supported plugin. In packets sent from the client, only a single
parameter is given to specify the preferred method (if any). A client
may clear a preference by given a string of length 0.

0   END_OF_PARAMETER  - Marks the end of a parameter list.
1   BUILD_NUMBER      - 4-byte integer.
2   BUILD_STRING      - Length-encoded string.
3   CAPABILITIES      - 4-byte bit field.
4   SESSION_ID        - 4-byte integer.
5   STATUS            - 4-byte bit field.
6   AUTH              - Length-encoded string.
7   CHECKSUM          - Length-encoded string.
8   COMPRESSION       - Length-encoded string.
9   COMMAND           - 1-byte command type given in a command
                        packet. Possible commands are:
                        0   HANDSHAKE - Process parameters.
                        1   QUIT      - No arguments.
                        2   DB        - Switch to the database given.
                        3   QUERY     - Run the query given.
                        4   SHUTDOWN  - Shutdown the server.
                        5   PING      - Request a server response.
                        6   AUTH      - Ask server for re-authentication.
10  COMMAND_ARGS      - Chunked-encoded data. There could be multiple
                        packets depending on command type.
11  RESULT            - 1-byte result code.
12  ROWS_EFFECTED     - Length-encoded count of rows effected.
13  ROWS_SCANNED      - Length-encoded count of rows scanned.
14  WARNINGS          - Length-encoded count of warnings encountered.
15  INSERT_ID         - Last insert ID.
16  ERROR_CODE        - 4-byte error code.
17  ERROR_STRING      - Length-encoded string.
18  SQL_STATE         - Length-encoded string.
19  DB_NAME           - Length-encoded string.
20  TABLE_NAME        - Length-encoded string.
21  ORIG_TABLE_NAME   - Length-encoded string.
22  FIELD_NAME        - Length-encoded string.
23  ORIG_FIELD_NAME   - Length-encoded string.
24  FIELD_TYPE        - 4-byte enumerated type.
25  FIELD_TYPE_LENGTH - Length-encoded value.
26  FIELD_FLAGS       - 4-byte bit-field.
27  DEFAULT_VALUE     - Length-encoded string.


Packet Header
-------------

All communication between the server and client are encapsulated
into application level packets (not to be confused with lower-layer
network packets). A Drizzle communication packet is prefixed with
the following header:

1-9 - Length-encoded command identifier. This is a unique number
      among all other queries currently being executed on the
      connection. The client is responsible for choosing a unique
      number while generating a command packet, and all response
      packets associated with that command must have the same command
      ID. Once a command has been completed, the client may reuse
      the ID.

There is one exception to the packet header, and this is for the
initial server handshake packet sent from the server to the client
on connect. The packet header will consist of a 3-byte length,
followed by a 1-byte sequence number (which will always be 0), and
then a 1-byte protocol version. This is the same format as the MySQL
protocol, and this is being preserved so that the client library can
differentiate between a Drizzle or MySQL server without having to be
explicitly configured.


Server Handshake
----------------

This is the first packet sent after a connection is established for
TCP and UDS. UDP does not use a server handshake packet. This packet
sends a number of server configuration parameters so that the client
can properly configure itself for communication. The first byte
is the protocol version number (this is currently "1").

Here is an example of a typical server handshake packet:

29 00 00 00      3-byte size=41, 1-byte sequence number=0
01               1-byte protocol version=1
01 1F 40 00 00   Parameter=1, 4-byte server build=500
03 DF EE 0C 00   Parameter=3, 4-byte capabilities=847583
04 0B 00 00 00   Parameter=4, 4-byte session_id=11
05 02 00 00 00   Parameter=5, 4-byte status=2
06 08 73 63 72   Parameter=6, auth plugin="scramble"
   61 6D 62 6C
   65
06 08 6B 65 72   Parameter=6, auth plugin="kerberos"
   62 65 72 6F
   73
07 05 63 72 63   Parameter=7. checksum plugin="crc32"
   33 32
07 03 6D 64 35   Parameter=7. checksum plugin="md5"


Client Handshake
----------------

The client handshake packet sets connection level preferences based
on the capabilities given by the server handshake packet.

TODO: FINISH. REWORK EXAMPLE

Here is an example of a typical client handshake packet:

14 00            Length encoded size=20, length encoded command id=0
00 1F 40 00 00   Key=0, 4-byte client build=500
01 DD CA 04 00   Key=1, 4-byte capabilities=314077
02 73 63 72 61   Key=2, preferred auth type="scramble"
   6D 62 6C 65
   00

Once the client handshake packet is sent, all options specified in
the return capabilities are put into effect. This may result in a
SSL handshake being performed, check-sums to be enabled, and other
possible protocol extensions. The server sends no response for the
client handshake packet, so the command ID used is ready for reuse.

For UDP packets, a client handshake packet is included, and contains
a bare-minimum set of configuration parameters so the server can
understand the command being sent.


TODO: GIVEN EXAMPLE PACKETS BELOW:

Command
-------

Command packets consist of required command parameter, followed by
zero or more configuration parameters. These additional parameters
set preferences to be used for just this command, and they override
any connection preferences.


Result
------

After receiving a command packet, the server will respond with OK,
Error, or Authentication Required. This value is length-encoded,
where a normal value indicates OK. If the value is 255, this indicates
an error. If the value is 254, this indicates that authentication
is required before the command can be executed. See the following
sections for more details on each.


OK
--

If an OK response is non-zero, this represents the number of fields in
the response. After the value, a list of parameters may follow, and the
end is indicated by byte with value '254'. Possible OK parameters are:


Error
-----

If the first byte of a result packet is 255, this signifies an
error. This is followed by a list of parameters, which may be:


Authentication Required
-----------------------

If the first byte of a result packet is 254, authentication is
required. At this point control is passed over to the preferred
authentication plugin, which may send whatever data it wishes to over
the socket (and take as many round-trips as required). The client
will also pass control over to the plugin so it may response in any
way it sees fit. Once the plugin has completed socket communications,
control is passed back to the core protocol library with either success
or fail. Please see documentation for each authentication plugin for
their own protocol specification.


Field
-----

A field packet consists of a number of field parameters such as
DB_NAME, TABLE_NAME, FIELD_NAME, and other field attributes.


Row
---

A row packet consists of a series of chunk-encoded data fields
The field data may either be in string format or native data type,
depending on connection and command preferences.

Advertisement