OVSDB(5) | Open vSwitch | OVSDB(5) |
ovsdb - Open vSwitch Database (File Formats)
OVSDB, the Open vSwitch Database, is a database system whose network protocol is specified by RFC 7047. The RFC does not specify an on-disk storage format. The OVSDB implementation in Open vSwitch implements two storage formats: one for standalone (and active-backup) databases, and the other for clustered databases. This manpage documents both of these formats.
Most users do not need to be concerned with this specification. Instead, to manipulate OVSDB files, refer to ovsdb-tool(1). For an introduction to OVSDB as a whole, read ovsdb(7).
OVSDB files explicitly record changes that are implied by the database schema. For example, the OVSDB “garbage collection” feature means that when a client removes the last reference to a garbage-collected row, the database server automatically removes that row. The database file explicitly records the deletion of the garbage-collected row, so that the reader does not need to infer it.
OVSDB files do not include the values of ephemeral columns.
Standalone and clustered database files share the common structure described here. They are text files encoded in UTF-8 with LF (U+000A) line ends, organized as append-only series of records. Each record consists of 2 lines of text.
The first line in each record has the format OVSDB <magic> <length> <hash>, where <magic> is JSON for standalone databases or CLUSTER for clustered databases, <length> is a positive decimal integer, and <hash> is a SHA-1 checksum expressed as 40 hexadecimal digits. Words in the first line must be separated by exactly one space.
The second line must be exactly length bytes long (including the LF) and its SHA-1 checksum (including the LF) must match hash exactly. The line’s contents must be a valid JSON object as specified by RFC 4627. Strings in the JSON object must be valid UTF-8. To ensure that the second line is exactly one line of text, the OVSDB implementation expresses any LF characters within a JSON string as \n. For the same reason, and to save space, the OVSDB implementation does not “pretty print” the JSON object with spaces and LFs. (The OVSDB implementation tolerates LFs when reading an OVSDB database file, as long as length and hash are correct.)
We use notation from RFC 7047 here to describe the JSON data in records. In addition to the notation defined there, we add the following:
The first record in a standalone database contains the JSON schema for the database, as specified in RFC 7047. Only this record is mandatory (a standalone file that contains only a schema represents an empty database).
The second and subsequent records in a standalone database are transaction records. Each record may have the following optional special members, which do not have any semantics but are often useful to administrators looking through a database log with ovsdb-tool show-log:
OVSDB always writes a _date member.
OVSDB only writes a _comment member if it would be a nonempty string.
Each of these records also has one or more additional members, each of which maps from the name of a database table to a <table-txn>:
For new rows, the OVSDB implementation omits columns whose values have the default values for their types defined in RFC 7047 section 5.2.1; for modified rows, the OVSDB implementation omits columns whose values are unchanged.
The clustered format has the following additional notation:
When a schema is present, the transaction record is relative to an empty database. That is, a schema change effectively resets the database to empty and the transaction record represents the full database contents. This allows readers to be ignorant of the full semantics of schema change.
The first record in a clustered database contains the following members, all of which are required, except prev_election_timer:
The second and subsequent records, if present, in a clustered database represent changes to the database, to the cluster state, or both. There are several types of these records. The most important types of records directly represent persistent state described in the Raft specification:
The following additional types of records aid debugging and troubleshooting, but they do not affect correctness.
The table below identifies the members that each type of record contains. “yes” indicates that a member is required, “?” that it is optional, blank that it is forbidden, and [1] that data and eid must be either both present or both absent.
member | Entry | Term | Vote | Leader | Commit Index | Note |
comment | ? | ? | ? | ? | ? | ? |
term | yes | yes | yes | yes | ||
index | yes | |||||
servers | ? | |||||
election_timer | ? | |||||
data | [1] | |||||
eid | [1] | |||||
vote | yes | |||||
leader | yes | |||||
commit_index | yes | |||||
note | yes |
The members are:
In addition to general format for a clustered database, there is also a special case for a database file created by ovsdb-tool join-cluster. Such a file contains exactly one record, which conveys the information passed to the join-cluster command. It has the following members:
When the server successfully joins the cluster, the database file is replaced by one described in Clustered Format.
The Open vSwitch Development Community
2023, The Open vSwitch Development Community
April 11, 2023 | 2.15 |