3 stable releases
Uses old Rust 2015
1.0.2 | Jun 8, 2018 |
---|---|
1.0.1 | Jun 7, 2018 |
1.0.0 | Mar 6, 2018 |
#23 in #concise
90KB
2K
SLoC
DBOR - Dq's Binary Object Representation
DBOR is a serialization format based on CBOR, designed for Rust, and optimized for speed and file size. It uses buffered reading and writing systems when interacting with io streams for maximum efficiency.
Example Usage
(derived from serde_json's tutorial)
Cargo.toml
[dependencies]
serde = "*"
serde_derive = "*"
serde_dbor = "*"
main.rs
extern crate serde;
extern crate serde_dbor;
#[macro_use]
extern crate serde_derive;
use serde_dbor::Error;
#[derive(Serialize, Deserialize)]
struct Person {
name: String,
age: u8,
phones: Vec<String>
}
fn example<'a>(data: &'a [u8]) => Result<(), Error> {
// Parse the data into a Person object.
let p: Person = serde_dbor::from_slice(data)?;
// Do things just like with any other Rust data structure.
println!("Please call {} at the number {}", p.name, p.phones[0]);
Ok(())
}
Spec
DBOR, just like CBOR, is composed of instruction bytes and additional content bytes. However, in DBOR, every item needs to be described before its content, meaning that indefinite-length arrays, strings, or maps are not allowed because they would require a termination byte at the end of the item. An instruction byte is split up into two sections of 3 bits and 5 bits, respectively. The first 3 bits define the type of the item, and the last 5 are a parameter for that item, which in some cases can be the value of the item itself. For example, an unsigned integer with a value of 21 would be stored as 0x15
, or 0b000 10101
, because type 0 (0b000
) is a uint and the byte has enough space left over to encode the number 21 (0b10101
).
When an instruction byte indicates that the parameter is of a certain size n
, the next n
bytes will be used for that parameter, and then afterwards will be the content of the item described by the instruction byte. For example, a u16
parameter takes up the two bytes immediately after the instruction byte. However, when serializing a u16
, it may be shortened into a u8
or into the instruction byte itself. Also, it should be noted that DBOR stores multi-byte integers and floats in little endian because it makes serialization/deserialization on most machines faster (x86 uses little endian).
Instruction Bytes
Type ID | Encoded Type | Parameter Descriptions |
---|---|---|
0b000 (0 ) |
uint |
|
0b001 (1 ) |
int |
|
0b010 (2 ) |
misc |
|
0b011 (3 ) |
variant (enum) |
|
0b100 (4 ) |
seq (array/tuple/struct) |
|
0b101 (5 ) |
bytes (string/byte array) |
|
0b110 (6 ) |
map |
|
0b111 (7 ) |
reserved |
|
Named Variant Byte
0-247
- name length of0-247
248
- name length asu8
249
- name length asu16
250
- name length asu32
251
- name length asu64
(only on 64-bit machines)252-255
- reserved
Note: serialization using named variants isn't currently implemented, but deserialization is.
Example Data
Rust Code
struct Data {
some_text: String,
a_small_number: u64,
a_byte: u8,
some_important_numbers: Vec<u16>,
}
let data = Data {
some_text: "Hello world!",
a_small_number: 0x04,
a_byte: 0x27,
some_important_numbers: vec![
0x1234,
0x6789,
0xabcd,
]
}
Annotated Hex Dump of DBOR
84 # Seq(4)
ac # Bytes(12)
48 65 6c 6c 6f 20...
77 6f 72 6c 64 21 # "Hello world!"
04 # uint(4)
18 # u8
27 # 0x27
83 # Seq(3)
19 # u16
34 12 # 0x1234
19 # u16
89 67 # 0x6789
19 # u16
cd ab # 0xabcd
Dependencies
~110–345KB