From Wikipedia, the free encyclopedia
Filename extension .bson
Internet media type none[1]
Type of format Data interchange
Extended from JSON
Standard(s) no RFC yet

BSON (play /ˈbsɒn/) is a computer data interchange format used mainly as a data storage and network transfer format in the MongoDB database. It is a binary form for representing simple data structuresand associative arrays (called objects or documents in MongoDB). The name "BSON" is based on the term JSON and stands for "Binary JSON".[2]



[edit]Data types and syntax

BSON documents (objects) consist of an ordered list of elements. Each element consists of a field name, a type, and a value. Field names are strings. Types include:

BSON types are nominally a superset of JSON types (JSON does not have a date or a byte array type, for example[3]), with one exception of not having a universal "number" type as JSON does.


Compared to JSON, BSON is designed to be efficient both in storage space and scan-speed. Large elements in a BSON document are prefixed with a length field to facilitate scanning. In some cases, BSON will use more space than JSON due to the length prefixes and explicit array indices.[2]

[edit]See also


[edit]External links

View page ratings
Rate this page



Version 1.0

BSON is a binary format in which zero or more key/value pairs are stored as a single entity. We call this entity a document.

The following grammar specifies version 1.0 of the BSON standard. We've written the grammar using a pseudo-BNF syntax. Valid BSON data is represented by the document non-terminal.

Basic Types

The following basic types are used as terminals in the rest of the grammar. Each type must be serialized in little-endian format.

byte 1 byte (8-bits)
int32 4 bytes (32-bit signed integer)
int64 8 bytes (64-bit signed integer)
double 8 bytes (64-bit IEEE 754 floating point)


The following specifies the rest of the BSON grammar. Note that quoted strings represent terminals, and should be interpreted with C semantics (e.g. "\x01" represents the byte0000 0001). Also note that we use the * operator as shorthand for repetition (e.g.("\x01"*2) is "\x01\x01"). When used as a unary operator, * means that the repetition can occur 0 or more times.

document ::= int32 e_list "\x00" BSON Document
e_list ::= element e_list Sequence of elements
  | ""  
element ::= "\x01" e_name double Floating point
  | "\x02" e_name string UTF-8 string
  | "\x03" e_name document Embedded document
  | "\x04" e_name document Array
  | "\x05" e_name binary Binary data
  | "\x06" e_name Undefined — Deprecated
  | "\x07" e_name (byte*12) ObjectId
  | "\x08" e_name "\x00" Boolean "false"
  | "\x08" e_name "\x01" Boolean "true"
  | "\x09" e_name int64 UTC datetime
  | "\x0A" e_name Null value
  | "\x0B" e_name cstring cstring Regular expression
  | "\x0C" e_name string (byte*12) DBPointer — Deprecated
  | "\x0D" e_name string JavaScript code
  | "\x0E" e_name string Symbol — Deprecated
  | "\x0F" e_name code_w_s JavaScript code w/ scope
  | "\x10" e_name int32 32-bit Integer
  | "\x11" e_name int64 Timestamp
  | "\x12" e_name int64 64-bit integer
  | "\xFF" e_name Min key
  | "\x7F" e_name Max key
e_name ::= cstring Key name
string ::= int32 (byte*) "\x00" String
cstring ::= (byte*) "\x00" CString
binary ::= int32 subtype (byte*) Binary
subtype ::= "\x00" Binary / Generic
  | "\x01" Function
  | "\x02" Binary (Old)
  | "\x03" UUID (Old)
  | "\x04" UUID
  | "\x05" MD5
  | "\x80" User defined
code_w_s ::= int32 string document Code w/ scope


The following are some example documents (in JavaScript / Python style syntax) and their corresponding BSON representations. Try mousing over them for some useful correlation.

{"hello""world"} "\x16\x00\x00\x00\x02hello\x00 
{"BSON"["awesome"5.05,1986]} "\x31\x00\x00\x00\x04BSON\x00\x26\x00 
posted @ 2012-11-29 13:45  Areas  阅读(488)  评论(0编辑  收藏  举报