Skip to main content

Variant concepts

Concepts about the variant data type and object

Variant Objects

A variant object is a JSON object containing key information about the variant including its value, type, value encoding, and type encoding. It has up to five properties:  "schema", "value", "valueEncoding", "type", and "storageEncoding"

The variant object requires the "schema", "value", and "type" properties. 

  • The value of the "schema" property is always set to "jsonaction.org/schemas/variantObject", which uniquely identifies a JSON object as a variant object. 

  • When sending a variant object to the server, the server converts the value in the "value" property to the type specified in the "type" property and stores the converted value. 

  • When receiving a variant object from the server, the "type" property tells the application the underlying type of the variant.

Example variant object

{
  "schema": "jsonaction.org/schemas/variantObject",
  "value": "123", 
  "valueEncoding": ["json"],
  "type": "integer",
  "storageEncoding": ["json"]
}
Table 1. Internal structure of a variant field

Variant Length

Data Type

Variant Value

4 bytes

4 bytes

Varying length



  • The first 4 bytes is an integer representing the total size of the variant type. 

    • The size of the variant value is the variant length minus the 4 bytes used for the variant type.

  • The second 4 bytes is an integer representing the data type of the variant. In JSON, its property name is type. 

    • Each integer value represents a different data type.

    • A data type defines the behavior, meaning, purpose and algorithmic processing of the data in the field.

    • The FairCom database contains a new table named variant_type that contains one row for each supported data type.

    • This table contains metadata about each data type so that applications can understand the data type – especially user-defined data types and their schemas.

    • The first 1,048,575 (0x0FFFFF in hex) types are reserved for FairCom's built-in SQL types, such as LVARCHAR, INTEGER, TIMESTAMP, and industry standard types that FairCom chooses to support, such as png, xml, html, csv, etc.

    • The remaining ~4 billion types are available for FairCom customers to create their own types, such as a JSON person object validated by a specific JSON schema.

The variant_type table is a physical file that contains information about each variant type. It has many fields, but the first three fields are the most important.

Table 2. Internal structure of the variant table

data_type_id (primary key)

data_type_name (natural key part 1)

storage_encoding (natural key part 1)

json_schema (optional validation technique for the VARIANT value)

4-byte integer

64-byte integer

256-byte VARCHAR

2 GB JSON



  • The data_type_id field is the primary key. 

    • There is one record in the table for each unique VARIANT Data Type value. 

    • Each record contains metadata about a specific VARIANT Data Type.

  • The data_type_name field is the first part of the table's natural key. 

    • It is the human-readable name of the variant field, such as "JSON" or "JPG".

    • It corresponds to the "type" property in the JSON serialization of a VARIANT, such as "type": "BMP"

  • The storage_encoding field is the second part of the table's natural key. 

    • It is a JSON array containing a sequence of subtypes that make up the VARIANT type, such as ["json","cbor"]

    • It corresponds to the "storageEncoding" property in the JSON serialization of a VARIANT, such as "storageEncoding": ["json","cbor"]

  • The json_schema field is optional. It only applies to VARIANT values that are encoded using JSON. If a schema is stored in this field, the server uses it to validate the JSON being stored in the VARIANT.

The table has two indexes:  

  1. data_type_id

  2. data_type_name + storage_encoding

Each unique combination of "type" and "storageEncoding" represents a unique type in the variant_type table. Adding a new combination of these two properties will cause another type to be added to the variant_type table. This is not an issue because the variant_type table can hold 2 billion types. 

You can reference a variant's type in two ways: by type ID or by type name plus type encoding

When you reference a "type" by ID number, you are referencing a unique type that is a combination of "type" and "storageEncoding"

In the JSON serialization of a VARIANT, you should either assign an ID number to "type" and leave out "storageEncoding" or assign a string name to "type" and assign a string value to the "storageEncoding" property.

Table 3. Variant data structure

Property

Description

Default

Type

Limits (inclusive)

value

contains the value to be stored in the variant field

Required - No default value

JSON value

valueEncoding

specifies how the "value" property is encoded

[]

array

storageEncoding

specifies additional steps for optimizing the value that will be stored in a variant field, such as compressing the value or converting it to a more efficient format

[]

array

type

specifies the data type of the "value" property

Required - No default value

enum

"integer"
"string"


The "value" property is a required property that contains the value to be stored in the variant field. It is always a JSON value, which may be a JSON string, number, boolean, object, or array.

The example below shows a JSON string containing a number being stored in the variant as a big integer.

{
  "schema": "jsonaction.org/schemas/variantObject",
  "value": "1234567890123456789", 
  "type": "bigint"
}

Note

In the physical record, the actual value stored inside a variant field is always a binary value. The "type" property defines the binary format of the binary data. In the example above, the data is stored as a binary big integer.

The "valueEncoding" property specifies how the "value" property is encoded.

The "valueEncoding" property specifies how the sender encodes the "value" property so the receiver can decode it. For example, when an application calls the "insertRecords" and "updateRecords" actions to insert or update a large variant value, the application might compress the value to save network bandwidth. The application can zip and Base64-encode the value before putting it into the "value" property.  The application can set the "valueEncoding" property to ["base64", "zip"] to tell the FairCom server that the data is Base64-encoded and zipped, so it can decode it using Base64 and unzip to recreate the variant's binary value. Conversely, when an application requests a variant value from the server, it can use "valueEncoding": ["base64", "zip"] to request that the server zip and Base64-encode the value before sending it.  

The "valueEncoding" property is optional. It specifies how the sender encoded the variant value before storing it in the "value" property. It may contain zero or more steps. Each step is specified in the order needed by the receiver to convert the "value" property to the variant's binary value. The type of the variant's binary value is specified by the  "type" property.

When there are no steps in the "valueEncoding" property, it must be omitted, set to [], or set to null.

When an application wants to use the "insertRecords" and "updateRecords" actions to send a variant value to a FairCom server, it may take the the variant's value and encode it zero or more times before writing the final encoded value to the variant's "value" property. The application must also add each encoding step to the "valueEncoding" list in the order the server needs to decode the "value"  property. 

Conversely, when the server returns a record to an application containing a variant field. The application reads the variant object's "value" property and must decode the value following the steps defined in the "valueEncoding" property in order from first to last.

In the following example, the "valueEncoding" property is an empty array because there are no additional decoding steps beyond converting from the JSON number type specified in the "value" property to the big integer number specified in the "type" property.

{
  "schema": "jsonaction.org/schemas/variantObject",
  "value": 123,
  "valueEncoding": [],
 
  "type": "bigint",
  "storageEncoding": []
}

In the following example, the "valueEncoding" property is "number" because the "value" property is a JSON string containing an embedded number. The receiver of the variant object must decode the JSON string into the big integer number specified in the "type" property.

{
  "schema": "jsonaction.org/schemas/variantObject",
  "value": "123",
  "valueEncoding": ["number"],
 
  "type": "bigint",
  "storageEncoding": []
}

In the following example, two additional steps are added to convert from the UTF-8 string in the "value" property to the JSON type in the "type" property. These steps include using base64 to decode the value string into a binary value and using the 7z algorithm to uncompress the binary value and validating the result as JSON. Because the "storageEncoding" property defines another encoding step, the database converts the JSON into BSON and stores the BSON into the variant as the field's binary value.

{
  "schema": "jsonaction.org/schemas/variantObject",
  "value": "N3q8ryccAAQEJgwBDQAAAAAAAABiAAAAAAAAAHW+XQoBAAh7ImEiOiJiIn0AAQQGAAEJDQAHCwEAASEhAQAMCQAICgGcXPZrAAAFARkMAAAAAAAAAAAAAAAAERsAagBzAG8AbgBfAGEAYgAuAGoAcwBvAG4AAAAZABQKAQAwhdlCD57ZARUGAQCAAAAAAAA=", 

  "valueEncoding": ["base64", "7z"],
  "type": "json",
  "storageEncoding": ["bson"]
}

The "storageEncoding" property is an optional array that specifies additional steps for optimizing the value that will be stored in a variant field, such as compressing the value or converting it to a more efficient format before writing it to the table (in order to save disk space). For example, a JSON type can be encoded as BSON before it is written to the table file. 

Each variant in each record can potentially specify a different set of encodings for the same "value". This is useful for migrating data incrementally from one encoding to another, such as incrementally converting natively stored JSON into CBOR. This is also useful for compressing data with different algorithms depending on which algorithm is more effective, such as using 7z for large strings, using RLE for small strings containing repeated characters, and using no compression for very small strings.

The steps in "storageEncoding" are specified in the order in which they encode data. They define the encoding starting after the encoding specified in the "type" property to the binary value stored in the field. 

When there are no additional steps in the "storageEncoding" property, it must be omitted, set to [],  or set to null.

When storing a variant into the database, the "value" property implicitly specifies the initial data type by how its value is represented in the JSON itself (as a quoted string, number, object, list, etc). The  "valueEncoding" steps, if any, define how to convert the "value" into the data type specified by the "type" property. 

A FairCom server automatically runs the "storageEncoding" steps on the data before storing it and after retrieving it. When retrieving a variant value, the database runs the "storageEncoding" steps in reverse order on the stored binary value to convert it into the field value specified by the "type" property.  

The following example is a variant containing a user-defined "personV2" type that is a  JSON document validated by a "personV2" JSON schema and stored in the variant field in the "CBOR" binary format in big endian byte order.

{
  "schema": "jsonaction.org/schemas/variantObject",
  "value": 
  {
    "schema":"faircom.com/schemas/personV2",
    "employeeId": 17, 
    "name":"Mike Bowers"
  },
  "valueEncoding": [], 

  "type": "personV2",
  "storageEncoding": ["json","cbor"]
}

The "type" property is a required property that contains the data type of the value. It can be a string or a number. Each type has a unique name and a unique integer identifier. 

The list of types are specified in the variant_type table. The numeric value of "type" must be a value in the data_type_id field. The string value of "type" must be a value in the data_type_name field.