Product Documentation

Knowledgebase

Previous Topic

Next Topic

Calculating Index Sizes

Index sizes are tricky to accurately predict. However, there are a few calculations to give you the bounds to expect on index size based on the following criteria:

  • index node size (page size)
  • key length
  • four or eight byte record offset value
  • six-byte transaction number for transaction controlled files

Consider an index with a four byte key size and the following information provided by the ctflvrfy utility:

  • Disk size was nearly 1GB
  • 1 root node
  • 17 internal nodes at the first level
  • 2692 internal nodes at the second level
  • 451,645 leaf nodes. (All key values are stored in the leaf nodes)
  • 49,604,909 key values

There are 454,355 total index nodes in this example.

As this is a transaction controlled, HUGE file (eight bye offsets), actual metrics can be computed as follows:

454,355 * 2048 node size = 930,519,040 bytes (~1GB)

49,604,909 key values / 451,645 leaf nodes = ~109 keys / node

key length (4) + 8 (record offset) + 6 (transaction #) = 18 bytes / key

109 * 18 = 1962 bytes / node on average.

Each leaf node can hold a maximum of 112 keys.

2048 byte node size - 28 byte node header = 2020 available bytes.

2020 bytes / 18 = 112.

Note in this example that the leaf nodes are nearly full. In normal key distributions, you can expect index nodes to be 1/2 full on average, thus, this index could theoretically be much larger with the same number of keys.

TOCIndex