ma-laforge

model/simulation/measurement analysis

Follow me on GitHub

Data Storage Formats

This page collects resources that deal with storing complex, hierarchical/binary data to file.

Alternatively, taking an excerpt from bsdf.io:

[...] data specification for serializing (scientific) data, for the purpose of storage and (inter process) communication.

Issue

  • Many file formats are designed to store complex hierarchical data in a "human-readable" format.
  • The misnomer is that "human-readable" really means these files can be read/edited with simple text editors.
  • In reality, these formats are error-prone/difficult for humans to edit, and even understand.
  • As dataset complexity increases, simple text editors unavoidably assure the eventuality of errors.
  • What is really needed to ensure "human-readability" is a simple, flexible application to view/edit files.

Goal

Thus, these data storage formats might as well provide the following components:

  1. A solid, flexible, hierarchical binary format.
  2. A simple-to-use editor/reader.

Basic info

Text-based formats

  • CSV: (Not typically hierarchical)
  • JSON
  • YAML
  • XML

Binary formats