summaryrefslogtreecommitdiff
path: root/docs/source/Internals.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/source/Internals.md')
-rwxr-xr-xdocs/source/Internals.md244
1 files changed, 244 insertions, 0 deletions
diff --git a/docs/source/Internals.md b/docs/source/Internals.md
new file mode 100755
index 00000000..b744c784
--- /dev/null
+++ b/docs/source/Internals.md
@@ -0,0 +1,244 @@
+# FlatBuffer Internals
+
+This section is entirely optional for the use of FlatBuffers. In normal
+usage, you should never need the information contained herein. If you're
+interested however, it should give you more of an appreciation of why
+FlatBuffers is both efficient and convenient.
+
+### Format components
+
+A FlatBuffer is a binary file and in-memory format consisting mostly of
+scalars of various sizes, all aligned to their own size. Each scalar is
+also always represented in little-endian format, as this corresponds to
+all commonly used CPUs today. FlatBuffers will also work on big-endian
+machines, but will be slightly slower because of additional
+byte-swap intrinsics.
+
+On purpose, the format leaves a lot of details about where exactly
+things live in memory undefined, e.g. fields in a table can have any
+order, and objects to some extend can be stored in many orders. This is
+because the format doesn't need this information to be efficient, and it
+leaves room for optimization and extension (for example, fields can be
+packed in a way that is most compact). Instead, the format is defined in
+terms of offsets and adjacency only.
+
+### Format identification
+
+The format also doesn't contain information for format identification
+and versioning, which is also by design. FlatBuffers is a statically typed
+system, meaning the user of a buffer needs to know what kind of buffer
+it is. FlatBuffers can of course be wrapped inside other containers
+where needed, or you can use its union feature to dynamically identify
+multiple possible sub-objects stored. Additionally, it can be used
+together with the schema parser if full reflective capabilities are
+desired.
+
+Versioning is something that is intrinsically part of the format (the
+optionality / extensibility of fields), so the format itself does not
+need a version number (it's a meta-format, in a sense). We're hoping
+that this format can accommodate all data needed. If format breaking
+changes are ever necessary, it would become a new kind of format rather
+than just a variation.
+
+### Offsets
+
+The most important and generic offset type (see `flatbuffers.h`) is
+`offset_t`, which is currently always a `uint32_t`, and is used to
+refer to all tables/unions/strings/vectors. 32bit is
+intentional, since we want to keep the format binary compatible between
+32 and 64bit systems, and a 64bit offset would bloat the size for almost
+all uses. A version of this format with 64bit (or 16bit) offsets is easy to set
+when needed. Unsigned means they can only point in one direction, which
+typically is forward (towards a higher memory location). Any backwards
+offsets will be explicitly marked as such.
+
+The format starts with an `offset_t` to the root object in the buffer.
+
+We have two kinds of objects, structs and tables.
+
+### Structs
+
+These are the simplest, and as mentioned, intended for simple data that
+benefits from being extra efficient and doesn't need versioning /
+extensibility. They are always stored inline in their parent (a struct,
+table, or vector) for maximum compactness. Structs define a consistent
+memory layout where all components are aligned to their size, and
+structs aligned to their largest scalar member. This is done independent
+of the alignment rules of the underlying compiler to guarantee a cross
+platform compatible layout. This layout is then enforced in the generated
+code.
+
+### Tables
+
+These start with an `soffset_t` to a vtable (signed version of
+`offset_t`, since vtables may be stored anywhere), followed by all the
+fields as aligned scalars. Unlike structs, not all fields need to be
+present. There is no set order and layout.
+
+To be able to access fields regardless of these uncertainties, we go
+through a vtable of offsets. Vtables are shared between any objects that
+happen to have the same vtable values.
+
+The elements of a vtable are all of type `voffset_t`, which is currently
+a `uint16_t`. The first element is the number of elements of the vtable,
+including this one. The second one is the size of the object, in bytes
+(including the vtable offset). This size is used for streaming, to know
+how many bytes to read to be able to access all fields of the object.
+The remaining elements are N the offsets, where N is the amount of field
+declared in the schema when the code that constructed this buffer was
+compiled (thus, the size of the table is N + 2).
+
+All accessor functions in the generated code for tables contain the
+offset into this table as a constant. This offset is checked against the
+first field (the number of elements), to protect against newer code
+reading older data. If this offset is out of range, or the vtable entry
+is 0, that means the field is not present in this object, and the
+default value is return. Otherwise, the entry is used as offset to the
+field to be read.
+
+### Strings and Vectors
+
+Strings are simply a vector of bytes, and are always
+null-terminated. Vectors are stored as contiguous aligned scalar
+elements prefixed by a count.
+
+### Construction
+
+The current implementation constructs these buffers backwards, since
+that significantly reduces the amount of bookkeeping and simplifies the
+construction API.
+
+### Code example
+
+Here's an example of the code that gets generated for the `samples/monster.fbs`.
+What follows is the entire file, broken up by comments:
+
+ // automatically generated, do not modify
+
+ #include "flatbuffers/flatbuffers.h"
+
+ namespace MyGame {
+ namespace Sample {
+
+Nested namespace support.
+
+ enum {
+ Color_Red = 0,
+ Color_Green = 1,
+ Color_Blue = 2,
+ };
+
+ inline const char **EnumNamesColor() {
+ static const char *names[] = { "Red", "Green", "Blue", nullptr };
+ return names;
+ }
+
+ inline const char *EnumNameColor(int e) { return EnumNamesColor()[e]; }
+
+Enums and convenient reverse lookup.
+
+ enum {
+ Any_NONE = 0,
+ Any_Monster = 1,
+ };
+
+ inline const char **EnumNamesAny() {
+ static const char *names[] = { "NONE", "Monster", nullptr };
+ return names;
+ }
+
+ inline const char *EnumNameAny(int e) { return EnumNamesAny()[e]; }
+
+Unions share a lot with enums.
+
+ struct Vec3;
+ struct Monster;
+
+Predeclare all datatypes since there may be circular references.
+
+ MANUALLY_ALIGNED_STRUCT(4) Vec3 {
+ private:
+ float x_;
+ float y_;
+ float z_;
+
+ public:
+ Vec3(float x, float y, float z)
+ : x_(flatbuffers::EndianScalar(x)), y_(flatbuffers::EndianScalar(y)), z_(flatbuffers::EndianScalar(z)) {}
+
+ float x() const { return flatbuffers::EndianScalar(x_); }
+ float y() const { return flatbuffers::EndianScalar(y_); }
+ float z() const { return flatbuffers::EndianScalar(z_); }
+ };
+ STRUCT_END(Vec3, 12);
+
+These ugly macros do a couple of things: they turn off any padding the compiler
+might normally do, since we add padding manually (though none in this example),
+and they enforce alignment chosen by FlatBuffers. This ensures the layout of
+this struct will look the same regardless of compiler and platform. Note that
+the fields are private: this is because these store little endian scalars
+regardless of platform (since this is part of the serialized data).
+`EndianScalar` then converts back and forth, which is a no-op on all current
+mobile and desktop platforms, and a single machine instruction on the few
+remaining big endian platforms.
+
+ struct Monster : private flatbuffers::Table {
+ const Vec3 *pos() const { return GetStruct<const Vec3 *>(4); }
+ int16_t mana() const { return GetField<int16_t>(6, 150); }
+ int16_t hp() const { return GetField<int16_t>(8, 100); }
+ const flatbuffers::String *name() const { return GetPointer<const flatbuffers::String *>(10); }
+ const flatbuffers::Vector<uint8_t> *inventory() const { return GetPointer<const flatbuffers::Vector<uint8_t> *>(14); }
+ int8_t color() const { return GetField<int8_t>(16, 2); }
+ };
+
+Tables are a bit more complicated. A table accessor struct is used to point at
+the serialized data for a table, which always starts with an offset to its
+vtable. It derives from `Table`, which contains the `GetField` helper functions.
+GetField takes a vtable offset, and a default value. It will look in the vtable
+at that offset. If the offset is out of bounds (data from an older version) or
+the vtable entry is 0, the field is not present and the default is returned.
+Otherwise, it uses the entry as an offset into the table to locate the field.
+
+ struct MonsterBuilder {
+ flatbuffers::FlatBufferBuilder &fbb_;
+ flatbuffers::uoffset_t start_;
+ void add_pos(const Vec3 *pos) { fbb_.AddStruct(4, pos); }
+ void add_mana(int16_t mana) { fbb_.AddElement<int16_t>(6, mana, 150); }
+ void add_hp(int16_t hp) { fbb_.AddElement<int16_t>(8, hp, 100); }
+ void add_name(flatbuffers::Offset<flatbuffers::String> name) { fbb_.AddOffset(10, name); }
+ void add_inventory(flatbuffers::Offset<flatbuffers::Vector<uint8_t>> inventory) { fbb_.AddOffset(14, inventory); }
+ void add_color(int8_t color) { fbb_.AddElement<int8_t>(16, color, 2); }
+ MonsterBuilder(flatbuffers::FlatBufferBuilder &_fbb) : fbb_(_fbb) { start_ = fbb_.StartTable(); }
+ flatbuffers::Offset<Monster> Finish() { return flatbuffers::Offset<Monster>(fbb_.EndTable(start_, 7)); }
+ };
+
+`MonsterBuilder` is the base helper struct to construct a table using a
+`FlatBufferBuilder`. You can add the fields in any order, and the `Finish`
+call will ensure the correct vtable gets generated.
+
+ inline flatbuffers::Offset<Monster> CreateMonster(flatbuffers::FlatBufferBuilder &_fbb, const Vec3 *pos, int16_t mana, int16_t hp, flatbuffers::Offset<flatbuffers::String> name, flatbuffers::Offset<flatbuffers::Vector<uint8_t>> inventory, int8_t color) {
+ MonsterBuilder builder_(_fbb);
+ builder_.add_inventory(inventory);
+ builder_.add_name(name);
+ builder_.add_pos(pos);
+ builder_.add_hp(hp);
+ builder_.add_mana(mana);
+ builder_.add_color(color);
+ return builder_.Finish();
+ }
+
+`CreateMonster` is a convenience function that calls all functions in
+`MonsterBuilder` above for you. Note that if you pass values which are
+defaults as arguments, it will not actually construct that field, so
+you can probably use this function instead of the builder class in
+almost all cases.
+
+ inline const Monster *GetMonster(const void *buf) { return flatbuffers::GetRoot<Monster>(buf); }
+
+This function is only generated for the root table type, to be able to
+start traversing a FlatBuffer from a raw buffer pointer.
+
+ }; // namespace MyGame
+ }; // namespace Sample
+
+