path: root/Documentation/botr/
diff options
Diffstat (limited to 'Documentation/botr/')
1 files changed, 112 insertions, 0 deletions
diff --git a/Documentation/botr/ b/Documentation/botr/
new file mode 100644
index 0000000000..8eb0b0dbcf
--- /dev/null
+++ b/Documentation/botr/
@@ -0,0 +1,112 @@
+# RyuJIT: Porting to different platforms
+## What is a Platform?
+* Target instruction set and pointer size
+* Target calling convention
+* Runtime data structures (not really covered here)
+* GC encoding
+ * So far only JIT32_GCENCODER and everything else
+* Debug info (so far mostly the same for all targets?)
+* EH info (not really covered here)
+One advantage of the CLR is that the VM (mostly) hides the (non-ABI) OS differences
+## The Very High Level View
+* 32 vs. 64 bits
+ * This work is not yet complete in the backend, but should be sharable
+* Instruction set architecture:
+ * instrsXXX.h, emitXXX.cpp and targetXXX.cpp
+ * lowerXXX.cpp
+ * codeGenXXX.cpp and simdcodegenXXX.cpp
+ * unwindXXX.cpp
+* Calling Convention: all over the place
+## Front-end changes
+* Calling Convention
+ * Struct args and returns seem to be the most complex differences
+ * Importer and morph are highly aware of these
+ * E.g. fgMorphArgs(), fgFixupStructReturn(), fgMorphCall(), fgPromoteStructs() and the various struct assignment morphing methods
+ * HFAs on ARM
+* Tail calls are target-dependent, but probably should be less so
+* Intrinsics: each platform recognizes different methods as intrinsics (e.g. Sin only for x86, Round everywhere BUT amd64)
+* Target-specific morphs such as for mul, mod and div
+## Backend Changes
+* Lowering: fully expose control flow and register requirements
+* Code Generation: traverse blocks in layout order, generating code (InstrDescs) based on register assignments on nodes
+ * Then, generate prolog & epilog, as well as GC, EH and scope tables
+* ABI changes:
+ * Calling convention register requirements
+ * Lowering of calls and returns
+ * Code sequences for prologs & epilogs
+ * Allocation & layout of frame
+## Target ISA "Configuration"
+* Conditional compilation (set in jit.h, based on incoming define, e.g. #ifdef X86)
+_TARGET_64_BIT_ (32 bit target is just ! _TARGET_64BIT_)
+* Target.h
+* InstrsXXX.h
+## Instruction Encoding
+* The instrDesc is the data structure used for encoding
+ * It is initialized with the opcode bits, and has fields for immediates and register numbers.
+ * instrDescs are collected into groups
+ * A label may only occur at the beginning of a group
+* The emitter is called to:
+ * Create new instructions (instrDescs), during CodeGen
+ * Emit the bits from the instrDescs after CodeGen is complete
+ * Update Gcinfo (live GC vars & safe points)
+## Adding Encodings
+* The instruction encodings are captured in instrsXXX.h. These are the opcode bits for each instruction
+* The structure of each instruction's encoding is target-dependent
+* An "instruction" is just the representation of the opcode
+* An instance of "instrDesc" represents the instruction to be emitted
+* For each "type" of instruction, emit methods need to be implemented. These follow a pattern but a target may have unique ones, e.g.
+emitter::emitInsMov(instruction ins, emitAttr attr, GenTree* node)
+emitter::emitIns_R_I(instruction ins, emitAttr attr, regNumber reg, ssize_t val)
+emitter::emitInsTernary(instruction ins, emitAttr attr, GenTree* dst, GenTree* src1, GenTree* src2) (currently Arm64 only)
+## Lowering
+* Lowering ensures that all register requirements are exposed for the register allocator
+ * Use count, def count, "internal" reg count, and any special register requirements
+ * Does half the work of code generation, since all computation is made explicit
+ * But it is NOT necessarily a 1:1 mapping from lowered tree nodes to target instructions
+ * Its first pass does a tree walk, transforming the instructions. Some of this is target-independent. Notable exceptions:
+ * Calls and arguments
+ * Switch lowering
+ * LEA transformation
+ * Its second pass walks the nodes in execution order
+ * Sets register requirements
+ * sometimes changes the register requirements children (which have already been traversed)
+ * Sets the block order and node locations for LSRA
+ * LinearScan:: startBlockSequence() and LinearScan::moveToNextBlock()
+## Register Allocation
+* Register allocation is largely target-independent
+ * The second phase of Lowering does nearly all the target-dependent work
+* Register candidates are determined in the front-end
+ * Local variables or temps, or fields of local variables or temps
+ * Not address-taken, plus a few other restrictions
+ * Sorted by lvaSortByRefCount(), and marked "lvTracked"
+## Addressing Modes
+* The code to find and capture addressing modes is particularly poorly abstracted
+* genCreateAddrMode(), in CodeGenCommon.cpp traverses the tree looking for an addressing mode, then captures its constituent elements (base, index, scale & offset) in "out parameters"
+ * It optionally generates code
+ * For RyuJIT, it NEVER generates code, and is only used by gtSetEvalOrder, and by lowering
+## Code Generation
+* For the most part, the code generation method structure is the same for all architectures
+ * Most code generation methods start with "gen"
+* Theoretically, CodeGenCommon.cpp contains code "mostly" common to all targets (this factoring is imperfect)
+ * Method prolog, epilog,
+* genCodeForBBList
+ * walks the trees in execution order, calling genCodeForTreeNode, which needs to handle all nodes that are not "contained"
+ * generates control flow code (branches, EH) for the block