diff options
Diffstat (limited to 'documentation/optee_design.md')
-rw-r--r-- | documentation/optee_design.md | 696 |
1 files changed, 696 insertions, 0 deletions
diff --git a/documentation/optee_design.md b/documentation/optee_design.md new file mode 100644 index 0000000..073b46b --- /dev/null +++ b/documentation/optee_design.md @@ -0,0 +1,696 @@ +OP-TEE design +======================== + +# Contents + +1. [Introduction](#1-introduction) +2. [Platform Initialization](#2-platform-initialization) +3. [Secure Monitor Calls - SMC](#3-secure-monitor-calls---smc) +4. [Thread handling](#4-thread-handling) +5. [MMU](#5-mmu) +6. [Stacks](#6-stacks) +7. [Shared Memory](#7-shared-memory) +8. [Pager](#8-pager) +9. [Memory Objects](#9-memory-objects) +10. [Cryptographic abstraction layer](#10-cryptographic-abstraction-layer) +11. [libutee](#11-libutee) +12. [Trusted Applications](#12-trusted-applications) + +# 1. Introduction +OP-TEE is a so called Trusted Execution Environment, in short a TEE, for ARM +based chips supporting TrustZone technology. OP-TEE consists of three +components. + ++ [OP-TEE Client], which is the client API running in normal world user space. ++ [OP-TEE Linux Kernel driver], which is the driver that handles the + communication between normal world user space and secure world. ++ [OP-TEE Trusted OS], which is the Trusted OS running in secure world. + +OP-TEE was designed with scalability and portability in mind and as of now it +has been ported to quite a few different platforms, both ARMv7-A and ARMv8-A +from different vendors. For a full list, please see [Platforms Supported]. + +OP-TEE OS is made of 2 main components: the OP-TEE core and a collection of +libraries designed for being used by Trusted Applications. While OP-TEE core +executes in the ARM CPU privileged level (also referred to as 'kernel land'), +the Trusted Applications execute in the non-privileged level (also referred to as +the 'userland'). The static libraries provided by the OP-TEE OS enable Trusted +Applications to call secure services executing at a more privileged level. + +# 2. Platform initialization +TBD + +# 3. Secure Monitor Calls - SMC +## 3.1 SMC handling +TBD + +## 3.2 SMC Interface +The OP-TEE SMC interface is defined in two levels using [optee_smc.h] and +[optee_msg.h]. The former file defines SMC identifiers and what is passed in the +registers for each SMC. The latter file defines the OP-TEE Message protocol +which is not restricted to only SMC even if that currently is the only option +available. + +## 3.3 Communication using SMC Interface +The main structure used for the SMC communication is defined in [struct +optee_msg_arg]. If we are looking into the source code, we could see that +communication mainly is achieved using `optee_msg_arg` and `thread_smc_args`, +where `optee_msg_arg` could be seen as the main structure. What will happen is +that the [OP-TEE Linux Kernel driver] will get the parameters either from +[OP-TEE Client] or directly from an internal service in the Linux kernel. The +TEE driver will populate the struct `optee_msg_arg` with the parameters plus +some additional bookkeeping information. Parameters for the SMC are passed in +registers 1 to 7, register 0 holds the SMC id which among other things tells +whether it is a standard or a fast call. + +# 4. Thread handling +The OP-TEE core uses a couple of threads to be able to support running jobs in +parallel (not fully enabled!). There are handlers for different purposes. In +[thread.c] you will find a function called `thread_init_primary` which assigns +`init_handlers` (functions) that should be called when OP-TEE core receives +standard or fast calls, FIQ and PSCI calls. There are default handlers for these +services, but the platform can decide if they want to implement their own +platform specific handlers instead. + +## Synchronization +OP-TEE has three primitives for synchronization of threads and CPUs: +spin-lock, mutex, and condvar. + +### Spin-lock +A spin-lock is represented as an `unsigned int`. This is the most primitive +lock. Interrupts should be disabled before attempting to take a spin-lock +and should remain disabled until the lock is released. A spin-lock is +initialized with `SPINLOCK_UNLOCK`. + +| Function | Purpose | +|----------|---------| +| `cpu_spin_lock()` | Locks a spin-lock | +| `cpu_spin_trylock()` | Locks a spin-lock if unlocked and returns `0` else the spin-lock is unchanged and the function returns `!0`| +| `cpu_spin_unlock()` | Unlocks a spin-lock | + +### Mutex +A mutex is represented by `struct mutex`. A mutex can be locked and +unlocked with interrupts enabled or disabled, but only from a normal +thread. A mutex cannot be used in an interrupt handler, abort handler or +before a thread has been selected for the CPU. A mutex is initialized with +either `MUTEX_INITIALIZER` or `mutex_init()`. + +| Function | Purpose | +|----------|---------| +|`mutex_lock()` | Locks a mutex. If the mutex is unlocked this is a fast operation, else the function issues an RPC to wait in normal world. | +| `mutex_unlock()` | Unlocks a mutex. If there is no waiters this is a fast operation, else the function issues an RPC to wake up a waiter in normal world. | +| `mutex_trylock()` | Locks a mutex if unlocked and returns `true` else the mutex is unchanged and the function returns `false`. | +| `mutex_destroy()` | Asserts that the mutex is unlocked and there is no waiters, after this the memory used by the mutex can be freed. | + +When a mutex is locked it is owned by the thread calling `mutex_lock()` or +`mutex_trylock()`, the mutex may only be unlocked by the thread owning the +mutex. A thread should not exit to TA user space when holding a mutex. + +### Condvar +A condvar is represented by `struct condvar`. A condvar is similar to a +pthread_condvar_t in the pthreads standard, only less advanced. Condition +variables are used to wait for some condition to be fulfilled and are +always used together a mutex. Once a condition variable has been used +together with a certain mutex, it must only be used with that mutex until +destroyed. A condvar is initialized with `CONDVAR_INITIALIZER` or +`condvar_init()`. + +| Function | Purpose | +|----------|---------| +| `condvar_wait()` | Atomically unlocks the supplied mutex and waits in normal world via an RPC for the condition variable to be signaled, when the function returns the mutex is locked again. | +| `condvar_signal()` | Wakes up one waiter of the condition variable (waiting in `condvar_wait()`) | +| `condvar_broadcast()` | Wake up all waiters of the condition variable. | + +The caller of `condvar_signal()` or `condvar_broadcast()` should hold the +mutex associated with the condition variable to guarantee that a waiter +does not miss the signal. + +# 5. MMU +## Translation tables +OP-TEE uses several L1 translation tables, one large spanning 4 GiB and two +or more small tables spanning 32 MiB. The large translation table handles +kernel mode mapping and matches all addresses not covered by the small +translation tables. The small translation tables are assigned per thread +and covers the mapping of the virtual memory space for one TA context. + +Memory space between small and large translation table is configured by +TTBRC. TTBR1 always points to the large translation table. TTBR0 points to +the a small translation table when user mapping is active and to the large +translation table when no user mapping is currently active. + +The translation tables has certain alignment constraints, the alignment (of +the physical address) has to be the same as the size of the translation +table. The translation tables are statically allocated to avoid fragmentation of +memory due to the alignment constraints. + +Each thread has one small L1 translation table of its own. Each TA context +has a compact representation of its L1 translation table. The compact +representation is used to initialize the thread specific L1 translation +table when the TA context is activated. + +![Select xlation table](images/xlat_table.png "Select xlation table") + +## Translation tables and switching to normal world +When switching to normal world either via an IRQ or RPC there is a chance +that secure world will resume execution on a different CPU. This means that +the new CPU need to be configured with the context of the currently active +TA. This is solved by always setting the TA context in the CPU when +resuming execution. Here is room for improvements since it is more likely +than not that it is the same CPU that resumes execution in secure world. + +# 6. Stacks +Different stacks are used during different stages. The stacks are: +- Secure monitor stack (128 bytes), bound to the CPU. Only available if + OP-TEE is compiled with a secure monitor always the case if the target is + ARMv7-A but never for ARMv8-A. +- Temp stack (small ~1KB), bound to the CPU. Used when transitioning from + one state to another. Interrupts are always disabled when using this + stack, aborts are fatal when using the temp stack. +- Abort stack (medium ~2KB), bound to the CPU. Used when trapping a data + or pre-fetch abort. Aborts from user space are never fatal the TA is only + killed. Aborts from kernel mode are used by the pager to do the demand + paging, if pager is disabled all kernel mode aborts are fatal. +- Thread stack (large ~8KB), not bound to the CPU instead used by the current + thread/task. Interrupts are usually enabled when using this stack. + +*Notes for ARMv7/AArch32:* + +| Stack | Comment | +|--------|---------| +| Temp | Assigned to `SP_SVC` during entry/exit, always assigned to `SP_IRQ` and `SP_FIQ` | +| Abort | Always assigned to `SP_ABT` | +| Thread | Assigned to `SP_SVC` while a thread is active | + +*Notes for AArch64:* +There are only two stack pointers, `SP_EL1` and `SP_EL0`, available for OP-TEE +in AArch64. When an exception is received stack pointer is always `SP_EL1` which +is used temporarily while assigning an appropriate stack pointer for `SP_EL0`. +**`SP_EL1` is always assigned the value of `thread_core_local[cpu_id]`.** This +structure has some spare space for temporary storage of registers and also keeps +the relevant stack pointers. In general when we talk about assigning a stack +pointer to the CPU below we mean `SP_EL0`. + +## Boot +During early boot the CPU is configured with the temp stack which is used until +OP-TEE exits to normal world the first time. + +*Notes for AArch64:* +`SPSEL` is always `0` on entry/exit to have `SP_EL0` acting as stack pointer. + +## Normal entry +Each time OP-TEE is entered from normal world the temp stack is used as the +initial stack. For fast calls this is the only stack used. For normal calls an +empty thread slot is selected and the CPU switches to that stack. + +## Normal exit +Normal exit occurs when a thread has finished its task and the thread is freed. +When the main thread function, tee_entry_std(), returns interrupts are disabled +and the CPU switches to the temp stack instead. The thread is freed and OP-TEE +exits to normal world. + +## RPC exit +RPC exit occurs when OP-TEE need some service from normal world. RPC can +currently only be performed with a thread is in running state. RPC is initiated +with a call to thread_rpc() which saves the state in a way that when the thread +is restored it will continue at the next instruction as if this function did a +normal return. CPU switches to use the temp stack before returning to normal +world. + +## IRQ exit +IRQ exit occurs when OP-TEE receives an IRQ, which is always handled in normal +world. IRQ exit is similar to RPC exit but it is `thread_irq_handler()` and +`elx_irq()` (respectively for ARMv7-A/Aarch32 and for Aarch64) that saves the +thread state instead. The thread is resumed in the same way though. + +*Notes for ARMv7/AArch32:* +SP_IRQ is initialized to temp stack instead of a separate stack. Prior to +exiting to normal world CPU state is changed to SVC and temp stack is selected. + +*Notes for AArch64:* +`SP_EL0` is assigned temp stack and is selected during IRQ processing. The +original `SP_EL0` is saved in the thread context to be restored when resuming. + +## Resume entry +OP-TEE is entered using the temp stack in the same way as for normal entry. The +thread to resume is looked up and the state is restored to resume execution. The +procedure to resume from an RPC exit or an IRQ exit is exactly the same. + +## Syscall +Syscalls are executed using the thread stack. + +*Notes for ARMv7/AArch32*: +Nothing special `SP_SVC` is already set with thread stack. + +*Notes for syscall AArch64*: + +Early in the exception processing the original `SP_EL0` is saved in `struct +thread_svc_regs` in case the TA is executed in AArch64. + +Current thread stack is assigned to `SP_EL0` which is then selected. + +When returning `SP_EL0` is assigned what is in `struct thread_svc_regs`. This +allows `tee_svc_sys_return_helper()` having the syscall exception handler return +directly to `thread_unwind_user_mode()`. + +# 7. Shared Memory +Shared Memory is a block of memory that is shared between the non-secure and the +secure world. It is used to transfer data between both worlds. + +## Shared Memory Allocation +The shared memory is allocated by the Linux driver from a pool `struct +shm_pool`, the pool contains: +* The physical address of the start of the pool +* The size of the pool +* Whether or not the memory is cached +* List of chunk of memory allocated. + +Note that: +- The shared memory pool is physically contiguous. +- The shared memory area is not secure as it is used by both non-secure and + secure world. + +### Shared Memory Configuration +It is the Linux kernel driver for OP-TEE that is responsible for initializing +the shared memory pool, given information provided by the OP-TEE core. The Linux +driver issues a SMC call `OPTEE_SMC_GET_SHM_CONFIG` to retrieve the information +* Physical address of the start of the pool +* Size of the pool +* Whether or not the memory is cached + +The shared memory pool configuration is platform specific. The memory mapping, +including the area `MEM_AREA_NSEC_SHM` (shared memory with non-secure world), is +retrieved by calling the platform-specific function `bootcfg_get_memory()`. +Please refer to this function and the area type `MEM_AREA_NSEC_SHM` to see the +configuration for the platform of interest. The Linux driver will then +initialize the shared memory pool accordingly. + +### Shared Memory Chunk Allocation +It is the Linux kernel driver for OP-TEE that is responsible for allocating +chunks of shared memory. OP-TEE linux kernel driver relies on linux kernel +generic allocation support (`CONFIG_GENERIC_ALLOCATION`) to allocation/release +of shared memory physical chunks. OP-TEE linux kernel driver relies on linux +kernel dma-buf support (`CONFIG_DMA_SHARED_BUFFER`) to track shared memory +buffers references. + +## Shared Memory Usage + +### From the Client Application +The client application can ask for shared memory allocation using the +GlobalPlatform Client API function `TEEC_AllocateSharedMemory()`. The client +application can also provide shared memory through the GlobalPlatform Client API +function `TEEC_RegisterSharedMemory()`. In such a case, the provided memory must +be physically contiguous so that the OP-TEE core, that does not handle +scatter-gather memory, is able to use the provided range of memory addresses. +Note that the reference count of a shared memory chunk is incremented when +shared memory is registered, and initialized to 1 on allocation. + +### From the Linux Driver +Occasionally the Linux kernel driver needs to allocate shared memory for the +communication with secure world, for example when using buffers of type +TEEC_TempMemoryReference. + +### From the OP-TEE core +In case the OP-TEE core needs information from the TEE supplicant (dynamic TA +loading, REE time request,...), shared memory must be allocated. Allocation +depends on the use case. The OP-TEE core asks for the following shared memory +allocation: +- `optee_msg_arg` structure, used to pass the arguments to the non-secure world, + where the allocation will be done by sending a `OPTEE_SMC_RPC_FUNC_ALLOC` + message. +- In some cases, a payload might be needed for storing the result from TEE + supplicant, for example when loading a Trusted Application. This type of + allocation will be done by sending the message + `OPTEE_MSG_RPC_CMD_SHM_ALLOC(OPTEE_MSG_RPC_SHM_TYPE_APPL,...)`, which then + will return: + - the physical address of the shared memory + - a handle to the memory, that later on will be used later on when freeing + this memory. + +### From the TEE Supplicant +The TEE supplicant is also working with shared memory, used to exchange data +between normal and secure worlds. The TEE supplicant receives a memory address +from the OP-TEE core, used to store the data. This is for example the case when a +Trusted Application is loaded. In this case, the TEE supplicant must register +the provided shared memory in the same way a client application would do, +involving the Linux driver. + +# 8. Pager +OP-TEE currently requires ~256 KiB RAM for OP-TEE kernel memory. This is not a +problem if OP-TEE uses TrustZone protected DDR, but for security reasons OP-TEE +may need to use TrustZone protected SRAM instead. The amount of available SRAM +varies between platforms, from just a few KiB up to over 512 KiB. Platforms with +just a few KiB of SRAM cannot be expected to be able to run a complete TEE +solution in SRAM. But those with 128 to 256 KiB of SRAM can be expected to have +a capable TEE solution in SRAM. The pager provides a solution to this by demand +paging parts of OP-TEE using virtual memory. + +## Secure memory +TrustZone protected SRAM is generally considered more secure than TrustZone +protected DRAM as there is usually more attack vectors on DRAM. The attack +vectors are hardware dependent and can be different for different platforms. + +## Backing store +TrustZone protected DRAM or in some cases non-secure DRAM is used as backing +store. The data in the backing store is integrity protected with one hash +(SHA-256) per page (4KiB). Readonly pages are not encrypted since the OP-TEE +binary itself is not encrypted. + +## Partitioning of memory +The code that handles demand paging must always be available as it would +otherwise lead to deadlock. The virtual memory is partitioned as: + +``` + Type Sections ++--------------+-----------------+ +| | text | +| | rodata | +| | data | +| unpaged | bss | +| | heap1 | +| | nozi | +| | heap2 | ++--------------+-----------------+ +| init / paged | text_init | +| | rodata_init | ++------------- +-----------------+ +| paged | text_pageable | +| | rodata_pageable | ++--------------+-----------------+ +| demand alloc | | +| | | ++--------------+-----------------+ +``` +Where "`nozi`" stands for "not zero initialized", this section contains entry +stacks (thread stack when TEE pager is not enabled) and translation tables (TEE +pager cached translation table when the pager is enabled and LPAE MMU is used). + +The "`init`" area is available when OP-TEE is initializing and contains +everything that is needed to initialize the pager. After the pager has been +initialized this area will be used for demand paged instead. + +The "`demand alloc`" area is a special area where the pages are allocated and +removed from the pager on demand. Those pages are returned when OP-TEE does not +need them any longer. The thread stacks currently belongs this area. This means +that when a stack is not used the physical pages can be used by the pager for +better performance. + +The technique to gather code in the different area is based on compiling all +functions and data into separate sections. The unpaged text and rodata is then +gathered by linking all object files with `--gc-sections` to eliminate sections +that are outside the dependency graph of the entry functions for unpaged +functions. A script analyzes this ELF file and generates the bits of the final +link script. The process is repeated for init text and rodata. What is not +"unpaged" or "init" becomes "paged". + +## Partitioning of the binary +The binary is partitioned into four parts as: +``` ++----------+ +| Header | ++----------+ +| Init | ++----------+ +| Hashes | ++----------+ +| Pageable | ++----------+ +``` +Header is defined as: +```c +#define OPTEE_MAGIC 0x4554504f +#define OPTEE_VERSION 1 +#define OPTEE_ARCH_ARM32 0 +#define OPTEE_ARCH_ARM64 1 + +struct optee_header { + uint32_t magic; + uint8_t version; + uint8_t arch; + uint16_t flags; + uint32_t init_size; + uint32_t init_load_addr_hi; + uint32_t init_load_addr_lo; + uint32_t init_mem_usage; + uint32_t paged_size; +}; +``` + +The header is only used by the loader of OP-TEE, not OP-TEE itself. To +initialize OP-TEE the loader loads the complete binary into memory and copies +what follows the header and the following `init_size` bytes to +`(init_load_addr_hi << 32 | init_load_addr_lo)`. `init_mem_usage` is used by the +loader to be able to check that there is enough physical memory available for +OP-TEE to be able to initialize at all. The loader supplies in `r0/x0` the +address of the first byte following what was not copied and jumps to the load +address to start OP-TEE. + +## Initializing the pager +The pager is initialized as early as possible during boot in order to minimize +the "init" area. The global variable `tee_mm_vcore` describes the virtual memory +range that is covered by the level 2 translation table supplied to +`tee_pager_init()`. + +### Assign pageable areas +A virtual memory range to be handled by the pager is registered with a call to +`tee_pager_add_core_area()`. + +```c +bool tee_pager_add_area(tee_mm_entry_t *mm, uint32_t flags, const void *store, + const void *hashes); +``` + +which takes a pointer to `tee_mm_entry_t` to tell the range, flags to tell how +memory should be mapped (readonly, execute etc), and pointers to backing store +and hashes of the pages. + +### Assign physical pages +Physical SRAM pages are supplied by calling `tee_pager_add_pages()` + +```c +void tee_pager_add_pages(tee_vaddr_t vaddr, size_t npages, bool unmap); +``` + +`tee_pager_add_pages()` takes the physical address stored in the entry mapping +the virtual address "vaddr" and "npages" entries after that and uses it to map +new pages when needed. The unmap parameter tells whether the pages should be +unmapped immediately since they does not contain initialized data or be kept +mapped until they need to be recycled. The pages in the "init" area are supplied +with `unmap == false` since those page have valid content and are in use. + +## Invocation +The pager is invoked as part of the abort handler. A pool of physical pages are +used to map different virtual addresses. When a new virtual address needs to be +mapped a free physical page is mapped at the new address, if a free physical +page cannot be found the oldest physical page is selected instead. When the page +is mapped new data is copied from backing store and the hash of the page is +verified. If it is OK the pager returns from the exception to resume the +execution. + +## Paging of user TA + +Paging of user TAs can optionally be enabled with CFG_PAGED_USER_TA=y. +Paging of user TAs is analogous to paging of OP-TEE kernel parts but with a +few differences: +- Read/write pages are paged in addition to read-only pages +- Page tables are managed dynamically + +tee_pager_add_uta_area() is used to setup initial read/write mapping needed +when populating the TA. When the TA is fully populated and relocated +tee_pager_set_uta_area_attr() changes the mapping of the area to strict +permissions used when the TA is running. + +# 9. Memory objects + +A memory object, MOBJ, describes a piece of memory. The interface provided +is mostly abstract when it comes to using the MOBJ to populate translation +tables etc. + +There is different kinds of MOBJs describing: +- physically contiguous memory + - created with mobj_phys_alloc() +- virtual memory + - one instance with the name mobj_virt available + - spans the entire virtual address space +- physically contiguous memory allocated from a tee_mm_pool_t * + - created with mobj_mm_alloc() +- paged memory + - created with mobj_paged_alloc() + - only contains the supplied size and makes mobj_is_paged() return true if + supplied as argument +- secure copy paged shared memory + - created with mobj_seccpy_shm_alloc() + - makes mobj_is_paged() and mobj_is_secure() return true if supplied as + argument + +# 10. Cryptographic abstraction layer +Cryptographic operations are implemented inside the TEE core by the +[LibTomCrypt] library. An abstraction layer allows for replacing the default +implementation, as explained in [crypto.md]. + +# 11. libutee + +The GlobalPlatform Core Internal API describes services that are provided to +Trusted Applications. libutee is a library that implements this API. + +libutee is a static library the Trusted Applications shall statically link +against. Trusted Applications do execute in non-privileged secure userspace and +libutee also aims at being executed in the non-privileged secure userspace. + +Some services for this API are fully statically implemented inside the +libutee library while some services for the API are implemented inside the +OP-TEE core (privileged level) and libutee calls such services through +system calls. + +# 12. Trusted Applications + +## Pseudo TAs and Dynamically Loaded TAs + +There are two ways to implement Trusted Applications (TAs), pseudo TAs and +dynamically loaded TAs. As dynamically loaded TAs are full featured Trusted +Applications as specified by the GlobalPlatform TEE specifications, these are +simply referred to as 'Trusted Applications'. For most cases, dynamically +loaded TAs are preferred. + +### Pseudo Trusted Applications + +These are added directly to the OP-TEE core tree in, eg, `core/arch/arm/pta`, +and are built along with and statically built into the OP-TEE core blob. + +The pseudo Trusted Applications included in OP-TEE already are OP-TEE +secure privileged level services hidden behind a "GlobalPlatform TA Client" API. +These pseudo-TAs are used for various purpose as specific secure services or +embedded tests services. + +Pseudo TAs do not benefit from the GlobalPlatform Core Internal API support +specified by the GlobalPlatform TEE specs. These APIs are provided to TAs as a +static library each TA shall link against (the "libutee") and that calls OP-TEE +core service through system calls. As OP-TEE core does link with the +libutee, Pseudo TAs can only use the OP-TEE core internal APIs and +routines. + +As pseudo TAs have the same privileged execution level as the OP-TEE core code +itself, such situation may not be desirable for complex TAs. + +In most cases a real, dynamically loaded TA is the best choice instead of adding +your code directly to the OP-TEE core. However if you decide your application +is best handled directly in OP-TEE core like this, you can look at +`core/arch/arm/pta/stats.c` as a template and just add your pseudo TA based on +that to the `sub.mk` in the same directory. + +### Trusted Applications + +Trusted Applications (TAs) are applications dynamically loaded by OP-TEE +core in the Secure World when something in the REE wants to talk to that +particular application UUID. It is similar to the way the Linux +kernel can dynamically load kernel modules, although unlike with Linux, in +OP-TEE TAs actually run at a lower CPU privileged level than OP-TEE core code. + +Because the TAs are signed by the same key that built the OP-TEE core, they +are able to be stored in the untrusted REE filesystem, and tee-supplicant will +take care of passing them to be checked and loaded by the Secure World OP-TEE +core. Again this is simular to Linux kernel module signature checking. + +Trusted Application benefit from the GlobalPlatform Core Internal API as +specified by the GlobalPlatform TEE specifications. + +Trusted Application consist of a cleartext signed ELF file, named from the UUID +of the TA and the suffix ".ta". + +They are built separately from the OP-TEE core boot-time blob, although when +they are built they use the same build system, and are signed with the key +from the build of the original OP-TEE core blob. + +## Special treatment of Trusted Applications + +### Syscalls + +Dynamically loaded TAs are not directly bound to function exports in the OP-TEE +core blob, both because the TA code is kept at arm's length by executing at a +different privileged level, and because TAs direct binding to addresses in the +core would require upgrades of all TAs synchronusly with upgrades of the +OP-TEE core blob. Instead, the resolution of OP-TEE core exports in the TA +is done at runtime. + +OP-TEE does this by using syscalls, the same kind of way as the Linux kernel +provides a stable API for its userland programs. TAs are written to use +syscall wrappers to access functions exported from OP-TEE core, so this all +happens automatically when a TA wants to use an API exported from OP-TEE +core. + +Pseudo TAs and anything else directly built into OP-TEE core do not +require going through a syscall interface, since they can just link directly +as they are directly part of the core. + +Most of the services defined by the GlobalPlatform Core Internal API are +implemented through syscall from the TA to the OP-TEE core privileged level: +cryptographic services, communications with other TAs, ... Some services were +added through OP-TEE development such as ASCII message tracing. + +Syscalls are provided already for all public exports from OP-TEE core that a +Dynamic TA is expected to use, so you only need to take care about this if +you will add new exported from OP-TEE core that TAs will want to use. + +### Malloc mapping + +The OP-TEE core code has its own private memory allocation heap that is mapped +into its MMU view only and cannot be seen by Trusted Applications. The +core code uses `malloc()` and `free()` style APIs. + +Trusted Applications also have their own private memory allocation heaps +that are visible to the owning TA, and to OP-TEE core. TAs manage their +heaps using `TEE_Malloc()` and `TEE_Free()` style apis. + +Heap |Visible to |Inaccessible to +-----|-------------|--------------- +core |core |any TA +TA |core, same TA|any other TA + +This enforces "Chinese Walls" between the TA views of Secure World. + +Since OP-TEE core cannot perform allocations in the TA's private heap, +and the TA is not going to be able to access allocations from the OP-TEE +core heap, it means only allocations from the TA heap are visible to both the +TA and OP-TEE core. When performing syscalls between a TA and OP-TEE core +then, the TA side must provide all the memory allocations for buffers, etc +used by both sides. + +### Malloc pool + +The OP-TEE core malloc heap is defined by `CFG_CORE_HEAP_SIZE` in `mk/config.mk`. + +However for TAs, the individual TA TEE_Malloc() heap size is defined by +`TA_DATA_SIZE` in `user_ta_header_defines.h`. Likewise the TA stack size is +set in the same file, in `TA_STACK_SIZE`. + +## File format of a Dynamic Trusted Application + +The format a TA is: +``` +<Signed header> +<ELF> +``` + +Where `<ELF>` is the content of a standard ELF file and `<Signed header>` +consists of: + +| Type | Name | Comment | +|------|------|---------| +| `uint32_t` | magic | Holds the magic number `0x4f545348` | +| `uint32_t` | img_type | image type, values defined by enum shdr_img_type | +| `uint32_t` | img_size | image size in bytes | +| `uint32_t` | algo | algorithm, defined by public key algorithms `TEE_ALG_*` from TEE Internal API specification | +| `uint16_t` | hash_size | size of the signed hash | +| `uint16_t` | sig_size | size of the signature | +| `uint8_t[hash_size]` | hash | Hash of the fields above and the `<ELF>` above | +| `uint8_t[sig_size]` | signature | Signature of hash | + +[crypto.md]: crypto.md +[LibTomCrypt]: https://github.com/libtom/libtomcrypt +[OP-TEE Client]: https://github.com/OP-TEE/optee_client +[OP-TEE Linux Kernel driver]: https://github.com/linaro-swg/linux +[OP-TEE Trusted OS]: https://github.com/OP-TEE/optee_os +[optee_smc.h]: ../core/arch/arm/include/sm/optee_smc.h +[optee_msg.h]: ../core/include/optee_msg.h +[Platforms Supported]: https://github.com/OP-TEE/optee_os#3-platforms-supported +[struct optee_msg_arg]: ../core/include/optee_msg.h +[thread.c]: ../core/arch/arm/kernel/thread.c |