docs/nncc/project/detailed_level_design.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329

# SW Detailed Level Design

**Revision history**

| Ver. | Date       | Contents          | Author            | Approver     |
| ---- | ---------- | ----------------- | ----------------- | ------------ |
| 0.1  | 2018.06.20 | Initial version   | Vostokov Sergey   | Sung-Jae Lee |
| 0.2  | 2018.06.21 | SE member review  | Alexey Kondrashov |              |
| 1.0  | 2018.06.22 | Final DR1 version | Vostokov Sergey   | Sung-Jae Lee |

**Terminology and Abbreviation**

|              |                                                               |
| ------------ | ------------------------------------------------------------- |
| OS           | Operating System                                              |
| OS API       | Application interface of OS                                   |
| HW           | Hardware                                                      |
| SW           | Software                                                      |
| NN           | Neural Network                                                |
| NN model     | Neural network model (Instance of NN built with ML framework) |
| NN compiler  | The compiler for neural network                               |
| ML framework | The machine learning framework                                |
| TF/TF Lite   | Tensorflow/Tensorflow Lite ML framework                       |
| IR           | Intermediate representation                                   |
| CI/CI system | Continuous integration system                                 |
| UI           | The user interface                                            |
| GUI          | The graphical user interface                                  |
| CLI          | The command-line interface                                    |

**References**

\[1\] Vostokov Sergey, [SW Requirements Specification](requirements_specification.md)

\[2\] Vostokov Sergey, [SW High-Level Design](high_level_design.md)

## Overview

### Scope

The main goal of the project is to develop a compiler for neural
networks to produce executable artefact for specified SW and HW
platform.

The development scope includes the following components:

  - Develop importer module to parse, verify and represent NN model for
    further optimization and compilation
  - Develop code emitters to produce executable binary for CPU and GPU

  
**2018 year goals:**

  - Support TensorFlow Lite NN model format
  - Support Caffe NN model format
  - Support Caffe2 NN model format (Optional)
  - Support compilation of MobileNet NN
  - Support compilation of Inception v3 NN
  - Support ARM CPU
  - Support ARM GPU (Mali)
  - Support Tizen OS
  - Support SmartMachine OS
(Optional)

| Product             | Target Model Name              | Comment          |
| ------------------- | ------------------------------ | ---------------- |
| Tizen phone         | Tizen TM2                      | Reference device |
| Tizen device        | Odroid XU4                     | Reference board  |
| SmartMachine target | Microvision mv8890, exynos8890 | Reference device |

Table 1-1. Target Model

### Design Consideration

Deep learning software demands reliability and performance. The common
approach which comes from the history is to develop a SW framework
(machine learning framework) which would compute each step of the neural
network inference process using supported hardware. This approach is
used in many popular solutions like Google Tensorflow/Tensorflow Lite,
Caffe/2, etc. Traditionally, neural network developers build a
computation graph and then an appropriate machine learning framework
interprets it. The latest discoveries in AI field show that the
node-visitor method of execution is inefficient. As a result, a second
approach has been worked out by the industry, which is a neural network
compiler that executes code more efficiently.

This document presents the design of the *nncc*, a neural network
compiler collection. The design should provide the easiest way to extend
the functionality of the *nncc* by adding new modules with the following
features:

  - Support neural networks produced by various machine learning
    frameworks;
  - Produce an artefact taking advantages of various hardware
    including specialized processors like NPU;
  - Apply new domain specific optimization techniques over given NN.

Non-functional requirements to the developed software are well-described
in the SW Requirements Specification, such requirements are not shown
here to avoid duplication.

### Constraints

See constraints in SW Requirements Specification.


<table>
<colgroup>
<col style="width: 24%" />
<col style="width: 64%" />
<col style="width: 10%" />
</colgroup>
<thead>
<tr class="header">
<th>Item</th>
<th>Assumptions, Dependencies and the Constraints</th>
<th>Reference</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Tizen SW Platform</td>
<td><dl>
<dt>The following items should be provided:</dt>
<dd><ul>
<li>Tizen API</li>
<li>Tizen kernel</li>
<li>Tizen FW</li>
<li>Tizen SDK</li>
<li>Tizen naming convention</li>
</ul>
</dd>
</dl></td>
<td>- <a href="www.tizen.org" class="uri">www.tizen.org</a> <br>- <a href="wiki.tizen.org" class="uri">wiki.tizen.org</a> <br>- <a href="developer.tizen.org" class="uri">developer.tizen.org</a></td>
</tr>
<tr class="even">
<td>SmartMachine OS Platform</td>
<td><dl>
<dt>The following items should be provided:</dt>
<dd><ul>
<li>SmartMachine API</li>
<li>SmartMachine kernel</li>
<li>SmartMachine FW</li>
<li>SmartMachine SDK</li>
<li>SmartMachine naming convention</li>
</ul>
</dd>
</dl></td>
<td>- <a href="http://suprem.sec.samsung.net/confluence/pages/viewpage.action?pageId=81833987">Platform confluence</a> <br>- <a href="https://github.sec.samsung.net/RS7-SmartMachine">Github</a> <br>- <a href="http://suprem.sec.samsung.net/confluence/display/ASEC/Adaptive+AUTOSAR">Functional Safety confluence</a></td>
</tr>
<tr class="odd">
<td>Host OS</td>
<td>Linux-based OS (Ubuntu, Archlinux, etc)</td>
<td>- <a href="https://www.ubuntu.com/">Ubuntu site</a> <br>- <a href="https://www.archlinux.org/">Archlinux site</a></td>
</tr>
<tr class="even">
<td>Tizen target HW</td>
<td>The reference device should be provided: Tizen TM2</td>
<td></td>
</tr>
<tr class="odd">
<td>SmartMachine target HW</td>
<td>The reference device should be provided</td>
<td></td>
</tr>
</tbody>
</table>
Table 1-2. Assumptions, Dependecies and the Constraints</caption>

## SW Detailed Structure Design

### SW Block Structure

Top-Level Components of the nncc descriped in HLD. More detailed
structure and class diagram will be available after development
completion.

### SW Block Feature

1.  Initialization: configure all internal modules (see
    [{Initialization} Detailed Design](#initialization-detailed-design))
2.  Frontend: Import NN model (see [{Import NN model} Detailed
    Design](#import-nn-model-detailed-design))
      - *Caffe frontend*: includes the parser of Caffe NN model format,
        verifier to ensure that parsed data is valid and consentient,
        and Caffe-specific IR converter
      - *Caffe2 frontend*: includes the parser of Caffe2 NN model
        format, verifier to ensure that parsed data is valid and
        consentient, and Caffe2-specific IR converter to Model IR
      - *Tensorflow Lite frontend*: includes the parser of Tensorflow NN
        model format with automatic version recognition feature,
        verifier to ensure that parsed data is valid and consentient,
        and Tensorflow Lite-specific IR converter to Model IR
3.  Backend: Generate the code (see [{Generate the code} Detailed
    Design](#generate-the-code-detailed-design))
      - *Interpreter:* As it was described in SW High-Level Document
        imported NN model may proceed through three step of Intermediate
        representation: Model IR, Coarse-Grained IR, Fine-Grained IR.
        The Interpreter backend uses each this IR to do inference of
        given NN model. As the output, the user gets the resulting
        calculation of all NN ops included into original computation
        graph.
      - *Binary*:This type refers to generating binary code that can be
        executed on the target device. NN compiler can generate code
        that is either executed solely on CPU or takes advantage of the
        GPU when possible if the corresponding target was specified. The
        user may want to incorporate 3rd party libraries included into
        target firmware or delivered with the application package. In
        this case, the compiler prepares the data following EABI
        convention and embeds an invocation of high-level functions by
        appropriate symbol.
      - *Soft*: Resulting program is a generated source code in
        high-level programming language C or C++. Here there are two
        options: the first one is to generate the source code that does
        not depend on libraries outside of itself, with the exception of
        system libraries. The second one is to include the code to
        invoke high-level functions from 3rd party libraries. For
        example, it may be an invocation of matrix multiplication from
        GEMM library.

## SW Detailed Operation Design

### {Initialization} Detailed Design

#### Major Function

To provide a valid configuration session for all modules of *nncc* using
user input from the command line/config file/environment variables.

#### Operation Sequence

Initialization of the *nncc* includes command line option processing,
configuration of its subsystems as well as any error checking possible
at this stage. It consists of the following steps:

1.  Collect all command line options and verify their format for
    validity (no syntax errors etc.)

2.  Check for validity and then process general options

3.  Load subsystem modules

4.  For each one of them:
    
      - Configure
      - Pass command line options
      - Check command line options for validity (for example, check
        that every required option is present)

At the end of this process each subsystem is configured and has access
to all data needed for its operation.

### {Import NN model} Detailed Design

#### Major Function

To convert given NN model from framework-specific IR to Model IR for
further processing.

#### Operation Sequence

As you may see on the diagram, neural network import is the main
function of the compiler front-end part. The result of this operation is
a computation graph which is presented as Model IR.

![image](../images/nncc_idef0_a12.png)

The import process consists of three parts:

1.  NN model parsing
2.  Verification of the result from the previous step
3.  Converting the model to the Model IR

During the first step, file or files containing the model are read and
represented in some format specific to each NN framework.

Verification step is included to ensure that:

  - None of the files constituting the model are damaged
  - Model format corresponds to the specified one
  - Version of the model format corresponds to the specified one

The most important step is accurately converting the model from the
framework-specific representation to the Model IR. This conversion
includes:

  - *Translation of the NN model computation graph to the Model IR
    computation graph.* During the translation new nodes may be
    introduced - for example, a high-level NN operation may be split
    into a few smaller ones.
  - *NN model parameter layout conversion.* The way parameters (also
    known as weights) of a model are layed out in each specific NN
    framework may differ, and it is necessary to convert such layout
    into a unified format.
  - *NN operation parameter conversion.* Each NN operation has a set
    of its own parameters describing the way this operation should be
    performed, and these parameters also differ between frameworks.

Resulting Model IR is equivalent to the initial NN model in terms of how
NN model inputs would be transformed into its outputs if all the
operations in the Model IR were executed.

### {Generate the code} Detailed Design

Development in progress. Will be described on Completion DR.

## Interface Design

Development in progress. Will be described on DR2.

## SW Code Structure

| Directory                | Description                                                          |
| ------------------------ | -------------------------------------------------------------------- |
| /                        | source codes of the build system, main README file                   |
| /contrib                 | Incubating projects                                                  |
| /doc                     | Contains the documentation of the project                            |
| /doc/project             | Contains project management documents (SRS, SDD, STD, HLD, DLD, etc) |
| /libs                    | Contains the source of the libraries which are used by the nncc      |
| /libs/core               | Contains the source code of the core library of nncc                 |
| /libs/frontend           | Contains the source code of supported frontend's plugins             |
| /libs/frontend/caffe     | The source code for the Caffe frontend                               |
| /libs/frontend/caffe2    | The source code for the Caffe2 frontend                              |
| /libs/frontend/tflite    | The source code for the Tensorflow Lite frontend                     |
| /libs/backend            | Contains the source code of supported backend’ plugins               |
| /libs/backend/cpu        | Contains the source code of CPU backend                              |
| /libs/backend/gpu        | Contains the source code of GPU backend                              |
| /libs/backend/3rd\_party | Contains the source code of backend to utilize 3rd party libraries   |
| /scripts                 | Various scripts for building and testing the nncc                    |
| /tools                   | The source code of the executables                                   |