docs/nnfw/howto/HowtoMakeSampleAppOnNnfw.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132

# How to make a sample app on nnfw

Our runtime `neurun` support `NNAPI` as interface currently. To use `NNAPI` efficiently, one of solution is to use tensorflow lite. We support additional library to help using tensorflow lite in `/libs/tflite`. (this library is not official support)

To use tensorflow lite, you need to prepare tensorflow lite model file, and you should know input/output tensor name. Then write sample app.

## Prepare loaded tensorflow lite model object

You can select one of kernel register: tensorflow lite official kernel register or extended register (for pre-implemented custom op)
```
#include "tensorflow/lite/kernels/register.h"
#include "tflite/ext/kernels/register.h"
```

To use tensorflow lite interpreter, need tensorflow lite interpreter session header
```
#include "tflite/InterpreterSession.h"
```

For NNAPI usage, need NNAPI session header
```
#include "tflite/NNAPISession.h"
```

Load the model object into `FlatBuffer`, create a tensorflow lite operator resolver `BuiltinOpResolver` and construct a tensorflow interpreter builder using them:
```
tflite::StderrReporter error_reporter;
auto model = tflite::FlatBufferModel::BuildFromFile(model_file.c_str(), &error_reporter);

// TODO: determine which BuiltinOpResolver and prepend namespace
BuiltinOpResolver resolver;

tflite::InterpreterBuilder builder(*model, resolver);
```

Create a tensorflow interpreter and init the builder using it:
```
std::unique_ptr<tflite::Interpreter> interpreter;
builder(&interpreter);
```

Create a tensorflow lite session to use NNAPI:
```
std::shared_ptr<nnfw::tflite::Session> sess = std::make_shared<nnfw::tflite::NNAPISession>(interpreter.get());
```

If you want to use tensorflow lite interpreter instead of NNAPI, then:
```
std::shared_ptr<nnfw::tflite::Session> sess = std::make_shared<nnfw::tflite::InterpreterSession>(interpreter.get());
```

`NNAPISession` constructs a computational graph from the interpreter and builds the model.

## Prepare tensors memory allocation and model input for inference

Allocate the memory for tensors of `tflite::Interpreter`:
```
sess->prepare();
```

Prepare inputs. How to prepare is out of scope and task specific.<br/>
Copy the input data into model, i.e. into `interpreter->inputs`. This is tensorflow specific, not nnfw, so one can use any method, that is applicable to Tensorflow, e.g.:
```
for (const auto &id : interpreter->inputs())
{
  if (interpreter->tensor(id)->name == input_name)
  {
    float *p = interpreter->tensor(id)->data.f;

    for (int y = 0; y < height; ++y)
    {
      for (int x = 0; x < width; ++x)
      {
        for (int c = 0; c < channel; ++c)
        {
          *p++ = data[y * width * channel + x * channel + c];
        }
      }
    }
  }
}
```
where:<br/>
`input_name` - name of the inputs of the model;<br/>
`data` - source vector of size `height * width * channel`.

## Run the inference and get outputs

Run the inference
```
sess->run();
```

Get the result from `interpreter->outputs()`. This is tensorflow lite specific, not nnfw, so one can use any method, that is applicable to tensorflow lite, e.g.:
```
for (const auto &id : interpreter->outputs())
{
  if (interpreter->tensor(id)->name == output_name)
  {
    float *p = interpreter->tensor(id)->data.f;

    for (int i = 0; i < result.capacity(); ++i)
    {
      result.push_back(p[i]);
    }
  }
}
```
where:<br/>
`output_name` - name of the outputs of the model;<br/>
`result` - float vector, where to put output. Its size can be calculated using
```
for (const auto &id : interpreter->outputs())
{
  if (interpreter->tensor(id)->name == output_name)
  {
    TfLiteTensor *t = interpreter->tensor(id);
    int v = 1;
    for (int i = 0; i < t->dims->size; ++i)
    {
      v *= t->dims->data[i];
    }
    return v;
  }
}
return -1;
```

Release the session
```
sess->teardown();
```