1 files changed, 132 insertions, 0 deletions
diff --git a/docs/nnfw/howto/HowtoMakeSampleAppOnNnfw.md b/docs/nnfw/howto/HowtoMakeSampleAppOnNnfw.md
new file mode 100644
index 000000000..d272a8390
--- /dev/null
+++ b/docs/nnfw/howto/HowtoMakeSampleAppOnNnfw.md
@@ -0,0 +1,132 @@
+# How to make a sample app on nnfw
+
+Our runtime `neurun` support `NNAPI` as interface currently. To use `NNAPI` efficiently, one of solution is to use tensorflow lite. We support additional library to help using tensorflow lite in `/libs/tflite`. (this library is not official support)
+
+To use tensorflow lite, you need to prepare tensorflow lite model file, and you should know input/output tensor name. Then write sample app.
+
+## Prepare loaded tensorflow lite model object
+
+You can select one of kernel register: tensorflow lite official kernel register or extended register (for pre-implemented custom op)
+```
+#include "tensorflow/lite/kernels/register.h"
+#include "tflite/ext/kernels/register.h"
+```
+
+To use tensorflow lite interpreter, need tensorflow lite interpreter session header
+```
+#include "tflite/InterpreterSession.h"
+```
+
+For NNAPI usage, need NNAPI session header
+```
+#include "tflite/NNAPISession.h"
+```
+
+Load the model object into `FlatBuffer`, create a tensorflow lite operator resolver `BuiltinOpResolver` and construct a tensorflow interpreter builder using them:
+```
+tflite::StderrReporter error_reporter;
+auto model = tflite::FlatBufferModel::BuildFromFile(model_file.c_str(), &error_reporter);
+
+// TODO: determine which BuiltinOpResolver and prepend namespace
+BuiltinOpResolver resolver;
+
+tflite::InterpreterBuilder builder(*model, resolver);
+```
+
+Create a tensorflow interpreter and init the builder using it:
+```
+std::unique_ptr<tflite::Interpreter> interpreter;
+builder(&interpreter);
+```
+
+Create a tensorflow lite session to use NNAPI:
+```
+std::shared_ptr<nnfw::tflite::Session> sess = std::make_shared<nnfw::tflite::NNAPISession>(interpreter.get());
+```
+
+If you want to use tensorflow lite interpreter instead of NNAPI, then:
+```
+std::shared_ptr<nnfw::tflite::Session> sess = std::make_shared<nnfw::tflite::InterpreterSession>(interpreter.get());
+```
+
+`NNAPISession` constructs a computational graph from the interpreter and builds the model.
+
+## Prepare tensors memory allocation and model input for inference
+
+Allocate the memory for tensors of `tflite::Interpreter`:
+```
+sess->prepare();
+```
+
+Prepare inputs. How to prepare is out of scope and task specific.<br/>
+Copy the input data into model, i.e. into `interpreter->inputs`. This is tensorflow specific, not nnfw, so one can use any method, that is applicable to Tensorflow, e.g.:
+```
+for (const auto &id : interpreter->inputs())
+{
+  if (interpreter->tensor(id)->name == input_name)
+  {
+    float *p = interpreter->tensor(id)->data.f;
+
+    for (int y = 0; y < height; ++y)
+    {
+      for (int x = 0; x < width; ++x)
+      {
+        for (int c = 0; c < channel; ++c)
+        {
+          *p++ = data[y * width * channel + x * channel + c];
+        }
+      }
+    }
+  }
+}
+```
+where:<br/>
+`input_name` - name of the inputs of the model;<br/>
+`data` - source vector of size `height * width * channel`.
+
+## Run the inference and get outputs
+
+Run the inference
+```
+sess->run();
+```
+
+Get the result from `interpreter->outputs()`. This is tensorflow lite specific, not nnfw, so one can use any method, that is applicable to tensorflow lite, e.g.:
+```
+for (const auto &id : interpreter->outputs())
+{
+  if (interpreter->tensor(id)->name == output_name)
+  {
+    float *p = interpreter->tensor(id)->data.f;
+
+    for (int i = 0; i < result.capacity(); ++i)
+    {
+      result.push_back(p[i]);
+    }
+  }
+}
+```
+where:<br/>
+`output_name` - name of the outputs of the model;<br/>
+`result` - float vector, where to put output. Its size can be calculated using
+```
+for (const auto &id : interpreter->outputs())
+{
+  if (interpreter->tensor(id)->name == output_name)
+  {
+    TfLiteTensor *t = interpreter->tensor(id);
+    int v = 1;
+    for (int i = 0; i < t->dims->size; ++i)
+    {
+      v *= t->dims->data[i];
+    }
+    return v;
+  }
+}
+return -1;
+```
+
+Release the session
+```
+sess->teardown();
+```