summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorForrest Iandola <fiandola@gmail.com>2017-10-16 00:44:24 -0700
committerGitHub <noreply@github.com>2017-10-16 00:44:24 -0700
commitf054c210e493111061a458b887f7c4edaca06a9f (patch)
tree7d4d52c6330bc4ee8bfcea0141967698be9feb97
parentbf8b01dfbfdca124673ade33c5eac8f3748d7abd (diff)
downloadarmcl-f054c210e493111061a458b887f7c4edaca06a9f.tar.gz
armcl-f054c210e493111061a458b887f7c4edaca06a9f.tar.bz2
armcl-f054c210e493111061a458b887f7c4edaca06a9f.zip
fix comment
typo kernerl's --> kernel's
-rw-r--r--src/core/NEON/kernels/NEDirectConvolutionLayerKernel.cpp2
1 files changed, 1 insertions, 1 deletions
diff --git a/src/core/NEON/kernels/NEDirectConvolutionLayerKernel.cpp b/src/core/NEON/kernels/NEDirectConvolutionLayerKernel.cpp
index 2766d698d..66d6d1fd4 100644
--- a/src/core/NEON/kernels/NEDirectConvolutionLayerKernel.cpp
+++ b/src/core/NEON/kernels/NEDirectConvolutionLayerKernel.cpp
@@ -1082,7 +1082,7 @@ public:
the third thread [16,24] and the fourth thread [25,31].
The algorithm outer loop iterates over Z, P, Y, X where P is the depth/3rd dimension of each kernel. This order is not arbitrary, the main benefit of this
- is that we setup the neon registers containing the kernerl's values only once and then compute each XY using the preloaded registers as opposed as doing this for every XY value.
+ is that we setup the neon registers containing the kernel's values only once and then compute each XY using the preloaded registers as opposed as doing this for every XY value.
The algorithm does not require allocating any additional memory amd computes the results directly in-place in two stages:
1) Convolve plane 0 with kernel 0 and initialize the corresponding output plane with these values.