summaryrefslogtreecommitdiff
path: root/examples
diff options
context:
space:
mode:
authorEvan Shelhamer <shelhamer@imaginarynumber.net>2014-10-14 10:18:20 -0700
committerEvan Shelhamer <shelhamer@imaginarynumber.net>2014-10-14 10:18:20 -0700
commit4302304ddc678723d5ea308f637fcd4d45252c76 (patch)
treeb594d0c465796765a3f2264f190b1a9c05e96abf /examples
parent5fb32067e059ee1a2c631b2283e17458cee64331 (diff)
downloadcaffeonacl-4302304ddc678723d5ea308f637fcd4d45252c76.tar.gz
caffeonacl-4302304ddc678723d5ea308f637fcd4d45252c76.tar.bz2
caffeonacl-4302304ddc678723d5ea308f637fcd4d45252c76.zip
[examples] fix reference model name for flickr fine-tuning
Diffstat (limited to 'examples')
-rw-r--r--examples/finetune_flickr_style/readme.md4
1 files changed, 2 insertions, 2 deletions
diff --git a/examples/finetune_flickr_style/readme.md b/examples/finetune_flickr_style/readme.md
index e0183510..c4aafbc6 100644
--- a/examples/finetune_flickr_style/readme.md
+++ b/examples/finetune_flickr_style/readme.md
@@ -13,14 +13,14 @@ Let's fine-tune the BVLC-distributed CaffeNet model on a different dataset, [Fli
## Explanation
-The Flickr-sourced images of the Style dataset are visually very similar to the ImageNet dataset, on which the `caffe_reference_imagenet_model` was trained.
+The Flickr-sourced images of the Style dataset are visually very similar to the ImageNet dataset, on which the `bvlc_reference_caffenet` was trained.
Since that model works well for object category classification, we'd like to use it architecture for our style classifier.
We also only have 80,000 images to train on, so we'd like to start with the parameters learned on the 1,000,000 ImageNet images, and fine-tune as needed.
If we give provide the `weights` argument to the `caffe train` command, the pretrained weights will be loaded into our model, matching layers by name.
Because we are predicting 20 classes instead of a 1,000, we do need to change the last layer in the model.
Therefore, we change the name of the last layer from `fc8` to `fc8_flickr` in our prototxt.
-Since there is no layer named that in the `caffe_reference_imagenet_model`, that layer will begin training with random weights.
+Since there is no layer named that in the `bvlc_reference_caffenet`, that layer will begin training with random weights.
We will also decrease the overall learning rate `base_lr` in the solver prototxt, but boost the `blobs_lr` on the newly introduced layer.
The idea is to have the rest of the model change very slowly with new data, but let the new layer learn fast.