diff options
author | Evan Shelhamer <shelhamer@imaginarynumber.net> | 2014-05-20 12:44:51 -0700 |
---|---|---|
committer | Evan Shelhamer <shelhamer@imaginarynumber.net> | 2014-05-20 12:44:51 -0700 |
commit | c6b00a34a13209daed0a73ab59e0a279f519f21b (patch) | |
tree | d7ce6d002c7ca6799afc8e3eb7f2970ce9b33f61 | |
parent | f048bea989a1fa99fb75cd8ebc384e997e0fe0df (diff) | |
parent | 86a3263fe5035a0caa341f88663d61c60efcddad (diff) | |
download | caffe-c6b00a34a13209daed0a73ab59e0a279f519f21b.tar.gz caffe-c6b00a34a13209daed0a73ab59e0a279f519f21b.tar.bz2 caffe-c6b00a34a13209daed0a73ab59e0a279f519f21b.zip |
Back-merge changes in master
* master:
bundle presentation in gh-pages for now...
fix typo pointed out by @yinxusen
note support for non-MKL installation in dev
include pretrained snapshot and performance details
Document AlexNet model, include download script
define AlexNet architecture
polished ignore
-rw-r--r-- | README.md | 2 | ||||
-rw-r--r-- | docs/getting_pretrained_models.md | 12 | ||||
-rw-r--r-- | docs/index.md | 2 | ||||
-rw-r--r-- | examples/imagenet/alexnet_deploy.prototxt | 260 | ||||
-rw-r--r-- | examples/imagenet/alexnet_solver.prototxt | 14 | ||||
-rw-r--r-- | examples/imagenet/alexnet_train.prototxt | 332 | ||||
-rw-r--r-- | examples/imagenet/alexnet_val.prototxt | 245 | ||||
-rwxr-xr-x | examples/imagenet/get_caffe_alexnet_model.sh | 28 | ||||
-rwxr-xr-x | examples/imagenet/train_alexnet.sh | 7 |
9 files changed, 899 insertions, 3 deletions
@@ -22,7 +22,7 @@ line of code: `Caffe::set_mode(Caffe::CPU)`. Even in CPU mode, computing predictions on an image takes only 20 ms when images are processed in batch mode. -* [Caffe introductory presentation](https://www.dropbox.com/s/10fx16yp5etb8dv/caffe-presentation.pdf) +* [Caffe introductory presentation](http://berkeleyvision.org/caffe-presentation.pdf) * [Installation instructions](http://caffe.berkeleyvision.org/installation.html) \* When measured with the [SuperVision](http://www.image-net.org/challenges/LSVRC/2012/supervision.pdf) model that won the ImageNet Large Scale Visual Recognition Challenge 2012. diff --git a/docs/getting_pretrained_models.md b/docs/getting_pretrained_models.md index 56a64457..bd342f70 100644 --- a/docs/getting_pretrained_models.md +++ b/docs/getting_pretrained_models.md @@ -12,6 +12,16 @@ This page will be updated as more models become available. ### ImageNet -Our reference implementation of the AlexNet model trained on ILSVRC-2012 can be downloaded (232.57MB) by running `examples/imagenet/get_caffe_reference_imagenet_model.sh` from the Caffe root directory. +**Caffe Reference ImageNet Model**: Our reference implementation of an ImageNet model trained on ILSVRC-2012 can be downloaded (232.6MB) by running `examples/imagenet/get_caffe_reference_imagenet_model.sh` from the Caffe root directory. + +- The bundled model is the iteration 310,000 snapshot. +- The best validation performance during training was iteration 313,000 with + validation accuracy 57.412% and loss 1.82328. + +**AlexNet**: Our training of the Krizhevsky architecture, which differs from the paper's methodology by (1) not training with the relighting data-augmentation and (2) initializing non-zero biases to 0.1 instead of 1. (2) was found necessary for training, as initialization to 1 gave flat loss. Download the model (243.9MB) by running `examples/imagenet/get_caffe_alexnet_model.sh` from the Caffe root directory. + +- The bundled model is the iteration 360,000 snapshot. +- The best validation performance during training was iteration 358,000 with + validation accuracy 57.258% and loss 1.83948. Additionally, you will probably eventually need some auxiliary data (mean image, synset list, etc.): run `data/ilsvrc12/get_ilsvrc_aux.sh` from the root directory to obtain it. diff --git a/docs/index.md b/docs/index.md index 1950a244..bc1969f6 100644 --- a/docs/index.md +++ b/docs/index.md @@ -23,7 +23,7 @@ Even in CPU mode, computing predictions on an image takes only 20 ms when images ## Documentation -* [Introductory slides](https://www.dropbox.com/s/10fx16yp5etb8dv/caffe-presentation.pdf): slides about the Caffe architecture, *updated 03/14*. +* [Introductory slides](http://caffe.berkeleyvision.org/caffe-presentation.pdf): slides about the Caffe architecture, *updated 03/14*. * [Installation](/installation.html): Instructions on installing Caffe (works on Ubuntu, Red Hat, OS X). * [Pre-trained models](/getting_pretrained_models.html): BVLC provides some pre-trained models for academic / non-commercial use. * [Development](/development.html): Guidelines for development and contributing to Caffe. diff --git a/examples/imagenet/alexnet_deploy.prototxt b/examples/imagenet/alexnet_deploy.prototxt new file mode 100644 index 00000000..4059fd5d --- /dev/null +++ b/examples/imagenet/alexnet_deploy.prototxt @@ -0,0 +1,260 @@ +name: "AlexNet" +input: "data" +input_dim: 10 +input_dim: 3 +input_dim: 227 +input_dim: 227 +layers { + layer { + name: "conv1" + type: "conv" + num_output: 96 + kernelsize: 11 + stride: 4 + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "data" + top: "conv1" +} +layers { + layer { + name: "relu1" + type: "relu" + } + bottom: "conv1" + top: "conv1" +} +layers { + layer { + name: "norm1" + type: "lrn" + local_size: 5 + alpha: 0.0001 + beta: 0.75 + } + bottom: "conv1" + top: "norm1" +} +layers { + layer { + name: "pool1" + type: "pool" + pool: MAX + kernelsize: 3 + stride: 2 + } + bottom: "norm1" + top: "pool1" +} +layers { + layer { + name: "conv2" + type: "conv" + num_output: 256 + group: 2 + kernelsize: 5 + pad: 2 + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "pool1" + top: "conv2" +} +layers { + layer { + name: "relu2" + type: "relu" + } + bottom: "conv2" + top: "conv2" +} +layers { + layer { + name: "norm2" + type: "lrn" + local_size: 5 + alpha: 0.0001 + beta: 0.75 + } + bottom: "conv2" + top: "norm2" +} +layers { + layer { + name: "pool2" + type: "pool" + pool: MAX + kernelsize: 3 + stride: 2 + } + bottom: "norm2" + top: "pool2" +} +layers { + layer { + name: "conv3" + type: "conv" + num_output: 384 + kernelsize: 3 + pad: 1 + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "pool2" + top: "conv3" +} +layers { + layer { + name: "relu3" + type: "relu" + } + bottom: "conv3" + top: "conv3" +} +layers { + layer { + name: "conv4" + type: "conv" + num_output: 384 + group: 2 + kernelsize: 3 + pad: 1 + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "conv3" + top: "conv4" +} +layers { + layer { + name: "relu4" + type: "relu" + } + bottom: "conv4" + top: "conv4" +} +layers { + layer { + name: "conv5" + type: "conv" + num_output: 256 + group: 2 + kernelsize: 3 + pad: 1 + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "conv4" + top: "conv5" +} +layers { + layer { + name: "relu5" + type: "relu" + } + bottom: "conv5" + top: "conv5" +} +layers { + layer { + name: "pool5" + type: "pool" + kernelsize: 3 + pool: MAX + stride: 2 + } + bottom: "conv5" + top: "pool5" +} +layers { + layer { + name: "fc6" + type: "innerproduct" + num_output: 4096 + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "pool5" + top: "fc6" +} +layers { + layer { + name: "relu6" + type: "relu" + } + bottom: "fc6" + top: "fc6" +} +layers { + layer { + name: "drop6" + type: "dropout" + dropout_ratio: 0.5 + } + bottom: "fc6" + top: "fc6" +} +layers { + layer { + name: "fc7" + type: "innerproduct" + num_output: 4096 + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "fc6" + top: "fc7" +} +layers { + layer { + name: "relu7" + type: "relu" + } + bottom: "fc7" + top: "fc7" +} +layers { + layer { + name: "drop7" + type: "dropout" + dropout_ratio: 0.5 + } + bottom: "fc7" + top: "fc7" +} +layers { + layer { + name: "fc8" + type: "innerproduct" + num_output: 1000 + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "fc7" + top: "fc8" +} +layers { + layer { + name: "prob" + type: "softmax" + } + bottom: "fc8" + top: "prob" +} diff --git a/examples/imagenet/alexnet_solver.prototxt b/examples/imagenet/alexnet_solver.prototxt new file mode 100644 index 00000000..75d0d5df --- /dev/null +++ b/examples/imagenet/alexnet_solver.prototxt @@ -0,0 +1,14 @@ +train_net: "alexnet_train.prototxt" +test_net: "alexnet_val.prototxt" +test_iter: 1000 +test_interval: 1000 +base_lr: 0.01 +lr_policy: "step" +gamma: 0.1 +stepsize: 100000 +display: 20 +max_iter: 450000 +momentum: 0.9 +weight_decay: 0.0005 +snapshot: 10000 +snapshot_prefix: "caffe_alexnet_train" diff --git a/examples/imagenet/alexnet_train.prototxt b/examples/imagenet/alexnet_train.prototxt new file mode 100644 index 00000000..c5394dc5 --- /dev/null +++ b/examples/imagenet/alexnet_train.prototxt @@ -0,0 +1,332 @@ +name: "AlexNet" +layers { + layer { + name: "data" + type: "data" + source: "ilsvrc12_train_leveldb" + meanfile: "../../data/ilsvrc12/imagenet_mean.binaryproto" + batchsize: 256 + cropsize: 227 + mirror: true + } + top: "data" + top: "label" +} +layers { + layer { + name: "conv1" + type: "conv" + num_output: 96 + kernelsize: 11 + stride: 4 + weight_filler { + type: "gaussian" + std: 0.01 + } + bias_filler { + type: "constant" + value: 0. + } + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "data" + top: "conv1" +} +layers { + layer { + name: "relu1" + type: "relu" + } + bottom: "conv1" + top: "conv1" +} +layers { + layer { + name: "norm1" + type: "lrn" + local_size: 5 + alpha: 0.0001 + beta: 0.75 + } + bottom: "conv1" + top: "norm1" +} +layers { + layer { + name: "pool1" + type: "pool" + pool: MAX + kernelsize: 3 + stride: 2 + } + bottom: "norm1" + top: "pool1" +} +layers { + layer { + name: "conv2" + type: "conv" + num_output: 256 + group: 2 + kernelsize: 5 + pad: 2 + weight_filler { + type: "gaussian" + std: 0.01 + } + bias_filler { + type: "constant" + value: 0.1 + } + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "pool1" + top: "conv2" +} +layers { + layer { + name: "relu2" + type: "relu" + } + bottom: "conv2" + top: "conv2" +} +layers { + layer { + name: "norm2" + type: "lrn" + local_size: 5 + alpha: 0.0001 + beta: 0.75 + } + bottom: "conv2" + top: "norm2" +} +layers { + layer { + name: "pool2" + type: "pool" + pool: MAX + kernelsize: 3 + stride: 2 + } + bottom: "norm2" + top: "pool2" +} +layers { + layer { + name: "conv3" + type: "conv" + num_output: 384 + kernelsize: 3 + pad: 1 + weight_filler { + type: "gaussian" + std: 0.01 + } + bias_filler { + type: "constant" + value: 0. + } + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "pool2" + top: "conv3" +} +layers { + layer { + name: "relu3" + type: "relu" + } + bottom: "conv3" + top: "conv3" +} +layers { + layer { + name: "conv4" + type: "conv" + num_output: 384 + group: 2 + kernelsize: 3 + pad: 1 + weight_filler { + type: "gaussian" + std: 0.01 + } + bias_filler { + type: "constant" + value: 0.1 + } + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "conv3" + top: "conv4" +} +layers { + layer { + name: "relu4" + type: "relu" + } + bottom: "conv4" + top: "conv4" +} +layers { + layer { + name: "conv5" + type: "conv" + num_output: 256 + group: 2 + kernelsize: 3 + pad: 1 + weight_filler { + type: "gaussian" + std: 0.01 + } + bias_filler { + type: "constant" + value: 0.1 + } + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "conv4" + top: "conv5" +} +layers { + layer { + name: "relu5" + type: "relu" + } + bottom: "conv5" + top: "conv5" +} +layers { + layer { + name: "pool5" + type: "pool" + kernelsize: 3 + pool: MAX + stride: 2 + } + bottom: "conv5" + top: "pool5" +} +layers { + layer { + name: "fc6" + type: "innerproduct" + num_output: 4096 + weight_filler { + type: "gaussian" + std: 0.005 + } + bias_filler { + type: "constant" + value: 0.1 + } + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "pool5" + top: "fc6" +} +layers { + layer { + name: "relu6" + type: "relu" + } + bottom: "fc6" + top: "fc6" +} +layers { + layer { + name: "drop6" + type: "dropout" + dropout_ratio: 0.5 + } + bottom: "fc6" + top: "fc6" +} +layers { + layer { + name: "fc7" + type: "innerproduct" + num_output: 4096 + weight_filler { + type: "gaussian" + std: 0.005 + } + bias_filler { + type: "constant" + value: 0.1 + } + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "fc6" + top: "fc7" +} +layers { + layer { + name: "relu7" + type: "relu" + } + bottom: "fc7" + top: "fc7" +} +layers { + layer { + name: "drop7" + type: "dropout" + dropout_ratio: 0.5 + } + bottom: "fc7" + top: "fc7" +} +layers { + layer { + name: "fc8" + type: "innerproduct" + num_output: 1000 + weight_filler { + type: "gaussian" + std: 0.01 + } + bias_filler { + type: "constant" + value: 0. + } + blobs_lr: 1. + blobs_lr: 2. + weight_decay: 1. + weight_decay: 0. + } + bottom: "fc7" + top: "fc8" +} +layers { + layer { + name: "loss" + type: "softmax_loss" + } + bottom: "fc8" + bottom: "label" +} diff --git a/examples/imagenet/alexnet_val.prototxt b/examples/imagenet/alexnet_val.prototxt new file mode 100644 index 00000000..aff33d01 --- /dev/null +++ b/examples/imagenet/alexnet_val.prototxt @@ -0,0 +1,245 @@ +name: "AlexNet" +layers { + layer { + name: "data" + type: "data" + source: "ilsvrc12_val_leveldb" + meanfile: "../../data/ilsvrc12/imagenet_mean.binaryproto" + batchsize: 50 + cropsize: 227 + mirror: false + } + top: "data" + top: "label" +} +layers { + layer { + name: "conv1" + type: "conv" + num_output: 96 + kernelsize: 11 + stride: 4 + } + bottom: "data" + top: "conv1" +} +layers { + layer { + name: "relu1" + type: "relu" + } + bottom: "conv1" + top: "conv1" +} +layers { + layer { + name: "norm1" + type: "lrn" + local_size: 5 + alpha: 0.0001 + beta: 0.75 + } + bottom: "conv1" + top: "norm1" +} +layers { + layer { + name: "pool1" + type: "pool" + pool: MAX + kernelsize: 3 + stride: 2 + } + bottom: "norm1" + top: "pool1" +} +layers { + layer { + name: "conv2" + type: "conv" + num_output: 256 + group: 2 + kernelsize: 5 + pad: 2 + } + bottom: "pool1" + top: "conv2" +} +layers { + layer { + name: "relu2" + type: "relu" + } + bottom: "conv2" + top: "conv2" +} +layers { + layer { + name: "norm2" + type: "lrn" + local_size: 5 + alpha: 0.0001 + beta: 0.75 + } + bottom: "conv2" + top: "norm2" +} +layers { + layer { + name: "pool2" + type: "pool" + pool: MAX + kernelsize: 3 + stride: 2 + } + bottom: "norm2" + top: "pool2" +} +layers { + layer { + name: "conv3" + type: "conv" + num_output: 384 + kernelsize: 3 + pad: 1 + } + bottom: "pool2" + top: "conv3" +} +layers { + layer { + name: "relu3" + type: "relu" + } + bottom: "conv3" + top: "conv3" +} +layers { + layer { + name: "conv4" + type: "conv" + num_output: 384 + group: 2 + kernelsize: 3 + pad: 1 + } + bottom: "conv3" + top: "conv4" +} +layers { + layer { + name: "relu4" + type: "relu" + } + bottom: "conv4" + top: "conv4" +} +layers { + layer { + name: "conv5" + type: "conv" + num_output: 256 + group: 2 + kernelsize: 3 + pad: 1 + } + bottom: "conv4" + top: "conv5" +} +layers { + layer { + name: "relu5" + type: "relu" + } + bottom: "conv5" + top: "conv5" +} +layers { + layer { + name: "pool5" + type: "pool" + kernelsize: 3 + pool: MAX + stride: 2 + } + bottom: "conv5" + top: "pool5" +} +layers { + layer { + name: "fc6" + type: "innerproduct" + num_output: 4096 + } + bottom: "pool5" + top: "fc6" +} +layers { + layer { + name: "relu6" + type: "relu" + } + bottom: "fc6" + top: "fc6" +} +layers { + layer { + name: "drop6" + type: "dropout" + dropout_ratio: 0.5 + } + bottom: "fc6" + top: "fc6" +} +layers { + layer { + name: "fc7" + type: "innerproduct" + num_output: 4096 + } + bottom: "fc6" + top: "fc7" +} +layers { + layer { + name: "relu7" + type: "relu" + } + bottom: "fc7" + top: "fc7" +} +layers { + layer { + name: "drop7" + type: "dropout" + dropout_ratio: 0.5 + } + bottom: "fc7" + top: "fc7" +} +layers { + layer { + name: "fc8" + type: "innerproduct" + num_output: 1000 + } + bottom: "fc7" + top: "fc8" +} +layers { + layer { + name: "prob" + type: "softmax" + } + bottom: "fc8" + top: "prob" +} +layers { + layer { + name: "accuracy" + type: "accuracy" + } + bottom: "prob" + bottom: "label" + top: "accuracy" +} diff --git a/examples/imagenet/get_caffe_alexnet_model.sh b/examples/imagenet/get_caffe_alexnet_model.sh new file mode 100755 index 00000000..167c9371 --- /dev/null +++ b/examples/imagenet/get_caffe_alexnet_model.sh @@ -0,0 +1,28 @@ +#!/usr/bin/env sh +# This scripts downloads the caffe reference imagenet model +# for ilsvrc image classification and deep feature extraction + +MODEL=caffe_alexnet_model +CHECKSUM=91df0e19290ef78324de9eecb258a77f + +if [ -f $MODEL ]; then + echo "Model already exists. Checking md5..." + os=`uname -s` + if [ "$os" = "Linux" ]; then + checksum=`md5sum $MODEL | awk '{ print $1 }'` + elif [ "$os" = "Darwin" ]; then + checksum=`cat $MODEL | md5` + fi + if [ "$checksum" = "$CHECKSUM" ]; then + echo "Model checksum is correct. No need to download." + exit 0 + else + echo "Model checksum is incorrect. Need to download again." + fi +fi + +echo "Downloading..." + +wget --no-check-certificate https://www.dropbox.com/s/rk6nkt0kf109slo/caffe_alexnet_model + +echo "Done. Please run this command again to verify that checksum = $CHECKSUM." diff --git a/examples/imagenet/train_alexnet.sh b/examples/imagenet/train_alexnet.sh new file mode 100755 index 00000000..c91e9108 --- /dev/null +++ b/examples/imagenet/train_alexnet.sh @@ -0,0 +1,7 @@ +#!/usr/bin/env sh + +TOOLS=../../build/tools + +GLOG_logtostderr=1 $TOOLS/train_net.bin alexnet_solver.prototxt + +echo "Done." |