Age | Commit message (Collapse) | Author | Files | Lines |
|
Add argmax_param "axis" to maximise output along the specified axis
|
|
Ensure consistency between memory alloc and free
|
|
Add a bool flag to record whether a host memory is allocated using malloc or
cudaMallocHost, and free correspondingly using this flag, instead of depending on Caffe::mode(), which is mutable during runtime.
|
|
Disallow PythonLayer in Multi-GPU training
|
|
|
|
|
|
LayerSetUp
This also eliminates the extra copying of bottom's shape.
|
|
refine format of switch case statement in solver
|
|
|
|
|
|
|
|
OpenCV, LMDB, LevelDB and Snappy are made optional via switches
(USE_OPENCV, USE_LMDB, USE_LEVELDB) available for Make and CMake
builds. Since Snappy is a LevelDB dependency, its use is determined by
USE_LEVELDB. HDF5 is left bundled because it is used for serializing
weights and solverstates.
|
|
|
|
|
|
|
|
|
|
Fix some doxygen warnings about an undocumented argument in Blob and
incorrect documentation for SoftmaxWithLossLayer::Forward_cpu().
|
|
Fix SPPLayer top blob num and address `pyramid_height_ == 1`
|
|
SPPLayer
also, do nothing in SPPLayer Reshape if already reshaped once and bottom size unchanged
|
|
|
|
Embed layer for lookup table of one hot encodings
|
|
Output accuracies per class.
|
|
Snapshot on signal
|
|
Fixed case where number of samples in class can be zero.
- Fixed ignore_label case, also added a test.
- Two other fixes.
Fixed lint errors.
Small fix.
|
|
Add signal handler and early exit/snapshot to Solver.
Add signal handler and early exit/snapshot to Solver.
Also check for exit and snapshot when testing.
Skip running test after early exit.
Fix more lint.
Rebase on master.
Finish rebase on master.
Fixups per review comments.
Redress review comments.
Lint.
Correct error message wording.
|
|
Useful for validating NetParameters without crashing on SIGABRT
|
|
|
|
This commit implements the Adam solver by Kingma et. al for CPU and
GPU. All solver parameters are defined in the caffe.proto. This also
adds an example for the MNIST dataset.
|
|
Multi-GPU Data Parallelism
|
|
|
|
Allow data layers (and also PythonLayer when used as data layer) to be shared
among worker solver's training net, and also test net for future-proof if one
wants to do Multi-GPU testing. Data layers are locked during forward to ensure
sequential forward.
|
|
additional test cases
|
|
|
|
|
|
- Parallelize batches among GPUs and tree-reduce the gradients
- The effective batch size scales with the number of devices
- Batch size is multiplied by the number of devices
- Split batches between GPUs, and tree-reduce the gradients
- Detect machine topology (twin-GPU boards, P2P connectivity)
- Track device in syncedmem (thanks @thatguymike)
- Insert a callback in the solver for minimal code change
- Accept list for gpu flag of caffe tool, e.g. '-gpu 0,1' or '-gpu all'.
Run on default GPU if no ID given.
- Add multi-GPU solver test
- Deterministic architecture for reproducible runs
|
|
thanks to discussion by @thatguymike and @flx42
|
|
- Make sure each solver accesses a different subset of the data
- Sequential reading of DB for performance
- Prefetch a configurable amount of data to host memory
- Distribute data to solvers in round-robin way for determinism
|
|
|
|
- Interrupt the thread before waiting on join
- Provide a method for looping threads to exit on demand
- CHECK if start and stop succeed instead of returning an error
|
|
|
|
|
|
Implement RMSProp solver and cleaned up to adjust to new solver interface that uses
accumulated gradients and refactored regularization.
|
|
with unit tests
|
|
|
|
(double impl from NVIDIA dev docs; float impl included in CUDA as
"atomicAdd")
|
|
-Params now share diffs as well as data (works due to layers
accumulating gradients into param diffs, rather than overwriting)
-It's now required that any shared params with specified lr_mult's,
decay_mult's match
-TestGradientBasedSolver checks that behavior remains correct with
shared weights
|
|
Summary of changes:
- HDF5 helper functions were moved into a separate file util/hdf5.cpp
- hdf5_save_nd_dataset now saves n-d blobs, can save diffs instead of
data
- Minor fix for memory leak in HDF5 functions (delete instead of
delete[])
- Extra methods have been added to both Net/Solver enabling
snapshotting and restoring from HDF5 files
- snapshot_format was added to SolverParameters, with possible values
HDF5 or BINARYPROTO (default HDF5)
- kMaxBlobAxes was reduced to 32 to match the limitations of HDF5
|
|
[docs] Fix layer typo
|
|
|
|
|