diff options
author | Matan Appelbaum <mappelbaum@fb.com> | 2018-02-13 10:58:37 -0800 |
---|---|---|
committer | Facebook Github Bot <facebook-github-bot@users.noreply.github.com> | 2018-02-13 11:14:41 -0800 |
commit | d99d28b3e6d56378f39870f904c190ef02549a8d (patch) | |
tree | a1177ac891385b709dbcb24c600c1b4be6a70c5b /caffe2/python/layers | |
parent | 5fe2f3f9e5c4b3c3a51be977f482cfa8ad20e2b1 (diff) | |
download | pytorch-d99d28b3e6d56378f39870f904c190ef02549a8d.tar.gz pytorch-d99d28b3e6d56378f39870f904c190ef02549a8d.tar.bz2 pytorch-d99d28b3e6d56378f39870f904c190ef02549a8d.zip |
Allow custom component tagging in DeviceOptions.node_name
Summary:
Modify detect_components to take a list of valid node_name prefixes instead of values. Users can set node_name to e.g. `'sparse_component:0'`, `'sparse_component:1'`, etc.
and pass `'sparse_component:'` as a valid prefix. Also add `Tags.SPARSE_COMPONENT` in addition to `Tags.SPARSE_SHARDED` and `Tags.SPARSE_DONT_SHARD` and update all calls to
`detect_device_components`.
Reviewed By: azzolini
Differential Revision: D6952599
fbshipit-source-id: e1b1e6b146a6bd053b295690016044fd5990c893
Diffstat (limited to 'caffe2/python/layers')
-rw-r--r-- | caffe2/python/layers/tags.py | 26 |
1 files changed, 20 insertions, 6 deletions
diff --git a/caffe2/python/layers/tags.py b/caffe2/python/layers/tags.py index 0195563c0d..9f7a20315d 100644 --- a/caffe2/python/layers/tags.py +++ b/caffe2/python/layers/tags.py @@ -56,14 +56,28 @@ class Tags(object): PREFER_GPU = 'prefer_gpu' CPU_ONLY = 'cpu_only' - # The following two tags are hints to **distributed training framework**. - # The first tag is to determine a layer contains a sparse parameter and the - # parameter should be sharded and operators on those parameters should be - # done on distributed parameter servers. The second tag is to determine a - # layer contains a sparse parameters among others, and that the parameters - # should not be sharded (i.e. should be placed together on a node) + # The following three tags are hints to **distributed training framework**. + """ + Indicates a layer contains a sparse shardable parameter. The parameter + should be sharded nd operators on those parameters should be done on + distributed parameter servers. + """ SPARSE_SHARDED = 'sparse_sharded' + """ + Indicates a layer contains a sparse parameters among others, and that the + parameters should not be sharded (i.e. should be placed together on a node). + """ SPARSE_DONT_SHARD = 'sparse_dont_shard' + """ + Used to manually indicate a component for an operator. Parameters for + all operators with the same component should be colocated on the same + parameter server. + """ + COMPONENT = 'component:' + """ + Valid tag prefixes for distributed training framework. + """ + DT_TAGS = (SPARSE_SHARDED, SPARSE_DONT_SHARD, COMPONENT) # In certain cases we want to have different schema for training and # prediction, as an example in prediction we might need to have only |