New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342 This PR introduces a few new concepts: - DeviceGuardImplInterface, and implementations for CPU and CUDA, which provide a generic interface for interfacing with device and stream state, without requiring a direct dependency on the code in question. - InlineDeviceGuard, a general template for generating both specialized and dynamically dispatched device guard implementations. Dynamic dispatch is done by specializing it on a VirtualGuardImpl. - Provide a device-independent DeviceGuard class, which can be used even from CPU code. It uses the aforementioned dynamic dispatch. - CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch but can only be used from CUDA. - StreamGuard, which is the same as above, but for streams rather than devices. - Optional variants of all the aforementioned guards, which are a no-op if no device/stream is specified - CUDAMultiStreamGuard, specifically for the case when we want to set a device on every guard. There are some subtle semantic changes, which have been thoroughly documented in the class definition. BC-breaking changes: - Move constructor/assignment have been removed from all device guard implementations. - In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write 'reset_device', because if you switch devices/device types, the stream/device on the previous device is unset. This is different from previous behavior. - CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard or CUDAMultiStreamGuard as appropriate for your use case. Reviewed By: dzhulgakov Differential Revision: D12849620 fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
author: Edward Yang <ezyang@fb.com> 2018-11-11 12:08:57 -0800
committer: Facebook Github Bot <facebook-github-bot@users.noreply.github.com> 2018-11-11 12:11:10 -0800
commit: e35418b3bec2f64814919f60aae9d98650771a3a (patch)
tree: 5d654a7195e0c59983a60b9584397816cc77b4a3 /c10/macros
parent: 4b86a215cad4bbada9fdb52a63f68bc6db52d696 (diff)
download: pytorch-e35418b3bec2f64814919f60aae9d98650771a3a.tar.gz
pytorch-e35418b3bec2f64814919f60aae9d98650771a3a.tar.bz2
pytorch-e35418b3bec2f64814919f60aae9d98650771a3a.zip
1 files changed, 2 insertions, 0 deletions
diff --git a/c10/macros/Macros.h b/c10/macros/Macros.h
index 2dcf0062da..0138a9cb79 100644
--- a/c10/macros/Macros.h
+++ b/c10/macros/Macros.h
@@ -53,6 +53,7 @@
 // Simply define the namespace, in case a dependent library want to refer to
 // the c10 namespace but not any nontrivial files.
 namespace c10 {} // namespace c10
+namespace c10 { namespace detail {} }
 
 // Since C10 is the core library for caffe2 (and aten), we will simply reroute
 // all abstractions defined in c10 to be available in caffe2 as well.
@@ -60,6 +61,7 @@ namespace c10 {} // namespace c10
 // c10 namespace where possible.
 namespace caffe2 {using namespace c10;}
 namespace at {using namespace c10;}
+namespace at { namespace detail { using namespace c10::detail; }}
 
 // C10_NORETURN
 #if defined(_MSC_VER)
author	Edward Yang <ezyang@fb.com>	2018-11-11 12:08:57 -0800
committer	Facebook Github Bot <facebook-github-bot@users.noreply.github.com>	2018-11-11 12:11:10 -0800
commit	e35418b3bec2f64814919f60aae9d98650771a3a (patch)
tree	5d654a7195e0c59983a60b9584397816cc77b4a3 /c10/macros
parent	4b86a215cad4bbada9fdb52a63f68bc6db52d696 (diff)
download	pytorch-e35418b3bec2f64814919f60aae9d98650771a3a.tar.gz pytorch-e35418b3bec2f64814919f60aae9d98650771a3a.tar.bz2 pytorch-e35418b3bec2f64814919f60aae9d98650771a3a.zip