summaryrefslogtreecommitdiff
path: root/docs/cpp
diff options
context:
space:
mode:
authorPeter Goldsborough <psag@fb.com>2018-11-05 16:09:00 -0800
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>2018-11-05 16:10:58 -0800
commit84cfc28f235cf01dd79553fbdf6665a4246454fb (patch)
tree64acecc28d25a9ae2e09e228cedc3eaa64cc1227 /docs/cpp
parentf6ff5d8934073a613d7ff86c82cab97c799242e5 (diff)
downloadpytorch-84cfc28f235cf01dd79553fbdf6665a4246454fb.tar.gz
pytorch-84cfc28f235cf01dd79553fbdf6665a4246454fb.tar.bz2
pytorch-84cfc28f235cf01dd79553fbdf6665a4246454fb.zip
Note on Tensor Creation
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13517 Differential Revision: D12914271 Pulled By: goldsborough fbshipit-source-id: df64fca6652525bc814f6fd3e486c87bff29b5b5
Diffstat (limited to 'docs/cpp')
-rw-r--r--docs/cpp/source/frontend.rst6
-rw-r--r--docs/cpp/source/index.rst9
-rw-r--r--docs/cpp/source/notes/faq.rst (renamed from docs/cpp/source/faq.rst)0
-rw-r--r--docs/cpp/source/notes/tensor_creation.rst353
4 files changed, 363 insertions, 5 deletions
diff --git a/docs/cpp/source/frontend.rst b/docs/cpp/source/frontend.rst
index d94f8a8ef5..5855c3ea43 100644
--- a/docs/cpp/source/frontend.rst
+++ b/docs/cpp/source/frontend.rst
@@ -1,8 +1,8 @@
-The PyTorch C++ Frontend
-========================
+The C++ Frontend
+================
The PyTorch C++ frontend is a C++11 library for CPU and GPU
-tensor computation, with automatic differentation and high level building
+tensor computation, with automatic differentiation and high level building
blocks for state of the art machine learning applications.
Description
diff --git a/docs/cpp/source/index.rst b/docs/cpp/source/index.rst
index 85b7e814cd..6bf375e051 100644
--- a/docs/cpp/source/index.rst
+++ b/docs/cpp/source/index.rst
@@ -159,12 +159,17 @@ Contents
.. toctree::
:maxdepth: 2
- frontend
installing
- faq
+ frontend
contributing
api/library_root
+.. toctree::
+ :glob:
+ :maxdepth: 1
+ :caption: Notes
+
+ notes/*
Indices and tables
==================
diff --git a/docs/cpp/source/faq.rst b/docs/cpp/source/notes/faq.rst
index 37a1a609f3..37a1a609f3 100644
--- a/docs/cpp/source/faq.rst
+++ b/docs/cpp/source/notes/faq.rst
diff --git a/docs/cpp/source/notes/tensor_creation.rst b/docs/cpp/source/notes/tensor_creation.rst
new file mode 100644
index 0000000000..4905c75e57
--- /dev/null
+++ b/docs/cpp/source/notes/tensor_creation.rst
@@ -0,0 +1,353 @@
+Tensor Creation API
+===================
+
+This note describes how to create tensors in the PyTorch C++ API. It highlights
+the available factory functions, which populate new tensors according to some
+algorithm, and lists the options available to configure the shape, data type,
+device and other properties of a new tensor.
+
+Factory Functions
+-----------------
+
+A *factory function* is a function that produces a new tensor. There are many
+factory functions available in PyTorch (both in Python and C++), which differ
+in the way they initialize a new tensor before returning it. All factory
+functions adhere to the following general "schema":
+
+.. code-block:: cpp
+
+ torch::<function-name>(<function-specific-options>, <sizes>, <tensor-options>)
+
+Let's bisect the various parts of this "schema":
+
+1. ``<function-name>`` is the name of the function you would like to invoke,
+2. ``<functions-specific-options>`` are any required or optional parameters a particular factory function accepts,
+3. ``<sizes>`` is an object of type ``IntList`` and specifies the shape of the resulting tensor,
+4. ``<tensor-options>`` is an instance of ``TensorOptions`` and configures the data type, device, layout and other properties of the resulting tensor.
+
+Picking a Factory Function
+**************************
+
+The following factory functions are available at the time of this writing (the
+hyperlinks lead to the corresponding Python functions, since they often have
+more eloquent documentation -- the options are the same in C++):
+
+- `arange <https://pytorch.org/docs/stable/torch.html#torch.arange>`_: Returns a tensor with a sequence of integers,
+- `empty <https://pytorch.org/docs/stable/torch.html#torch.empty>`_: Returns a tensor with uninitialized values,
+- `eye <https://pytorch.org/docs/stable/torch.html#torch.eye>`_: Returns an identity matrix,
+- `full <https://pytorch.org/docs/stable/torch.html#torch.full>`_: Returns a tensor filled with a single value,
+- `linspace <https://pytorch.org/docs/stable/torch.html#torch.linspace>`_: Returns a tensor with values linearly spaced in some interval,
+- `logspace <https://pytorch.org/docs/stable/torch.html#torch.logspace>`_: Returns a tensor with values logarithmically spaced in some interval,
+- `ones <https://pytorch.org/docs/stable/torch.html#torch.ones>`_: Returns a tensor filled with all ones,
+- `rand <https://pytorch.org/docs/stable/torch.html#torch.rand>`_: Returns a tensor filled with values drawn from a uniform distribution on ``[0, 1)``.
+- `randint <https://pytorch.org/docs/stable/torch.html#torch.randint>`_: Returns a tensor with integers randomly drawn from an interval,
+- `randn <https://pytorch.org/docs/stable/torch.html#torch.randn>`_: Returns a tensor filled with values drawn from a unit normal distribution,
+- `randperm <https://pytorch.org/docs/stable/torch.html#torch.randperm>`_: Returns a tensor filled with a random permutation of integers in some interval,
+- `zeros <https://pytorch.org/docs/stable/torch.html#torch.zeros>`_: Returns a tensor filled with all zeros.
+
+Specifying a Size
+*****************
+
+Functions that do not require specific arguments by nature of how they fill the
+tensor can be invoked with just a size. For example, the following line creates
+a vector with 5 components, initially all set to 1:
+
+.. code-block:: cpp
+
+ torch::Tensor tensor = torch::ones(5);
+
+
+What if we wanted to instead create a ``3 x 5`` matrix, or a ``2 x 3 x 4``
+tensor? In general, an ``IntList`` -- the type of the size parameter of factory
+functions -- is constructed by specifying the size along each dimension in
+curly braces. For example, ``{2, 3}`` for a tensor (in this case matrix) with
+two rows and three columns, ``{3, 4, 5}`` for a three-dimensional tensor, and
+``{2}`` for a one-dimensional tensor with two components. In the one
+dimensional case, you can omit the curly braces and just pass the single
+integer like we did above. Note that the squiggly braces are just one way of
+constructing an ``IntList``. You can also pass an ``std::vector<int64_t>`` and
+a few other types. Either way, this means we can construct a three-dimensional
+tensor filled with values from a unit normal distribution by writing:
+
+.. code-block:: cpp
+
+ torch::Tensor tensor = torch::randn({3, 4, 5});
+ assert(tensor.sizes() == torch::IntList{3, 4, 5})
+
+Notice how we use ``tensor.sizes()`` to get back an ``IntList`` containing the
+sizes we passed to the tensor. You can also write ``tensor.size(i)`` to access
+a single dimension, which is equivalent to but preferred over
+``tensor.sizes()[i]``.
+
+Passing Function-Specific Parameters
+************************************
+
+Neither ``ones`` nor ``randn`` accept any additional parameters to change their
+behavior. One function which does require further configuration is ``randint``,
+which takes an upper bound on the value for the integers it generates, as well
+as an optional lower bound, which defaults to zero. Here we create a ``5 x 5``
+square matrix with integers between 0 and 10:
+
+.. code-block:: cpp
+
+ torch::Tensor tensor = torch::randint(/*high=*/10, {5, 5});
+
+And here we raise the lower bound to 3:
+
+.. code-block:: cpp
+
+ torch::Tensor tensor = torch::randint(/*low=*/3, /*high=*/10, {5, 5});
+
+The inline comments ``/*low=*/`` and ``/*high=*/`` are not required of course,
+but aid readability just like keyword arguments in Python.
+
+.. tip::
+
+ The main take-away is that the size always follows the function specific
+ arguments.
+
+.. attention::
+
+ Sometimes a function does not need a size at all. For example, the size of
+ the tensor returned by ``arange`` is fully specified by its function-specific
+ arguments -- the lower and upper bound of a range of integers. In that case
+ the function does not take a ``size`` parameter.
+
+Configuring Properties of the Tensor
+************************************
+
+The previous section discussed function-specific arguments. Function-specific
+arguments can only change the values with which tensors are filled, and
+sometimes the size of the tensor. They never change things like the data type
+(e.g. ``float32`` or ``int64``) of the tensor being created, or whether it
+lives in CPU or GPU memory. The specification of these properties is left to
+the very last argument to every factory function: a ``TensorOptions`` object,
+discussed below.
+
+``TensorOptions`` is a class that encapsulates the construction axes of a
+Tensor. With *construction axis* we mean a particular property of a Tensor that
+can be configured before its construction (and sometimes changed afterwards).
+These construction axes are:
+
+- The ``dtype`` (previously "scalar type"), which controls the data type of the
+ elements stored in the tensor,
+- The ``layout``, which is either strided (dense) or sparse,
+- The ``device``, which represents a compute device on which a tensor is stored (like a CPU or CUDA GPU),
+- The ``requires_grad`` boolean to enable or disable gradient recording for a tensor,
+
+If you are used to PyTorch in Python, these axes will sound very familiar. The
+allowed values for these axes at the moment are:
+
+- For ``dtype``: ``kUInt8``, ``kInt8``, ``kInt16``, ``kInt32``, ``kInt64``, ``kFloat32`` and ``kFloat64``,
+- For ``layout``: ``kStrided`` and ``kSparse``,
+- For ``device``: Either ``kCPU``, or ``kCUDA`` (which accepts an optional device index),
+- For ``requires_grad``: either ``true`` or ``false``.
+
+.. tip::
+
+ There exist "Rust-style" shorthands for dtypes, like ``kF32`` instead of
+ ``kFloat32``. See `here
+ <https://github.com/pytorch/pytorch/blob/master/torch/csrc/api/include/torch/types.h>`_
+ for the full list.
+
+
+An instance of ``TensorOptions`` stores a concrete value for each of these
+axes. Here is an example of creating a ``TensorOptions`` object that represents
+a 64-bit float, strided tensor that requires a gradient, and lives on CUDA
+device 1:
+
+.. code-block:: cpp
+
+ auto options =
+ torch::TensorOptions()
+ .dtype(torch::kFloat32)
+ .layout(torch::kStrided)
+ .device({torch::kCUDA, 1})
+ .requires_grad(true);
+
+
+Notice how we use the '"builder"-style methods of ``TensorOptions`` to
+construct the object piece by piece. If we pass this object as the last
+argument to a factory function, the newly created tensor will have these
+properties:
+
+.. code-block:: cpp
+
+ torch::Tensor tensor = torch::full(/*value=*/123, {3, 4}, options);
+
+ assert(tensor.dtype() == torch::kFloat32);
+ assert(tensor.layout() == torch::kStrided);
+ assert(tensor.device().type() == torch::kCUDA); // or device().is_cuda()
+ assert(tensor.device().index() == 1);
+ assert(tensor.requires_grad())
+
+ assert(tensor.options() == options)
+
+Now, you may be thinking: do I really need to specify each axis for every new
+tensor I create? Fortunately, the answer is "no", as **every axis has a default
+value**. These defaults are:
+
+- ``kFloat32`` for the dtype,
+- ``kStrided`` for the layout,
+- ``kCPU`` for the device,
+- ``false`` for ``requires_grad``.
+
+What this means is that any axis you omit during the construction of a
+``TensorOptions`` object will take on its default value. For example, this is
+our previous ``TensorOptions`` object, but with the ``dtype`` and ``layout``
+defaulted:
+
+.. code-block:: cpp
+
+ auto options = torch::TensorOptions().device({torch::kCUDA, 1}).requires_grad(true);
+
+In fact, we can even omit all axes to get an entirely defaulted
+``TensorOptions`` object:
+
+.. code-block:: cpp
+
+ auto options = torch::TensorOptions(); // or `torch::TensorOptions options;`
+
+A nice consequence of this is that the ``TensorOptions`` object we just spoke
+so much about can be entirely omitted from any tensor factory call:
+
+.. code-block:: cpp
+
+ // A 32-bit float, strided, CPU tensor that does not require a gradient.
+ torch::Tensor tensor = torch::randn({3, 4});
+ torch::Tensor range = torch::arange(5, 10);
+
+But the sugar gets sweeter: In the API presented here so far, you may have
+noticed that the initial ``torch::TensorOptions()`` is quite a mouthful to
+write. The good news is that for every construction axis (dtype, layout, device
+and ``requires_grad``), there is one *free function* in the ``torch::``
+namespace which you can pass a value for that axis. Each function then returns
+a ``TensorOptions`` object preconfigured with that axis, but allowing even
+further modification via the builder-style methods shown above. For example,
+
+.. code-block:: cpp
+
+ torch::ones(10, torch::TensorOptions().dtype(torch::kFloat32))
+
+is equivalent to
+
+.. code-block:: cpp
+
+ torch::ones(10, torch::dtype(torch::kFloat32))
+
+and further instead of
+
+.. code-block:: cpp
+
+ torch::ones(10, torch::TensorOptions().dtype(torch::kFloat32).layout(torch::kStrided))
+
+we can just write
+
+.. code-block:: cpp
+
+ torch::ones(10, torch::dtype(torch::kFloat32).layout(torch::kStrided))
+
+which saves us quite a bit of typing. What this means is that in practice, you
+should barely, if ever, have to write out ``torch::TensorOptions``. Instead use
+the ``torch::dtype()``, ``torch::device()``, ``torch::layout()`` and
+``torch::requires_grad()`` functions.
+
+A final bit of convenience is that ``TensorOptions`` is implicitly
+constructible from individual values. This means that whenever a function has a
+parameter of type ``TensorOptions``, like all factory functions do, we can
+directly pass a value like ``torch::kFloat32`` or ``torch::kStrided`` in place
+of the full object. Therefore, when there is only a single axis we would like
+to change compared to its default value, we can pass only that value. As such,
+what was
+
+.. code-block:: cpp
+
+ torch::ones(10, torch::TensorOptions().dtype(torch::kFloat32))
+
+became
+
+.. code-block:: cpp
+
+ torch::ones(10, torch::dtype(torch::kFloat32))
+
+and can finally be shortened to
+
+.. code-block:: cpp
+
+ torch::ones(10, torch::kFloat32)
+
+Of course, it is not possible to modify further properties of the
+``TensorOptions`` instance with this short syntax, but if all we needed was to
+change one property, this is quite practical.
+
+In conclusion, we can now compare how ``TensorOptions`` defaults, together with
+the abbreviated API for creating ``TensorOptions`` using free functions, allow
+tensor creation in C++ with the same convenience as in Python. Compare this
+call in Python::
+
+ torch.randn(3, 4, dtype=torch.float32, device=torch.device('cuda', 1), requires_grad=True)
+
+with the equivalent call in C++:
+
+.. code-block:: cpp
+
+ torch::randn({3, 4}, torch::dtype(torch::kFloat32).device({torch::kCUDA, 1}).requires_grad(true))
+
+Pretty close!
+
+Conversion
+----------
+
+Just as we can use ``TensorOptions`` to configure how new tensors should be
+created, we can also use ``TensorOptions`` to convert a tensor from one set of
+properties to a new set of properties. Such a conversion usually creates a new
+tensor and does not occur in-place. For example, if we have a ``source_tensor``
+created with
+
+.. code-block:: cpp
+
+ torch::Tensor source_tensor = torch::randn({2, 3}, torch::kInt64);
+
+we can convert it from ``int64`` to ``float32``:
+
+.. code-block:: cpp
+
+ torch::Tensor float_tensor = source_tensor.to(torch::kFloat32);
+
+.. attention::
+
+ The result of the conversion, ``float_tensor``, is a new tensor pointing to
+ new memory, unrelated to the source ``source_tensor``.
+
+We can then move it from CPU memory to GPU memory:
+
+.. code-block:: cpp
+
+ torch::Tensor gpu_tensor = float_tensor.to(torch::kCUDA);
+
+If you have multiple CUDA devices available, the above code will copy the
+tensor to the *default* CUDA device, which you can configure with a
+``torch::DeviceGuard``. If no ``DeviceGuard`` is in place, this will be GPU
+1. If you would like to specify a different GPU index, you can pass it to
+the ``Device`` constructor:
+
+.. code-block:: cpp
+
+ torch::Tensor gpu_two_tensor = float_tensor.to(torch::Device(torch::kCUDA, 1));
+
+In the case of CPU to GPU copy and reverse, we can also configure the memory
+copy to be *asynchronous* by passing ``/*non_blocking=*/false`` as the last
+argument to ``to()``:
+
+.. code-block:: cpp
+
+ torch::Tensor async_cpu_tensor = gpu_tensor.to(torch::kCPU, /*non_blocking=*/true);
+
+Conclusion
+----------
+
+This note hopefully gave you a good understanding of how to create and convert
+tensors in an idiomatic fashion using the PyTorch C++ API. If you have any
+further questions or suggestions, please use our `forum
+<https://discuss.pytorch.org/>`_ or `GitHub issues
+<https://github.com/pytorch/pytorch/issues>`_ to get in touch.