PyTorch教程-2.1.数据操作-电子发烧友网

为了完成任何事情，我们需要一些方法来存储和操作数据。通常，我们需要对数据做两件重要的事情：（i）获取它们；(ii) 一旦它们进入计算机就对其进行处理。如果没有某种存储方式，获取数据是没有意义的，所以首先，让我们动手操作n维数组，我们也称之为张量。如果您已经了解 NumPy 科学计算包，那么这将是一件轻而易举的事。对于所有现代深度学习框架，张量类（ndarray在 MXNet、 TensorPyTorch 和 TensorFlow 中）类似于 NumPy ndarray，但增加了一些杀手级功能。首先，张量类支持自动微分。其次，它利用 GPU 来加速数值计算，而 NumPy 只能在 CPU 上运行。这些特性使神经网络既易于编码又能快速运行。

2.1.1. 入门

首先，我们导入 PyTorch 库。请注意，包名称是 torch.

import torch

To start, we import the np (numpy) and npx (numpy_extension) modules from MXNet. Here, the np module includes functions supported by NumPy, while the npx module contains a set of extensions developed to empower deep learning within a NumPy-like environment. When using tensors, we almost always invoke the set_np function: this is for compatibility of tensor processing by other components of MXNet.

from mxnet import np, npx npx.set_np()

import jax from jax import numpy as jnp

To start, we import tensorflow. For brevity, practitioners often assign the alias tf.

import tensorflow as tf

张量表示一个（可能是多维的）数值数组。对于一个轴，张量称为向量。具有两个轴的张量称为矩阵。和k>2轴，我们删除专门的名称并仅将对象称为 kth 阶张量。

PyTorch 提供了多种函数来创建预填充值的新张量。例如，通过调用，我们可以创建一个均匀分布值的向量，从 0（包括）开始到（不包括）arange(n)结束。n默认情况下，间隔大小为 1. 除非另有说明，否则新张量存储在主内存中并指定用于基于 CPU 的计算。

x = torch.arange(12, dtype=torch.float32) x

tensor([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11.])

这些值中的每一个都称为张量的一个元素。张量 x包含 12 个元素。我们可以通过其方法检查张量中元素的总数numel。

x.numel()

MXNet provides a variety of functions for creating new tensors prepopulated with values. For example, by invoking arange(n), we can create a vector of evenly spaced values, starting at 0 (included) and ending at n (not included). By default, the interval size is 1. Unless otherwise specified, new tensors are stored in main memory and designated for CPU-based computation.

x = np.arange(12) x

array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11.])

Each of these values is called an element of the tensor. The tensor x contains 12 elements. We can inspect the total number of elements in a tensor via its size attribute.

x.size

x = jnp.arange(12) x

No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)

Array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], dtype=int32)

x.size

TensorFlow provides a variety of functions for creating new tensors prepopulated with values. For example, by invoking range(n), we can create a vector of evenly spaced values, starting at 0 (included) and ending at n (not included). By default, the interval size is 1. Unless otherwise specified, new tensors are stored in main memory and designated for CPU-based computation.

x = tf.range(12, dtype=tf.float32) x

<tf.Tensor: shape=(12,), dtype=float32, numpy=
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11.],
   dtype=float32)>

Each of these values is called an element of the tensor. The tensor x contains 12 elements. We can inspect the total number of elements in a tensor via the size function.

tf.size(x)

<tf.Tensor: shape=(), dtype=int32, numpy=12>

我们可以通过检查其属性来访问张量的形状（沿每个轴的长度）shape。因为我们在这里处理的是一个向量，所以它只shape包含一个元素并且与大小相同。

x.shape

torch.Size([12])

x.shape

(12,)

x.shape

(12,)

x.shape

TensorShape([12])

我们可以通过调用来改变张量的形状而不改变它的大小或值reshape。例如，我们可以将形状为 (12,) 的向量转换为形状为 (3, 4) 的x 矩阵。X这个新张量保留了所有元素，但将它们重新配置为矩阵。请注意，我们向量的元素一次排成一行，因此 .x[3] == X[0, 3]

X = x.reshape(3, 4) X

tensor([[ 0., 1., 2., 3.],
    [ 4., 5., 6., 7.],
    [ 8., 9., 10., 11.]])

X = x.reshape(3, 4) X

array([[ 0., 1., 2., 3.],
    [ 4., 5., 6., 7.],
    [ 8., 9., 10., 11.]])

X = x.reshape(3, 4) X

Array([[ 0, 1, 2, 3],
    [ 4, 5, 6, 7],
    [ 8, 9, 10, 11]], dtype=int32)

X = tf.reshape(x, (3, 4)) X

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[ 0., 1., 2., 3.],
    [ 4., 5., 6., 7.],
    [ 8., 9., 10., 11.]], dtype=float32)>

请注意，指定每个形状组件reshape是多余的。因为我们已经知道张量的大小，所以我们可以在给定其余部分的情况下计算出形状的一个组成部分。例如，给定大小的张量 n和目标形状（h,w），我们知道 w=n/h. 要自动推断形状的一个组件，我们可以-1为应该自动推断的形状组件放置一个。在我们的例子中，我们可以等效地调用or 而不是调用。x.reshape(3, 4)x.reshape(-1, 4)x.reshape(3, -1)

从业者通常需要使用初始化为包含全零或全一的张量。我们可以通过函数构造一个所有元素都设置为零且形状为 (2, 3, 4) 的张量zeros。

torch.zeros((2, 3, 4))

tensor([[[0., 0., 0., 0.],
     [0., 0., 0., 0.],
     [0., 0., 0., 0.]],

    [[0., 0., 0., 0.],
     [0., 0., 0., 0.],
     [0., 0., 0., 0.]]])

np.zeros((2, 3, 4))

array([[[0., 0., 0., 0.],
    [0., 0., 0., 0.],
    [0., 0., 0., 0.]],

    [[0., 0., 0., 0.],
    [0., 0., 0., 0.],
    [0., 0., 0., 0.]]])

jnp.zeros((2, 3, 4))

Array([[[0., 0., 0., 0.],
    [0., 0., 0., 0.],
    [0., 0., 0., 0.]],

    [[0., 0., 0., 0.],
    [0., 0., 0., 0.],
    [0., 0., 0., 0.]]], dtype=float32)

tf.zeros((2, 3, 4))

<tf.Tensor: shape=(2, 3, 4), dtype=float32, numpy=
array([[[0., 0., 0., 0.],
    [0., 0., 0., 0.],
    [0., 0., 0., 0.]],

    [[0., 0., 0., 0.],
    [0., 0., 0., 0.],
    [0., 0., 0., 0.]]], dtype=float32)>

类似地，我们可以通过调用创建一个全部为 1 的张量ones。

torch.ones((2, 3, 4))

tensor([[[1., 1., 1., 1.],
     [1., 1., 1., 1.],
     [1., 1., 1., 1.]],

    [[1., 1., 1., 1.],
     [1., 1., 1., 1.],
     [1., 1., 1., 1.]]])

np.ones((2, 3, 4))

array([[[1., 1., 1., 1.],
    [1., 1., 1., 1.],
    [1., 1., 1., 1.]],

    [[1., 1., 1., 1.],
    [1., 1., 1., 1.],
    [1., 1., 1., 1.]]])

jnp.ones((2, 3, 4))

Array([[[1., 1., 1., 1.],
    [1., 1., 1., 1.],
    [1., 1., 1., 1.]],

    [[1., 1., 1., 1.],
    [1., 1., 1., 1.],
    [1., 1., 1., 1.]]], dtype=float32)

tf.ones((2, 3, 4))

<tf.Tensor: shape=(2, 3, 4), dtype=float32, numpy=
array([[[1., 1., 1., 1.],
    [1., 1., 1., 1.],
    [1., 1., 1., 1.]],

    [[1., 1., 1., 1.],
    [1., 1., 1., 1.],
    [1., 1., 1., 1.]]], dtype=float32)>

我们经常希望从给定的概率分布中随机（且独立地）采样每个元素。例如，神经网络的参数通常是随机初始化的。以下代码片段创建了一个张量，其中的元素取自标准高斯（正态）分布，均值为 0，标准差为 1。

torch.randn(3, 4)

tensor([[ 1.4251, -1.4341, 0.2826, -0.4915],
    [ 0.1799, -1.1769, 2.3581, -0.1923],
    [ 0.8576, -0.0719, 1.4172, -1.3151]])

np.random.normal(0, 1, size=(3, 4))

array([[ 2.2122064 , 1.1630787 , 0.7740038 , 0.4838046 ],
    [ 1.0434403 , 0.29956347, 1.1839255 , 0.15302546],
    [ 1.8917114 , -1.1688148 , -1.2347414 , 1.5580711 ]])

# Any call of a random function in JAX requires a key to be # specified, feeding the same key to a random function will # always result in the same sample being generated jax.random.normal(jax.random.PRNGKey(0), (3, 4))

Array([[ 1.1901639 , -1.0996888 , 0.44367844, 0.5984697 ],
    [-0.39189556, 0.69261974, 0.46018356, -2.068578 ],
    [-0.21438177, -0.9898306 , -0.6789304 , 0.27362573]],   dtype=float32)

tf.random.normal(shape=[3, 4])

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[-1.198586 , -0.04204642, -0.6005369 , 1.4371548 ],
    [ 0.08375237, 0.947974 , 1.6228461 , 1.1598791 ],
    [ 0.58289856, -0.76583815, -0.36692864, 1.727855 ]],
   dtype=float32)>

最后，我们可以通过提供（可能嵌套的）包含数字文字的 Python 列表为每个元素提供精确值来构造张量。在这里，我们构建了一个包含列表列表的矩阵，其中最外层的列表对应于轴 0，内部列表对应于轴 1。

torch.tensor([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

tensor([[2, 1, 4, 3],
    [1, 2, 3, 4],
    [4, 3, 2, 1]])

np.array([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

array([[2., 1., 4., 3.],
    [1., 2., 3., 4.],
    [4., 3., 2., 1.]])

jnp.array([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

Array([[2, 1, 4, 3],
    [1, 2, 3, 4],
    [4, 3, 2, 1]], dtype=int32)

tf.constant([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[2, 1, 4, 3],
    [1, 2, 3, 4],
    [4, 3, 2, 1]], dtype=int32)>

2.1.2. 索引和切片

与 Python 列表一样，我们可以通过索引（从 0 开始）访问张量元素。要根据元素相对于列表末尾的位置访问元素，我们可以使用负索引。最后，我们可以通过切片（例如，）访问整个索引范围X[start:stop]，其中返回值包括第一个索引（start）但不包括最后一个（stop）。最后，当只有一个索引（或切片）被指定为kth阶张量，它沿轴 0 应用。因此，在下面的代码中，[-1]选择最后一行并 [1:3]选择第二行和第三行。

X[-1], X[1:3]

(tensor([ 8., 9., 10., 11.]),
 tensor([[ 4., 5., 6., 7.],
     [ 8., 9., 10., 11.]]))

除了读取之外，我们还可以通过指定索引来写入矩阵的元素。

X[1, 2] = 17 X

tensor([[ 0., 1., 2., 3.],
    [ 4., 5., 17., 7.],
    [ 8., 9., 10., 11.]])

X[-1], X[1:3]

(array([ 8., 9., 10., 11.]),
 array([[ 4., 5., 6., 7.],
    [ 8., 9., 10., 11.]]))

Beyond reading, we can also write elements of a matrix by specifying indices.

X[1, 2] = 17 X

array([[ 0., 1., 2., 3.],
    [ 4., 5., 17., 7.],
    [ 8., 9., 10., 11.]])

X[-1], X[1:3]

(Array([ 8, 9, 10, 11], dtype=int32),
 Array([[ 4, 5, 6, 7],
    [ 8, 9, 10, 11]], dtype=int32))

# JAX arrays are immutable. jax.numpy.ndarray.at index # update operators create a new array with the corresponding # modifications made X_new_1 = X.at[1, 2].set(17) X_new_1

Array([[ 0, 1, 2, 3],
    [ 4, 5, 17, 7],
    [ 8, 9, 10, 11]], dtype=int32)

X[-1], X[1:3]

(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 8., 9., 10., 11.], dtype=float32)>,
 <tf.Tensor: shape=(2, 4), dtype=float32, numpy=
 array([[ 4., 5., 6., 7.],
    [ 8., 9., 10., 11.]], dtype=float32)>)

Tensors in TensorFlow are immutable, and cannot be assigned to. Variables in TensorFlow are mutable containers of state that support assignments. Keep in mind that gradients in TensorFlow do not flow backwards through Variable assignments.

Beyond assigning a value to the entire Variable, we can write elements of a Variable by specifying indices.

X_var = tf.Variable(X) X_var[1, 2].assign(9) X_var

<tf.Variable 'Variable:0' shape=(3, 4) dtype=float32, numpy=
array([[ 0., 1., 2., 3.],
    [ 4., 5., 9., 7.],
    [ 8., 9., 10., 11.]], dtype=float32)>

如果我们想为多个元素分配相同的值，我们在赋值操作的左侧应用索引。例如，访问第一行和第二行，其中获取沿轴 1（列）的所有元素。虽然我们讨论了矩阵的索引，但它也适用于向量和二维以上的张量。[:2, :]:

X[:2, :] = 12 X

tensor([[12., 12., 12., 12.],
    [12., 12., 12., 12.],
    [ 8., 9., 10., 11.]])

X[:2, :] = 12 X

array([[12., 12., 12., 12.],
    [12., 12., 12., 12.],
    [ 8., 9., 10., 11.]])

X_new_2 = X_new_1.at[:2, :].set(12) X_new_2

Array([[12, 12, 12, 12],
    [12, 12, 12, 12],
    [ 8, 9, 10, 11]], dtype=int32)

X_var = tf.Variable(X) X_var[:2, :].assign(tf.ones(X_var[:2,:].shape, dtype=tf.float32) * 12) X_var

<tf.Variable 'Variable:0' shape=(3, 4) dtype=float32, numpy=
array([[12., 12., 12., 12.],
    [12., 12., 12., 12.],
    [ 8., 9., 10., 11.]], dtype=float32)>

2.1.3. 操作

现在我们知道如何构建张量以及如何读取和写入它们的元素，我们可以开始使用各种数学运算来操纵它们。最有用的工具之一是 逐元素操作。这些将标准标量运算应用于张量的每个元素。对于将两个张量作为输入的函数，逐元素运算对每对对应元素应用一些标准二元运算符。我们可以从从标量映射到标量的任何函数创建一个逐元素函数。

在数学符号中，我们用签名表示这样的一元标量运算符（接受一个输入）f:R→R. 这只是意味着函数从任何实数映射到其他实数。大多数标准运算符都可以按元素应用，包括一元运算符，如ex.

torch.exp(x)

tensor([162754.7969, 162754.7969, 162754.7969, 162754.7969, 162754.7969,
    162754.7969, 162754.7969, 162754.7969,  2980.9580,  8103.0840,
     22026.4648, 59874.1406])

np.exp(x)

array([1.0000000e+00, 2.7182817e+00, 7.3890562e+00, 2.0085537e+01,
    5.4598148e+01, 1.4841316e+02, 4.0342880e+02, 1.0966332e+03,
    2.9809580e+03, 8.1030840e+03, 2.2026465e+04, 5.9874141e+04])

jnp.exp(x)

Array([1.0000000e+00, 2.7182817e+00, 7.3890562e+00, 2.0085537e+01,
    5.4598152e+01, 1.4841316e+02, 4.0342880e+02, 1.0966332e+03,
    2.9809580e+03, 8.1030840e+03, 2.2026465e+04, 5.9874141e+04],   dtype=float32)

tf.exp(x)

<tf.Tensor: shape=(12,), dtype=float32, numpy=
array([1.0000000e+00, 2.7182817e+00, 7.3890562e+00, 2.0085537e+01,
    5.4598148e+01, 1.4841316e+02, 4.0342877e+02, 1.0966332e+03,
    2.9809580e+03, 8.1030840e+03, 2.2026465e+04, 5.9874141e+04],
   dtype=float32)>

同样，我们表示二元标量运算符，它通过签名将成对的实数映射到一个（单个）实数 f:R,R→R. 给定任意两个向量u和v 形状相同，和一个二元运算符f，我们可以产生一个向量 c=F(u,v)通过设置 ci←f(ui,vi)对全部i，在哪里ci,ui，和vi是ith向量的元素 c,u，和v. 在这里，我们产生了向量值 F:Rd,Rd→Rd通过将标量函数提升为元素向量运算。+加法 ( )、减法 ( -)、乘法 ( *)、除法 ( /) 和求幂 ( )的常见标准算术运算符**都已提升为任意形状的相同形状张量的元素运算。

x = torch.tensor([1.0, 2, 4, 8]) y = torch.tensor([2, 2, 2, 2]) x + y, x - y, x * y, x / y, x ** y

(tensor([ 3., 4., 6., 10.]),
 tensor([-1., 0., 2., 6.]),
 tensor([ 2., 4., 8., 16.]),
 tensor([0.5000, 1.0000, 2.0000, 4.0000]),
 tensor([ 1., 4., 16., 64.]))

x = np.array([1, 2, 4, 8]) y = np.array([2, 2, 2, 2]) x + y, x - y, x * y, x / y, x ** y

(array([ 3., 4., 6., 10.]),
 array([-1., 0., 2., 6.]),
 array([ 2., 4., 8., 16.]),
 array([0.5, 1. , 2. , 4. ]),
 array([ 1., 4., 16., 64.]))

x = jnp.array([1.0, 2, 4, 8]) y = jnp.array([2, 2, 2, 2]) x + y, x - y, x * y, x / y, x ** y

(Array([ 3., 4., 6., 10.], dtype=float32),
 Array([-1., 0., 2., 6.], dtype=float32),
 Array([ 2., 4., 8., 16.], dtype=float32),
 Array([0.5, 1. , 2. , 4. ], dtype=float32),
 Array([ 1., 4., 16., 64.], dtype=float32))

x = tf.constant([1.0, 2, 4, 8]) y = tf.constant([2.0, 2, 2, 2]) x + y, x - y, x * y, x / y, x ** y

(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 3., 4., 6., 10.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([-1., 0., 2., 6.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 2., 4., 8., 16.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([0.5, 1. , 2. , 4. ], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 1., 4., 16., 64.], dtype=float32)>)

除了按元素计算，我们还可以执行线性代数运算，例如点积和矩阵乘法。我们将在2.3 节中详细说明这些内容。

我们还可以将多个张量连接在一起，将它们首尾相连形成一个更大的张量。我们只需要提供一个张量列表并告诉系统沿着哪个轴连接。下面的示例显示了当我们沿行（轴 0）与列（轴 1）连接两个矩阵时会发生什么。我们可以看到第一个输出的 axis-0 长度 (6) 是两个输入张量的轴 0 长度之和 (3+3); 而第二个输出的 axis-1 长度 (8) 是两个输入张量的 axis-1 长度之和 (4+4).

X = torch.arange(12, dtype=torch.float32).reshape((3,4)) Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]]) torch.cat((X, Y), dim=0), torch.cat((X, Y), dim=1)

(tensor([[ 0., 1., 2., 3.],
     [ 4., 5., 6., 7.],
     [ 8., 9., 10., 11.],
     [ 2., 1., 4., 3.],
     [ 1., 2., 3., 4.],
     [ 4., 3., 2., 1.]]),
 tensor([[ 0., 1., 2., 3., 2., 1., 4., 3.],
     [ 4., 5., 6., 7., 1., 2., 3., 4.],
     [ 8., 9., 10., 11., 4., 3., 2., 1.]]))

X = np.arange(12).reshape(3, 4) Y = np.array([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]]) np.concatenate([X, Y], axis=0), np.concatenate([X, Y], axis=1)

(array([[ 0., 1., 2., 3.],
    [ 4., 5., 6., 7.],
    [ 8., 9., 10., 11.],
    [ 2., 1., 4., 3.],
    [ 1., 2., 3., 4.],
    [ 4., 3., 2., 1.]]),
 array([[ 0., 1., 2., 3., 2., 1., 4., 3.],
    [ 4., 5., 6., 7., 1., 2., 3., 4.],
    [ 8., 9., 10., 11., 4., 3., 2., 1.]]))

X = jnp.arange(12, dtype=jnp.float32).reshape((3, 4)) Y = jnp.array([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]]) jnp.concatenate((X, Y), axis=0), jnp.concatenate((X, Y), axis=1)

(Array([[ 0., 1., 2., 3.],
    [ 4., 5., 6., 7.],
    [ 8., 9., 10., 11.],
    [ 2., 1., 4., 3.],
    [ 1., 2., 3., 4.],
    [ 4., 3., 2., 1.]], dtype=float32),
 Array([[ 0., 1., 2., 3., 2., 1., 4., 3.],
    [ 4., 5., 6., 7., 1., 2., 3., 4.],
    [ 8., 9., 10., 11., 4., 3., 2., 1.]], dtype=float32))

X = tf.reshape(tf.range(12, dtype=tf.float32), (3, 4)) Y = tf.constant([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]]) tf.concat([X, Y], axis=0), tf.concat([X, Y], axis=1)

(<tf.Tensor: shape=(6, 4), dtype=float32, numpy=
 array([[ 0., 1., 2., 3.],
    [ 4., 5., 6., 7.],
    [ 8., 9., 10., 11.],
    [ 2., 1., 4., 3.],
    [ 1., 2., 3., 4.],
    [ 4., 3., 2., 1.]], dtype=float32)>,
 <tf.Tensor: shape=(3, 8), dtype=float32, numpy=
 array([[ 0., 1., 2., 3., 2., 1., 4., 3.],
    [ 4., 5., 6., 7., 1., 2., 3., 4.],
    [ 8., 9., 10., 11., 4., 3., 2., 1.]], dtype=float32)>)

有时，我们想通过逻辑语句构造一个二元张量。举个例子。对于每一个位置，如果和相等，则结果中相应的条目取值，否则取值。X == Yi, jX[i, j]Y[i, j]10

X == Y

tensor([[False, True, False, True],
    [False, False, False, False],
    [False, False, False, False]])

X == Y

array([[False, True, False, True],
    [False, False, False, False],
    [False, False, False, False]])

X == Y

Array([[False, True, False, True],
    [False, False, False, False],
    [False, False, False, False]], dtype=bool)

X == Y

<tf.Tensor: shape=(3, 4), dtype=bool, numpy=
array([[False, True, False, True],
    [False, False, False, False],
    [False, False, False, False]])>

将张量中的所有元素相加得到一个只有一个元素的张量。

X.sum()

tensor(66.)

X.sum()

array(66.)

X.sum()

Array(66., dtype=float32)

tf.reduce_sum(X)

<tf.Tensor: shape=(), dtype=float32, numpy=66.0>

2.1.4. 广播

到目前为止，您已经知道如何对两个相同形状的张量执行逐元素二元运算。在某些条件下，即使形状不同，我们仍然可以通过调用广播机制来执行元素二元运算。广播根据以下两步过程进行：（i）通过沿长度为 1 的轴复制元素来扩展一个或两个数组，以便在此转换之后，两个张量具有相同的形状；(ii) 对结果数组执行逐元素操作。

a = torch.arange(3).reshape((3, 1)) b = torch.arange(2).reshape((1, 2)) a, b

(tensor([[0],
     [1],
     [2]]),
 tensor([[0, 1]]))

a = np.arange(3).reshape(3, 1) b = np.arange(2).reshape(1, 2) a, b

(array([[0.],
    [1.],
    [2.]]),
 array([[0., 1.]]))

a = jnp.arange(3).reshape((3, 1)) b = jnp.arange(2).reshape((1, 2)) a, b

(Array([[0],
    [1],
    [2]], dtype=int32),
 Array([[0, 1]], dtype=int32))

a = tf.reshape(tf.range(3), (3, 1)) b = tf.reshape(tf.range(2), (1, 2)) a, b

(<tf.Tensor: shape=(3, 1), dtype=int32, numpy=
 array([[0],
    [1],
    [2]], dtype=int32)>,
 <tf.Tensor: shape=(1, 2), dtype=int32, numpy=array([[0, 1]], dtype=int32)>)

因为a和b是3×1和1×2 矩阵，它们的形状不匹配。广播产生了更大的3×2a 通过在按元素添加之前沿列复制矩阵和b沿行复制矩阵来创建矩阵。

a + b

tensor([[0, 1],
    [1, 2],
    [2, 3]])

a + b

array([[0., 1.],
    [1., 2.],
    [2., 3.]])

a + b

Array([[0, 1],
    [1, 2],
    [2, 3]], dtype=int32)

a + b

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[0, 1],
    [1, 2],
    [2, 3]], dtype=int32)>

2.1.5. 节省内存

运行操作可能会导致将新内存分配给主机结果。例如，如果我们写，我们取消引用曾经指向的张量，而是指向新分配的内存。我们可以用 Python 的函数来演示这个问题，它为我们提供了被引用对象在内存中的确切地址。请注意，在我们运行之后，指向不同的位置。这是因为 Python 首先求值，为结果分配新的内存，然后指向内存中的这个新位置。Y = X + YYYid()Y = Y + Xid(Y)Y + XY

before = id(Y) Y = Y + X id(Y) == before

False

before = id(Y) Y = Y + X id(Y) == before

False

before = id(Y) Y = Y + X id(Y) == before

False

before = id(Y) Y = Y + X id(Y) == before

False

由于两个原因，这可能是不受欢迎的。首先，我们不想一直在不必要地分配内存。在机器学习中，我们通常有数百兆字节的参数并且每秒更新所有这些参数多次。只要有可能，我们都希望就地执行这些更新。其次，我们可能会从多个变量中指向相同的参数。如果我们没有就地更新，我们必须小心更新所有这些引用，以免引发内存泄漏或无意中引用过时的参数。

幸运的是，执行就地操作很容易。Y我们可以使用切片表示法将操作的结果分配给先前分配的数组：。为了说明这个概念，我们覆盖张量的值，在初始化它之后，使用，使其具有与相同的形状。Y[:] = Zzeros_likeY

Z = torch.zeros_like(Y) print('id(Z):', id(Z)) Z[:] = X + Y print('id(Z):', id(Z))

id(Z): 139763606871712
id(Z): 139763606871712

X如果在后续计算中不重用的值，我们也可以使用or来减少操作的内存开销。X[:] = X + YX += Y

before = id(X) X += Y id(X) == before

True

Fortunately, performing in-place operations is easy. We can assign the result of an operation to a previously allocated array Y by using slice notation: Y[:] = . To illustrate this concept, we overwrite the values of tensor Z, after initializing it, using zeros_like, to have the same shape as Y.

Z = np.zeros_like(Y) print('id(Z):', id(Z)) Z[:] = X + Y print('id(Z):', id(Z))

id(Z): 140447312694464
id(Z): 140447312694464

If the value of X is not reused in subsequent computations, we can also use X[:] = X + Y or X += Y to reduce the memory overhead of the operation.

before = id(X) X += Y id(X) == before

True

# JAX arrays do not allow in-place operations

Variables are mutable containers of state in TensorFlow. They provide a way to store your model parameters. We can assign the result of an operation to a Variable with assign. To illustrate this concept, we overwrite the values of Variable Z after initializing it, using zeros_like, to have the same shape as Y.

Z = tf.Variable(tf.zeros_like(Y)) print('id(Z):', id(Z)) Z.assign(X + Y) print('id(Z):', id(Z))

id(Z): 140457113440208
id(Z): 140457113440208

Even once you store state persistently in a Variable, you may want to reduce your memory usage further by avoiding excess allocations for tensors that are not your model parameters. Because TensorFlow Tensors are immutable and gradients do not flow through Variable assignments, TensorFlow does not provide an explicit way to run an individual operation in-place.

However, TensorFlow provides the tf.function decorator to wrap computation inside of a TensorFlow graph that gets compiled and optimized before running. This allows TensorFlow to prune unused values, and to reuse prior allocations that are no longer needed. This minimizes the memory overhead of TensorFlow computations.

@tf.function def computation(X, Y): Z = tf.zeros_like(Y) # This unused value will be pruned out A = X + Y # Allocations will be reused when no longer needed B = A + Y C = B + Y return C + Y computation(X, Y)

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[ 8., 9., 26., 27.],
    [24., 33., 42., 51.],
    [56., 57., 58., 59.]], dtype=float32)>

2.1.6. 转换为其他 Python 对象

转换为 NumPy 张量 ( ndarray)，反之亦然，很容易。torch Tensor 和 numpy array 将共享它们的底层内存，通过就地操作改变一个也会改变另一个。

A = X.numpy() B = torch.from_numpy(A) type(A), type(B)

(numpy.ndarray, torch.Tensor)

Converting to a NumPy tensor (ndarray), or vice versa, is easy. The converted result does not share memory. This minor inconvenience is actually quite important: when you perform operations on the CPU or on GPUs, you do not want to halt computation, waiting to see whether the NumPy package of Python might want to be doing something else with the same chunk of memory.

A = X.asnumpy() B = np.array(A) type(A), type(B)

(numpy.ndarray, mxnet.numpy.ndarray)

A = jax.device_get(X) B = jax.device_put(A) type(A), type(B)

(numpy.ndarray, jaxlib.xla_extension.Array)

A = X.numpy() B = tf.constant(A) type(A), type(B)

(numpy.ndarray, tensorflow.python.framework.ops.EagerTensor)

要将大小为 1 的张量转换为 Python 标量，我们可以调用函数 item或 Python 的内置函数。

a = torch.tensor([3.5]) a, a.item(), float(a), int(a)

(tensor([3.5000]), 3.5, 3.5, 3)

a = np.array([3.5]) a, a.item(), float(a), int(a)

(array([3.5]), 3.5, 3.5, 3)

a = jnp.array([3.5]) a, a.item(), float(a), int(a)

(Array([3.5], dtype=float32), 3.5, 3.5, 3)

a = tf.constant([3.5]).numpy() a, a.item(), float(a), int(a)

(array([3.5], dtype=float32), 3.5, 3.5, 3)

2.1.7. 概括

张量类是深度学习库中存储和操作数据的主要接口。张量提供多种功能，包括构造例程；索引和切片；基础数学运算；广播; 内存高效分配；以及与其他 Python 对象之间的转换。

2.1.8. 练习

运行本节中的代码。把条件语句改成 or ，然后看看你能得到什么样的张量。X == YX < YX > Y
将广播机制中按元素操作的两个张量替换为其他形状，例如 3 维张量。结果和预期的一样吗？

审核编辑：汤梓红

声明：本文内容及配图由入驻作者撰写或者入驻合作网站授权转载。文章观点仅代表作者本人，不代表电子发烧友网立场。文章及其配图仅供工程师学习之用，如有内容侵权或者其他违规问题，请联系本站处理。举报投诉

数据

数据

+关注

关注
8

文章
6515

浏览量
87649
gpu

gpu

+关注

关注
27

文章
4430

浏览量
126788
pytorch

pytorch

+关注

关注
2

文章
764

浏览量
12842

Pytorch模型训练实用PDF教程【中文】

对 PyTorch 提供的数据增强方法（22 个）、权值初始化方法（10 个）、损失函数（17 个）、优化器（6 个）及 tensorboardX 的方法（13 个）进行了详细介绍。本教程分为四章

发表于 12-21 09:18

Pytorch入门之的基本操作

Pytorch入门之基本操作

发表于 05-22 17:15

PyTorch如何入门

PyTorch 入门实战（一）——Tensor

发表于 06-01 09:58

Pytorch AI语音助手

想做一个Pytorch AI语音助手，有没有好的思路呀？

发表于 03-06 13:00

如何往星光2板子里装pytorch？

如题,想先gpu版本的pytorch只安装cpu版本的pytorch,pytorch官网提供了基于conda和pip两种安装方式。因为咱是risc架构没对应的conda，而使用pip安装提示也没有

发表于 09-12 06:30

pytorch模型转换需要注意的事项有哪些？

和记录张量上的操作，不会记录任何控制流操作。为什么不能是GPU模型？答：BMNETP的编译过程不支持。如何将GPU模型转成CPU模型？答：在加载PyTorch的Python模型

发表于 09-18 08:05

什么是张量，如何在PyTorch中操作张量?

Kirill Dubovikov写的PyTorch vs TensorFlow — spotting the difference比较了PyTorch和TensorFlow这两个框架。如果你想

发表于 10-12 08:58 •1.5w次阅读

基于PyTorch的深度学习入门教程之PyTorch简单知识

计算 Part3：使用PyTorch构建一个神经网络 Part4：训练一个神经网络分类器 Part5：数据并行化本文是关于Part1的内容。 Part1：PyTorch简单知识 PyTorc

发表于 02-16 15:20 •2018次阅读

PyTorch构建自己一种易用的计算图结构

PNNX项目 PyTorch Neural Network eXchange(PNNX)是PyTorch模型互操作性的开放标准.

发表于 02-01 14:26 •771次阅读

PyTorch教程之数据预处理

电子发烧友网站提供《PyTorch教程之数据预处理.pdf》资料免费下载

发表于 06-02 14:11 •0次下载

PyTorch教程3.3之综合回归数据

电子发烧友网站提供《PyTorch教程3.3之综合回归数据.pdf》资料免费下载

发表于 06-05 15:48 •0次下载

PyTorch教程4.2之图像分类数据集

电子发烧友网站提供《PyTorch教程4.2之图像分类数据集.pdf》资料免费下载

发表于 06-05 15:41 •0次下载

PyTorch教程14.6之对象检测数据集

电子发烧友网站提供《PyTorch教程14.6之对象检测数据集.pdf》资料免费下载

发表于 06-05 11:23 •0次下载

PyTorch教程15.9之预训练BERT的数据集

电子发烧友网站提供《PyTorch教程15.9之预训练BERT的数据集.pdf》资料免费下载

发表于 06-05 11:06 •0次下载

PyTorch入门须知 PyTorch教程-2.1. 数据操作

为了完成任何事情，我们需要一些方法来存储和操作数据。通常，我们需要对数据做两件重要的事情：（i）获取它们；(ii) 一旦它们进入计算机就对其进行处理。如果没有某种存储方式，获取数据是没有意义的，所以

发表于 06-05 15:14 •471次阅读

搜索历史

PyTorch教程-2.1.数据操作

2.1.1. 入门

2.1.2. 索引和切片

2.1.3. 操作

2.1.4. 广播

2.1.5. 节省内存

2.1.6. 转换为其他 Python 对象

2.1.7. 概括

2.1.8. 练习

评论

Pytorch模型训练实用PDF教程【中文】

Pytorch入门之的基本操作

PyTorch如何入门

Pytorch AI语音助手

如何往星光2板子里装pytorch？

pytorch模型转换需要注意的事项有哪些？

什么是张量，如何在PyTorch中操作张量?

基于PyTorch的深度学习入门教程之PyTorch简单知识

PyTorch构建自己一种易用的计算图结构

PyTorch教程之数据预处理

PyTorch教程3.3之综合回归数据

PyTorch教程4.2之图像分类数据集

PyTorch教程14.6之对象检测数据集

PyTorch教程15.9之预训练BERT的数据集

PyTorch入门须知 PyTorch教程-2.1. 数据操作