PyTorch教程-20.2. 深度卷积生成对抗网络-电子发烧友网

在20.1 节中，我们介绍了 GAN 工作原理背后的基本思想。我们展示了他们可以从一些简单的、易于采样的分布中抽取样本，比如均匀分布或正态分布，并将它们转换成看起来与某些数据集的分布相匹配的样本。虽然我们匹配 2D 高斯分布的示例说明了要点，但它并不是特别令人兴奋。

在本节中，我们将演示如何使用 GAN 生成逼真的图像。我们的模型将基于 Radford等人介绍的深度卷积 GAN (DCGAN)。（2015 年）。我们将借用已经证明在判别计算机视觉问题上非常成功的卷积架构，并展示如何通过 GAN 来利用它们来生成逼真的图像。

						import warnings
import torch
import torchvision
from torch import nn
from d2l import torch as d2l

						 

						from mxnet import gluon, init, np, npx
from mxnet.gluon import nn
from d2l import mxnet as d2l

npx.set_np()

						import tensorflow as tf
from d2l import tensorflow as d2l

20.2.1。口袋妖怪数据集

我们将使用的数据集是从pokemondb获得的 Pokemon 精灵的集合。首先下载、提取和加载此数据集。

							#@save
d2l.DATA_HUB['pokemon'] = (d2l.DATA_URL + 'pokemon.zip',
              'c065c0e2593b8b161a2d7873e42418bf6a21106c')

data_dir = d2l.download_extract('pokemon')
pokemon = torchvision.datasets.ImageFolder(data_dir)

							 

							Downloading ../data/pokemon.zip from http://d2l-data.s3-accelerate.amazonaws.com/pokemon.zip...

						

							#@save
d2l.DATA_HUB['pokemon'] = (d2l.DATA_URL + 'pokemon.zip',
              'c065c0e2593b8b161a2d7873e42418bf6a21106c')

data_dir = d2l.download_extract('pokemon')
pokemon = gluon.data.vision.datasets.ImageFolderDataset(data_dir)

							 

							Downloading ../data/pokemon.zip from http://d2l-data.s3-accelerate.amazonaws.com/pokemon.zip...

						

							#@save
d2l.DATA_HUB['pokemon'] = (d2l.DATA_URL + 'pokemon.zip',
              'c065c0e2593b8b161a2d7873e42418bf6a21106c')

data_dir = d2l.download_extract('pokemon')
batch_size = 256
pokemon = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir, batch_size=batch_size, image_size=(64, 64))

							 

							Downloading ../data/pokemon.zip from http://d2l-data.s3-accelerate.amazonaws.com/pokemon.zip...
Found 40597 files belonging to 721 classes.

我们将每个图像调整为64×64. 变换ToTensor 会将像素值投影到[0,1]，而我们的生成器将使用 tanh 函数获取输出 [−1,1]. 因此我们用0.5意味着和0.5标准偏差以匹配值范围。

							batch_size = 256
transformer = torchvision.transforms.Compose([
  torchvision.transforms.Resize((64, 64)),
  torchvision.transforms.ToTensor(),
  torchvision.transforms.Normalize(0.5, 0.5)
])
pokemon.transform = transformer
data_iter = torch.utils.data.DataLoader(
  pokemon, batch_size=batch_size,
  shuffle=True, num_workers=d2l.get_dataloader_workers())

							 

							batch_size = 256
transformer = gluon.data.vision.transforms.Compose([
  gluon.data.vision.transforms.Resize(64),
  gluon.data.vision.transforms.ToTensor(),
  gluon.data.vision.transforms.Normalize(0.5, 0.5)
])
data_iter = gluon.data.DataLoader(
  pokemon.transform_first(transformer), batch_size=batch_size,
  shuffle=True, num_workers=d2l.get_dataloader_workers())

							 

							def transform_func(X):
  X = X / 255.
  X = (X - 0.5) / (0.5)
  return X

# For TF>=2.4 use `num_parallel_calls = tf.data.AUTOTUNE`
data_iter = pokemon.map(lambda x, y: (transform_func(x), y),
            num_parallel_calls=tf.data.experimental.AUTOTUNE)
data_iter = data_iter.cache().shuffle(buffer_size=1000).prefetch(
  buffer_size=tf.data.experimental.AUTOTUNE)

							 

							WARNING:tensorflow:From /home/d2l-worker/miniconda3/envs/d2l-en-release-1/lib/python3.9/site-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089

						

让我们想象一下前 20 张图像。

							warnings.filterwarnings('ignore')
d2l.set_figsize((4, 4))
for X, y in data_iter:
  imgs = X[:20,:,:,:].permute(0, 2, 3, 1)/2+0.5
  d2l.show_images(imgs, num_rows=4, num_cols=5)
  break

							 

https://file.elecfans.com/web2/M00/A9/CE/poYBAGR9PgeAcQuvAAUFYoaPyj0359.svg

							d2l.set_figsize((4, 4))
for X, y in data_iter:
  imgs = X[:20,:,:,:].transpose(0, 2, 3, 1)/2+0.5
  d2l.show_images(imgs, num_rows=4, num_cols=5)
  break

							 

https://file.elecfans.com/web2/M00/AA/49/pYYBAGR9PgqAFv1VAAWC4qBkgJs206.svg

							d2l.set_figsize(figsize=(4, 4))
for X, y in data_iter.take(1):
  imgs = X[:20, :, :, :] / 2 + 0.5
  d2l.show_images(imgs, num_rows=4, num_cols=5)

							 

https://file.elecfans.com/web2/M00/A9/CE/poYBAGR9Pg6AP2sgAAWKkkbhRrg987.svg

20.2.2。发电机

生成器需要映射噪声变量 z∈Rd, 长度-d向量，到具有宽度和高度的 RGB 图像64×64. 在 14.11 节中我们介绍了全卷积网络，它使用转置卷积层（参考 14.10 节）来扩大输入尺寸。生成器的基本块包含一个转置卷积层，然后是批量归一化和 ReLU 激活。

							class G_block(nn.Module):
  def __init__(self, out_channels, in_channels=3, kernel_size=4, strides=2,
         padding=1, **kwargs):
    super(G_block, self).__init__(**kwargs)
    self.conv2d_trans = nn.ConvTranspose2d(in_channels, out_channels,
                kernel_size, strides, padding, bias=False)
    self.batch_norm = nn.BatchNorm2d(out_channels)
    self.activation = nn.ReLU()

  def forward(self, X):
    return self.activation(self.batch_norm(self.conv2d_trans(X)))

							 

							class G_block(nn.Block):
  def __init__(self, channels, kernel_size=4,
         strides=2, padding=1, **kwargs):
    super(G_block, self).__init__(**kwargs)
    self.conv2d_trans = nn.Conv2DTranspose(
      channels, kernel_size, strides, padding, use_bias=False)
    self.batch_norm = nn.BatchNorm()
    self.activation = nn.Activation('relu')

  def forward(self, X):
    return self.activation(self.batch_norm(self.conv2d_trans(X)))

							 

							class G_block(tf.keras.layers.Layer):
  def __init__(self, out_channels, kernel_size=4, strides=2, padding="same",
         **kwargs):
    super().__init__(**kwargs)
    self.conv2d_trans = tf.keras.layers.Conv2DTranspose(
      out_channels, kernel_size, strides, padding, use_bias=False)
    self.batch_norm = tf.keras.layers.BatchNormalization()
    self.activation = tf.keras.layers.ReLU()

  def call(self, X):
    return self.activation(self.batch_norm(self.conv2d_trans(X)))

							 

默认情况下，转置卷积层使用 kh=kw=4内核，一个sh=sw=2大步前进，一个 ph=pw=1填充。输入形状为 nh′×nw′=16×16，生成器块将使输入的宽度和高度加倍。

(20.2.1)nh′×nw′=[(nhkh−(nh−1)(kh−sh)−2ph]×[(nwkw−(nw−1)(kw−sw)−2pw]=[(kh+sh(nh−1)−2ph]×[(kw+sw(nw−1)−2pw]=[(4+2×(16−1)−2×1]×[(4+2×(16−1)−2×1]=32×32.

							x = torch.zeros((2, 3, 16, 16))
g_blk = G_block(20)
g_blk(x).shape

							 

							torch.Size([2, 20, 32, 32])

						

							x = np.zeros((2, 3, 16, 16))
g_blk = G_block(20)
g_blk.initialize()
g_blk(x).shape

							 

							(2, 20, 32, 32)

						

							x = tf.zeros((2, 16, 16, 3)) # Channel last convention
g_blk = G_block(20)
g_blk(x).shape

							 

							TensorShape([2, 32, 32, 20])

						

如果将转置卷积层更改为4×4 核心，1×1步幅和零填充。输入大小为 1×1，输出的宽度和高度将分别增加 3。

							x = torch.zeros((2, 3, 1, 1))
g_blk = G_block(20, strides=1, padding=0)
g_blk(x).shape

							 

							torch.Size([2, 20, 4, 4])

						

							x = np.zeros((2, 3, 1, 1))
g_blk = G_block(20, strides=1, padding=0)
g_blk.initialize()
g_blk(x).shape

							 

							(2, 20, 4, 4)

						

							x = tf.zeros((2, 1, 1, 3))
# `padding="valid"` corresponds to no padding
g_blk = G_block(20, strides=1, padding="valid")
g_blk(x).shape

							 

							TensorShape([2, 4, 4, 20])

						

生成器由四个基本块组成，将输入的宽度和高度从 1 增加到 32。同时，它首先将潜在变量投影到64×8通道，然后每次将通道减半。最后，使用转置卷积层生成输出。它进一步加倍宽度和高度以匹配所需的64×64形状，并将通道尺寸减小到 3. tanh 激活函数用于将输出值投影到(−1,1)范围。

							n_G = 64
net_G = nn.Sequential(
  G_block(in_channels=100, out_channels=n_G*8,
      strides=1, padding=0),         # Output: (64 * 8, 4, 4)
  G_block(in_channels=n_G*8, out_channels=n_G*4), # Output: (64 * 4, 8, 8)
  G_block(in_channels=n_G*4, out_channels=n_G*2), # Output: (64 * 2, 16, 16)
  G_block(in_channels=n_G*2, out_channels=n_G),  # Output: (64, 32, 32)
  nn.ConvTranspose2d(in_channels=n_G, out_channels=3,
            kernel_size=4, stride=2, padding=1, bias=False),
  nn.Tanh()) # Output: (3, 64, 64)

							 

							n_G = 64
net_G = nn.Sequential()
net_G.add(G_block(n_G*8, strides=1, padding=0), # Output: (64 * 8, 4, 4)
     G_block(n_G*4), # Output: (64 * 4, 8, 8)
     G_block(n_G*2), # Output: (64 * 2, 16, 16)
     G_block(n_G),  # Output: (64, 32, 32)
     nn.Conv2DTranspose(
       3, kernel_size=4, strides=2, padding=1, use_bias=False,
       activation='tanh')) # Output: (3, 64, 64)

							 

							n_G = 64
net_G = tf.keras.Sequential([
  # Output: (4, 4, 64 * 8)
  G_block(out_channels=n_G*8, strides=1, padding="valid"),
  G_block(out_channels=n_G*4), # Output: (8, 8, 64 * 4)
  G_block(out_channels=n_G*2), # Output: (16, 16, 64 * 2)
  G_block(out_channels=n_G), # Output: (32, 32, 64)
  # Output: (64, 64, 3)
  tf.keras.layers.Conv2DTranspose(
    3, kernel_size=4, strides=2, padding="same", use_bias=False,
    activation="tanh")
])

							 

生成一个 100 维的潜在变量来验证生成器的输出形状。

							x = torch.zeros((1, 100, 1, 1))
net_G(x).shape

							torch.Size([1, 3, 64, 64])

						

							x = np.zeros((1, 100, 1, 1))
net_G.initialize()
net_G(x).shape

							 

							(1, 3, 64, 64)

						

							x = tf.zeros((1, 1, 1, 100))
net_G(x).shape

							TensorShape([1, 64, 64, 3])

						

20.2.3。判别器

判别器是一个普通的卷积网络，除了它使用一个 leaky ReLU 作为它的激活函数。鉴于 α∈[0,1], 它的定义是

(20.2.2)leaky ReLU(x)={xifx>0αxotherwise.

可以看出，如果α=0，以及一个身份函数，如果α=1. 为了α∈(0,1)，leaky ReLU 是一个非线性函数，它为负输入提供非零输出。它旨在解决“垂死的 ReLU”问题，即神经元可能始终输出负值，因此无法取得任何进展，因为 ReLU 的梯度为 0。

							alphas = [0, .2, .4, .6, .8, 1]
x = torch.arange(-2, 1, 0.1)
Y = [nn.LeakyReLU(alpha)(x).detach().numpy() for alpha in alphas]
d2l.plot(x.detach().numpy(), Y, 'x', 'y', alphas)

							 

https://file.elecfans.com/web2/M00/A9/CE/poYBAGR9PhKAAphBAACN8Sd5kmw122.svg

							alphas = [0, .2, .4, .6, .8, 1]
x = np.arange(-2, 1, 0.1)
Y = [nn.LeakyReLU(alpha)(x).asnumpy() for alpha in alphas]
d2l.plot(x.asnumpy(), Y, 'x', 'y', alphas)

							 

https://file.elecfans.com/web2/M00/A9/CE/poYBAGR9PhSAVJfHAACN_mMaeFE528.svg

							alphas = [0, .2, .4, .6, .8, 1]
x = tf.range(-2, 1, 0.1)
Y = [tf.keras.layers.LeakyReLU(alpha)(x).numpy() for alpha in alphas]
d2l.plot(x.numpy(), Y, 'x', 'y', alphas)

							 

https://file.elecfans.com/web2/M00/A9/CE/poYBAGR9PhaAZDDrAACPk58eqYI298.svg

判别器的基本块是一个卷积层，然后是一个批量归一化层和一个 leaky ReLU 激活。卷积层的超参数类似于生成器块中的转置卷积层。

							class D_block(nn.Module):
  def __init__(self, out_channels, in_channels=3, kernel_size=4, strides=2,
        padding=1, alpha=0.2, **kwargs):
    super(D_block, self).__init__(**kwargs)
    self.conv2d = nn.Conv2d(in_channels, out_channels, kernel_size,
                strides, padding, bias=False)
    self.batch_norm = nn.BatchNorm2d(out_channels)
    self.activation = nn.LeakyReLU(alpha, inplace=True)

  def forward(self, X):
    return self.activation(self.batch_norm(self.conv2d(X)))

							 

							class D_block(nn.Block):
  def __init__(self, channels, kernel_size=4, strides=2,
         padding=1, alpha=0.2, **kwargs):
    super(D_block, self).__init__(**kwargs)
    self.conv2d = nn.Conv2D(
      channels, kernel_size, strides, padding, use_bias=False)
    self.batch_norm = nn.BatchNorm()
    self.activation = nn.LeakyReLU(alpha)

  def forward(self, X):
    return self.activation(self.batch_norm(self.conv2d(X)))

							 

							class D_block(tf.keras.layers.Layer):
  def __init__(self, out_channels, kernel_size=4, strides=2, padding="same",
         alpha=0.2, **kwargs):
    super().__init__(**kwargs)
    self.conv2d = tf.keras.layers.Conv2D(out_channels, kernel_size,
                       strides, padding, use_bias=False)
    self.batch_norm = tf.keras.layers.BatchNormalization()
    self.activation = tf.keras.layers.LeakyReLU(alpha)

  def call(self, X):
    return self.activation(self.batch_norm(self.conv2d(X)))

							 

正如我们在第 7.3 节中演示的那样，具有默认设置的基本块会将输入的宽度和高度减半。例如，给定一个输入形状nh=nw=16, 具有内核形状 kh=kw=4, 步幅sh=sw=2和填充形状ph=pw=1，输出形状将是：

(20.2.3)nh′×nw′=⌊(nh−kh+2ph+sh)/sh⌋×⌊(nw−kw+2pw+sw)/sw⌋=⌊(16−4+2×1+2)/2⌋×⌊(16−4+2×1+2)/2⌋=8×8.

							x = torch.zeros((2, 3, 16, 16))
d_blk = D_block(20)
d_blk(x).shape

							 

							torch.Size([2, 20, 8, 8])

						

							x = np.zeros((2, 3, 16, 16))
d_blk = D_block(20)
d_blk.initialize()
d_blk(x).shape

							 

							(2, 20, 8, 8)

						

							x = tf.zeros((2, 16, 16, 3))
d_blk = D_block(20)
d_blk(x).shape

							 

							TensorShape([2, 8, 8, 20])

						

鉴别器是生成器的镜像。

							n_D = 64
net_D = nn.Sequential(
  D_block(n_D), # Output: (64, 32, 32)
  D_block(in_channels=n_D, out_channels=n_D*2), # Output: (64 * 2, 16, 16)
  D_block(in_channels=n_D*2, out_channels=n_D*4), # Output: (64 * 4, 8, 8)
  D_block(in_channels=n_D*4, out_channels=n_D*8), # Output: (64 * 8, 4, 4)
  nn.Conv2d(in_channels=n_D*8, out_channels=1,
       kernel_size=4, bias=False)) # Output: (1, 1, 1)

							 

							n_D = 64
net_D = nn.Sequential()
net_D.add(D_block(n_D),  # Output: (64, 32, 32)
     D_block(n_D*2), # Output: (64 * 2, 16, 16)
     D_block(n_D*4), # Output: (64 * 4, 8, 8)
     D_block(n_D*8), # Output: (64 * 8, 4, 4)
     nn.Conv2D(1, kernel_size=4, use_bias=False)) # Output: (1, 1, 1)

							 

							n_D = 64
net_D = tf.keras.Sequential([
  D_block(n_D), # Output: (32, 32, 64)
  D_block(out_channels=n_D*2), # Output: (16, 16, 64 * 2)
  D_block(out_channels=n_D*4), # Output: (8, 8, 64 * 4)
  D_block(out_channels=n_D*8), # Outupt: (4, 4, 64 * 64)
  # Output: (1, 1, 1)
  tf.keras.layers.Conv2D(1, kernel_size=4, use_bias=False)
])

							 

它使用带有输出通道的卷积层1作为获得单个预测值的最后一层。

							x = torch.zeros((1, 3, 64, 64))
net_D(x).shape

							torch.Size([1, 1, 1, 1])

						

							x = np.zeros((1, 3, 64, 64))
net_D.initialize()
net_D(x).shape

							 

							(1, 1, 1, 1)

						

							x = tf.zeros((1, 64, 64, 3))
net_D(x).shape

							TensorShape([1, 1, 1, 1])

						

20.2.4。训练

与第 20.1 节中的基本 GAN 相比，我们对生成器和鉴别器使用相同的学习率，因为它们彼此相似。此外，我们改变β1在 Adam 中（第 12.10 节）来自0.9到0.5. 它降低了动量的平滑度，即过去梯度的指数加权移动平均值，以处理快速变化的梯度，因为生成器和鉴别器相互争斗。此外，随机生成的噪声Z是一个 4-D 张量，我们正在使用 GPU 来加速计算。

							def train(net_D, net_G, data_iter, num_epochs, lr, latent_dim,
     device=d2l.try_gpu()):
  loss = nn.BCEWithLogitsLoss(reduction='sum')
  for w in net_D.parameters():
    nn.init.normal_(w, 0, 0.02)
  for w in net_G.parameters():
    nn.init.normal_(w, 0, 0.02)
  net_D, net_G = net_D.to(device), net_G.to(device)
  trainer_hp = {'lr': lr, 'betas': [0.5,0.999]}
  trainer_D = torch.optim.Adam(net_D.parameters(), **trainer_hp)
  trainer_G = torch.optim.Adam(net_G.parameters(), **trainer_hp)
  animator = d2l.Animator(xlabel='epoch', ylabel='loss',
              xlim=[1, num_epochs], nrows=2, figsize=(5, 5),
              legend=['discriminator', 'generator'])
  animator.fig.subplots_adjust(hspace=0.3)
  for epoch in range(1, num_epochs + 1):
    # Train one epoch
    timer = d2l.Timer()
    metric = d2l.Accumulator(3) # loss_D, loss_G, num_examples
    for X, _ in data_iter:
      batch_size = X.shape[0]
      Z = torch.normal(0, 1, size=(batch_size, latent_dim, 1, 1))
      X, Z = X.to(device), Z.to(device)
      metric.add(d2l.update_D(X, Z, net_D, net_G, loss, trainer_D),
            d2l.update_G(Z, net_D, net_G, loss, trainer_G),
            batch_size)
    # Show generated examples
    Z = torch.normal(0, 1, size=(21, latent_dim, 1, 1), device=device)
    # Normalize the synthetic data to N(0, 1)
    fake_x = net_G(Z).permute(0, 2, 3, 1) / 2 + 0.5
    imgs = torch.cat(
      [torch.cat([
        fake_x[i * 7 + j].cpu().detach() for j in range(7)], dim=1)
       for i in range(len(fake_x)//7)], dim=0)
    animator.axes[1].cla()
    animator.axes[1].imshow(imgs)
    # Show the losses
    loss_D, loss_G = metric[0] / metric[2], metric[1] / metric[2]
    animator.add(epoch, (loss_D, loss_G))
  print(f'loss_D {loss_D:.3f}, loss_G {loss_G:.3f}, '
     f'{metric[2] / timer.stop():.1f} examples/sec on {str(device)}')

							 

							def train(net_D, net_G, data_iter, num_epochs, lr, latent_dim,
     device=d2l.try_gpu()):
  loss = gluon.loss.SigmoidBCELoss()
  net_D.initialize(init=init.Normal(0.02), force_reinit=True, ctx=device)
  net_G.initialize(init=init.Normal(0.02), force_reinit=True, ctx=device)
  trainer_hp = {'learning_rate': lr, 'beta1': 0.5}
  trainer_D = gluon.Trainer(net_D.collect_params(), 'adam', trainer_hp)
  trainer_G = gluon.Trainer(net_G.collect_params(), 'adam', trainer_hp)
  animator = d2l.Animator(xlabel='epoch', ylabel='loss',
              xlim=[1, num_epochs], nrows=2, figsize=(5, 5),
              legend=['discriminator', 'generator'])
  animator.fig.subplots_adjust(hspace=0.3)
  for epoch in range(1, num_epochs + 1):
    # Train one epoch
    timer = d2l.Timer()
    metric = d2l.Accumulator(3) # loss_D, loss_G, num_examples
    for X, _ in data_iter:
      batch_size = X.shape[0]
      Z = np.random.normal(0, 1, size=(batch_size, latent_dim, 1, 1))
      X, Z = X.as_in_ctx(device), Z.as_in_ctx(device),
      metric.add(d2l.update_D(X, Z, net_D, net_G, loss, trainer_D),
            d2l.update_G(Z, net_D, net_G, loss, trainer_G),
            batch_size)
    # Show generated examples
    Z = np.random.normal(0, 1, size=(21, latent_dim, 1, 1), ctx=device)
    # Normalize the synthetic data to N(0, 1)
    fake_x = net_G(Z).transpose(0, 2, 3, 1) / 2 + 0.5
    imgs = np.concatenate(
      [np.concatenate([fake_x[i * 7 + j] for j in range(7)], axis=1)
       for i in range(len(fake_x)//7)], axis=0)
    animator.axes[1].cla()
    animator.axes[1].imshow(imgs.asnumpy())
    # Show the losses
    loss_D, loss_G = metric[0] / metric[2], metric[1] / metric[2]
    animator.add(epoch, (loss_D, loss_G))
  print(f'loss_D {loss_D:.3f}, loss_G {loss_G:.3f}, '
     f'{metric[2] / timer.stop():.1f} examples/sec on {str(device)}')

							 

							def train(net_D, net_G, data_iter, num_epochs, lr, latent_dim,
     device=d2l.try_gpu()):
  loss = tf.keras.losses.BinaryCrossentropy(
    from_logits=True, reduction=tf.keras.losses.Reduction.SUM)

  for w in net_D.trainable_variables:
    w.assign(tf.random.normal(mean=0, stddev=0.02, shape=w.shape))
  for w in net_G.trainable_variables:
    w.assign(tf.random.normal(mean=0, stddev=0.02, shape=w.shape))

  optimizer_hp = {"lr": lr, "beta_1": 0.5, "beta_2": 0.999}
  optimizer_D = tf.keras.optimizers.Adam(**optimizer_hp)
  optimizer_G = tf.keras.optimizers.Adam(**optimizer_hp)

  animator = d2l.Animator(xlabel='epoch', ylabel='loss',
              xlim=[1, num_epochs], nrows=2, figsize=(5, 5),
              legend=['discriminator', 'generator'])
  animator.fig.subplots_adjust(hspace=0.3)

  for epoch in range(1, num_epochs + 1):
    # Train one epoch
    timer = d2l.Timer()
    metric = d2l.Accumulator(3) # loss_D, loss_G, num_examples
    for X, _ in data_iter:
      batch_size = X.shape[0]
      Z = tf.random.normal(mean=0, stddev=1,
                 shape=(batch_size, 1, 1, latent_dim))
      metric.add(d2l.update_D(X, Z, net_D, net_G, loss, optimizer_D),
            d2l.update_G(Z, net_D, net_G, loss, optimizer_G),
            batch_size)

    # Show generated examples
    Z = tf.random.normal(mean=0, stddev=1, shape=(21, 1, 1, latent_dim))
    # Normalize the synthetic data to N(0, 1)
    fake_x = net_G(Z) / 2 + 0.5
    imgs = tf.concat([tf.concat([fake_x[i * 7 + j] for j in range(7)],
                  axis=1)
             for i in range(len(fake_x) // 7)], axis=0)
    animator.axes[1].cla()
    animator.axes[1].imshow(imgs)
    # Show the losses
    loss_D, loss_G = metric[0] / metric[2], metric[1] / metric[2]
    animator.add(epoch, (loss_D, loss_G))
  print(f'loss_D {loss_D:.3f}, loss_G {loss_G:.3f}, '
     f'{metric[2] / timer.stop():.1f} examples/sec on {str(device._device_name)}')

							 

我们用少量的 epochs 训练模型只是为了演示。为了获得更好的性能，可以将变量num_epochs设置为更大的数字。

							latent_dim, lr, num_epochs = 100, 0.005, 20
train(net_D, net_G, data_iter, num_epochs, lr, latent_dim)

							loss_D 0.030, loss_G 7.203, 1026.4 examples/sec on cuda:0

						

https://file.elecfans.com/web2/M00/AA/49/pYYBAGR9PhmAKNvvAAQ_akulsoA525.svg

							latent_dim, lr, num_epochs = 100, 0.005, 20
train(net_D, net_G, data_iter, num_epochs, lr, latent_dim)

							loss_D 0.224, loss_G 6.386, 2260.7 examples/sec on gpu(0)

						

https://file.elecfans.com/web2/M00/A9/CE/poYBAGR9PhyANJTuAAQWXCMFF5g049.svg

							latent_dim, lr, num_epochs = 100, 0.0005, 40
train(net_D, net_G, data_iter, num_epochs, lr, latent_dim)

							loss_D 0.112, loss_G 4.952, 1968.2 examples/sec on /GPU:0

						

https://file.elecfans.com/web2/M00/AA/49/pYYBAGR9Ph6ANe_HAAP3DIt7UTs347.svg

20.2.5。概括

DCGAN 架构有四个用于鉴别器的卷积层和四个用于生成器的“分数步”卷积层。
鉴别器是一个 4 层跨步卷积，具有批量归一化（除了它的输入层）和 leaky ReLU 激活。
Leaky ReLU 是一种非线性函数，可为负输入提供非零输出。它旨在解决“垂死的 ReLU”问题，并帮助梯度更容易地通过架构。

20.2.6. 练习

如果我们使用标准 ReLU 激活而不是 leaky ReLU 会发生什么？
在 Fashion-MNIST 上应用 DCGAN，看看哪个类别效果好，哪个效果不好。

声明：本文内容及配图由入驻作者撰写或者入驻合作网站授权转载。文章观点仅代表作者本人，不代表电子发烧友网立场。文章及其配图仅供工程师学习之用，如有内容侵权或者其他违规问题，请联系本站处理。举报投诉

pytorch

pytorch

+关注

关注
2

文章
763

浏览量
12836

图像生成对抗生成网络gan_GAN生成汽车图像精选资料推荐

图像生成对抗生成网络ganHello there! This is my story of making a GAN that would generate images of cars

发表于 08-31 06:48

图像生成对抗生成网络

图像生成对抗生成网络ganby Thalles Silva 由Thalles Silva暖身 (Warm up)Let’s say there’s a very cool party going

发表于 09-15 09:29

生成对抗网络模型综述

，开创性地提出了生成对抗网络（ GAN）。生成对抗网络包含一个生成模型和一个判别模型。其中，生成

发表于 04-03 10:48 •1次下载

<b class='flag-5'>生成对抗</b><b class='flag-5'>网络</b>模型综述

生成对抗网络GAN，正在成为新的“深度学习”

生成对抗网络由一个生成网络（Generator）与一个判别网络（Discriminator）组成。生成

发表于 06-11 16:04 •4486次阅读

生成对抗网络在计算机视觉领域有什么应用

生成对抗网络（GAN， Generative Adversarial Networks）的出现是计算机视觉领域又一里程碑式的发展，它为解决各种图像预测问题提供了新型工具。以此为目的，本文通过相关文献

发表于 12-06 15:29 •22次下载

基于谱归一化条件生成对抗网络的图像修复算法

基于生成对抗网络的图像修复算法在修复大尺寸缺失图像时，存在图像失真较多与判别网络性能不可控等问题，基于谱归一化条件生成对抗网络，提出一种新的

发表于 03-12 10:22 •14次下载

基于生成对抗网络的单图像超分辨率重建方法

基于深度卷积神经网络的超分辨率重建方法虽然有较高的峰值信噪比（PSNR），但重建结釆在大尺度因子下存在缺乏高频信息和纹理细节，视觉感知效果差的问题。针对这一问题，提出了一种基于生成对抗

发表于 04-12 10:24 •16次下载

基于自注意力机制的条件生成对抗网络模型

近年来，越来越多的生成对抗网络出现在深度学习的各个领域中。条件生成对抗网络（ Conditional generative Adverari

发表于 04-20 14:26 •11次下载

基于生成对抗网络的深度伪造视频综述

深度伪造的滥用，给囯家、社会和个人带来了潜在威胁。首先，介绍了深度伪造的概念和当前发展趋势，分析了基于生成对抗网络的深度伪造视频的

发表于 05-10 15:39 •11次下载

基于密集卷积生成对抗网络的图像修复方法

度差等问题。针对上述问题，文中提出了一种基于密集卷积生成对抗网络的图像修复算法。该算法采用生成对抗网络作为图像修复的基本框架。首先，利用密集

发表于 05-13 14:39 •15次下载

注塑瓶检测的半监督深度卷积生成对抗网络模型

（ Semi-supervised）深度卷积生成对抗网络（ Deep Convolutional Generative Adversarial Network， DCGAN）模型。该模

发表于 05-18 14:24 •2次下载

基于深度卷积生成对抗网络的花朵图像识别分类

为了提高花朵图像识别与分类的准确率，采用基于深度卷积生成对抗网络的算法来完成花朵图像的识别与分类。为了保证花朵图像在卷积过程中的特征完整性，

发表于 05-28 16:51 •5次下载

基于残差生成对抗网络的人脸图像复原方法

得益于计算机视觉的快速发展，人脸图像复原技术可以仅利用人脸的轮廓来生成完整的人脸图像。目前已有许多基于卷积神经网络和生成对抗网络等方法的人脸

发表于 06-16 16:04 •9次下载

基于像素级生成对抗网络的图像彩色化模型

基于像素级生成对抗网络的图像彩色化模型

发表于 06-27 11:02 •4次下载

PyTorch教程20.2之深度卷积生成对抗网络

电子发烧友网站提供《PyTorch教程20.2之深度卷积生成对抗网络.pdf》资料免费下载

发表于 06-05 10:21 •2次下载