COMP9444 Neural Networks and Deep Learning

Quiz 8 Deep RL and Unsupervised Learning

This is an optional quiz to test your understanding of Deep RL and Unsupervised Learning.
  1. Write out the steps in the REINFORCE algorithm, making sure to define any symbols you use.

  2. In the context of Deep Q-Learning, explain the following:

    1. Experience Replay
    2. Double Q-Learning

  3. What is the Energy function for these architectures:

    1. Boltzmann Machine
    2. Restricted Boltzmann Machine

    Remember to define any variables you use.

  4. The Variational Auto-Encoder is trained to maximize

    Ez ∼ qφ(z | x(i)) [log pθ(x(i) | z)]   –   DKL(qφ(z | x(i)) || p(z))

    Briefly state what each of these two terms aims to achieve.

  5. Generative Adversarial Networks traditionally made use of a two-player zero-sum game between a Generator Gθ and a Discriminator Dψ, to compute

    minθ maxψ (V(Gθ, Dψ))

    1. Give the formula for V(Gθ, Dψ).
    2. Explain why it may be advantageous to change the GAN algorithm so that the game is no longer zero-sum, and write the formula that the Generator would try to maximize in that case.

  6. In the context of GANs, briefly explain what is meant by mode collapse, and list three different methods for avoiding it.