Sample Question - Part A

The following questions are meant to give you some orientation about the kind of questions and the range of topics you may see in the exam. To value of all question on the part taught by Anthony Knittel will be 60 marks (or 70 marks for 9844 students) (corresponding to 60 minutes of allocated time).

1) Perceptron (10 marks)

Provide a schematic diagram of a simple perceptron neuron and describe mathematically its function. Give the Perceptron training algorithm in pseudo code. Does the Perceptron algorithm perform gradient descent? Justify your answer.

2) Backpropagation (10 marks)

Explain the basic idea of the backpropagation learning algorithm. Explain the role of the learning rate: What effect has it when the learning rate is increased? What effect has a decrease of the learning rate?

3) MLPs (10 marks)

What would happen if the transfer functions (at the hidden and output layer) in a multi-layer perceptron would be omitted; i.e. if the activation would simply be the weighted sum of all input signals? Explain why this (simpler) activation scheme is not normally used in MLPs although it would simplify and accelerate the calculations for the backpropogation algorithm?

4) Computational Learning Theory (10 marks)

Explain the definition of the Vapnik-Chervonenkis dimension. What is the Vapnik-Chervonenkis dimension (VC dimension) of the following set of sets F over the domain X={a,b,c,d,e,f,g}?

F={{},{a},{b,c,d,f},{d,e,f},{a,b,d,e},{a,d,e,f},{b,c,e},{a,b,c,d},{c,d,f},{a,c,e,g}}

I.e. each set corresponds to a function f that can be learned. Such a function f classifies the objects belonging to the set positively and classifies all other objects negatively. Justify your answer.

5) Support Vector Machines: (10 marks)

What is the Optimal hyperplane? What is the input space as opposed to the feature space? What is the purpose of Kernel functions in Support Vector Machines? Justify your answers.

6) Boosting (10 marks)

What is Ada-boost? Explain in intuitive terms what it does. Provide a description of it in pseudo code. Explain in intuitive terms why it often improves the accuracy of the learned function. How does it differ from the original boosting algorithm? With which neural network learning algorithms can it be combined?