Artificial Intelligence

Solutions to Revision Exercises - Prolog, Knowledge Representation, Natural Language Processing, Machine Learning

  1. (15 marks) Prolog Programming

    1. Given the Prolog database and rules shown below, what is the output of the Prolog query that follows?

         child_of(mary, steve).
         child_of(mary, anne).
         child_of(alice, anne).
         child_of(alice, steve).
         child_of(jane, steve).
         child_of(leslie, steve).
         sisters(Person1, Person2) :-
            child_of(Person1, Parent1),
            child_of(Person1, Parent2),
            child_of(Person2, Parent1),
            child_of(Person2, Parent2),
            not(Parent1 = Parent2),
            not(Person1 = Person2).


      ?- sisters(Sis1, Sis2).


      Sis1 = mary,
      Sis2 = alice;
      Sis1 = mary,
      Sis2 = alice;
      Sis1 = alice,
      Sis2 = mary;
      Sis1 = alice,
      Sis2 = mary;

      Comment:Essentially there are two persons here who are sisters, namely mary and alice, so we intuitively expect a single answer, effectively verifying that sisters(mary, alice). is true. However, Prolog also proves sisters(alice, mary). which doubles the number of solutions found. It also proves each of these two ways, binding Parent1 to steve and Parent2 to anne in one case, and vice-versa in the other case, again doubling the number of solutions (to give 4 altogether).

    2. Does the following Prolog query succeed? If not, explain why not. If so, what bindings will be reported?

      ?- believes(X, likes(mary, pizza)) = believes(frank, likes(Y, pizza)).

      Answer: It succeeds, with bindings:

      X = frank,
      Y = mary

      Comment: Because X and Y start with capital letters, they are variables, so they match the "atoms" frank and mary on the opposite side of the equals sign.

      Integers and floating point numbers are built into most programming languages, including Prolog. However, suppose that numbers and arithmetic operations were not available. It is still possible to define numbers in Prolog and to write programs to implement simple arithmetic operations. In the following question, you MUST NOT use any of Prologs built-in arithmetic predicates such as is, <, +, *, etc. (except in part (f)). Also, do not use the cut operation (!) in your program.

      A positive integer can be defined as zero or the successor of another number. Thus, zero can be represented by 0 and a number, such as two, can be represented by s(s(0)), where two is the successor of the successor of 0. Thus s(X) can be thought of as X+1. We call numbers written like this s-expressions.

      In parts (c) to (f), don't worry about what happens if X (and Y and Z) are not s-expressions. You don't have to cope with this, and in particular, you don't have to check for this.

      See also the Wikipedia entry on the Peano axioms. This link is just for interest - the material at the link is not examinable.

    3. Write a Prolog predicate, s_number(X), which is true when X is a valid number. For example:
      ?- s_number(s(s(0)).


      s_number(s(X)) :-

    4. Write a Prolog predicate, minus1(X,Y), which is true if X is the successor of Y, i.e. if X = Y+1, or Y=X-1. Naturally, this predicate will fail if X is 0.
      ?- minus1(s(s(0)), Y).
      Y = s(0)
      ?- minus1(0, Y).


      minus1(s(X), X).

    5. Write a Prolog predicate, subtract(X, Y, Z), which is true if Z represents the difference between X and Y, assuming X is greater than or equal to Y. If X is less than Y, then this predicate is undefined.
      ?- subtract(s(s(s(s(0)))), s(s(0)), Z).
      Z = s(s(0))


      subtract(X, 0, X).
      subtract(s(X), s(Y), Z) :-
          subtract(X, Y, Z).

    6. Write a Prolog predicate value_of(X, Y), which is true if Y is the natural number, in normal notation, corresponding to the s(s(...)) expression X.
      ?- value_of(s(s(s(s(0)))), Y).
      Y = 4


      value_of(0, 0).
      value_of(s(X), Y) :-
          value_of(X, Z),
          Y is Z+1.

  2. Knowledge Representation (5 marks)

    1. Draw a semantic network to represent the sentence "Fred finds a furry purple cube."

      NB: This question is about semantic nets, not NLP.

      Semantic net diagram

    2. Briefly describe the function of each part of the match-resolve-act cycle in a rule-based system


      • match conditions of rules with contents of working memory
      • if no rule matches then stop
      • resolve conflicts - i.e. if more than one rule is matched, and so apparently eligible to fire, then choose one of them
      • act - i.e. perform conclusion part of rule
      end loop

    3. Briefly describe two types of demon used in the frame system described in lectures.

      Answer: Pick any two of:

      demons are triggered when a new value is put into a slot.
      demons are triggered when a value is removed from a slot.
      is triggered when a slot value is replaced.
      demons are triggered when there is no value present in an instance frame and a value must be computed from a generic frame.
      is triggered when a new frame is created.
      is triggered when a new value is added. The value must satisfy the range constraint specified for the slot.
      is triggered when the range demon is triggered and returns false.

  3. Natural Language Processing - Short Answers (13 marks) Answer each of the following in no more than a few words each:

    1. What is the significance of a feature like VAR being a head feature for categories like S, VP, NP?

      Answer: Head feature values are inherited from the head subsconstituent of the object: thus an S inherits its VAR feature from its VP, a VP from its V, and an NP from its head N, NAME, or PRO.

    2. What form can grammar rules take in a regular grammar?

      Answer: At or AtN, where A and N are non-terminals, and t is a terminal symbol.

    3. Give an example of a predicate operator in the logical form language.

      Answer: plur transforms a predicate like dog1, which is true of a single dog, into a predicate plur(dog1) which is true of a set of more than one dog.

    4. Name three more components of the logical form language described in lectures, and give an example of them.

      Answer: As well as predicate operators, there are the following components to the logical form language:
      terms  likefido1 and name(j1, 'John')
      predicates  likebites1
      logical operators  likenot, &
      propositions  likebites1(fido1, postman1)
      quantifiers  likeevery, most, the
      modal operators  likebelieve, pres, assert
      term constructors  likepro(—, —) and name(—, —)

    5. What is the result of applying the λ-expression λ x likes(x, pizza) to mary, after λ-reduction?

      Answer: likes(mary, pizza)

    6. What is meant by the term subcategorization?

      Answer: Subcategorization refers to the property of verbs (and adjectives) of allowing only certain patterns of complement structure to follow them. For example, laugh cannot be followed by an NP in English, (so it subcategorizes none) and tell can be followed by a NP and then a VP in the infinitive form, such as tell Peter to buy petrol (so it subcategorizes np_vp:inf).

    7. Give an example of a sentence that exhibits structural ambiguity, and draw two trees showing the ambiguity.

      Answer: The major exhibits age.

    8. What general form can the rules of a context-sensitive grammar take?

      Answer: Two possible answers: here is one - rules must be of the form λAρ → λσρ where λ and ρ are (possibly empty) strings of alphabet symbols, A is a non-terminal, and σ is a string of alphabet symbols.

    9. Write the down the condition under which two feature values agree, and apply this to the two feature values {3s,3p} and {1s,2s,1p,2p,3p}.

      Answer: Two feature values unify by intersection, and they agree if they have non-empty intersection.
      {3s,3p} ∩ {1s,2s,1p,2p,3p} = {3p}, which is non-empty, so these two features agree.

  4. Given the lexicon and grammar information below, trace the steps in chart-parsing the two sentences Time flies. It flies like an arrow.

    Lexicon must cover any words to be processed.
    timeNOUN, VERB
    fliesNOUN, VERB

    S → NP VPG-S1
    S → VERB NPG-S2
    NP → NOUNG-NP1
    VP → VERBG-VP1

    Note: The numbers that are part of active arc names are not especially significant. All that matters is that different arcs have different names.

    *sentence*: time *current-position*: 1
    NOUN1: NOUN "time" from 0 to 1
    VERB1: VERB "time" from 0 to 1
    NP1: NP NOUN1 by rule G-NP1 from 0 to 1
    VP1: VP VERB1 by rule G-VP1 from 0 to 1
    ARC1: S NP1 • VP by rule G-S1 from 0 to 1
    ARC2: S VERB1 • NP by rule G-S2 from 0 to 1
    ARC3: VP VERB1 • PP by rule G-VP2 from 0 to 1

    *sentence*: time flies *current-position*: 2
    *constituents*: add
    NOUN2: NOUN "flies" from 1 to 2
    VERB2: VERB "flies" from 1 to 2
    NP2: NP NOUN2 by rule TF-NP1 from 1 to 2
    VP2: VP VERB2 by rule G-VP1 from 1 to 2
    S1: S VERB1 NP2 by rule G-S2 from ARC2 from 0 to 2
    S2: S NP1 VP2 by rule G-S1 from ARC1 from 0 to 2
    *active-arcs*: add
    ARC4: S NP2 • VP by rule G-S1 from 1 to 2
    ARC5: S VERB2 • NP by rule G-S2 from 1 to 2
    ARC6: VP VERB2 • PP by rule G-VP2 from 1 to 2

    Full parses:
    (S (NP (NOUN "time")) (VP (VERB "flies")))
    (S (VERB "time") (NP (NOUN "flies")))

    *sentence*: it *current-position*: 1
    PRON2: PRON "it" from 0 to 1
    NP1: NP PRON2 by rule G-NP2 from 0 to 1
    ARC1: S NP1 • VP by rule G-S1 from 0 to 1

    *sentence*: it flies *current-position*: 2
    *constituents*: add
    NOUN1: NOUN "flies" from 1 to 2
    VERB1: VERB "flies" from 1 to 2
    NP2: NP NOUN1 by rule G-NP1 from 1 to 2
    VP1: VP VERB1 by rule G-VP1 from 1 to 2
    S1: S NP1 VP1 from ARC1 from 0 to 2
    *active-arcs*: add
    ARC2: S NP2 • VP by rule G-S1 from 1 to 2
    ARC3: VP VERB1 • PP by rule G-VP2 from 1 to 2
    ARC4: S VERB1 • NP by rule G-S2 from 1 to 2

    *sentence*: it flies like *current-position*: 3
    *constituents*: add
    NOUN2: NOUN "like" from 2 to 3
    CONJ2: CONJ "like" from 2 to 3
    ADV1: ADV "like" from 2 to 3
    ADJ1: ADJ "like" from 2 to 3
    PREP1: PREP "like" from 2 to 3
    VERB2: VERB "like" from 2 to 3
    NP3: NP NOUN2 by rule G-NP1 from 2 to 3
    S2: S VERB1 NP3 from ARC4 from 1 to 3
    VP2: VP VERB2 by rule G-VP1 from 2 to 3
    S3: S NP2 VP2 from ARC2 from 1 to 3
    *active-arcs*: add
    ARC5: S NP3 • VP by rule G-S1 from 2 to 3
    ARC6: PP PREP1 • NP by rule G-PP1 from 2 to 3
    ARC7: VP VERB2 • PP by rule G-VP2 from 2 to 3
    ARC8: S VERB2 • NP by rule G-S2 from 2 to 3

    *sentence*: it flies like an *current-position*: 4
    *constituents*: add
    DET1: DET "an" from 3 to 4
    *active-arcs*: add
    ARC9: NP DET1 • NOUN by rule G-NP3 from 3 to 4

    *sentence*: it flies like an arrow *current-position*: 5
    *constituents*: add
    NOUN3: NOUN "arrow" from 4 to 5
    NP5: NP NOUN3 by rule G-NP1 from 4 to 5
    NP4: NP DET1 NOUN3 from ARC9 from 3 to 5
    PP1: PP PREP1 NP4 from ARC6 from 2 to 5
    S4: S VERB2 NP4 from ARC8 from 2 to 5
    VP3: VP VERB1 PP1 from ARC3 from 1 to 5
    S5: S NP1 VP3 from ARC1 from 0 to 5
    *active-arcs*: add
    ARC10: S NP5 • VP by rule G-S1 from 4 to 5
    ARC11: S NP4 • VP by rule G-S1 from 3 to 5

    Full parse:
    (S(NP(PRON "it"))
    (VP(VERB "flies")
    (PP (PREP "like") (NP (DET "an") (NOUN "arrow"))))))

  5. Machine Learning (11 marks)

    1. (6 marks) Describe the information gain calculations that are used in ID3 to to decide which attribute to "split" on.

      Answer: For solution, see lecture notes. Here's a concrete example, from question 3 of the Machine Learning Exercises.

      The attributes and their possible values are:
      texture: smooth wavy rough
      temperature: cold cool warm hot
      size: small medium large


      There are 7 positive examples and 4 negative ones overall, so the overall entropy is –(7/11)×log2(7/11) – (4/11)×log2(4/11) which is about 0.94566.

      In the exam, you will not have a calculator. You will not, of course, be expected to work out the logarithms by hand. :-) You should therefore leave the entropy as an unevaluated expression: –(7/11)×log2(7/11) – (4/11)×log2(4/11).)

      There are 3 values for the texture attribute: smooth, wavy and rough.

      For smooth, there are 4 positive examples and 1 negative, so the entropy is -(4/5)×log2(4/5) - (1/5)×log2(1/5) which is about 0.7219.

      (Again, in the exam, you would leave this as an unevaluated expression.)
      For wavy there is one positive and one negative example, so the entropy is -(1/2)×log2(1/2) - (1/2)×log2(1/2) = 1.

      (I would expect you to know that log2(1/2) = –1, and so evaluate this expression.)

      For rough there are two positive and two negative examples, so the entropy is -(1/2)×log2(1/2) - (1/2)×log2(1/2) = 1.

      Expected information for a split on texture is thus:

      0.7219×(5/11) + 1×(2/11) + 1×(4/11) = 0.8736

      Expected information gain for a split on texture is thus 0.94566 - 0.8736 = 0.07206.

    2. (5 marks)

      1. State the delta rule for backpropagation learning and indicate the significance of each symbol used in its statement

        Answer: Δwji = η × δj × yi
        where Δwji is the weight change from unit i to unit j, η is the learning rate, δj is the local gradient (whose definition is complex and goes beyond what is required for this question), and yi
        is the input signal to node j.

      2. Draw a diagram of a 3-layer feedforward neural network, with 3 neurons in each layer, labelling the layers appropriately and indicating the connections between neurons.

        Diagram of 3-layer feedforward net

Good luck in the exam!

CRICOS Provider Code No. 00098G