• Course Overview

    • What is course about?
      • AI as what humans do
      • AI as what we cannot do yet
      • AI as systems to achieve weak goals
    • Overview vs detail - some of each
    • Aiming at top of class
    • Discussion in class - need to take own notes - almost problem based learning
    • Plagiarism policy
    • 4 main things we focus on:
      • Systems
        • Code reviews
        • Version control
      • Search/Dynamic Programming
      • Bayes
        • Subjective probability
      • Representation
        • How the world is represented in your agent
        • The agent cannot understand anything that is in this representation
          • e.g. In Robocup, the field is 2D, so the robots cannot comprehend the ball being in the air
  • Agents building = system building

    • KISS ( Keep It Simple & Stupid)
      • Premature profiling is the root of all evil
      • Good software engineering helps
      • Code reviews
      • Only as strong as weakest link
      • Trade-offs between approaches
  • Perception:

    • Bayes' Rule
    • (and Take a machine learning course)
  • Overview of Agent in world

    • Sense/Act cycle (+reward)
      • Consider the agent and world to be 2 separate entities
      • Information passes between these 2 entities (information interface)
      • The agent senses things going on in the world, and then it carries out an action upon the world
      • Observations and actions can also have a quality, called a "reward", associated with them
      • This reward can be for the instantaneous, i.e. for the current observation/action, or accumulated (simplified explanation)
    • Definition of a history sequence
      • The history sequence is a series of observations (o) and actions (a)
        • e.g. o1a1o2a2o3a3
      • Then you can also include the reward as part of this sequence, either as part of the observation/action or separate (? not sure)
  • Many assumptions/constraints to make things tractable

    • My approach:
      • Introduce the general theory
      • Give specific examples, some simple, some complex
  • Types of systems:

    • Discrete vs Continuous
      • Finite vs Infinite
      • Continuous (Smooth, Discontinuous, etc)
        • Discrete is usually linked with Discontinuous because it tends to have "jumps"
    • Fully Observable vs Partially Observable
      • Fully Observable - when agent sensors world means it knows everything about the world
      • Partially Observable - almost all real agents can only see what they can observe through sensors
    • Markov vs. Non-markov
      • Assuming time is discrete
      • In Markov model, what state is happening at the moment does not depend on what state & action in previous time
      • It is not necessary to remember the full history, since we only want to record sufficient statistics to be able to predict what happens next
    • Deterministic vs Non-deterministic vs Stochastic (Non-deterministic with probabilities)
      • Deterministic, no probability
      • Non-deterministic, occasionally probability is given
      • Stochastic, probability is always given
    • Implicit vs explicit time
      • Systems often have collection of states & actions
      • Implicit time means all action to change state I into state II assume 1 time step
    • Tracking vs Acting
      • Acting --> at every time step, perform an action, given sufficient statistics
      • Tracking --> the world perform the change for the agent, finding sufficient statistic to find a state or an action to perform. (can be think of a subset of Acting)
    • Online vs Offline (Anytime algs)
      • Offline - Not attached to robot
      • Online - continuous problems
    • Stationary vs Non-stationary (world changes over time without actions of agent)
  • Goal types:

    • Goal = state
    • Goal = trajectory
      • Goal is specified in a path, go through X states of states (exact goal)
    • Maintenance goals
    • Reward (Loss)
  • Plan types:

    • Complete policy
    • Markov policy
      • Only previous state (not the prior history) is relevant
    • Linear plan
    • Telioreactive plan