• Course Overview

• AI as what humans do
• AI as what we cannot do yet
• AI as systems to achieve weak goals
• Overview vs detail - some of each
• Aiming at top of class
• Discussion in class - need to take own notes - almost problem based learning
• Plagiarism policy
• 4 main things we focus on:
• Systems
• Code reviews
• Version control
• Search/Dynamic Programming
• Bayes
• Subjective probability
• Representation
• How the world is represented in your agent
• The agent cannot understand anything that is in this representation
• e.g. In Robocup, the field is 2D, so the robots cannot comprehend the ball being in the air
• Agents building = system building

• KISS ( Keep It Simple & Stupid)
• Premature profiling is the root of all evil
• Good software engineering helps
• Code reviews
• Only as strong as weakest link
• Perception:

• Bayes' Rule
• (and Take a machine learning course)
• Overview of Agent in world

• Sense/Act cycle (+reward)
• Consider the agent and world to be 2 separate entities
• Information passes between these 2 entities (information interface)
• The agent senses things going on in the world, and then it carries out an action upon the world
• Observations and actions can also have a quality, called a "reward", associated with them
• This reward can be for the instantaneous, i.e. for the current observation/action, or accumulated (simplified explanation)
• Definition of a history sequence
• The history sequence is a series of observations (o) and actions (a)
• e.g. o1a1o2a2o3a3
• Then you can also include the reward as part of this sequence, either as part of the observation/action or separate (? not sure)
• Many assumptions/constraints to make things tractable

• My approach:
• Introduce the general theory
• Give specific examples, some simple, some complex
• Types of systems:

• Discrete vs Continuous
• Finite vs Infinite
• Continuous (Smooth, Discontinuous, etc)
• Discrete is usually linked with Discontinuous because it tends to have "jumps"
• Fully Observable vs Partially Observable
• Fully Observable - when agent sensors world means it knows everything about the world
• Partially Observable - almost all real agents can only see what they can observe through sensors
• Markov vs. Non-markov
• Assuming time is discrete
• In Markov model, what state is happening at the moment does not depend on what state & action in previous time
• It is not necessary to remember the full history, since we only want to record sufficient statistics to be able to predict what happens next
• Deterministic vs Non-deterministic vs Stochastic (Non-deterministic with probabilities)
• Deterministic, no probability
• Non-deterministic, occasionally probability is given
• Stochastic, probability is always given
• Implicit vs explicit time
• Systems often have collection of states & actions
• Implicit time means all action to change state I into state II assume 1 time step
• Tracking vs Acting
• Acting --> at every time step, perform an action, given sufficient statistics
• Tracking --> the world perform the change for the agent, finding sufficient statistic to find a state or an action to perform. (can be think of a subset of Acting)
• Online vs Offline (Anytime algs)
• Offline - Not attached to robot
• Online - continuous problems
• Stationary vs Non-stationary (world changes over time without actions of agent)
• Goal types:

• Goal = state
• Goal = trajectory
• Goal is specified in a path, go through X states of states (exact goal)
• Maintenance goals
• Reward (Loss)
• Plan types:

• Complete policy
• Markov policy
• Only previous state (not the prior history) is relevant
• Linear plan
• Telioreactive plan