Course Overview
- What is course about?
- AI as what humans do
- AI as what we cannot do yet
- AI as systems to achieve weak goals
- Overview vs detail - some of each
- Aiming at top of class
- Discussion in class - need to take own notes - almost problem based learning
- Plagiarism policy
- 4 main things we focus on:
- Systems
- Code reviews
- Version control
- Search/Dynamic Programming
- Bayes
- Subjective probability
- Representation
- How the world is represented in your agent
- The agent cannot understand anything that is in this representation
- e.g. In Robocup, the field is 2D, so the robots cannot comprehend the ball being in the air
- Systems
- What is course about?
Agents building = system building
- KISS ( Keep It Simple & Stupid)
- Premature profiling is the root of all evil
- Good software engineering helps
- Code reviews
- Only as strong as weakest link
- Trade-offs between approaches
- KISS ( Keep It Simple & Stupid)
Perception:
- Bayes' Rule
- (and Take a machine learning course)
Overview of Agent in world
- Sense/Act cycle (+reward)
- Consider the agent and world to be 2 separate entities
- Information passes between these 2 entities (information interface)
- The agent senses things going on in the world, and then it carries out an action upon the world
- Observations and actions can also have a quality, called a "reward", associated with them
- This reward can be for the instantaneous, i.e. for the current observation/action, or accumulated (simplified explanation)
- Definition of a history sequence
- The history sequence is a series of observations (o) and actions (a)
- e.g. o1a1o2a2o3a3
- Then you can also include the reward as part of this sequence, either as part of the observation/action or separate (? not sure)
- The history sequence is a series of observations (o) and actions (a)
- Sense/Act cycle (+reward)
Many assumptions/constraints to make things tractable
- My approach:
- Introduce the general theory
- Give specific examples, some simple, some complex
- My approach:
Types of systems:
- Discrete vs Continuous
- Finite vs Infinite
- Continuous (Smooth, Discontinuous, etc)
- Discrete is usually linked with Discontinuous because it tends to have "jumps"
- Fully Observable vs Partially Observable
- Fully Observable - when agent sensors world means it knows everything about the world
- Partially Observable - almost all real agents can only see what they can observe through sensors
- Markov vs. Non-markov
- Assuming time is discrete
- In Markov model, what state is happening at the moment does not depend on what state & action in previous time
- It is not necessary to remember the full history, since we only want to record sufficient statistics to be able to predict what happens next
- Deterministic vs Non-deterministic vs Stochastic (Non-deterministic with probabilities)
- Deterministic, no probability
- Non-deterministic, occasionally probability is given
- Stochastic, probability is always given
- Implicit vs explicit time
- Systems often have collection of states & actions
- Implicit time means all action to change state I into state II assume 1 time step
- Tracking vs Acting
- Acting --> at every time step, perform an action, given sufficient statistics
- Tracking --> the world perform the change for the agent, finding sufficient statistic to find a state or an action to perform. (can be think of a subset of Acting)
- Online vs Offline (Anytime algs)
- Offline - Not attached to robot
- Online - continuous problems
- Stationary vs Non-stationary (world changes over time without actions of agent)
- Discrete vs Continuous
Goal types:
- Goal = state
- Goal = trajectory
- Goal is specified in a path, go through X states of states (exact goal)
- Maintenance goals
- Reward (Loss)
Plan types:
- Complete policy
- Markov policy
- Only previous state (not the prior history) is relevant
- Linear plan
- Telioreactive plan