de Blanc, Peter Ontological Crises in Artificial Agents Value Systems. The Singularity Institute, San Francisco, CA, May 19.

Similar documents
Algorithmic Probability

COMP3702/7702 Artificial Intelligence Week1: Introduction Russell & Norvig ch.1-2.3, Hanna Kurniawati

Discriminative Direction for Kernel Classifiers

NP, polynomial-time mapping reductions, and NP-completeness

1 Computational problems

Adversarial Sequence Prediction

The poetry of mathematics

AI Programming CS S-09 Knowledge Representation

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?

Consistency of Feature Markov Processes

34.1 Polynomial time. Abstract problems

Introducing Proof 1. hsn.uk.net. Contents

It From Bit Or Bit From Us?

Designing Information Devices and Systems I Fall 2018 Lecture Notes Note Introduction to Linear Algebra the EECS Way

Duality in Probabilistic Automata

Mappings For Cognitive Semantic Interoperability

Think about systems of linear equations, their solutions, and how they might be represented with matrices.

An Ontology Diagram for Coordination of the Hylomorphically Treated Entities

Designing Information Devices and Systems I Spring 2018 Lecture Notes Note Introduction to Linear Algebra the EECS Way

Introduction to Special Relativity

Lecture notes on Turing machines

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations.

Introduction to Metalogic

Bias and No Free Lunch in Formal Measures of Intelligence

Ordinary Least Squares Linear Regression

Figure 1: Doing work on a block by pushing it across the floor.

A Dual Ontology of Nature, Life, and Person

The mousetrap. Figure by MIT OCW.

Russell s logicism. Jeff Speaks. September 26, 2007

Using Expectation-Maximization for Reinforcement Learning

Reinforcement Learning II. George Konidaris

Natural Language Processing Prof. Pawan Goyal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Preferred Mental Models in Qualitative Spatial Reasoning: A Cognitive Assessment of Allen s Calculus

Lecture 3,4: Multiparty Computation

The Simplex Method: An Example

Information Theory in Intelligent Decision Making

2. Prime and Maximal Ideals

Reinforcement Learning II. George Konidaris

13 : Variational Inference: Loopy Belief Propagation and Mean Field

The Cross Entropy Method for the N-Persons Iterated Prisoner s Dilemma

Stephen F Austin. Exponents and Logarithms. chapter 3

Abstract and Variable Sets in Category Theory 1

Distribution of Environments in Formal Measures of Intelligence: Extended Version

Delayed Choice Paradox

Supervised learning in single-stage feedforward networks

The World According to Wolfram

Opleiding Informatica

Unsupervised Learning with Permuted Data

Machine Learning 3. week

CS168: The Modern Algorithmic Toolbox Lecture #6: Regularization

MODAL LOGIC WITH SUBJUNCTIVE MARKERS: A NEW PERSPECTIVE ON RIGID DESIGNATION

Experiments on the Consciousness Prior

Lecture 21: Spectral Learning for Graphical Models

Modern Algebra Prof. Manindra Agrawal Department of Computer Science and Engineering Indian Institute of Technology, Kanpur

PHYSICS 107. Lecture 1: The Puzzle of Motion. In American universities there are three main types of physics courses for nonspecialists.

Machine Learning (Spring 2012) Principal Component Analysis

Hardy s Paradox. Chapter Introduction

Intersections of Planes and Systems of Linear Equations

Some problems related to Singer sets

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION

DEEP METAPHYSICAL INDETERMINACY

What Does Quantum Mechanics Suggest About Our Perceptions of Reality?

Leibniz s Ultimate Theory Soshichi Uchii

The P versus NP Problem. Dean Casalena University of Cape Town CSLDEA001

PHYSICS 715 COURSE NOTES WEEK 1

Local Cognitive Theory for Coordinate Systems. Components of Understanding:

0. Introduction 1 0. INTRODUCTION

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

How generative models develop in predictive processing

Seminaar Abstrakte Wiskunde Seminar in Abstract Mathematics Lecture notes in progress (27 March 2010)

CS188: Artificial Intelligence, Fall 2009 Written 2: MDPs, RL, and Probability

Dept. of Linguistics, Indiana University Fall 2015

Linear Algebra, Summer 2011, pt. 2

SYDE 112, LECTURE 7: Integration by Parts

FORMALISING SITUATED LEARNING IN COMPUTER-AIDED DESIGN

Complex Matrix Transformations

Gödel, Turing, the Undecidability Results and the Nature of Human Mind

CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING. Santiago Ontañón

Part II A Reexamination of Contemporary Utilitarianism

Control Theory : Course Summary

Probabilistic and Bayesian Machine Learning

CS 331: Artificial Intelligence Propositional Logic I. Knowledge-based Agents

Knowledge-based Agents. CS 331: Artificial Intelligence Propositional Logic I. Knowledge-based Agents. Outline. Knowledge-based Agents

Chapter 1 Introduction

Information Theory, Statistics, and Decision Trees

A Review of the E 3 Algorithm: Near-Optimal Reinforcement Learning in Polynomial Time

CS 380: ARTIFICIAL INTELLIGENCE

Computational Tasks and Models

Notation, Matrices, and Matrix Mathematics

SET THEORY IN LINEAR ALGEBRA

Deep Metaphysical Indeterminacy

Should we think of quantum probabilities as Bayesian probabilities? Yes, because facts never determine probabilities or quantum states.

Quantum Mechanics as Reality or Potentiality from A Psycho-Biological Perspective

CH 59 SQUARE ROOTS. Every positive number has two square roots. Ch 59 Square Roots. Introduction

Measuring Agent Intelligence via Hierarchies of Environments

Stochastic Processes

has a lot of good notes on GR and links to other pages. General Relativity Philosophy of general relativity.

Bayesian Learning. Artificial Intelligence Programming. 15-0: Learning vs. Deduction

The roots of computability theory. September 5, 2016

DEMBSKI S SPECIFIED COMPLEXITY: A SIMPLE ERROR IN ARITHMETIC

Transcription:

MIRI MACHINE INTELLIGENCE RESEARCH INSTITUTE Ontological Crises in Artificial Agents Value Systems Peter de Blanc Machine Intelligence Research Institute Abstract Decision-theoretic agents predict and evaluate the results of their actions using a model, or ontology, of their environment. An agent s goal, or utility function, may also be specified in terms of the states of, or entities within, its ontology. If the agent may upgrade or replace its ontology, it faces a crisis: the agent s original goal may not be well-defined with respect to its new ontology. This crisis must be resolved before the agent can make plans towards achieving its goals. We discuss in this paper which sorts of agents will undergo ontological crises and why we may want to create such agents. We present some concrete examples, and argue that a well-defined procedure for resolving ontological crises is needed. We point to some possible approaches to solving this problem, and evaluate these methods on our examples. de Blanc, Peter. 2011. Ontological Crises in Artificial Agents Value Systems. The Singularity Institute, San Francisco, CA, May 19. The Machine Intelligence Research Institute was previously known as the Singularity Institute.

Peter de Blanc 1. Introduction: Goals and Utility Functions An agent is any person or thing that performs actions in order to achieve a goal. These goals may involve anything of which the agent is aware, from its own inputs and outputs to distant physical objects. When creating an artificial agent, it is natural to be interested in which goals we choose to give it. When we create something, we usually do so because we expect it to be useful to us. Thus the goals we give to artificial agents should be things that we want to see accomplished. Programmers of artificial agents are then faced with the task of specifying a goal. In our discussion we assume that goals take the form of utility functions defined on the set of possible states within the agent s ontology. If a programmer is specifying a utility function by hand that is, by looking at the ontology and directly assigning utilities to different states then the ontology must be comprehensible to the programmer. This will typically be the case for an ontology that the programmer has designed, but not necessarily so for one that an agent has learned from experience. An agent with a fixed ontology is not a very powerful agent, so we would like to discuss an agent that begins with an ontology that its programmers understand and have specified a utility function over, and then upgrades or replaces its ontology. If the agent s utility function is defined in terms of states of, or objects within, its initial ontology, then it cannot evaluate utilities within its new ontology unless it translates its utility function somehow. Consider, for example, an agent schooled in classical physics. Perhaps this agent has a goal that is easy to specify in terms of the movement of atoms, such as to maintain a particular temperature within a given region of space. If we replace our agent s ontology with a quantum one, it is no longer obvious how the agent should evaluate the desirability of a given state. If its utility function is determined by temperature, and temperature is determined by the movement of atoms, then the agent s utility function is determined by the movement of atoms. Yet in a quantum worldview, atoms are not clearly-defined objects. Atoms are not even fundamental to a quantum worldview, so the agent s ontology may contain no reference to atoms whatsoever. How then, can the agent define its utility function? One way to sidestep the problem of ontological crises is to define the agent s utility function entirely in terms of its percepts, as the set of possible percept-sequences is one aspect of the agent s ontology that does not change. Marcus Hutter s universal agent AIXI (Hutter 2007) uses this approach, and always tries to maximize the values in its reward channel. Humans and other animals partially rely on a similar sort of reinforcement learning, but not entirely so. 1

Ontological Crises in Artificial Agents Value Systems We find the reinforcement learning approach unsatisfactory. As builders of artificial agents, we care about the changes to the environment that the agent will effect; any reward signal that the agent processes is only a proxy for these external changes. We would like to encode this information directly into the agent s utility function, rather than in an external system that the agent may seek to manipulate. 2. Our Approach We will approach this problem from the perspective of concrete, comprehensible ontologies. An AI programmer may specify an ontology by hand, and then specify a utility function for that ontology. We will then try to devise a systematic way to translate this utility function to different ontologies. When using this method in practice, we might expect the agent to have a probability distribution over many ontologies, perhaps specified concisely by the programmer as members of a parametric family. The programmer would specify a utility function on some concrete ontology which would be automatically translated to all other ontologies before the agent is turned on. In this way the agent has a complete utility function. However, for the purposes of this discussion, we may imagine that the agent has only two ontologies, one old and one new, which we may call O 0 and O 1. The agent s utility function is defined in terms of states of O 0, but it now believes O 1 to be a more accurate model of its environment. The agent now faces an ontological crisis the problem of translating its utility function to the new ontology O 1. In this paper we will present a method for addressing these problems. Our intention, however, is not to close the book on ontological crises, but rather to open it. Our method is of an ad-hoc character and only defined for a certain class of ontologies. Furthermore it is not computationally tractable for large ontologies. We hope that this discussion will inspire other thinkers to consider the problem of ontological crises and develop new solutions. 3. Finite State Models We will now consider a specific kind of ontology, which we may call a finite state model. These models have some finite set of possible hidden states, which the agent does not directly observe. On each time step, the model inputs some symbol (the agent s output), enters some hidden state, and outputs some symbol (the agent s input). The model s output depends (stochastically) only on its current state, while its state depends (stochastically) on both the input and the previous state. 2

Peter de Blanc Let us call the agent s output symbols motor symbols and the agent s input symbols sensor symbols. We will call the sets of symbols the motor alphabet and the sensor alphabet, denoted M and S respectively. We will assume that the alphabets are fixed properties of our agent s embodiment; we will not consider models with different alphabets. Let m = M and s = S. Then a model with n states may be completely specified by m different n n transition matrices and one s n output matrix. For each x M, let us call the state transition matrix T x. Note that the superscript here is not an exponent. We may call the output probability matrix A. Since we will be speaking of two ontologies, O 0 and O 1, we will use subscripts to indicate which ontology we are taking these matrices from; for instance, T0 x is the state transition matrix for action x in the O 0 ontology. 4. Maps between Ontologies Our basic approach to translating our utility function from O 0 to O 1 will be to construct a function from O 1 to O 0 and compose our utility function with this function. If U : O 0 R (1) is a utility function defined on O 0, and φ : O 1 O 0, then U φ is a utility function defined on O 1. The function φ we will seek to define will be a stochastic function; its output will not be a single state within O 0, but a probability distribution over states. Thus if O 0 has n 0 states while O 1 has n 1 states, φ will be most naturally expressed as an n 0 n 1 matrix. Let us consider some desiderata for φ: 1. φ should be determined by the structure of the models O 0 and O 1 ; the way in which the states are labeled is irrelevant. 2. If O 0 and O 1 are isomorphic to each other, then φ should be an isomorphism. This may seem irrelevant, for if O 1 is isomorphic to O 0, then there is no need to change models at all. Nevertheless, few would object to 2 on grounds other than irrelevance, and 2 may be seen as a special case of a more general statement: 3

Ontological Crises in Artificial Agents Value Systems 3. If O 0 and O 1 are nearly isomorphic to each other, then φ should nearly be an isomorphism. This criterion is certainly relevant; since O 0 and O 1 are both models of the same reality, they can be expected to be similar to that reality, and thus similar to each other. In accordance with these desiderata, we will try to construct a function that is as much like an isomorphism as possible. To accomplish this, we will define in quantitative terms what we mean by like an isomorphism. First, we observe that isomorphisms are invertible functions; thus, we will define a second function, which we fancifully call φ 1 : O 0 O 1, even though it may not be a true inverse of φ, and we will optimize both φ and φ 1 to be like isomorphisms. Our criterion is a combination of the computer science notion of bisimulation with the information-theoretic idea of Kullback-Leibler divergence. Bisimulation means that either model may be used to simulate the other, using φ and φ 1 to translate states between models. Thus, for any action x, we would like φ 1 T0 x φ to approximate T1 x. By this we mean that we should be able to predict as accurately as possible the result of some action x in O 1 by translating our distribution for the initial state in O 1 to a distribution over O 0 (using the function φ), predicting the result of action x within O 0, and translating this result back to O 1 using φ 1. Similarly, we would like to use O 1 to predict the behavior of O 0. Furthermore, we want to to optimize φ and φ 1 so that both models will make similar predictions about sensory data. Thus A 0 φ should be close to A 1 and A 1 φ 1 should be close to A 0. To measure distance between two matrices, we treat the columns vectors as probability distributions and sum the Kullback-Leibler divergences of the columns. For two matrices P and Q, let D K L(P Q) be the sum of the Kullback-Leibler divergences of the columns. When calculating Kullback-Leibler divergence, we consider the columns of the A and T matrices to be the true distributions, while those depending on φ or φ 1 are regarded as the approximations. So we choose φ and φ 1 to minimize the quantity ( ) D KL (T1 x φ 1 T0 x φ) + D KL (A 1 φ 1 A 0 φ) x M ( ) + D KL (T0 x φt1 x φ 1 ) + D KL (A 0 φa 1 φ 1 ) x M Using a simple hill-climbing algorithm, we have tested our criterion on a simple example. 4

5. Example: The Long Corridor Peter de Blanc The agent initially believes that it is standing in a corridor consisting of four discrete locations. The agent s actions are to move left or right. If the agent is already at the end of the corridor and attempts to move further in that direction, it will remain where it is. The agent can see whether it standing at the left end, the right end, or at neither end of the corridor. The agent s goal is to stand at the right end of the corridor. Now the agent discovers that this ontology is incorrect; the corridor actually consists of five discrete locations. What, then, should the agent do? Intuitively, it seems most plausible that the agent should stand at the right end of the corridor. Stretching plausibility a bit, perhaps the agent should stand one step away from the right end of the corridor, since the corridor is longer than expected. Any other solution seems counterintuitive. Our initial, four-state ontology O 0 can be represented in matrix form as follows: T0 L = 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0, T 0 R = 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 1 0 0 0 A 0 = 0 1 1 0 0 0 0 1 (2) And the five-state ontology O 1 can be represented as: T1 L = (3) 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0, T1 R = 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 A 1 = 0 1 1 1 0 0 0 0 0 1 5

Ontological Crises in Artificial Agents Value Systems By hill-climbing from random initial values, our program found several local optima. After 10 runs, our best result, to three significant figures, was: 1 0.014 0.001 0 1 0 0 0 0 0 0.715 0 0 φ = 0 1 0.503 0 0 0 0 0.496 1 0, φ 1 = 0 0.270 0.283 0 (4) 0 0 0.715 0 0 0 0 0 1 0 0 0 1 We can now make an interesting observation: φφ 1 is close to an identity matrix, as is φ 1 φ. Thus, after mapping from one ontology to the other, we can nearly recover our initial information. φφ 1 = 1 0.014 0.137 0 0 0.851 0.142 0 0 0.134 0.856 0 0 0 0.001 1, φ 1 φ = 1 0.014 0.008 0.001 0 0 0.715 0.360 0 0 0 0.270 0.276 0.283 0 0 0 0.355 0.715 0 0 0 0.001 0 1 (5) The matrix φ represents the following function mapping the 5-state ontology to the 4-state ontology: The black arrows indicate near-certainty; the gray arrows indicate probabilities of about 1. 2 If we compose φ with our utility function, we obtain a utility of 1 for the right square and a utility of 0 for the other squares, which agrees with our intuitions. 6. Outlook Those wishing to extend our algorithm as presented may consider what to do when the agent s sensors or motors are replaced, how to deal with differently-sized time steps, how 6

Peter de Blanc to deal with continuous models, and how to efficiently find mappings between larger, structured ontologies. Furthermore, there remain difficult philosophical problems. We have made a distinction between the agent s uncertainty about which model is correct and the agent s uncertainty about which state the world is in within the model. We may wish to eliminate this distinction; we could specify a single model, but only give utilities for some states of the model. We would then like the agent to generalize this utility function to the entire state space of the model. Human beings also confront ontological crises. We should find out what cognitive algorithms humans use to solve the same problems described in this paper. If we wish to build agents that maximize human values, this may be aided by knowing how humans re-interpret their values in new ontologies. We hope that other thinkers will consider these questions carefully. 7

Acknowledgments Ontological Crises in Artificial Agents Value Systems Thanks to Roko Mijic, with whom I discussed these ideas in 2009, and to Steve Rayhawk, who gave extensive criticism on earlier versions of this paper. References Hutter, Marcus. 2007. Universal Algorithmic Intelligence: A Mathematical Top Down Approach. In Artificial General Intelligence, edited by Ben Goertzel and Cassio Pennachin, 227 290. Cognitive Technologies. Berlin: Springer. doi:10.1007/978-3-540-68677-4_8. 8