logical induc-on - artificial intelligence · logical induc-on andrew critch...

LogicalInduc-on AndrewCritch [email protected]

LogicalInduc-on

Sco8Garrabrant,AndrewCritch,TsviBenson-Tilsen,NateSoares,JessicaTaylor

(sco8|critch|tsvi|nate|jessica)@intelligence.org

MachineIntelligenceResearchIns-tute

h8p://intelligence.org/


Outline

Roughplanforthistalk:[5mins]Theproblemoflogicalinduc-on[10mins]Mo-va-onfromAIsafetyandotherfields[30mins]Beamerpresenta-onoftechnicalresults[15mins]Implica-onsandtake-aways


1min 1day ∞

#1.P(D10=7) 10% 10% 10%

#2.P(D10=7|snapshot) 10% 15% 16%

#3.P(10thdigitof√(10)=7) 10% 1% 0%

snapshotfor#2:

Credencesshouldchangewith@mespentthinking/compu@ng:Probabilitytheorygivesrulesforhowprobabili-esshouldrelatetoeachotherandchangewithnewobserva-ons,assuminglogicalomniscience…

Also,50%wouldbeaworseanswertostartwithhere...canwemakeaprincipledtheoryfromwhichthisclaimwouldfollows?

…butwhatrulesshouldcredencesfollowover-me,ascomputa-oniscarriedoutonobserva-onsthathavealreadybeenmade?

Goal:callthepurpleprocesses“logicalinduc@on”andfigureouthowitshouldwork.


Whydevelopatheore-calmodeloflogicalinduc-on?

Q:HowcanwereasonaboutahighlycapableAIsystembeforeitexists?A:Oneapproachistomodelitas“goodatstuff”,like:choosingac@onstoachieveobjec-vesgivenbeliefsà  itroughlyobeysra@onalchoicetheory(e.g.VNMtheorem)

upda@ngbeliefsaccordingtonewevidenceàitroughlyobeysprobabilitytheory(e.g.Bayes’theorem)compu@ngbeliefupdateswithresourcelimita-onsà  itroughlyobeys<?????>theory(e.g.<*****>theorem)

Inhopesofdevelopingit,<?????>hasbeencalled“logicaluncertainty”,andwecalltheprocessofrefininglogicaluncertain-es“logicalinduc@on”.


Pastdesideratafor“goodreasoning”underlogicaluncertainty:

1.   computableapproximability—theprocessshouldbeapproximablebyaTuringMachine.(Demsky,2012)2.   coherentlimit—anerinfinite-me,credencesshouldsa-sfythelawsofprobabilitytheory,suchas

(A→B)⇒(P(A)≤P(B)).(Gaifman,1964).3.   par@alcoherence:credencesatfinites-meshouldroughlysa-sfysomecoherenceproper-es;suchas

Q(A^B)+Q(AvB)≈Q(A)+Q(B)(Good,1950;Hacking,1967)4.   calibra@on—theprocessshouldberightroughly90%ofthe-mewhenit’s90%confident.(Savage,1967)5.   introspec@on—theprocessshouldbeabletodescribeandreasonaboutitself.(Hin-kka,1962;Fagin,1995;

Chris-ano,2013;Campbell-Moore,2015)6.   self-trust—itshouldunderstandthatitisreliableandthatitwillbecomemorereliablewith-me

(Hilbert,1900)7.   non-dogma@sm—itdoesnotassign100%or0%credencetoclaimsunlesstheyhavebeenprovenor

disproven,respec-vely(Carnap,1962;Gaifman,1982;Snir,1982)8.   PA-capable—itshouldassignnon-zeroprobabilitytotheconsistencyofPeanoArithme-c,i.e.tothesetof

consistentcomple-onsofPA.9.   roughinexploitability—itshouldnotbeeasyto``dutchbook’’theprocess/makebetsagainstitthatare

guaranteedtowin(vonNeumannandMorgenstern1944;deFine{1979)10.   Gaifmaninduc@vity—itshouldcometobelieve(∀x,f(x))inthelimitasitexamineseveryexampleofxand

confirmsf(x)(Gaifman1964,Hu8er2013)11.   Efficiency—itrunsinpolynomial(preferablyquadra-c)-me12.   Decision-relevant—shouldbeabletofocuscomputa-ononques-onsrelevanttodecisions.13.   Updatesonoldevidence(Glymour,1980)


Let’sdeferapplica@onsun-llaterinthetalk,whentheideahasbeenmademoreprecise.

Anyques-onsfarabouttheproblemitselfbeforewegetintoformaldefini-ons?


Formalizinglogicalinduc-on

PowerPointàBeamer


Formalizinglogicalinduc-on

BeameràPowerPoint


Thecurrentstateoflogicaluncertaintytheory

DomainofStudy

AgentConcept

Minimalis@cSufficientCondi@ons

DesirabilityArguments Feasibility

ra-onalchoicetheory/

economics

VNMu-litymaximizer VNMaxioms Dutchbookarguments,

compellingaxioms,…AIXI,POMDPsolvers,…

probabilitytheory

Bayesianupdater

axiomsofprobabilitytheory

Dutchbookarguments,compellingaxioms,…

Solomonoffinduc-on

logicaluncertaintytheory

Garrabrantinductor ???

Dutchbookarguments,historical

desiderata,…LIA2016

recentprogress


Pathsforward

*Musteventuallyaddresslogicaluncertaintyimplicitlyorexplicitly,soexpectsomeconvergence.

1.   Improvinglogicaluncertaintytheory(minimalis-ccondi-ons,moreconsequences…)

2.   UsingGarrabrantinductors/LIA2016toposeandsolvenewproblemsinAIalignment

3.   OtherapproachestoAIalignment*MIRI’sfocus


Howwilllogicalinduc-onbeapplicable?

Conceptualtoolsforreasoningaboutincen@ves,compe@@on,andgoalpursuitareunder-developedforcomputa-onallyboundedagents.Theypresumeagentsarelogicallyomniscient,becausewealreadyhadgoodtheore-calmodelsfordevelopingthemthatway:•  Gametheoryandeconomics:

–  VonNeumann-Morgensternu-litytheorem–  Nashequilibriaandcorrelatedequilibria–  Efficientmarkettheory:

•  Fundamentaltheoremsofwelfareeconomics•  Coase’sTheorem

–  ValueofInforma-on(VOI)•  Mechanismdesign:

–  Gibbard–Sa8erthwaitetheorem–  Myerson–Sa8erthwaitetheorem–  RevenueEquivalencetheorem

Wecanuseourtheore-calmodeloflogicalinduc-ontorefineandexpandthesefieldsforbe8erapplica-ontoar-ficialagents.


Currently,gametheoryanalyzesscenarioswithlogicallyomniscientagents…

Nowwecanbe8ertheore-callyanalyzescenarioswithboundedreasoners:

Visualizingatheore-calapplica-on


Whathavewelearnedsofar?Thefollowingaremorefeasiblethanonemightthink:•  Inexploitability.Analgorithmcansa-sfyafairlyarbitrarysetofinexploitabilitycondi-onsusingBrouwer’sFPT.

•  Self-trust.Introspec-onandself-trustneednotleadtomathema-calparadoxes.

•  [email protected],byanuncomputablylargemarginonpoly-megenerableques-ons.


Whathavewelearnedsofar?Thefollowingareless“required”thanonemightthinkforara-onalgamblertoavoidexploita-on:•  Calibra@on.Sofaritlookslikeoneneedonlybecalibratedaboutlogicalbetsthatarese8ledsufficientlyquickly(thisisbeingac-velyresearched).

•  Hard-codedbeliefcoherence.Apowerfulbet-balancingprocedurecanandmustlearnto“mimic”deduc-verulesusedtose8lesitsbets.


Metaupdates

MIRI’sgeneralapproachincludesdevelop“big”ques-onsabouthowAIcanandshouldwork,pastthestagesofphilosophicalconversa-onandintothedomainofmathandCS.Philosophy Mathema-cs/CS

bigques-onsaboutAI

technicalanswers


Metaupdates

Iwasnotpersonallyexpec-nglogicalinduc-ontobe“solved”inthiswayforatleastadecade,soI’veupdatedthat:•  themethodologyofbreakingunse8ledphilosophicalques-onsdownintomath/CSandgrindingthroughthemismorefrui�ulthanIthought;and

•  perhapsotherseemingly“outofreach”problemsinAIalignment,likedecisiontheoryandlogicalcounterfactuals,mightbeamenabletothisapproach.


Thanks!

To•  ScobGarrabrant,forthecoreideaandmanyrapidsubsequentinsights

•  TsviBensonTilsen,NateSoares,andJessicaTaylorforcoauthoringthepaper

•  JimmyRintjemaforalotofhelpwithLaTeXbugsandcollabora-veedi-ngissues


SlidesfromothertalksIcouldendupwan-ngtouseinresponsetoques-ons:


Someques-onsIsitfeasibletobuildausefulsuperintelligencethat,e.g.,•  Sharesourvalues,andwillnottakethemtoextremes?(“valuelearning”)

•  Willnotcompetewithusforresources?(“convergentincen-ves”)

•  Willnotresistusmodifyingitsgoalsorshu{ngitdown?(“corribility”)

•  Canunderstanditselfwithoutderivingcontradic-onsviaboundedLöb’sTheorem?(“self-reflec-vestability”)


Examplesoftechnicalunderstanding

•  Vickreysecond-priceauc-ons(1961):– Well-understoodop-malityresults(truthfulbiddingisop-mal)

– Real-worldapplica-ons,(networkrou-ng)

– Decadesofpeer-review


•  Nashequilibria(1951):


•  ClassicalGameTheory(1953):

Anextensiveformgame.


Problem:CounterfactualsforSelf-Reflec-veAgents

WhatdoesitmeanforaprogramAtoimprovesomefeatureofalargerprogramEinwhichAisrunning,andwhichAcanunderstand?

def Environment (): … def Agent(senseData) : def Utility(globalVariables) : … … … do Agent(senseData1) … do Agent(senseData2) … end


(op-onalpausefordiscussionofIndigna-onBot)


Example:πmaximizing

WhatwouldhappenifIchangedthefirstdigitofπto9?Thisseemsabsurdbecauseπislogicallydetermined.However,theresultofrunningacomputerprogram(e.g.theevolu-onoftheSchrodingerequa-on)islogicallydeterminedbyitssourcecodeandinputs…


…whenanagentreasonstodoX“becauseXisbe8erthanY”,consideringwhatwouldhappenifitdidYinsteadmeansconsideringamathema-calimpossibility.(Iftheagenthasaccesstoitsownsourcecode,itcanderiveacontradic-onfromthehypothesis“IdoY”,fromwhichanythingfollows.ThisisclearlynothowwewantourAItoreason.Howdowe?


Currentformalismsare“Cartesian”inthattheyseparateanagent’ssourcecodeandcogni-vemachineryformitsenvironment.

Thisisatypeerror,andincombina-onwithothersubtle-es,ithassomeseriousconsequences.


Examples(page1)•  RobustCoopera?oninthePrisoners’Dilemma(LaVictoireetal,2014)demonstratesnon-classicalcoopera-vebehaviorinagentswithopensourcecodes;

•  MemoryIssuesofIntelligentAgents(OrseauandRing,AGI2012)notesthatCartesianagentsareoblivioustodamagetotheircogni-vemachinery;


Examples(page2)•  Space-TimeEmbeddedIntelligence(OrseauandRing,AGI2012)providesamorenaturalizedframeworkforagentsinsideenvironments;

•  Problemsofself-referenceinself-improvingspace-?meembeddedintelligence(FallensteinandSoares,AGI2014)iden-fiesproblemspersis-ngintheOrseau-Ringframework,includingprocras-na-onandissueswithself-trustarisingfromLöb’stheorem;


Examples(page3)•  VingeanReflec?on:ReliableReasoningforSelf-ImprovingAgents(FallensteinandSoares,2015)providessomeapproachestoresolvingsomeoftheseissues;

•  …lotsmore;seeintelligence.org/researchforaddi-onalreading.

logical induc-on - artificial intelligence · logical induc-on andrew critch...

Documents