2015ijcnn manic

A Minimal Architecture for General CognitionMichael S. GashlerDepartment of Computer Scienceand Computer EngineeringUniversity of ArkansasFayetteville, Arkansas 72701, [email protected] KindleDepartment of Computer Scienceand Computer EngineeringUniversity of ArkansasFayetteville, Arkansas 72701, [email protected] R. SmithDepartment of Computer ScienceBrigham Young UniversityProvo, Utah 84602, [email protected] minimalistic cognitive architecture called MANICis presented. The MANIC architecture requires only three func-tion approximating models, and one state machine. Even with sofewmajor components, it is theoreticallysufcient toachievefunctional equivalence with all other cognitive architectures,andcanbe practically trained. Insteadof seeking to trasferarchitectural inspiration from biology into articial intelligence,MANICseeks tominimize noveltyandfollowthe most well-established constructs that have evolved within various sub-elds of data science. Fromthis perspective, MANICoffersanalternateapproachtoalong-standingobjectiveof articialintelligence. This paper provides atheoretical analysis of theMANIC architecture.I. INTRODUCTIONWhen articial intelligence was rst discussed in the 1950sas a research project at Dartmouth [1], many of the early pio-neers in AI optimistically looked forward to creating intelligentagents. In1965Herbert Simonpredictedthat machinewillbe capable, within twenty years, of doing any work man cando[2]. In1967, MarvinMinskyagreed, writingwithinageneration. . . theproblemofcreatingarticial intelligencewill be substantiallysolved [3]. Since that time, articialgeneral intelligencehasnot yet beenachievedaspredicted.Articial intelligence has been successfully applied in severalspecicapplicationssuchasnaturallanguageprocessingandcomputer vision. However, thesedomainsarenowsubeldsand sometimes even divided into sub-subelds where there isoften little communication [4].Identifying the minimal architecture necessary for a partic-ulartask isan importantstep forfocusingsubsequent effortsto implement practical solutions. For example, the Turingmachine, whichdenedthe minimal architecture necessaryfor general computation, servedasthebasisfor subsequentimplementations of general purpose computers. In this paper,wepresent aMinimal ArchitectureNecessaryforIntelligentCognition(MANIC). Weshowthat MANICsimultaneouslyachieves theoretical sufciency for general cognition whilebeing practical to train. Additionally, we identify a few inter-esting parallels that may be analogous with human cognition.In biological evolution, the ttest creatures are more likelytosurvive. Analogously, incomputational science, themosteffectivealgorithmsaremorelikelytobeused. It isnaturaltosuppose, therefore, that theconstructs that evolvewithinthecomputational sciences might begin, tosomeextent, tomirrorthosethathavealreadyevolvedinbiologicalsystems.Thus, inspiration for an effective cognitive architure need notnecessarilyoriginatefrombiology, but mayalsocomefromthestructuresthat self-organizewithintheeldsofmachinelearningandarticial intelligence. Our ideasarenot rootedintheobservations of humanintelligence, but inseekingasimplisticarchitecturethatcouldplausiblyexplainhigh-levelhuman-like cognitive functionality.In this context, we propose that high-level cognitive func-tionalitycanbeexplainedwiththreeprimarycapabilities:1-environmental perception, 2-planning, and3-sentience. Forthisdiscussion, wedeneEnvironmental perceptionasanunderstandingof theagents environment andsituation. Anagentperceivesitsenvironmentifitmodelstheenvironmentto a sufcient extent that it can accurately anticipate theconsequences of candidate actions. We dene Planningtorefer toanabilitytochoose actions that leadtodesirableoutcomes, particularly under conditions that differ from thosepreviouslyencountered. WedeneSentiencetorefer toanawareness of feelings that summarize anagents condition,coupled with a desire to respond to them. We carefully showthat MANICsatises criteria1and2. Criterion3containsaspectsthat arenot yet well-dened, but wealsoproposeaplausibletheoryforsentiencetowhichMANICcanachievefunctional equivalence.Figure 1shows a subjective plot of several representa-tive cognitive challenges. This plot attempts to rank thesechallengesaccordingtoour rst twocapabilites: perceptionandplanning. Due tothe subjective nature of these evalu-ations exact positions onthis chart maybedebated, but itissignicanttonotethatthechallengestypicallyconsideredtobemost representativeofhumancognitionarethosethatsimultaneously require capabilities in both perception andplanning. Problemsrequiringonlyoneofthesetwoabilitieshavebeenlargelysolvedintherespectivesub-discplinesofmachine learning and articial intelligence, in some caseseven exceeding human capabilities. It follows that human-likecognition requires an integration of the recent advances in bothmachinelearningandarticialintelligence.MANICseekstoidentify a natural integration of components developed in theserespective elds.This document is laid out as follows: Section II de-scribestheMANICcognitivearchitecture. SectionIIIshowsthat MANICistheoreticallysufcient toaccomplishgeneralcognition. SectionIVdescribes howthe MANICcognitivearchitecture can be practically trained using existing methods.Section Vdiscusses a plausible theory for sentience, anddescribes how the MANIC architecture can achieve functionallong-term planningdeep perceptionnear-term planningshallow perceptionCAPTCHAsDatingChessMaze-runningPoliticsDrivingcarsTic-tac-toePong3D First-personshooter gamesMechanicalassemblyFace recognitionChat botsCheckers2D side-scroller gamesRTS gamesCookingResearchMissile-guidanceSoccerBabysittingEquation solvingClassicationRobot vacuumsFig. 1. Asubjective plot of several cognitive challenges. Advances inmachine learning, especially with deep artical neural networks, have solvedmanyproblems that requiredeepperception(top-left quadrant). Advancesinarticial intelligencehavesolvedmanyproblemsthat requirelong-termplanning (bottom-right quadrant). The unsolved challenges (mostly in the top-right quadrant) require a combination of both deep perception and long-termplanning. Hence, the necessity of cognitive architectures, which combine ad-vances in both sub-disciplines to address problems that require a combinationof cognitive abilities.equivalence with it. Finally, Section VI concludes by summa-rizing the contributions of this paper.II. ARCHITECTUREAnimplementationof MANICcanbedownloadedfromhttps://github.com/mikegashler/manic.A cognitive architecture describes a type of software agent.It operates in a world that may be either physical or virtual. Itobserves its world through a set of percepts, and operates onits world through a set of actions. Consistent with other simplearticial architectures, MANIC queries its percepts at regulartime intervals, t = 0, 1, 2, . . ., to receive corresponding vectorsof observedvalues, x0, x1, x2, . . .. It choosesactionvectorstoperformat eachtime, u0, u1, u2, . . .. For our discussionhere, MANIC assumes that its percepts are implemented witha camera (or renderer in the case of a virtual world), such thateachxtisavisual digital image. Weusevisionbecauseitiswell understoodinthecontext ofdemonstratingcognitiveabilities. Other percepts, such as a microphone, could also beused to augment the agents observations.At a highlevel, the MANICdivides intotwosystems,whichwecall thelearningsystemandthedecision-makingsystem.Thelearningsystemdrawsitsarchitecturefromcon-structs that are studied predominantly in the eld of machinelearning, and the decision-making system draws its architecturefrom the constructs that are studied predominantly in the partlyoverlapping eld of articial intelligence. The agents perceptsfeedintothelearningsystem, providingthesourceof datanecessary for learning, and the agents actions derive from thedecision-making system, which determines the action vectorsfor the agent to perform. A diagram of the MANIC architectureis given in Figure 2.A. Learning systemThepurposeofthelearningsystemistolearnfrompastexperiences. A large portion of the research effort in the eldof machinelearningthusfar hasfocusedonstaticsystems.Thesearesystemswheretheoutputs(orlabels)dependonlyonthecurrent inputs (or features) andsomecomponent ofunobservable noise. With a software agent, however, it isnecessaryfor theagent tomodel its worldas adynamicalsystem. (A dynamical system refers to one that changes overtime, andshouldnot be confusedwitha dynamic system,which is a different concept.)Althoughmanyapproachesexistformodelingdynamicalsystems in machine learning, nearly all of them either explicitlyor implicitly divide the problem into three constituent compo-nents: a transition model,f, a belief vector,vt, and an obser-vation model, g. The belief vector is an internal representationof thestateof thesystemat anygiventime. Inthecontextof acognitivearchitecture, werefer toit asbeliefratherthan state to emphasize that it is an intrinsic representationthat is unlikely to completely represent the dynamical system,which in this case is the agents entire world, that it attempts tomodel. The transition model maps from current beliefs to thebeliefs in the next time step,vt+1= f(vt). It is implementedusing some function approximating regression model, such asafeed-forwardmultilayerperceptron. Theobservationmodelis abi-directional mappingbetweenanticipatedbeliefs andanticipated observations, xt= g(vt), and vt= g+(xt), whereg+approximates the inverse ofg.Whentrained, this learningsystemenables theagent toanticipate observations into the arbitrarily distant future (withdecaying accuracy) by beginning with the current beliefs, vt,thenrepeatedlyapplyingthetransitionfunctiontoestimatethebeliefsatsomefuturetimestep, vt+i.Thefuturebeliefsmaybepassedthroughtheobservationmodel toanticipatewhattheagentexpectstoobserveatthattimestep. Becausetheagent isunlikelytoeversuccessfullymodel itscomplexworld with perfect precision, the anticipated observations maydiffer somewhat from the actual observations that occur whenthe future time step arrives. This difference provides a usefulerror signal, e=xt xt, which can be utilized to rene thelearning system over time. In other words, the agent knows itslearning system is well trained when the error signal convergestoward a steady stream of values close to zero.Thislearningsystemrepresentsthecommonintersectionamong a wide diversity of models for dynamical systems. Forexample, an Elman network [5] is a well-established recurrentneural networksuitablefor modelingdynamical systems. Itis typically considered as a single network, but can be easilysegmented into transition and observation components, and itsinternal activationsmaybetermedabelief vector. ANAR-MAX model, which is commonly used in system identicationmoreexplicitlysegmentsintothesethreecomponents[6]. AJordannetworkis arecurrent neural networkthat lacks anobservationmodelcomponent, butitmaystillbeconsideredtobeadegeneratecaseof anElmannetworkthat usestheidentityfunctionfor itsobservationcomponent. Manyotherapproaches, including the extended Kalman lter [7] andLSTMnetworks[8]naturallyt withinthislearningsystemdesign.1) Transitionmodel: Althoughthelearningsystemas awhole is a recurrent model, the transition model may beimplemented as a simple feed-forward model, because it onlyneeds to predictvt+1from vt, ut.MANIC agentLearning systemObservationsActionssystemWorldTheTransitionmodelObservation modelactionCandidatePlanningmoduleContentmentmodelContentmentDecoderEncoderobservationsAnticipatedTeacherBeliefs fgh+gt+1BeliefstDecision-makingFig. 2. A high-level diagram of MANIC, which consists of just 3 function approximating models (blue rectangles), one of which is bi-directional, and 1 statemachine (red rectangle). Vectors are shown in rounded bubbles. At the highest level, MANIC divides its articial brain in accordance with functional divisionsthat have evolved between the sub-disciplines of machine learning and articial intelligence.2) Observation model: The purpose of the observationmodel is tomapbetweenbeliefs andobservations. It is abi-directional mappingbetweenstates andobservations. Ofall thecomponents inMANIC, theobservationmodel mayarguably be the most critical for the overall success of the agentinachievingprociencyingeneral settings. If gandg+arewell-implemented, thentheagentsinternal representationofbeliefs will reect a rich understanding of its world. This is oneaspect of cognition where humans have traditionally excelledover machines, as evidenced by our innate ability to recognizeandrecall imagessoprociently[9]. Bycontrast, machineshave long been able to navigate complex decision chainswith greater effectiveness than humans. For example, machinesare unbeatable at checkers, and they can consistently trouncemost humansat chess. Evendecision-makingprocessesthatonlyutilizeshort-termplanningmayappeartoexhibitmuchintelligenceiftheyarebasedonarichunderstandingofthesituation.Sinceabout 2006, arelativelynewresearchcommunityhas formedwithintheeldof articial neural networks tostudy deep learning architectures [10], [11]. These deeplearning architectures have demonstrated signicant advancesin ability for mapping graphical images to hierarchical featurerepresentations. For example, in 2009, Lee and others showedthatdeepnetworksdecomposeimagesintoconstituentparts,muchashumansunderstandthesameimages[12]. In2012,Krizhevsky demonstrated unprecidented accuracy at visualrecognitiontasks usingadeeplearningarchitecturetrainedontheImageNet dataset [13], andmanyotherdevelopmentsinthiscommunityhaveeclipsedotherapproachesforimagerecognition. Since MANIC assumes that observations are givenintheformof digital images, andthetaskof theobserva-tionmodel istomapfromtheseobservationstomeaningfulinternalrepresentations, thedevelopmentsofthiscommunityareideallysuitedtoprovidethebest implementationfortheobservationmodel. Theencodingcomponentoftheobserva-tion model,g+, may be implemented as a deep convolutionalneural network, which is known to be particularly effective forencoding images [14], [15], [16]. The decoding component, g,may be implemented as a classic fully-connected deep neuralnetwork. Insteadof predictinganentireobservationimage,it models the image as a continuous function. That is, itpredictsall thecolorchannel valuesforonlyasinglepixel,but additional inputs are added to specify the pixel of interest.Such an architecture can be shown to implement the decodingcounterpart of a deep convolutional neural network.If anabsolutelyminimal architectureis desired, theen-coder, g+, may be omitted. Rather than calculating beliefs fromthecurrent observations, beliefsmayberenedtomaketheanticipatedobservationsmatchtheactual observations. Thisinference-basedapproachhastheadvantageofrememberingbelieved state even when it is not visible in the currentobservations. Hence, the encoder is only used in the rarecases when the agent is activated in a newcontext, butrepeated applications of inference with the decoder wouldlikely accomplish the same objective.B. Decision-making systemThe decision making system produces plans to accomplishtasksthat areanticipatedtomaximizethesystemscontent-ment. Some clear patterns have also emerged among the manydecision-makingsystemsdevelopedforarticialintelligence.Nearlyall of themdivide the problemintosome formofmodel for estimating the utility of various possible states,andsomemethodfor planningactionsthat areexpectedtolead to desirable states. Let h(vt) be the utility that theagent believes tobeassociatedwithbeinginstate vt. LetPbeapool of candidateplans for maximizinghthat theagentconsiders, whereeachpi Pisasequenceofactionvectors, pi=ut, ut+1, ut+2, . At eachtimestep, theagent performs some amount of renement to its pool of plans,selects theonethat yields thebiggest expectedutility, andchooses to perform the rst action in that plan.Pick the planthat maximizesGenerate apopulationof randomplansPick a smallbut diverse setof the bestcontentmentPresent toteacherfor rankingIf the teacheris availableplansPlanning moduleIf the teacheris not availableUntil a satisficingplan is foundConvert tovideos ofimaginedobservationsUntil executionis completethe plan andlearn from theExecuteexperienceRefine theplans byevolutionaryoptimizationFig. 3. The planning module is state machine that makes plans by using the 3 function-approximating models to anticipate the consequences and desirabilityof candidate action sequences.At this high level, the model is designed to be sufcientlygeneral toencapsulatemost decision-makingprocesses. Forexample, those that maintain only one plan, or look only onetime-stepahead, maybeconsideredtobedegeneratecasesof this model witha small pool or short plans. Processesthat donot renetheirplansaftereachstepmayimplementthe regular renement as an empty operation. Implementationdetails of this nature are more properly dened in the lower-level components.1) Contentment model: Because MANIC is designed to bealong-livingagent, wethinkof utilityasbeingmaintainedover time, rather only being maximized in carrying out asingleplan. Thus, werefer tohasthecontentment model.We assume that any tasks the agent is intended to accomplishareachievedthroughtheagentsconstant effortstopreservea state of homeostasis. For example, a robotic vacuum wouldbemostcontentwhentheoorhasbeenrecentlycleaned.Inordertomaintainthisconditionofcontentment, however,it mayneedtotemporarilystopcleaninginorderemptyitsbagof dust, rechargeitsbatteries, or avoidinterferingwiththe owner. The contentment model,h, is trained to capture allof the factors that motivate the agent.The contentment model is the only component of MANICthat benets from a human teacher. Because the models in thelearningsystemlearnfromunlabeledobservations, directionfromateachercanbefocusedtowardthisonemodel. Com-petitive approaches can also be used to train the contentmentmodel without a human teacher, as described in Section IV.2) Planning module: The planning module is a simplestatemachinethat utilizesthethreemodelstoanticipatetheoutcomeofpossibleplans, andevaluatetheirutility. Aowchart for thismoduleisgiveninFigure3. Muchworkhasbeen done in the eld of articial intelligence to developsystems that guarantee convergence toward optimal plans.Unfortunately, these systems typically offer limited scalabilitywith respect to dimensionality [17]. Early attempts at articialintelligence foundthat it is mucheasier toexceedhumancapabilities for planning than it is to exceed human capabilitiesfor recognition [9]. This implies that humans utilize richinternal representations of their beliefs to compensate for theirrelativeinabilitytoplanveryfarahead. Toapproachhumancognition, therefore, it is necessary to prioritize the scalabilityof the belief-space over the optimalityof the planning. Agood compromise with many desirable properties is found inPolicy agentObservationsWorldThePolicymodelActionsFig. 4. A simple architecture that is not sufcient for general cognition.genetic algorithms that optimize by simulated evolution. Thesemethods maintain a population of candidate plans, can benetfrom prior planning, are easily parallelized, and scale very wellinto high-dimensional belief spaces.III. SUFFICIENCYOccamsrazorsuggeststhatadditionalcomplexityshouldnot beaddedtoamodel without necessity. Inthecaseofcognitive architectures, additional complexity should probablygivethearchitectureadditionalcognitivecapabilities, orelseits necessity should be challenged. In this section, we evaluateMANIC against Occams razor, and contrast it with two evensimpler models to highlight its desirable properties.Adiagramofasimplepolicyagent isgiveninFigure4.This agent uses a function approximating model to map fromthecurrent observationstoactions. (It differsfromareexagent [18] in that its model is not hard coded for a particularproblem, but may be trained to approximate solutions tonewproblems that it encounters.) The capabilities of thisarchitecture are maximizedwhenit is implementedwithamodel that is known to be capable of approximating arbitraryfunctions, such as a feedforward articial multilayer perceptronwith at least one hidden layer [19], [20]. However, no matterhow powerful its one model may be, there are many problemsthisarchitecturecannotsolvethatmorecapablearchitecturescansolve. For example, if observationsarenot sufcient touniqelyidentifystate, anarchitecturewithmemorycoulddobetter. Therefore, we say that this policy agent is not sufcientto implement general cognition.Memory+policy agentObservationsWorldThe MemorymodelBeliefsPolicymodelActionsFig. 5. A simple architecture that is sufcient for general cognition, but isnot practical to train.A diagram of a memory+policy agent is given in Figure 5.This architecture extends the policy agent with memory. It usestwo models, one to update its internal beliefs from new obser-vations, and one that maps from its current beliefs to actions.This architecture can be shown to be theoretically sufcient forgeneral cognition. If its belief vector is sufciently large, then itcan represent any state that might occur in its world. (This maynotalwaysbepractical, butwearecurrentlyevaluatingonlythe theoretical sufciency of this architecture.) If its memorymodel is implemented with an arbitrary function approximator,then it can theoretically update its beliefs as well as any otherarchitecturecouldfromthegivenobservations. If itspolicymodel is implemented with an arbitrary function approximator,thenit cantheoreticallychooseactionsaswell asanyotherarchitecture could from accurate beliefs. Therefore, we can saythat the memory+policy architecture is theoretically sufcientfor general cognition.Since the memory+policy model is theoretically sufcientfor general cognition, it would generally not be reasonable toresort to greater complexity for the purpose of trying to achievenew theoretical capabilities. However, the memory+policy ar-chitecture also has a signicant practical limitation: it is verydifcult to train. It requires a teacher that can unambiguouslytell it whichactionstochooseineveryregionof itsbeliefspace. Even if a teacher is available to provide such thoroughtraining, a memory+policy agent would never be able to exceedthe capabilities of its teacher, which renders it of limitedpractical value.The MANIC architecture adds some additional complexitysothat it canseparateitslearningsystemfromitsdecision-making system. This separation enables it to learn fromunlabeled observations. That is, it can rene the learningsystemsabilitytoaccuratelyanticipatetheconsequencesofactions whether or not the teacher is available. With MANIC,supervision is only needed to guide its priorities, not its beliefsor choices. Yet, while beingmuchmore practical totrain,MANIC is also provably sufcient for general cognition.If weallowits belief vector, vt, tobearbitrarilylarge,then it it is sufcient to encode a correct representation of theagentsworld,nomatterhowcomplexthatworldmaybe.Ifthetransitionfunction, f, is implementedwithanarbitraryfunction approximator, then it is theoretically sufcient tocorrectly anticipate any state transitions. And, if the decoder, g,isalsoimplementedwithanarbitraryfunctionapproximator,then it will be able to accurately anticipate observationsFig. 6. A system consisting of a simulated crane viewed through a camera.Each observation consists of 9216 values, but the system itself only exhibits2degrees of freedom, sothese images lie ona 2-dimensional non-linearmanifold embedded in 9216-dimensional space. This gure depicts the entiremanifold as represented by uniform sampling over its nonlinear surface.fromcorrect beliefs. If we allowthe genetic algorithmoftheplanningsystemtoutilizeanarbitrarilylargepopulation,then it approximates an exhaustive search through all possibleplans, whichwill ndtheoptimal plan. Sincetheactionitchooses is always the rst step in an optimal plan, its actionswill beoptimal for maximizingcontentment. Finally, if thecontentment model, h, isalsoimplementedwithauniversalfunction approximator, then it is sufcient to approximate theideal utility metric for arbitrary problems. Therefore, MANICis sufcient for general cognition. Inthe next Section, wediscuss additional details about why MANIC is also practicalfor training.IV. TRAININGPerhaps, themost signicant insight formakingMANICpractical fortrainingcomesfromthehigh-level divisionbe-tweenthelearningsystemandthedecision-makingsystem.Becauseit divideswheremachinelearningandarticial in-telligence typically separate, well-established training methodsbecome applicable, whereas such methods cannot be used withother cognitive architectures. Because the learning system doesnot choose candidate actions, but only anticipates their effect,it can learn from every observation that the agent makes, evenif the actions are chosen randomly, and even when no teacheris available.The learning system in MANIC is a recurrent architecturebecause the current beliefs, vt, are used by the transition modeltoanticpatesubsequentbeliefs, vt+1.Recurrentarchitecturesare notoriouslydifcult totrain, due tothe chaotic effectsof feedbackthat result fromtherecurrence[21], [22], [23],[24], [25]. This tends tocreatebothlargehills andvalleysthroughout theerror surface, solocal optimizationmethodsmust use very small learning rates. Further, local optimizationmethods are susceptible to the frequent local optima thatoccur throughout the error space of recurrent architectures, andglobal optimization methods tend to be extremely slow.However, recentadvancesinnonlineardimensionalityre-ductionprovideasolutionfor cases whereobservations lieonahigh-dimensional manifold. Signicantly, this situationoccurswhenobservationsconsistofimagesthatderivefromcontinuous space. In other words, MANIC can handle the caseofarobotequippedwithdigitalcameras,whichhasobviousanalogy with humans equipped with optical vision.Fig.7. Left:Thehiddenstatesvisitedinarandomwalkwithasimulatedcrane system. Color is used to depict time, starting with red and ending withpurple. The horizontal axis shows boom position, and the vertical axis showscable length. The model was shown images of the crane, but was not allowedtoviewthehiddenstate. Right:Estimatedstatescalculatedbyreducingthedimensionalityof observedimages. Althoughdifferingsomewhat fromtheactual states, theseestimates werecloseenoughtobootstraptrainingof arecurrent model of system dynamics.Whenobservations are high-dimensional, a goodinitialestimate of state (or inthis case, beliefs) canbe obtainedby reducing the dimensionality of those observations. Co-author Gashler demonstratedthis approachin2011withamethodthat traineddeeprecurrent articial neural networkstomodeldynamicalsystems[26]. Forexample, considerthecrane system depicted in Figure 6. We used images containing64 48 pixels in 3 color channels, for a total of 9216 values.We performed a randomwalk through the state space ofthecrane bymovingthecrane left, right, up, or down, atrandomtocollect observations. (Todemonstraterobustness,randomnoisewas injectedintostatetransitions as well astheobservedimages.) Usinganonlinear dimensionalityre-duction technique, we reduced the 9216-dimensional sequenceofobservedimagesdowntojust 2dimensions(becausethesystemhasonly2degreesoffreedom)toobtainanestimateof the state represented in each high-dimensional image. (SeeFigure 7.) Signicant similarity can be observed between theactual (left) andestimated(right) states. Consequently, thisapproach is ideal for bootstrapping the training of a recurrentmodel of systemdynamics. Whenthebeliefsareinitializedtoreasonable intial values, local optima is muchless of aproblem, soregular stochasticgradient descent canbeusedto rene the model from subsequent observations.In the context of MANIC, this implies that nonlineardimensionalityreductioncanbe usedtoestimate each vt.Then, gcanbetrainedtomapfromeachvttoxt, g+canbe trained to map from eachxt tovt, andfcan be trained tomap from eachvttovt+1. Note that each of these mappingsdepends on having a reasonable estimate ofvt. These valuesare not typically known with recurrent architectures, but digitalimages provide sufcient informationthat unsuperviseddi-mensionality reduction methods can estimatev1, v2, fromx1, x2, very well. When an estimate for each vt is known,training the various components of the learning system reducesto a simple supervised learning problem.A similar three-step approach can be used to bootstrap thelearning system of MANIC:Gather high-dimensional observations.Reduce observations to an initial estimate of beliefs.Use beliefs to train the transition model.The last step, training the transition model, is difcultwithrecurrent models becausegradient-basedmethods tendto get stuck in local optima. However, because dimensionalityreductioncanestimatebeliefspriortotraining, it reducestoa simple supervised training approach. It has been shown thatbootstrappingthetrainingofneural networkscaneffectivelybypass the local optima that otherwise cause problems forreningwithgradient-basedapproaches [27]. This effect ismuchmore dramatic withrecurrent neural networks, sincetheycreatelocal optimathroughout their model space[21].Thus, after initial training, the system can be maintained usingregular backpropagationtorene the learningsystemfromsubsequent observations.Withthe crane dynamical system, we were able toac-curatelyanticipatedynamicsseveral hundredtime-stepsintothe future [26], even with injected noise. To validate theplausibility of our planning system, we also demonstrated thisapproach on another problem involving a robot that navigateswithin a warehouse. We rst trained the learning systemon a sequence of randomobservations. Then, using onlyobservationspredictedbythelearningsystem, wewereableto successfully plan an entire sequence of actions by using thelearning system to anticipate the beliefs of MANIC. The pathweplannedisshowninFigure8.B. We, then, executedthisplanontheactual system. Theactual states throughwhichit passed are shown in Figure 8.C. Even though MANICrepresented its internal belief states differently from the actualsystem, theanticipatedobservations wereveryclosetotheactual observations.Manyother recent advances indimensionalityreductionwithdeeparticialneuralnetworksvalidatethatthisgeneralapproach is effective for producing internal intrinsic represen-tations of external observations.[28], [29], [30], [31], [32].The decision-making system contains one model, h, whichneedstobetrainedtolearnwhatconstituteshomeostasis(orcontentment)forthesystem. Thisisdoneusingatypeofreinforcment learning. Because motivations are subjectivelytiedtohumanpreferences, themotivationsforanagent thathumans wouldreceive as intellgent necessarilydepends onhumanteaching. Therefore, weassumethatahumanteacheris periodically available to direct the MANIC agent. In caseswherenohumanteacherisavailable, thecontentmentmodelcould also be trained using a competitive or evolutionary ap-proach. This is done by instantiating multiple MANIC agentswithvariationsintheir contentment functions, andallowingthe more t instantiations to survive.When MANIC makes plans, it utilizes its learning systemtoconvert eachplanfromasequenceofactionstoacorre-spondingvideoof anticipatedobservations. Inmanyways,these videos of anticipatedobservations maybe analogouswith the dreams or fantasies that humans produce internally asthey sleep or anticipate future encounters. Although differencescertainly exist, this similar side-effect may indicate that the ar-chitecture within the human brain has certain similarities withMANIC. Ironically, theimaginationof thearticial systemismoreaccessiblethanthat of biological humans, enablinghumans to examine the inner imaginings of the articial systemmore intimately than they can with each other.ThevideosofimaginedobservationsarepresentedtotheFig. 8. A: A model was trained to capture the dynamics of a robot that uses a camera to navigate within a warehouse. B: A sequence of actions was plannedentirely within the belief-space of the model. C: When executed, the plan resulted in actual behavior corresponding with the agents anticipated beliefs. D: Acomparison of the agents anticipated observations with those actually obtained when the plan was executed.humanteacher (whenheor sheisavailable) for considera-tion. Thehuman, then, ranks thesevideos accordingtothedisirabilityoftheanticipatedoutcome. Notethat theseplansneednot actuallybeexecutedtogeneratethecorrespondingvideo. Rather, theteacher onlyranks imaginedscenes. Theprecision with which the imagination of the agent correspondswith reality when it actually executes a plan depends only onthelearningsystem(whichiscontinuallyrened), anddoesnot depend on the teacher. Because the teacher is only neededto rene the contentment model, only a reasonable amount ofhuman attention is ever needed.MANICencapsulates a diversityof learningparadigms.The observation model, g, is trained in an unsupervised mannerfrom camera-based observations. The transition model,f, andtheobservationmodel, gandg+, aretrainedbysupervisedlearning. Thecontentment model, h, istrainedbyreinforce-ment from a human oracle. In 1999, Doya identied anatomi-cal, physiological, and theoretical evidence to support the hy-potheses that the cerebellum specializes in supervised learning,thecerebral cortexspecializesinunsupervisedlearning, andthe basal ganglia specialize inreinforcement learning[33].This suggests a possible correlation between the components ofMANIC and those in the brain. Finally, the planning moduleinMANICtiesall ofitspiecestogethertomakeintelligentplans by means of a genetic algorithm. We consider it to be apositivepropertythattheMANICarchitectureuniessuchawide diversity of learning techniques in such a natural manner.As each of these methods exists for the purpose of addressing aparticular aspect of cognition, we claim that a general cognitivearchitecturemustnecessarilygiveplacetoeachofthemajorparadigms that have found utility in articial intelligence andmachine learning.V. SENTIENCE THROUGH SELF PERCEPTIONSentience is a highly subjective and ill-dened concept, butis a critical aspect of the human experience. Consequently, ithas been the subject of much focus in cognitive architectures.Weproposeaplausibletheorythat might explainsentience,and show that MANIC can achieve functional equivalence.It is clear that MANIC perceives its environment, becauseit responds intelligently to the observations it makes. Itsperceptionisimplementedbyitsbeliefs, whichdescribeitsunderstanding of the observations it makes, as well as its obser-vation model, which connects its beliefs to the world. The termawareness is sometimes used with humans to imply a higherlevel ofperception. It isnot clearwhetherMANICachievesawareness, because the difference between awareness andperception is not yet well-dened. However, because wehave proven that MANIC is sufcient for general cognition, wecan say for sure that MANIC achieves something functionallyequivalenct with awareness. That is, we know it can behave asif it is aware, but we do not know whether human awarenessrequires certain immeasurable properties that MANIC lacks.Similarly, thetermsentiencecontains aspects that arenotyetwell-dened, butotheraspectsofsentiencearewell-established. Specically, sentience requires that an agent pos-sess feelings that summarize its overall well-being, as well asanawarenessofthosefeelings, andadesiretoact onthem.An agent that implements the well-dened aspects of sentiencecan be said to implement something functionally equivalent.Our denition of sentience is best expressed as an analogywith perception: Perception requires the ability to observethe environment, beliefs that summarize or represent thoseobservations, and a model to give the beliefs context. Similarly,weproposethatsentiencerequirestheabilitytomakeintro-spectiveobservations, feelingsthatsummarizeorrepresentthem, and a model to give the feelings context. In otherwords, ifsentiencearisesfromself-awareness, thenMANICcan achieve something functionally equivalent through selfperception. Inadditiontoacamerathat observes theenvi-ronment, MANIC can be equipped with the ability to observeitsowninternal state. (For example, it might beenabledtoadditionallyobserve the weights of its three models, f, g,and h, andits belief vector, v.) Since MANICis alreadydesigned for operating with high-dimensional observations,these introspective observations could simply be concatenatedwiththeexternal observationsit alreadymakes. Thiswouldcause MANICto utilize a portion of vto summarize itsintrospective observations.Thus, vwould represent both the beliefs and feel-ings of a MANICagent. Its planningsystemwouldthenimplicilty make plans to maintaining homeostasis in bothits beliefs and feelings. And its observation model wouldgive context to its feelings by mapping between feelings andintrospective observations, just as it does between beliefs andexternal observations. This theoryof sentience is plausiblewithhumans,becausehumansplanwithregardtoboththeirfeelings and beliefs as a unied concept, maintaining both theirexternal objectives and internal well-being, and because a well-connectedbrain, whichhumanshave, wouldbesufcient toprovide the introspective observations necessary to facilitateit.Since MANIC learns to model its priorities from a humanteacher inits contentment function, h, it will learntogiveappropriate regard to its own feelings when they are relevantforitspurpose. Presumably, thiswilloccurwhenthehumanteacher directs it to maintain itself. Using the same observationmodelwithbothintrospectiveandexternalobservations, andusingthe same vector tomodel bothfeelings andbeliefs,arebothplausiblebecausethesedesignchoiceswill enableMANIC to entangle its feelings with its environment, behaviorthat humans are known to exhibit.VI. CONCLUSIONWe presented a cognitive architecture called MANIC. Thisarchitecture unies a diversity of techniques in the sub-disciplines of machine learning and articial intelligence with-out introducingmuchnovelty. Yet, whilerelyingonexistingmethods, and with minimal complexity, MANIC is a powerfulcognitive architecture. We showed that it is sufciently generaltoaccomplisharbitrarycognitive tasks, andthat it canbepractically trained using recent methods.We supportedMANICs designbyreferringtoexistingworksthat validateitsindividual components, andwemadetheoretical arguments about the capabilities that should emergefrom combining them in the manner outlined by MANIC. Theprimary contribution of this paper is to show that these existingmethods can already accomplish more of cognitive intelligencethanisgenerallyrecognized. Ourultimateintent istoarguethat if general intelligence is one of the ultimate objectives ofthe elds of machine learning and articial intelligence, thenthey are very much on the right track, and it is not clear thatany critical piece of understanding necessary for implementinga rudimentary consciousness is denitely missing.REFERENCES[1] J. McCarthy, M. Minsky, N. Rochester, and C. E. Shannon, A proposalfor thedartmouthsummer researchproject onarticial intelligence,august 31, 1955, AI Magazine, vol. 27, no. 4, pp. 1214, 2006.[2] H. A. Simon, The Shape of Automation for Men and Management. NewYork: Harper and Row, 1967.[3] D. Cervier, AI: The Tumultuous Search for Articial Intelligence. NewYork: Basic Books, 1993.[4] P. McCorduck, MachinesWhoThink, 2nded. Natick, MA: A. K.Peters, Ltd., 2004.[5] X. Gao, X. Gao, andS. Ovaska, Amodiedelmanneural networkmodel with application to dynamical systems identication, in Systems,Man, and Cybernetics, 1996., IEEE International Conference on, vol. 2.IEEE, 1996, pp. 13761381.[6] S. ChenandS. Billings, Representationsof non-linear systems: thenarmaxmodel,International Journal of Control, vol. 49, no. 3, pp.10131032, 1989.[7] S. S. Haykin, S. S. Haykin, andS. S. Haykin, Kalmanlteringandneural networks. Wiley Online Library, 2001.[8] S. Hochreiter and J. Schmidhuber, Long short-term memory, Neuralcomputation, vol. 9, no. 8, pp. 17351780, 1997.[9] N. Pinto, D. D. Cox, and J. J. DiCarlo, Why is real-world visual objectrecognitionhard?PLoScomputationalbiology, vol. 4,no. 1, p. e27,2008.[10] G. E. Hinton, S. Osindero, and Y.-W. Teh, A fast learning algorithm fordeepbeliefnets,Neuralcomputation,vol.18,no.7,pp.15271554,2006.[11] K.-S. Oh and K. Jung, Gpu implementation of neural networks,Pattern Recognition, vol. 37, no. 6, pp. 13111314, 2004.[12] H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, Convolutional deepbelief networks for scalable unsupervised learning of hierarchical repre-sentations, in Proceedings of the 26th Annual International Conferenceon Machine Learning. ACM, 2009, pp. 609616.[13] A. Krizhevsky, I. Sutskever, andG. Hinton, Imagenet classicationwith deep convolutional neural networks, in Advances in NeuralInformation Processing Systems 25, 2012, pp. 11061114.[14] S. Lawrence, C. L. Giles, A. C. Tsoi, and A. D. Back, Face recognition:Aconvolutional neural-network approach, Neural Networks, IEEETransactions on, vol. 8, no. 1, pp. 98113, 1997.[15] Y. LeCun, K. Kavukcuoglu, andC. Farabet, Convolutionalnetworksand applications in vision, in Circuits and Systems (ISCAS), Proceed-ings of 2010IEEEInternational Symposiumon. IEEE, 2010, pp.253256.[16] D. Ciresan, U. Meier, J. Masci, andJ. Schmidhuber, Acommitteeofneuralnetworksfortrafcsignclassication,inNeural Networks(IJCNN), The2011International Joint Conferenceon. IEEE, 2011,pp. 19181921.[17] R.Bellman,AdaptiveControlProcesses:AGuidedTour. Princeton,NJ, USA: Princeton University Press, 1961.[18] S. Russell, P. Norvig, and A. Intelligence, Amodern approach,Articial Intelligence. Prentice-Hall, Egnlewood Cliffs, vol. 25, 1995.[19] G. Cybenko, Approximationbysuperpositionsof asigmoidal func-tion,Mathematicsof control, signalsandsystems, vol. 2, no. 4, pp.303314, 1989.[20] G. Cybenko, Continuous valuedneural networks withtwohiddenlayers are sufcient, Tufts University, Medford, MA, Tech. Rep., 1998.[21] M. P. Cu ellar, M. Delgado, andM. C. Pegalajar, Anapplicationofnon-linear programming to train recurrent neural networks in time seriesprediction problems, Enterprise Information Systems VII, pp. 95102,2006.[22] E. Sontag, Neural networks for control, Essays on Control: Perspec-tives in the Theory and its Applications, vol. 14, pp. 339380, 1993.[23] J. Sj oberg, Q. Zhang, L. Ljung, A. Benveniste, B. Deylon, P. Y.Glorennec, H. Hjalmarsson, and A. Juditsky, Nonlinear black-boxmodeling in systemidentication: a unied overview, Automatica,vol. 31, pp. 16911724, 1995.[24] D. FloreanoandF. Mondada, Automaticcreationofanautonomousagent: Genetic evolution of a neural-network driven robot, Fromanimals to animats, vol. 3, pp. 421430, 1994.[25] A. Blanco, M. Delgado, andM. C. Pegalajar, Areal-codedgeneticalgorithmfor trainingrecurrent neural networks, Neural Networks,vol. 14, no. 1, pp. 93105, 2001.[26] M. S. Gashler and T. R. Martinez, Temporal nonlinear dimensionalityreduction,inProceedings of the International Joint Conference onNeural Networks. IEEE Press, 2011, pp. 19591966.[27] Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, Greedylayer-wise training of deep networks, Advances in neural informationprocessing systems, vol. 19, p. 153, 2007.[28] G. E. Hinton and R. R. Salakhutdinov, Reducing the dimensionality ofdata with neural networks, Science, vol. 313, no. 5786, pp. 504507,2006.[29] P. Vincent, H. Larochelle, Y. Bengio, andP.-A. Manzagol, Extract-ingandcomposingrobust features withdenoisingautoencoders,inProceedings of the 25th international conference on Machine learning.ACM, 2008, pp. 10961103.[30] S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y. Bengio, Contractiveauto-encoders: Explicit invarianceduringfeatureextraction,inPro-ceedingsof the28thInternational ConferenceonMachineLearning(ICML-11), 2011, pp. 833840.[31] Y. Bengio, A. C. Courville, and J. S. Bergstra, Unsupervised modelsof images by spike-and-slab rbms, in Proceedings of the 28th Interna-tional ConferenceonMachineLearning(ICML-11), 2011, pp. 11451152.[32] D. P. Kingma and M. Welling, Auto-encoding variational bayes, arXivpreprint arXiv:1312.6114, 2013.[33] K. Doya, What are the computations of the cerebellum, the basalganglia and the cerebral cortex?Neuralnetworks, vol. 12, no. 7, pp.961974, 1999.

2015ijcnn manic

Documents

general cognitionmichael

cognitive architectures

effective cognitive

human cognition

practical solutions

incomputational science

biological evolution

function approximating