towards human level ai jitendra malik u.c. berkeley jitendra malik u.c. berkeley

Towards Human Level AI Towards Human Level AI

Jitendra Malik

U.C. Berkeley

Jitendra Malik

U.C. Berkeley

This talkThis talk

• Approaches to computational intelligence• Review of human intelligence and child

development• Review of computer vision and its problems• The problem of visual categories• Proposed agenda

• Approaches to computational intelligence• Review of human intelligence and child

development• Review of computer vision and its problems• The problem of visual categories• Proposed agenda

Paradigms for mechanizing intelligence ~1960

Paradigms for mechanizing intelligence ~1960

• Classic AI (McCarthy, Minsky, Newell, Simon)– Games, theorem-proving, reasoning– Search, represent and reason in first-order logic

• Pattern Recognition/Neural Networks (Rosenblatt)– Classification, Associative memory– Learning (Perceptrons …)

• Estimation and Control (Kalman)– Decide action in uncertain, time-varying environment– Markov Decision Processes, adaptive control …

• Classic AI (McCarthy, Minsky, Newell, Simon)– Games, theorem-proving, reasoning– Search, represent and reason in first-order logic

• Pattern Recognition/Neural Networks (Rosenblatt)– Classification, Associative memory– Learning (Perceptrons …)

• Estimation and Control (Kalman)– Decide action in uncertain, time-varying environment– Markov Decision Processes, adaptive control …

Achievements (1960-1990)Achievements (1960-1990)

• Classic AI (McCarthy, Minsky, Newell, Simon)– Chess programs, planning sytems, first generation expert

systems– [[ relational databases ]]

• Pattern Recognition/Neural Networks (Rosenblatt)– Various applications of MLPs and other learning techniques– HMMs for speech recognition

• Estimation and Control (Kalman)– Man on Moon– Self-driving cars …

• Classic AI (McCarthy, Minsky, Newell, Simon)– Chess programs, planning sytems, first generation expert

systems– [[ relational databases ]]

• Pattern Recognition/Neural Networks (Rosenblatt)– Various applications of MLPs and other learning techniques– HMMs for speech recognition

• Estimation and Control (Kalman)– Man on Moon– Self-driving cars …

John von Neumann’s warningJohn von Neumann’s warning• As a mathematical discipline travels far from its empirical source, or still more, if it is a second and

third generation only indirectly inspired from ideas coming from 'reality', it is beset with very grave dangers. It becomes more and more purely aestheticizing, more and more purely l'art pour l'art. This need not be bad, if the field is surrounded by correlated subjects, which still have closer empirical connections, or if the discipline is under the influence of men with an exceptionally well-developed taste.

"But there is a grave danger that the subject will develop along the line of least resistance, that the stream, so far from its source, will separate into a multitude of insignificant branches, and that the discipline will become a disorganized mass of details and complexities.

In other words, at a great distance from its empirical source, or after much 'abstract' inbreeding, a mathematical subject is in danger of degeneration. At the inception the style is usually classical; when it shows signs of becoming baroque the danger signal is up. It would be easy to give examples, to trace specific evolutions into the baroque and the very high baroque, but this would be too technical.

In any event, whenever this stage is reached, the only remedy seems to me to be the rejuvenating return to the source: the reinjection of more or less directly empirical ideas. I am convinced that this is a necessary condition to conserve the freshness and the vitality of the subject, and that this will remain so in the future.".

• As a mathematical discipline travels far from its empirical source, or still more, if it is a second and third generation only indirectly inspired from ideas coming from 'reality', it is beset with very grave dangers. It becomes more and more purely aestheticizing, more and more purely l'art pour l'art. This need not be bad, if the field is surrounded by correlated subjects, which still have closer empirical connections, or if the discipline is under the influence of men with an exceptionally well-developed taste.

"But there is a grave danger that the subject will develop along the line of least resistance, that the stream, so far from its source, will separate into a multitude of insignificant branches, and that the discipline will become a disorganized mass of details and complexities.

In other words, at a great distance from its empirical source, or after much 'abstract' inbreeding, a mathematical subject is in danger of degeneration. At the inception the style is usually classical; when it shows signs of becoming baroque the danger signal is up. It would be easy to give examples, to trace specific evolutions into the baroque and the very high baroque, but this would be too technical.

In any event, whenever this stage is reached, the only remedy seems to me to be the rejuvenating return to the source: the reinjection of more or less directly empirical ideas. I am convinced that this is a necessary condition to conserve the freshness and the vitality of the subject, and that this will remain so in the future.".

Brain Sub-SystemsBrain Sub-Systems

• Sensory– Vision (30-50%)– Audition – Somatic– Chemical (Taste, Smell)

• Motor– Manipulation– Locomotion– Speech

• Language• Central

– Planning and problem solving– …..

• Sensory– Vision (30-50%)– Audition – Somatic– Chemical (Taste, Smell)

• Motor– Manipulation– Locomotion– Speech

• Language• Central

– Planning and problem solving– …..

What do we know from human child development?

What do we know from human child development?

• It is nature AND nurture • It is nature AND nurture

Visual DevelopmentVisual Development

• Axon growth guided by chemical gradients (in turn due to gene expression)

• Critical period for development of orientation selectivity (Hubel & Wiesel)

• New-born babies sensitive to faces

• Visual tracking ~ 3mo

• Binocularity/Stereopsis ~4mo

• Axon growth guided by chemical gradients (in turn due to gene expression)

• Critical period for development of orientation selectivity (Hubel & Wiesel)

• New-born babies sensitive to faces

• Visual tracking ~ 3mo

• Binocularity/Stereopsis ~4mo

Language DevelopmentLanguage Development

• Babbling & tuning phonemes• Developing link between words and objects• Words refer to objects• They denote categories• Objects have only one name• Slow between 12 and 18 mo (median no. words at

20 mo is 169) and very rapid afterwards (6 yrs - 13k)• Word pairs (18-24 mo)• Grammar takes off after that

• Babbling & tuning phonemes• Developing link between words and objects• Words refer to objects• They denote categories• Objects have only one name• Slow between 12 and 18 mo (median no. words at

20 mo is 169) and very rapid afterwards (6 yrs - 13k)• Word pairs (18-24 mo)• Grammar takes off after that

Cognitive developmentCognitive development

• Categorization

• Perception/reality distinction

• Self-Awareness (mirror test)

• Categorization

• Perception/reality distinction

• Self-Awareness (mirror test)

Many types of memoryMany types of memory

• Semantic memory

• Episodic memory

• Skill memory

• Semantic memory

• Episodic memory

• Skill memory

The Hilbert Problems of Computer Vision

The Hilbert Problems of Computer Vision

Jitendra Malik

Forty years of computer vision 1963-2003

Forty years of computer vision 1963-2003

• 1960s: Beginnings in artificial intelligence, image processing and pattern recognition

• 1970s: Foundational work on image formation: Horn, Koenderink, Longuet-Higgins …

• 1980s: Vision as applied mathematics: geometry, multi-scale analysis, control theory, optimization …

• 1990s: – Geometric analysis largely completed– Probabilistic/Learning approaches in full swing– Successful applications in graphics, biometrics, HCI …

• 1960s: Beginnings in artificial intelligence, image processing and pattern recognition

• 1970s: Foundational work on image formation: Horn, Koenderink, Longuet-Higgins …

• 1980s: Vision as applied mathematics: geometry, multi-scale analysis, control theory, optimization …

• 1990s: – Geometric analysis largely completed– Probabilistic/Learning approaches in full swing– Successful applications in graphics, biometrics, HCI …

And now …And now …

• Back to basics: the classic problem of understanding the scene from its image/s

• Central question: Interplay of bottom-up and top-down information

• Back to basics: the classic problem of understanding the scene from its image/s

• Central question: Interplay of bottom-up and top-down information

Early VisionEarly Vision

• What can we learn from image statistics that we didn't know already?

• How far can bottom-up image segmentation go?

• How do we make inferences from shading and texture patterns in natural images?

• What can we learn from image statistics that we didn't know already?

• How far can bottom-up image segmentation go?

• How do we make inferences from shading and texture patterns in natural images?

Static Scene UnderstandingStatic Scene Understanding

• What is the interaction between segmentation and recognition?

• What is the interaction between scenes, objects, and parts?

• What is the role of design vs. learning in recognition systems?

• What is the interaction between segmentation and recognition?

• What is the interaction between scenes, objects, and parts?

• What is the role of design vs. learning in recognition systems?

Dynamic Scene UnderstandingDynamic Scene Understanding

• What is the role of high-level knowledge in long range motion correspondence?

• How do we find and track articulated structures?

• How do we represent "movemes" and actions?

• What is the role of high-level knowledge in long range motion correspondence?

• How do we find and track articulated structures?

• How do we represent "movemes" and actions?

Proposed Research AgendaProposed Research Agenda

Child Language Acquisition with visual inputChild Language Acquisition with visual input

towards human level ai jitendra malik u.c. berkeley jitendra malik u.c. berkeley

Documents

adaptive control slide

control kalman man

berkeley slide

agenda slide

mechanizing intelligence

simon games

empirical source

speech recognition estimation