computer-on-watch: imagery analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716...

27
Greg Angelides MIT ILP R&D Conference 14 November 2019 Computer-on-Watch: Imagery Analysis DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited. This material is based upon work supported by the Under Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Under Secretary of Defense for Research and Engineering. © 2019 Massachusetts Institute of Technology. Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.

Upload: others

Post on 20-Mar-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

Greg Angelides

MIT ILP R&D Conference

14 November 2019

Computer-on-Watch: Imagery Analysis

DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.

This material is based upon work supported by the Under Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Under Secretary of Defense for Research and Engineering.

© 2019 Massachusetts Institute of Technology.

Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or

DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.

Page 2: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 2GA 11/14/2019

Global AI Technology Race

“AI is probably the most important thing humanity has ever worked on. I think of it as something more profound than electricity or fire.”

Google CEO Sundar Pichai, January 2018

“For years I’ve been telling people that the internet was the appetizer, and that AI is the main course. AI is the way of the future, something whose impact will be broader and deeper.”

Baidu CEO Robin Li, November 2018

“Artificial intelligence is the future, not only for Russia, but for all humankind… Whoever becomes the leader in this sphere will become the ruler of the world.”

Russian president Vladimir Putin, September 2017

Significant global efforts to develop advanced AI systems and lead in this key field

Page 3: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 3GA 11/14/2019

Example Applications of AI

* UAV – Unmanned Aerial Vehicle

CommunicationsOrganization and Retrieval

LogisticsResource Allocation

Hazardous Area AssessmentUAVs* Surveil Danger Zones

Autonomous TransportUAV* Autonomous Resupply

High-Throughput Data CollectionAI Supporting Data Analysis

Pattern-of-LifeAnomaly Detection

Shipping Lane

Anomaly

Follows Identified Pattern

Page 4: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 4GA 11/14/2019

Example Applications of AI

* UAV – Unmanned Aerial Vehicle

CommunicationsOrganization and Retrieval

LogisticsResource Allocation

Hazardous Area AssessmentUAVs* Surveil Danger Zones

Autonomous TransportUAV* Autonomous Resupply

High-Throughput Data CollectionAI Supporting Data Analysis

Pattern-of-LifeAnomaly Detection

Shipping Lane

Anomaly

Follows Identified Pattern

Focus of brief

Page 5: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 5GA 11/14/2019

Cognitive Computer on Watch Framework

A cognitive assistant to support 24/7 data analysis

Perception: process sensor data to turn unstructured content into structured information

1 2 3 4 Human-Machine Teaming: effective teaming requires efficient interaction and a natural language interface

Sense-Making: advanced analytics predict, reason, and support decision making

Memory: information is indexed and stored to facilitate search and retrieval of historical data

Key Technical Components:

* PAI – publicly available information

FeedbackQuery

Interact

Maritime

Space

Cyber

Ground

Air

PAI*

Memory

Information Storage

Perception

Language Processing

ComputerVision

Speech Recognition

Signal Processing

Sense-Making

AnomalyDetection

Alerts

PatternRecognition

PlanAnalysis

Reasoning

4

321Analysts and

Decision Makers

Observe Orient Decide Act

Human-Machine Teaming

Page 6: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 6GA 11/14/2019

Cognitive Computer on Watch Framework

A cognitive assistant to support 24/7 data analysis

Perception: process sensor data to turn unstructured content into structured information

1 2 3 4 Human-Machine Teaming: effective teaming requires efficient interaction and a natural language interface

Sense-Making: advanced analytics predict, reason, and support decision making

Memory: information is indexed and stored to facilitate search and retrieval of historical data

Key Technical Components:

Query

Maritime

Space

Cyber

Ground

Air

PAI*

Memory

Information Storage

Perception

Language Processing

Speech Recognition

Signal Processing

Sense-Making

AnomalyDetection

Alerts

PatternRecognition

PlanAnalysis

4

321Analyst and

Decision Makers

Observe Orient Decide Act

Human-Machine Teaming

Today’s Focus: Imagery Analysis

ReasoningComputerVision

Feedback

Interact

Analyst and Decision Makers

Page 7: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 7GA 11/14/2019

Outline

• Introduction

• Accelerating Imagery Analysis

• Interactive and Interpretable AI

• Summary

Page 8: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 8GA 11/14/2019

Current Image Processing Pipeline

Data

• Large volume with varying resolution, modality, etc.

Data Products

• Distilled information

Imagery Analysis

• Skilled analysts required

• Challenging for limited numbers of skilled analysts to assess large volume of collected imagery

Page 9: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 9GA 11/14/2019

Image Processing Pipeline Leveraging AI

Data Imagery Analysis

Data Products

• Large volume with varying resolution, modality, etc.

• Analysts perform high-level reasoning

• Distilled information

Prioritized and Annotated Data

AI Algorithms

Deep Neural Network

• Robust automated systems

• Challenging for limited numbers of skilled analysts to assess large volume of collected imagery• Goal: Leverage advances in computer vision to extract relevant information from traditional sensors

Page 10: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 10GA 11/14/2019

Example Application

Vehicle Classification and Detection

Training Data: Cars Overhead with Context (COWC) Research Dataset*

• Overhead imagery with 15 cm resolution• Data collected from six locations, across four countries• 5,284 100x100m images• 32,716 annotated cars, 58,247 annotated negative examples

Ground Truth

* Mundhenk, T. Nathan, et al. "A Large Contextual Dataset for Classification, Detection and Counting of Cars with Deep Learning." European Conference on Computer Vision. Springer International Publishing, 2016.

Page 11: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 11GA 11/14/2019

Classification Model

Training Data: Manually labeled 90,963 car and background examples

– Estimated labeling time: >250 hours*

*Based on a rate of 10 seconds per sample

• Very high classification accuracy achievable with sufficient training data

Example Manually Annotated Training Images

• Goal: Differentiate cars from car-like objects (boats, trailers, etc.)

Car Background

Acc

urac

y (%

)

0

20

40

60

80

100

ClassificationModel Performance

Classification Region

Additional Context

Page 12: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 12GA 11/14/2019

AI Data Challenges

Number of Labeled Samples

Acc

urac

y

Learning CurveHuman-Level Performance

Deep LearningBreakthroughs

Data Rich Environments

Data Starved Environments

Data Rich Environments• Labels are free or crowd sourced• Data is easy to collect

Data Starved Environments• Data may be challenging to label for

machine learning applications• Data are difficult to collect because content

of interest is rare

Page 13: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 13GA 11/14/2019

AI Data Challenges

Number of Labeled Samples

Acc

urac

y

Learning CurveHuman-Level Performance

Deep LearningBreakthroughs

Data Rich Environments

Data Starved Environments

Data Rich Environments• Labels are free or crowd sourced• Data is easy to collect

Data Starved Environments• Data may be challenging to label for

machine learning applications• Data are difficult to collect because content

of interest is rare

Page 14: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 14GA 11/14/2019

Impact of Data Quantity on System Performance

Unlabeled Data

UnorderedData

Analyst Labels Subset

Train Model with Labeled Data

Labeled

Subset

Baseline Labeling ApproachLearning Curve

Number of Labeled Samples (k)*

Acc

urac

y (%

)

1092

20 30 40 50 60

94

95

96

97

98

99

100

93

0

*Labeled data augmented 5x for training

BaselineMax Baseline

Page 15: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 15GA 11/14/2019

Overview:• In an active learning framework, training

samples are prioritized for labeling according to model uncertainty

• This prioritizes samples that will have the largest impact on model performance

Technical Challenge:• Limitation: Accurate estimation of model

uncertainty is challenging• Solution Employed: Uncertainty estimated

by sampling multiple model architectures*

Efficient Approach to Labeling Data: Active Learning

Active Learning Cycle

* Gal, Yarin, and Zoubin Ghahramani. "Bayesian convolutional neural networks with Bernoulli approximate variational inference." arXiv preprint arXiv:1506.02158 (2015).

Unlabeled Data

Model Predictions with Uncertainty Calculated

Model Training Performed

Analyst Labels Prioritized Data Subset

Labeled Data Subset

1

Unlabeled DataPrioritized by Uncertainty

? ? ?

? ?

? ? ?

? ?

2

3

Page 16: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 16GA 11/14/2019

Impact of Data Quantity on System Performance

Unlabeled Data

Efficient Labeling with Active Learning

UnorderedData

Analyst Labels Subset

Train Model with Labeled Data

Labeled

Subset Learning Curve

Model Prioritizes Data

Analyst Labels Subset

Labeled

Subset

PrioritizedData

Train Model with Labeled Data

Baseline Labeling Approach

Number of Labeled Samples (k)*

Acc

urac

y (%

)

*Labeled data augmented 5x for training

1092

20 30 40 50 60

94

95

96

97

98

99

100

93

0

Active LearningBaselineMax Baseline

Page 17: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 17GA 11/14/2019

Impact of Active Learning on Data Requirements

Highlights• Efficient labeling reveals that only ~1/5 of the data need to be

labeled to achieve maximum baseline performance

Samples Prioritized with Active Learning

*Labeled data augmented 5x for training

Challenging Positives

Challenging Negatives

Learning Curve

Labeling Effort Saved

1092

20 30 40 50 60

94

95

96

97

98

99

100

93

0

Active LearningBaselineMax Baseline

Number of Labeled Samples (k)*

Acc

urac

y (%

)

Page 18: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 18GA 11/14/2019

AI Data Challenges

Number of Labeled Samples

Acc

urac

y

Learning CurveHuman-Level Performance

Deep LearningBreakthroughs

Data Rich Environments

Data Starved Environments

Data Rich Environments• Labels are free or crowd sourced• Data is easy to collect

Data Starved Environments• Data may be challenging to label for

machine learning applications• Data are difficult to collect because content

of interest is rare

Page 19: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 19GA 11/14/2019

Data Simulation

Toronto, Ontario• 0.21 km2

Ft. Devens (Ayer, MA)• 8.3 km2

Joint Base Cape Cod (JBCC)• 340 km2

• Unity game engine simulations incorporate real-world information– Height maps– Tree locations– Road locations– Building locations

• Implemented semi-automated procedural pipeline to build real-world environments

Example Simulation Environments Developed

• Limited real data available for training AI systems on many targets of interest• Simulation can be used to generate training data on these targets

Page 20: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 20GA 11/14/2019

Simulation Data Products

• Training data: EO imagery, LWIR imagery*• Object level truth data: category, positions, detection boxes• Pixel level truth data: category (segmentation masks), range, incident angle to ground

EO Image Ground Truth

• Simulation enables generation of massive amounts of accurately labeled and diverse training data• Utilizing simulated data with the training of real-world models is an open area of research

* Hsu et al. Empirical LWIR Scene Simulation Based on E/O Satellite and Airborne LWIR Imagery

Page 21: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 21GA 11/14/2019

Target Detection in Simulation

• Vehicle detection model trained on large amounts of simulated overhead imagery

• Very high performance achieved on simulation environments (~90% precision and recall)– Simulation currently leveraged for initial algorithm development and hardware-in-the-loop testing– Approaches for transferring simulation trained models to real-world applications currently being investigated

Example EO Vehicle Detections Example IR Vehicle Detections

Page 22: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 22GA 11/14/2019

OutlineUNCLASSIFIED

UNCLASSIFIED

• Introduction

• Accelerating Imagery Analysis

• Interactive and Interpretable AI

• Summary

Page 23: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 23GA 11/14/2019

Example Human-Machine Teaming:Visual Question Answering Problem

Desired FutureAsk questions through natural language

1. How many cars are south of the large building?2. Are there any airplanes in the region?3. How many helicopters are pointing east?

TodayAsk questions by querying a database

SELECT COUNT(*)FROM cars, buildingsWHERE ST_Contains(ST_GeomFromGeoHash('9qqj7nmxncg'),

ST_GeomFromGeoHash(building.geo))AND ST_Contains(ST_GeomFromGeoHash('9qqj7nmxncg'),

ST_GeomFromGeoHash(car.geo))) AND ST_Azimuth(ST_GeomFromGeoHash(building.geo),

ST_GeomFromGeoHash(car.geo)) < 3*pi()/2AND ST_Azimuth(ST_GeomFromGeoHash(building.geo),

ST_GeomFromGeoHash(car.geo)) > pi()/2

Page 24: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 24GA 11/14/2019

Count

LookLeft

FindCyan Objects

Visual Question Answering with Machine Learning

Transparency by Design Visual Reasoning Network

Question:

Answer: 4

Outputs

Output: series of sub-tasks

Image processing

network

How many spheres are left of the cyan object?

FindSpheres

• Transparency by Design1 (TbD) networks are performant (99.1% accuracy) and produce interpretable outputs1. David Mascharka, Philip Tran, Ryan Soklaski, and Arjun Majumdar. “Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

• CLEVR Visual Reasoning Research Dataset

Language parsing network

Page 25: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 25GA 11/14/2019

Transparency by Design Approach to Visual Question Answering

Question: How many cars are south of the large building?

Answer: 23

Find large building1 Look south of the large building2 Find cars in region south

of the large building3 4 Count the cars

Parse question and identify sub-tasks:

• Transparency by Design networks can be combined with advanced analytics to provide rich access to imagery data

• Interpretable decisions allow for review and understanding of algorithm results

Page 26: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 26GA 11/14/2019

Summary

• Modern machine learning techniques provide a means of extracting information from high-throughput sensors on relevant time scales– Automated exploitation for rapid analysis

• Methods for developing AI systems in environments with limited training data is an active area of research– Active learning approaches can make more efficient use of data, reducing labeling requirements– Simulation can provide large volume of supplemental data when real data is limited

• Research ongoing into best way to leverage simulated environments for real-world use

• Continued development of interactive and interpretable machine learning systems key for providing advanced decision support tools– Transparency by Design networks can be combined with advanced analytics to provide rich

access to imagery data

Page 27: Computer-on-Watch: Imagery Analysisilp.mit.edu/images/conferences/2019/rd/presentations/...32,716 annotated cars, 58,247 annotated negative examples Ground Truth * Mundhenk, T. Nathan,

CoW: Imagery Analysis- 27GA 11/14/2019

• Bob Bond• Curt Davis• Constantine Frost• Vijay Gadepally• Dan Griffith• Rick Heinrichs• Bernadette Johnson• Samantha Jones• Alicia Kendall• Ben Landon• Arjun Majumdar• David Mascharka

• Paul Metzger• Sanjeev Mohindra• Paul Monticciolo• Tommy O'Connell• Bob Shin• Ben Smith• Michael Snyder• Ryan Soklaski• Kim Tanguay• Philip Tran• Marc Viera• Joseph Zipkin

Acknowledgements