sirius: an open end-to-end voice and vision personal...

1
Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers Johann Hauswald, Michael A. Laurenzano, Yunqi Zhang, Cheng Li, Austin Rovinski, Arjun Khurana, Ron Dreslinski, Trevor Mudge, Vinicius Petrucci, Lingjia Tang, Jason Mars Intelligent Personal Assistants (IPAs) are standard in today’s mobile devices. The rapid rise in IPA equipped devices means more compute intensive queries will be hitting current datacenters which are ill-suited to handle this type of workload. 1. Problem Statement: Redesigning the Datacenter for Intelligent Personal Assistants 2. Sirius: An Open End-to-End Voice and Vision Personal Assistant Answer Question-Answering Search Database Question Action Execute Action Mobile Server Display Answer Image Database Image Matching Image Image Data Voice Question or Action Query Classifier Automatic Speech-Recognition Users Figure 1: End-to-End Sirius Pipeline Users Voice Command (VC) Voice Query (VQ) Voice-Image Query (VIQ) Query Taxonomy IPA Services Algorithmic Components Gaussian Mixture Model (GMM) or Deep Neural Network (DNN) Automatic-Speech Recognition (ASR) Stemmer Regular Expression Conditional Random Fields Question Answering (QA) Feature Extraction Feature Description Image Matching (IMM) Tasks Natural Language Processing Image Processing Signal Processing Open Source Tools CMU Sphinx Figure 2: Top-down view of Sirius Sirius: built from the latest open source tools; Sirius resembles current production intelligent personal assistants in its algorithmic components. Figure 3: Sirius Service Cycle Breakdown Clarity Lab, University of Michigan, Ann Arbor, MI, USA 4. Implications for Future Warehouse Scale Computers 3. Accelerating Sirius-suite Sirius-suite: extracted from Sirius, this is a suite of the 7 most computationally demanding kernels in Sirius representing 92% of the total execution time. Sirius-suite Speedup Figure 4: Heat-map of Sirius-suite Acceleration Platform Model Clock Threads CMP Intel Xeon E3-1240 V3 3.40 GHz 8 GPU NVIDIA GTX 770 1.05 GHz 12288 Intel Phi Phi 5110P 1.05 GHz 240 FPGA Xilinx Virtex-6 ML605 400 MHz N/A Table 1: Sirius-suite Ported Platforms Latency Reduction: FPGA: 16x GPU: 10x Figure 5: Latency ReductionAcross Sirius Services Total Cost of Ownership (TCO) Improvement GPU: 2.6x FPGA: 1.4x Call my doctor. Who’s the lead singer of U2? When does this bar close? Figure 6: Latency Reduction Figure 7: Performance/Watt Improvement Figure 8: TCO Reduction

Upload: others

Post on 16-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sirius: An Open End-to-End Voice and Vision Personal ...web.eecs.umich.edu/~jahausw/media/sirius-poster.pdfIntelligent Personal Assistants 2. Sirius: An Open End-to-End Voice and Vision

Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale ComputersJohann Hauswald, Michael A. Laurenzano, Yunqi Zhang, Cheng Li, Austin Rovinski, Arjun Khurana, Ron Dreslinski, Trevor Mudge, Vinicius Petrucci, Lingjia Tang, Jason Mars

Intelligent Personal Assistants (IPAs) are standard in today’s mobile devices. The rapid rise in IPA equipped devices means more compute intensive queries will be hitting current datacenters which are ill-suited to handle this type of workload.

1. Problem Statement:Redesigning the Datacenter for Intelligent Personal Assistants

2. Sirius: An Open End-to-End Voice and Vision Personal AssistantAnswer

Question-Answering

Search Database

Question

ActionExecute

Action

Mob

ile

Ser

ver

DisplayAnswer

ImageDatabase

Image Matching

Image

Image D

ata

Voice Questionor

Action

Query Classifier

AutomaticSpeech-Recognition

Users

Figure 1: End-to-End Sirius Pipeline

Users

Voice Command(VC)

Voice Query(VQ)

Voice-Image Query(VIQ) Query Taxonomy

IPA Services

AlgorithmicComponents

Gaussian Mixture Model (GMM)or

Deep Neural Network (DNN)

Automatic-Speech Recognition

(ASR)

StemmerRegularExpression

ConditionalRandom Fields

Question Answering(QA)

Feature Extraction

Feature Description

Image Matching(IMM)

TasksNatural LanguageProcessing

Image ProcessingSignal Processing

Open SourceTools

CMU Sphinx

Figure 2: Top-down view of Sirius

Sirius: built from the latest open source tools; Sirius resembles current production intelligent personal assistants in its algorithmic components.

Figure 3: Sirius Service Cycle Breakdown

Clarity Lab, University of Michigan, Ann Arbor, MI, USA

4. Implications for Future Warehouse Scale Computers3. Accelerating Sirius-suiteSirius-suite: extracted from Sirius, this is a suite of the 7 most computationally demanding kernels in Sirius representing 92% of the total execution time.

Sirius-suite

SpeedupFigure 4: Heat-map of Sirius-suite Acceleration

Platform Model Clock Threads

CMP Intel Xeon E3-1240 V3 3.40 GHz 8

GPU NVIDIA GTX 770 1.05 GHz 12288

Intel Phi Phi 5110P 1.05 GHz 240

FPGA Xilinx Virtex-6 ML605 400 MHz N/A

Table 1: Sirius-suite Ported Platforms

Latency Reduction: FPGA: 16x GPU: 10x

Figure 5: Latency ReductionAcross Sirius Services

Total Cost of Ownership (TCO) Improvement

GPU: 2.6x FPGA: 1.4x

Call my doctor. Who’s the lead singer of U2?

When does this bar close?

Figure 6: Latency Reduction Figure 7: Performance/Watt Improvement

Figure 8: TCO Reduction