big data and analytics - eddie satterly and michael stevens

Upload: erepublic

Post on 14-Apr-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    1/35

    1

    Big Data & AnalyticsLos Angeles Digital Government Summit

    September 4, 2013

    Eddie Satterly Splunk

    Michael D. Stevens - IBM

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    2/35

    2

    What is Big Data &

    Analytics?

    Why is it Important to

    Government?

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    3/35

    3

    Big data embodies new data characteristics created

    by todays digitized environment

    3

    Characteristics of big data

    Source: IBM methodology

    2013 IBM Corporation

    Variability

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    4/35

    4

    An Explosion of Data

    150 Exabytesglobal size of Big Data inHealthcare, growing between1.2 and 2.4 EX / year

    AT&T transfers about

    30 Petabytes of datathrough its network daily

    For every session,NY Stock Exchange captures

    1 Terabyte of tradeinformation

    Hadron Collider at CERN

    generates 40 Terabytesof usable data / day

    Facebook processes

    500+ Terabytesof data daily

    Twitter processes

    12 Terabytesof data

    daily

    Google processes

    > 24 Petabytesof data in a single day

    By 2016, annual Internet trafficwill reach 1.3Zettabytes(1 ZB = 1,000,000,000,000,000,000,000 bytes) 21

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    5/35

    55

    GlobalD

    ataVolumeinExa

    bytes

    Sensors

    (Inte

    rnet

    ofThin

    gs)

    Multiple sources: IDC,Cisco

    100

    90

    80

    70

    60

    50

    40

    30

    20

    10

    AggregateUncertainty%

    VoIP

    9000

    8000

    7000

    6000

    5000

    4000

    3000

    2000

    1000

    0

    2005 2010 2015

    Enterprise Data

    By 2015 the number of networked

    devices will be double the entire global

    population.

    Socia

    l Media

    (video

    , aud

    io and

    text)

    The total number of social media

    accounts exceeds the entire global

    population.

    The Growth of Big Data

    Warehouses will not be enough.the

    future will be based on those that learn

    how to leverage the power of Big Data

    2013 IBM Corporation

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    6/35

    6

    Text Documents Blogs Web Logs Mfg. Equipment

    Email Weather Data Social M edia Stock Trades

    Text Documents BlogsText Documents Web LogsBlogs

    Mfg. Equipment Utility Meters Medical Equip. Call Data Records

    Point of Sale Data Video Cameras Audio Devices Oil Rigs

    Where is the Big Data Coming From?

    Data at rest

    Data is stored on disk

    Huge volumes of unstructured data

    No pre-defined schemas

    Too large for traditional tools to

    process in a timely manner

    Data in motion

    Data is typically not storedTremendous velocity

    Multiple data sources

    Huge volumes of unstructured data

    Ultra low latency required

    2013 IBM Corporation

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    7/35

    Copyright 2012, Splunk Inc. Listen to your data.

    What Does Big Data Look Like?

    7

    GPS,

    RFID,

    Hypervisor,Web Servers,

    Email, Messaging

    Clickstreams, Mobile,

    Telephony, IVR, Databases,

    Sensors, Telematics, Storage,

    Servers, Security devices, Desktops

    Machine-generated data is one of

    the fastest growing, most complexand most valuable segments of big

    data

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    8/35

    8

    Big Data Enables Different Kinds of Analytics

    Struc tured Data

    & U n s t r u c t u r e d

    Content

    Des c ri pt i v e

    Anal y t i c s

    Pres c ri pt i v e

    Anal y t i c s

    Predi c t i v e

    Anal y t i c s

    Made

    c o n s u m a b l e an d

    ac c es s i bl e to

    ev ery one

    What if

    these trends

    continue?

    Forecasting

    How can we achieve

    the best outcome and

    address variability?

    StochasticOptimisation

    What is

    happening

    What

    exactly is

    the

    problem?

    How many,

    how often,

    where?

    What

    actions areneeded?

    What could

    happen?

    Simulation

    How can we achieve

    the best outcome?

    Optimization

    What will

    happen next

    if?

    PredictiveModelling

    Extracting

    concepts and

    relationships

    Content

    Anal y t i c s

    What Are

    People

    Talking About& Feeling

    Web

    Anal y t i c s

    Language &

    Sentiment

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    9/35

    9

    New Challenges & Big Data Require A Different Approach

    Leaders Are Breaking The Traditional Information Management Model

    ITStructures the

    data to answer

    that question

    ITDelivers a platform to

    enable creative discovery

    BusinessExplores what questions

    could be asked

    Business UsersDetermine what

    question to ask

    Big Data ApproachTraditional Approach

    Structured & Repeatable Analytics

    Query Based -- Questions Drive Data

    Customer Surveys & Focus Groups

    Monthly, Weekly, Daily

    Data At Rest

    Iterative & Exploratory Analytics

    Autonomic -- Insight Drives Answers

    Customer Sentiment

    Persistent & Ad Hoc

    Data In Motion & at rest

    VS.

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    10/35

    10

    Big Data Better Context

    Sense Making: We understand something better by taking into account the things around it

    Context Accumulation: The incremental process of integrating new

    observations with previous observations.

    @Steve Rocked The

    Slopes Today!1 minutes ago

    [Hardly actionable]

    Back InjuryWork Comp

    Claim

    Dr. Blacklist

    [Substantially more actionable]

    @Steve Rocked The

    Slopes Today!1 minutes ago

    2013 IBM Corporation

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    11/35

    Copyright 2012, Splunk Inc. Listen to your data.

    Big Data Technologies

    11

    Relational Database

    (highly structured)

    Teradata

    GreenplumCassandra

    CouchDB

    MongoDB

    SQL &

    Map / ReduceNoSQL

    Temporal, Unstructured

    Heterogeneous

    Hadoop

    RDBMS

    ShardingHDFS Storage +

    Map / Reduce

    Real Time Indexing

    IBM

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    12/35

    12

    Next Generation Architecture for Delivering Information and Insights

    AnalyticAppliances

    AnalyticAppliances

    Security, Governance and Business ContinuitySecurity, Governance and Business Continuity

    Information Movement, Matching & TransformationInformation Movement, Matching & Transformation

    Landing,Exploration& Archive

    Landing,Exploration& Archive Enterprise

    WarehouseEnterprise

    Warehouse

    Data MartsData Marts

    Real-Time AnalyticsReal-Time Analytics

    DataSources

    Structured

    Operational

    Unstructured

    External

    Social

    Sensor

    Geospatial

    Time Series

    Streaming

    Information& Insight

    BI & Performance

    Management

    Predictive Analytics

    & Modeling

    Exploration &

    Discovery

    Big D ata Plat form

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    13/35

    2013 IBM Corporation

    Big Data Use Cases

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    14/35

    14

    Massive Amounts

    of Internal Data

    14

    Social ServicesPredict current and futureneeds of citizens and design

    programs accordingly, while

    preventing fraud and abuse.

    Revenue Management &Tax Compliance

    Have real time insights into

    program budgets, and address the

    tax gap through optimized auditing.

    PoliceHave a holistic view of

    perpetrators, suspects

    and victims and have

    insights to beat criminalsto the scene.

    National Borders

    & SecurityIdentify and respond topotential threats before

    they materialize.

    DefenseMake better command

    and control decisions

    and improve the

    tracking of strategic and

    operational assets.

    TransportationImprove traffic flows and reduce

    emissions based on real-time

    traffic and weather data.

    Government Organizations can Improve

    Operations and Outcomes

    Satellite/AerialImagery

    Email, Fax

    SensorsVideo

    SocialMedia

    GPSCurrentEvents

    With Big Data

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    15/35

    15

    NewCapability

    Geospatial

    Social Media

    Tax Files

    Benefit

    PaymentsMedical

    Files

    New fraud clues revealed Real-time information sharing

    across government & private

    industry

    Deep medical & benefits

    records text analytics

    Faster and more accurate

    predictive models

    Tax and social program fraud, abuse and errorsAn integrated approach to fighting fraud, abuse and error in tax and social

    programs

    OutcomesReduce overpayments

    Minimize tax gap

    Proactively detect & deter fraud

    Reduce analysis time

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    16/35

    2013 IBM Corporation16

    Reducing Fraud and Enabling Better Outcomes

    Identified an improper payment levelfor a particular benefit of over 40%,

    w o rth o ver $140 Mi l l ion

    Performed analysis in hours, instead

    of weeksAd-hoc analysis of over 70 data

    sources, including: in-patient, out-patient, prescriptions, financial records,notices of death, criminal data, manyothers

    Utilizes analytic data warehouseappliance

    Major governmentmedical and social

    benefits agency

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    17/35

    17

    GeospatialLocation data

    Social MediaSearch, blogs, tweets,

    text messages

    Entities& Relationships

    Persons of interest,targets, watch lists

    Sensorsoptical, acoustic,

    thermal, chemical, etc.

    Continuous ingest of relevant

    structured and unstructured data

    Holistic entity or activity-centric

    picture across multiple data

    sources and types of intelligence

    ImagerySatellite, aerial,

    camera

    Threat & Crime prediction and preventionIdentify and respond to threats and crime before it materializes

    More reliable understanding

    of a suspect, target or area of

    interest

    Finds the dots, connects them

    Helps analysts understand

    what they dont know

    Outcomes

    NewCapability

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    18/35

    18

    Threat and Crime Prediction & Prevention

    U.S. High SecurityFacility

    Recognize crime trends as t hey ar eh a p p e n i n g ; enables changing tactics andredirecting resources before crime happens

    Integrates heterogeneous data, statisticalmodeling/analysis and GIS

    30% r educ t i on in serious crime overall;

    36% r educ t i on in one targeted area

    Memphis PoliceDepartment

    Needed a physical intrusion detector systemable to detect, classify, locate and trackpotential threats above and below ground

    Data arrives at the extremely h i g h d a ta r at e of 1.6 GB p er second and isprocessed and transmitted in real-time

    Sensitive enough to distinguish between aanimal and an intruder

    Uses stream computing platform

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    19/35

    Dublin City Council improves traffic flow by using big data analytics

    to predict bus arrival and transit times

    Improves bus servicefor citizens by helping ensure

    buses stay on schedule

    Smarter Traffic

    Business Challenge: To improve public transport services, Dublin City Councilsought a way to dynamically monitor the movement of each of the citys 1,000

    buses and better gauge if each one was operating on time.

    The Smarter Solution: The city deployed an intelligent traffic control solution

    that uses geospatial data from GPS-equipped buses to visually display thenear-real-time position of each bus on a digital city map. Controllers can locate

    areas experiencing delays at a glance and instantly drill down to live camera

    feeds to identify root causes. Predictive analytics take into account speed, traffic

    flow and other factors to continually generate up-to-date estimates for bus arrival

    and transit times.

    Our traffic managers can make more informed decisions, based on whats

    happening on our bus routes at any point in time. Thats a powerful tool.

    Brendan OBrien, head of technical services, roads and traffic department

    Reduces congestionby increasing visibility into trafficdelays and speeding decision

    making by controllers

    Optimizes planningwith deeper insight into long-term

    traffic and usage patterns

    2013 IBM Corporation

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    20/35

    2013 IBM Corporation

    Enhance citizenrelationships

    Understand citizens needs to target new services cost-

    effectively through different social media channels

    Create Relationships. Build Advocacy. Improve Service.

    Evaluate your reputation and make evidence-based

    decisions that target the right stakeholders at the right

    time

    Improve citizen

    experience

    Respond more quickly with accurate, timely and

    relevant insight into citizens requests to ensure aconsistent experience across all channels

    Social Media Analytics (Citizen Insight)

    Enhance ServiceOutcomes

    Analytics that listen, measure and analyze social media

    performance to more effectively:

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    21/35

    A city in France uses social data analysis to better understandand respond to citizens top issues, including their

    misconceptions1.6 million commentsanalyzed to pinpoint 100,000

    unique comments for more

    precise analysis

    Smarter Cities

    Business Challenge: This city in France was challenged with deciphering the

    concerns, ideas and expectations its residents were voicing on a wide range of

    topics. With a booming population, the city government struggled to keep up withthe flood of comments streaming through social media.

    The Smarter Solution: The city uses a social data analytics solution to analyze

    citizens opinions posted on public social media, taking into account factorsincluding context, content and sentiment. The insights help the city identify and

    prioritize citizens most prevalent and pressing issues as well as understand how

    messages may be resonating. This has helped the city clear up

    misunderstandings about a major revitalization of the city center, allaying

    anxieties about traffic and construction and gaining support for the project.

    The city government is now perfectly in tune with its citizens, enabling it to offer

    them solutions and responses to accurately meet their expectations. This is a

    significant step in managing its urban policy.

    Director of communications

    93% increasein average response time toroad maintenance issues, from

    15 days to 1 day

    Boosts public relationsas well as urban planning and policy

    development

    2013 IBM Corporation

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    22/35

    Copyright 2012, Splunk Inc. Listen to your data.

    Use Cases

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    23/35

    Copyright 2012, Splunk Inc. Listen to your data.

    Major International Airport

    Challenges PCI Compliance due to credit

    card transaction processed

    by Airport Agency

    Solution PCI app

    Delivered a centralized view

    into users and in-scope

    system activities

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    24/35

    Copyright 2012, Splunk Inc. Listen to your data.

    Federal Agencies

    Challenges ability to respond to

    incidents

    by analyzing massive

    amounts

    of network and ITinfrastructure logs

    FISMA / NIST Compliance

    Solution

    FISMA app

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    25/35

    Copyright 2012, Splunk Inc. Listen to your data.

    Healthcare Information Delivery

    Challenges Frequent outages

    Long processing time for claims

    Claim Fraud

    Solution Analyze data from healthcare

    claim processing platform

    Monitoring and alerting,

    capacity planning

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    26/35

    Copyright 2012, Splunk Inc. Listen to your data.

    Military Base

    Challenges Fault Detection and Diagnosis Continuous Commissioning Energy Monitoring Control Costs Improve Operations Save Energy

    Solution bdoc Analytics

    Gaining insight into buildingsenergy use and expense

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    27/35

    Copyright 2012, Splunk Inc. Listen to your data.

    State Health & Human Services Agency

    Challenges Frequent outages of online system

    Lack of visibility on how public accesses the eligibility and welfare system

    Solution: Web Intelligence App insight into user activity on site/app

    Operational Dashboards Optimize developer time based on platform usage

    Detecting Fraud IE: number of lost/stolen EBT cards correlated with replacement

    cards ordered

    Reporting on State Wide Benchmarks IE: Average amount recovered and saved

    per completed Medicaid provider investigation

    Number of citizens using call centers and the Internet to apply for Medicaid,

    benefits, and services

    Average daily case load for Child or Adult Protective Services

    Enhance and Improve customer satisfaction

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    28/35

    Copyright 2012, Splunk Inc. Listen to your data.

    Monitoring Health Information Network

    28

    Key Issues:

    Challenging to get visibility

    across complex platform

    Long time to resolve problems Reactive in addressing issues

    Need for high accuracy /

    uptime in information exchange

    200,000+ physicians | 1,000+ hospitals | 1300+ health plans I 450+ industry partner

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    29/35

    Copyright 2012, Splunk Inc. Listen to your data.

    City Police Department

    Challenges Visualization of Crime locations/types

    Lack of visibility into communication

    Solution Crime analysis statistics Monitoring the logs of the radio communications

    Audit who is accessing what system

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    30/35

    Copyright 2012, Splunk Inc. Listen to your data.

    Customer Profiling with Set Top Box Activity

    30

    Data Source Provisioning Customer searching & viewing

    behavior

    Business value

    Customer intelligence to drivemarketing / promotion campaigns Geo location mapping for better

    localized promotions Negotiate price/licensing rights

    based on actual movie/contentdemand

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    31/35

    Where & How Do I Start?

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    32/35

    32

    Tech America Big Data Report Findings

    1. Understand the Art of the Possible

    2. Identify 2-4 key business or mission requirements that develop

    underpinning use cases that would create value for both the agency

    and the public.

    3. Take inventory of your data assets. Explore the data available both

    within the agency enterprise and across the government ecosystem

    within the context of use cases.

    4. Assess your current capabilities and architecture against what is

    required to support your goals

    5. Explore which data assets can be made open and available to the

    public to help spur innovation outside the agency.

    2013 IBM Corporation

    http://www.techamericafoundation.org/bigdata

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    33/35

    33

    Practical Big Data Roadmap

    Define the Big Data

    opportunity including the key

    business and mission

    challenges, the initial use

    case or set of use cases, and

    the value Big Data canDeliver

    Assess the organizations

    currently available data and

    technical capabilities, against

    the data and technical

    capabilities required to satisfy

    the defined set of businessrequirements and use cases

    Select the most appropriate

    deployment pattern and

    entry point, design the to

    be technical architecture,

    and identify potential policy,

    privacy and securityconsiderations

    Deploy the current phase Big

    Data project, maintaining the

    flexibility to leverage its

    investment to accommodate

    subsequent business

    requirements and use cases

    Continually review progress,

    adjust the deployment plan as

    required,and test business

    process,policy, governance,

    privacy and security

    considerations

    Define Plan

    Execute Review

    Assess

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    34/35

    34

    For more information:

    ibm.com/bigdata splunk.com/bigdata

    [email protected]@us.ibm.com

    IBM Government Big Data E-book:

    http://www.ibm.com/common/ssi/cgi-

    bin/ssialias?subtype=BK&infotype=PM&appname=SWGE_IM_EZ_USEN&htmlfid=IMM14130USEN&attachment=IMM14130U

    SEN.PDF

  • 7/27/2019 Big Data and Analytics - Eddie Satterly and Michael Stevens

    35/35

    35

    Questions?