from what? to discovery enablement: presented by gabe arnett, moody's analytics
TRANSCRIPT
O C T O B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X
From What? To Discovery Enablement Gabe Arnett
Senior Director Search & Data Strategy
Moody’s Analytics
02
0.21 centuries playing with the 1s & 0s
Presently wrangling data management wild west and enabling discovery with a little help from the incredible folks @Lucidworks using #Fusion
4
Moody’s Analytics helps capital markets and risk management professionals worldwide respond to an evolving marketplace with confidence.
The company offers unique tools and best practices for measuring and managing risk through expertise and experience in credit analysis, economic research and financial risk management.
By providing leading-edge software, advisory services, and research, including the proprietary analysis of Moody’s Investors Service, Moody’s Analytics integrates and customizes its offerings to address specific business challenges.
Moody's Analytics is a subsidiary of Moody's Corporation (NYSE: MCO), which reported revenue of $3.3 billion in 2014, employs approximately 9,900 people worldwide and maintains a presence in 33 countries.
Further information is available at www.moodysanalytics.com
5
01
6
01
7
01
8
01
9
01
¥ search engine transformed into primary data store
• Endeca != Database• Difficult to expand/integrate
outside of Search• Just really bad decision!
10
01
¥ no joins +¥ poor data model +¥ duplication
11
01
¥ index & query performance <
12
01
¥ architectural dinosaur• Fault tolerance severe• Disk/IO HOG• Inefficient memory management• Bulky infrastructure
13
01
¥ Changes are painful and time consuming
¥ Forget about sandboxing
¥ Development effort required ALWAYS!
¥ Regression testing required ALWAYS!
¥ Did I mention its also the primary data store?
14
01
ITL(writes) MDEX(reads)
¥ Partial indexing runs every 4 mins¥ Crawls MS SQL server and Sharepoint Document Library¥ Average time to web ~5-10 mins¥ > 16 million records with total index size ~100GB (BLOAT)¥ > 400k page views a day across multiple applications
15
01
¥ evolve to survive
¥ significantly improve our users experience
¥ google like precision and speed
16
01
17
01
¥ Single administration & configuration management interface
¥ Index data in clicks not hours/days/weeks
¥ Relevant results every time
¥ As it happens search analytics
¥ Superb scale
¥ Robust and simple interface
18
01
¥ Indexing is real time¥ Crawls NoSql cluster (Mongodb) (Eval Couchbase)¥ Average time to web see first bullet above¥ Index size significantly smaller¥ Able to support same load and then some
Fusion Solr NoSql Cluster
SEARCH ONLY
19
01
¥ Recommendation system
¥ Analytics
¥ ?