Adding Value To Information

Download Adding Value To Information

Post on 08-Dec-2014




3 download

Embed Size (px)




<ul><li> 1. Adding Value to Information via Analytics.Perspective from BA&amp;MS Research and Projects May 2008</li></ul> <p> 2. Outline </p> <ul><li>Historical perspective. When can analytics enhance value of information? </li></ul> <ul><li>Using analytics to utilize information. </li></ul> <ul><li><ul><li>Supply chain </li></ul></li></ul> <ul><li><ul><li>Workforce management </li></ul></li></ul> <ul><li><ul><li>Carbon management </li></ul></li></ul> <ul><li>Using analytics to extract information. </li></ul> <ul><li><ul><li>Collaborative filtering, Netflix challenge </li></ul></li></ul> <ul><li><ul><li>ASCOT </li></ul></li></ul> <ul><li><ul><li>BANTER </li></ul></li></ul> <ul><li>Using analytics to collect information. </li></ul> <ul><li><ul><li>Prediction markets </li></ul></li></ul> <ul><li><ul><li>Peer-to-peer services </li></ul></li></ul> <ul><li><ul><li>Personal benchmarking </li></ul></li></ul> <p> 3. Information / Analytic services start up when a new sector of economic activity begins to take-off Information / Analytic Service Starting Points 2000 1990 1980 1970 1960 1950 1940 1930 1920 1900 IMS Health Brand Pharmaceutical market begins to take off R.L. Polk meets with Alfred Sloan to discuss information needs in growing auto market Polk Auto Registry Database A.C. Nielsen Network TV advertising opens up Early Mover position in an emerging market is critical Getty Images Digital Photography takes over Navteq GPS becomes commercially usable Stock market crash of 1907 Moodys aQuantive Internet advertising begins to grow Morningstar Take-off in individual mutual fund investing Fair-Isaac Consumer credit goes mass market 4. Outline </p> <ul><li>Historical perspective. When can analytics enhance value of information? </li></ul> <ul><li>Using analytics to utilize information. </li></ul> <ul><li><ul><li>Supply chain </li></ul></li></ul> <ul><li><ul><li>Workforce management </li></ul></li></ul> <ul><li><ul><li>Carbon management </li></ul></li></ul> <ul><li>Using analytics to extract information. </li></ul> <ul><li><ul><li>Collaborative filtering, Netflix challenge </li></ul></li></ul> <ul><li><ul><li>ASCOT </li></ul></li></ul> <ul><li><ul><li>BANTER </li></ul></li></ul> <ul><li>Using analytics to collect information. </li></ul> <ul><li><ul><li>Prediction markets </li></ul></li></ul> <ul><li><ul><li>Peer-to-peer services </li></ul></li></ul> <ul><li><ul><li>Personal benchmarking </li></ul></li></ul> <p> 5. Utilizing Information </p> <ul><li>We consider situations where information is already available</li></ul> <ul><li><ul><li>From ERP or other business process automation tools </li></ul></li></ul> <ul><li><ul><li><ul><li>Historical data</li></ul></li></ul></li></ul> <ul><li><ul><li><ul><li>Some enterprise generated view of the future </li></ul></li></ul></li></ul> <ul><li><ul><li>May be combined with purchased data from information services </li></ul></li></ul> <ul><li><ul><li>Most examples now are within an enterprise or an enterprise driven value net </li></ul></li></ul> <ul><li>We focus on the case where analytics are applied to the information with the goal of optimizing the use of resources</li></ul> <ul><li>Examples: </li></ul> <ul><li><ul><li>Supply Chain </li></ul></li></ul> <ul><li><ul><li>Workforce management </li></ul></li></ul> <ul><li><ul><li>Carbon management </li></ul></li></ul> <p> 6. Supply Chain Collaboration: IBM Buy Analysis Tool ( i BAT) Improve Inventory Cost in IBM's Extended Supply Chain Business Problem Solution Business Value </p> <ul><li><ul><li>A significant percentage of IBMs hardware sales in high-velocity servers are sold through major channel partners such as Arrow, Ingram, and Tech Data.</li></ul></li></ul> <ul><li><ul><li>Lack of alignment between procurement, manufacturing, and channel sales resulted in significant price protection and sales incentive costs for IBM and high inventory-related costs for our channel partners </li></ul></li></ul> <ul><li><ul><li>Web-based collaboration platformfor IBMs channel replenishment planningthat c ombines innovative forecasting and inventory analytics with up-to-date visibility of channel sales and inventory data </li></ul></li></ul> <ul><li><ul><li>Optimized buy recommendations for channel partnersbased on statistical forecasting techniques and risk-optimized inventory replenishmentmodels </li></ul></li></ul> <ul><li><ul><li>Proactive r eview system that initiates demand shaping based on supply and demand imbalances </li></ul></li></ul> <ul><li><ul><li>Standard SOA-based solution design which can easily be adapted to specific ERP environments </li></ul></li></ul> <ul><li><ul><li>Patent-pending methodology </li></ul></li></ul> <ul><li><ul><li>Cornerstone of IBM Server Groups Business Partner Transformation Initiative </li></ul></li></ul> <ul><li><ul><li>Fully deployed with IBMslargest channel partners across the United States, Canada and Europe </li></ul></li></ul> <ul><li><ul><li>Solution enables business partners to carry 15-25% less inventory without negatively impacting their delivery performance </li></ul></li></ul> <ul><li><ul><li>Lower channel inventory resulted in lower price protection expenses for IBM, improved cash flow, and higher operating margins</li></ul></li></ul> <p> 7. Available to Sell (ATS) Find saleable product recommendations to consume excess inventory Business Problem Solution Business Value </p> <ul><li><ul><li>With shrinking product lifecycles, component supply overages can quickly lead to obsolescence requiring costly inventory writeoffs.One way to avoid this costs is to find products to build and sell that would consume the excess supply. </li></ul></li></ul> <ul><li><ul><li>In a complex product environment such as IBM Servers, product build-out typically requires additional procurement of non-excess parts to square with the excess supplies.With part commonality across many possible product configurations, this leads to an enormous number of potential build-out strategies to choose from.Additional factors such as part substitution, re-work costs, and marketing constraints make this a difficult optimization problem. </li></ul></li></ul> <ul><li><ul><li>ATS Engine uses IBMs Watson Implosion Technology to find optimal sales recommendation portfolio given: excess part supplies, bill of material, procurement and value-add costs, product demand upper bounds, and product pricing. </li></ul></li></ul> <ul><li><ul><li>Pegging module assigns excess consumption additional costs to each product in the sales recommendation allowing users to pick which build-outs to execute and promote in market. </li></ul></li></ul> <ul><li><ul><li>What-if capability enables users to cost a targeted build-out plan, supporting end-of-life processes. </li></ul></li></ul> <ul><li><ul><li>ATS Engine and Process fully deployed in IBMs Systems Technology Group since 2002. </li></ul></li></ul> <ul><li><ul><li>Solution drove build-outs and sales recommendations which consumed $200 million worth of excess inventory in 2002.</li></ul></li></ul> <ul><li><ul><li>Ongoing usage of the tool keeps excess supply from becoming obsolete.</li></ul></li></ul> <ul><li><ul><li>System is integrated with IBMs Central Planning Engine with Web-based, on-demand availability within IBM STG. </li></ul></li></ul> <p> 8. Application Areas in Workforce Management Many opportunities to improve workforce management through utilization of information JAN APR JUL DEC DEMAND FORECASTING CAPACITY PLANNING STRATEGIC PLANNING TRAINING AND LEARNING SKILL&amp;ENGAGEMENT ANALYTICS MATCHING &amp; SCHEDULING ? x Now Target 9. Workforce challenges -The DATA is distributed in many enterprise applications </p> <ul><li>There is no single Enterprise Resource Planning tool for labor management </li></ul> <ul><li>Supply (given in terms of roles or skills) </li></ul> <ul><li><ul><li>Traditional HR systems contain information about the current job </li></ul></li></ul> <ul><li><ul><li><ul><li>Structured: Position code, salary, location, shift, etc </li></ul></li></ul></li></ul> <ul><li><ul><li><ul><li>Unstructured: Education, IBM courses, dept history, awards</li></ul></li></ul></li></ul> <ul><li><ul><li>New Job Role/Skill Set with job taxonomy and skill list </li></ul></li></ul> <ul><li><ul><li><ul><li>Full Text Resumes </li></ul></li></ul></li></ul> <ul><li>Demand (given in terms of engagements or contracts) </li></ul> <ul><li><ul><li>Past and Current Contracts (and history of deal closure)</li></ul></li></ul> <ul><li><ul><li>New opportunities: Sales Opportunity Database </li></ul></li></ul> <ul><li>Missing link </li></ul> <ul><li><ul><li>Bill of resources = set of skills required to deliver an engagement </li></ul></li></ul> <ul><li><ul><li>But billing database includes detail (by individual) on employees participation in engagements </li></ul></li></ul> <ul><li><ul><li>And additional sources include contractor/engagement data</li></ul></li></ul> <p> 10. Business Consulting Examples Can range from one month, one skill set.. .to more than 10 months, 16K hours, and wide range of job roles/skill sets Weekly variations appear to be driven by calendar effects, vacation schedules, and resource availability Supply Chain-PLM Engagements 11. </p> <ul><li>Several different sources of dataHigh level account information, such as </li></ul> <ul><li><ul><li><ul><li>Client name </li></ul></li></ul></li></ul> <ul><li><ul><li><ul><li>Account description </li></ul></li></ul></li></ul> <ul><li><ul><li><ul><li>Offering information </li></ul></li></ul></li></ul> <ul><li><ul><li><ul><li>Billing (Fixed price, best estimate)</li></ul></li></ul></li></ul> <ul><li><ul><li>Ledger information </li></ul></li></ul> <ul><li><ul><li><ul><li>Project cost, revenue </li></ul></li></ul></li></ul> <ul><li><ul><li>Labor claiming information </li></ul></li></ul> <ul><li><ul><li><ul><li>Hours claimed per week by each employee on a project </li></ul></li></ul></li></ul> <ul><li><ul><li>Employee information </li></ul></li></ul> <ul><li><ul><li><ul><li>Line of Business, Job Role, Skill Set, global resource, etc.</li></ul></li></ul></li></ul> <ul><li>For US contracts over past 18 months </li></ul> <ul><li><ul><li>Approximately 10K accounts </li></ul></li></ul> <ul><li><ul><li>More than 2M labor claim records </li></ul></li></ul> <p>Analysis of Data to estimate Bill Of Resources</p> <ul><li>Data Issues</li></ul> <ul><li><ul><li>Cant tell if individual is deployed in primary Job Role/Skill Set</li></ul></li></ul> <ul><li><ul><li>JR/SS table has current state only </li></ul></li></ul> <ul><li><ul><li><ul><li>Beginning to collect longitudinal data</li></ul></li></ul></li></ul> <ul><li><ul><li>High % of missing JR/SS information </li></ul></li></ul> <ul><li><ul><li><ul><li>JR/SS not tracked consistently at subcontractor or global resource level</li></ul></li></ul></li></ul> <ul><li><ul><li><ul><li>No information for consultants no longer with IBM </li></ul></li></ul></li></ul> <ul><li>Over 400 valid JR/SS combinations </li></ul> <ul><li>Account descriptions give little to no indication of scope of work </li></ul> <p>History reflects what actually happened, not necessarily best practice 12. Engagement Profiling </p> <ul><li>Service offerings/opportunities are typically specified in terms of revenue and solution </li></ul> <ul><li><ul><li>Using statistical analysis and clustering, develop template staffing structure for offerings, which can be used to translate offering revenue forecasts and opportunity revenue into staffing resource requirements </li></ul></li></ul> <ul><li><ul><li>Semi-automated and parameterized process for generating staffing templates and supporting software </li></ul></li></ul> <ul><li>Value </li></ul> <ul><li><ul><li>Standardized project templates allow for planning of staffing decisions at earlier stages of the engagement process, more reliable forecasting of resource needs and better workforce planning </li></ul></li></ul> <ul><li><ul><li>Enables partners/project managers to quickly develop staffing plans early in the opportunity cycle </li></ul></li></ul> <ul><li><ul><li>Predictive accuracy of 70-80% at engagement level and 90-95% at aggregate level formajorjob roles </li></ul></li></ul> <ul><li><ul><li>Deployed by GBS in the Demand Capture Tool 2.1 released in December 2006 </li></ul></li></ul> <p>ABCClient Name Plan Names No Linked to other projects?4700000 Estimated Revenue 12/31/2004 End Date 1/2/2004 Start Date Package Configuration and Implementation Project Type SAP.SCM Modules SAP ISVSupply Chain Management Service Industrial Sector 13. Risk Based Capacity Planning Allows development of capacity plans according to business strategy. The best solution will be based on a combination of expected revenues/costs/profits, allowed risk tolerances with respect to revenue loss, and other business concerns, such as market-share and growth TECHNOLOGY ADOPTION PRODUCT SERVICES, US, 3Q05 Revenue at Risk ($M) Revenue curve Labor Cost curve Gross Profit curve 251 266 292 346 247 Capacity 14. Workforce Does Not Happen Overnight The use of analytics and optimization in workforce management applications requires significant maturity levels in terms of data, process and business understanding Automation Job taxonomies How to describe skills and activities View of supply Infrastructure and process to capture available resources Bills of materials Templates to describe projects/tasks to be performed View of demand Infrastructure, process and analytics to forecast demand Analytics &amp; Optimization Nothing 15. Carbon as a New Variable in Supply Chain Decisions </p> <ul><li>Typical supply chain optimization only considers the direct monetary costs </li></ul> <ul><li>Inventory and supply policies can be significantly different with the inclusion of broader environmental costs, and constraints </li></ul> <ul><li>A good model can quantify both the cost and the carbon impact of various supply chain policies. </li></ul> <ul><li>A comprehensive model can identify areas where carbon and cost reduction can be achieved simultaneously (e.g. minimization of wastage, rework etc) </li></ul> <p>Transportation Options Inventory Policy Options Quality CO 2 Cost Service Supply Chain Trade-offs Design Options Energy Options Packaging Options Process Options Component Options 16. Any Supply Chain Carbon View must be Multi-Dimensional Shrinkage ($, CO 2cost) Breakage ($, CO 2cost) Real Estate ($ cost) Handling ($, CO 2cost) Transportation ($, CO 2cost) Utilities ($, CO 2cost) Manufacturing ($, CO 2cost) Component Supply ($, CO 2cost) Packaging Options Transportation Options Energy Options Inventory Policy Options Process Options Supply Options 17. Green Sigma TM Carbon Management Dashboard 18. Outline </p> <ul><li>Historical perspective. When can analytics enhance value of information? </li></ul> <ul><li>Using analytics to utilize information. </li></ul> <ul><li><ul><li>Supply chain </li></ul></li></ul> <ul><li><ul><li>Workforce management </li></ul></li></ul> <ul><li><ul><li>Carbon management </li></ul></li></ul> <ul><li>Using analytics to extract information. </li></ul> <ul><li><ul><li>Collaborative filtering, Netflix challenge </li></ul></li></ul> <ul><li><ul><li>ASCOT </li></ul></li></ul> <ul><li><ul><li>BANTER </li></ul></li></ul> <ul><li>Using analytics to collect information. </li></ul> <ul><li><ul><li>Prediction markets </li></ul></li></ul> <ul><li><ul><li>Peer-to-peer services </li></ul></li></ul> <ul><li><ul><li>Personal benchmarking </li></ul></li></ul> <p> 19. Extracting Information </p> <ul><li>We consider situations when vast amount of data is available.</li></ul> <ul><li><ul><li>Typically a mix of structured and unstructured data </li></ul></li></ul> <ul><li><ul><li>Often incomplete and/or noisy data </li></ul></li></ul> <ul><li>Data may come from multiple sources, but typically includes at least some private data. </li></ul> <ul><li>The data owner wants to use the data to improve some aspect of the business operations, but a specific business objective is typically not fully articulated. </li></ul> <ul><li>Analysis (and pre-analysis data preparation) need to be automated. </li></ul> <ul><li>Examples: </li></ul> <ul><li>KDD cup and Netflix Challenge </li></ul> <ul><li>ASCOT </li></ul> <ul><li>BANTER </li></ul> <p> 20. 21. October 2006 Announcementof the NETFLIX Competition</p> <ul><li>USAToday headline:</li></ul> <ul><li> Netflix offers $1 million prize for better movie recommendations </li></ul> <ul><li>Details: </li></ul> <ul><li>Beat NETFLIX current recommender model Cinematch by 10% based on absolute rating error prior to 2011 </li></ul> <ul><li>$50.000 for the annual progress price (relative to baseline) </li></ul> <ul><li>Data contains a subset of 100 million movie ratings from NETFLIX including 480,189 users and 17,770 movies </li></ul> <ul><li>Performance is evaluated on holdout movies-users pairs </li></ul> <ul><li>NETFLIX competition has attracted 24,396 contestants on 19,799 teams from 155 different countries</li></ul> <ul><li>25115 valid submissions from 3335 different teams</li></ul> <ul><li>current best result is 9.08% better than baseline (from 6.7% as of March 2007) </li></ul> <p> 22. KDD-Cup 2007 </p> <ul><li>The 2007 KDD-Cup was based on a subset of the Netflix prize data </li></ul> <ul><li><ul><li>The Netflix grand prize competition (a different task on the same data) attracts 24396 contestants on 19799 teams from 155 different countries (no IBM participants due to IP issues) </li></ul></li></ul> <ul><li><ul><li>The data contains a subset of 100 million movie ratings from including 480,189 users and 17,770 movies </li></ul></li></ul> <ul><li><ul><li>Ratings of users and movies were collected from Nov-1999 until Dec-2005 </li></ul></li></ul> <ul><li>Task 1: Who Rated what in 2006 </li></ul> <ul><li><ul><li>Given a list of 100,000 pairs of users and movies, predict for each pair the probability that the user rated the movie in 2006 </li></ul></li></ul> <ul><li>Task 2: Number of ratings per movie in 2006 </li></ul> <ul><li><ul><li>Given a list of 8863 movie, predict the number of additional reviews that all existing users will give in 2006 </li></ul></li></ul> <p> 23. Task 1: Probability of a member rating a movie </p> <ul><li>Extracted features: </li></ul> <ul><li><ul><li>Movie-based features </li></ul></li></ul> <ul><li><ul><li><ul><li>Graph topology: # of ratings per movie (across different years), adjacent scores between movies calculated using SVD on the graph matrix </li></ul></li></ul></li></ul> <ul><li><ul><li><ul><li>Movie content: similarity of two movies calculated using Latent Semantic Indexing based on bag of words from (1) plots of the movie and (2) other information, such as directory, actors</li></ul></li></ul></li></ul> <ul><li><ul><li>User profile </li></ul></li></ul> <ul><li><ul><li><ul><li>Graph topology:#rating per user (across different years),adjacent scores between users in the graph calculated using SVD </li></ul></li></ul></li></ul> <ul><li><ul><li><ul><li>User content: user preference based on the movies being rated: key word match count </li></ul></li></ul></li></ul> <ul><li>Learning Algorithm: </li></ul> <ul><li><ul><li>Single classifiers: logistic regression, Ridge regression, decision tree, support vector machines (best run: RMSE = 0.2647) </li></ul></li></ul> <ul><li><ul><li>Nave Ensemble: combining sub-classifiers built on different types of features with pre-set weights (best run: RMSE = 0.2642) </li></ul></li></ul> <ul><li><ul><li>Ensemble classifiers: combining sub-classifiers with weight learnt from the development set (best run: RMSE = 0.2629) </li></ul></li></ul> <p> 24. Task 2: Number of additional ratings per movie</p> <ul><li>Perfo