analytics magzine

h t t p : / / w w w. a n a ly t i c s - m a g a z i n e . c o m

ALSO INSIDE:

may/june 2011Driving Better Business Decisions

• Behavior Segmentation: Five best practices • Data Mining Survey: trends and new insights • Sports Law Analytics: high-stakes litigation • Simulation Frameworks: Keys to dashboarding

BANkINg ON BEttEr DAyS

POSt-crISIS ANALySIS: crEDIt rISk MANAgEMENt

LESSONS LEArNED thE hArD wAy

executive edge michael Kubica of applied Quantitative sciences: simulation in strategic forecasting

1 | a n a ly t i c s - m aga z i n e . co m

Failure to communicate

i n s i D e s t o r y

When confronted with complex business analytics problems that beg for mathematical modeling, the reactive first response is, “Show me the data.” However, based on one of the recurring themes that came out of the recent INFORMS Conference on Business Analyt-ics & Operations Research held in Chicago, the proper first response is, “Tell me about your business.”

After all, how can you solve someone’s business problem if you don’t first thoroughly under-stand their business and all of the behind-the-scenes issues, con-straints and personality conflicts that will ultimately – and perhaps surprisingly – impact the outcome of the project and implementation of the solution?

“Expert interviewing,” as it’s sometimes referred to, is the un-appreciated yet critical art of as-certaining client information up front that will often determine the success or failure of an analytics project. Who hasn’t had a promis-ing project and/or elegant solution scuttled or never applied because someone forgot to ask the right questions early on?

I was reminded of this basic te-net of analytical problem-solving on my way home from the Chica-go conference. A fellow attendee and I, over lunch, determined that we both had 7 p.m. flights home that evening. We agreed to share a taxi to the airport, a more effi-cient mode of transportation and a means of saving us both a few bucks. We met in the hotel lobby at the appointed hour as planned, climbed into a taxi, and the driver asked where we were going. I said, “Midway,” at the exact moment my fellow attendee said, “O’Hare.”

We had never asked each other during our initial conversation which airport we were flying from. We scrambled out of the taxi and looked around for other options, our original “efficient” plan doomed because of a classic “failure to communicate.”

By the way, the conference was ter-rific, from UPS Vice President Chuck Holland’s powerful opening plenary to the Oscar-like Edelman Award gala. See you at the 2012 event April 15-17 in Huntington Beach, Calif. ❙

– peter horner, [email protected]

Brought to you by

DRIVING BETTER BUSINESS DECISIONS

C o n t e n t s

FEAturESeconomically caliBrateD moDelsBy Andrew Jennings and Carolyn WangLessons learned the hard way: the secret to better credit risk management.

risK in revenue managementBy Param SinghAcknowledging risk’s existence and knowing how to minimize it.

Behavioral segmentationBy Talha OmerFive best practices in making embedded segmentation highly relevant.

unDerstanDing Data minersBy Karl Rexer, Heather N. Allen and Paul GearanData miner survey examines trends and reveals new insights.

sports law analyticsBy Ryan M. Rodenberg and Anastasios KaburakisAnalytics proving to be dispositive in high-stakes sports industry litigation.

simulation FrameworKs By Zubin Dowlaty, Subir Mansukhani and Keshav Athreyathe key to building succssful dashboards for displaying, deploying metrics.

11

16

20

24

29

33

2 | a n a ly t i c s - m aga z i n e . co m a n a ly t i c s | W i n t e r 2 0 0 8

29

24

16

mAy/june 2011

DRIVING BETTER BUSINESS DECISIONS Brought to you byDRIVING BETTER BUSINESS DECISIONS Brought to you by


RegisteR foR a fRee subscRiption:http://analytics.informs.org

infoRMs boaRd of diRectoRs

President Rina Schneur, Verizon Network & Tech. President-Elect Terry P. Harrison, Penn State University Past President Susan L. Albin, Rutgers University Secretary Anton J. Kleywegt, Georgia Tech Treasurer Nicholas G. Hall, The Ohio State University Vice President-Meetings William Klimack, Decision Strategies, Inc. Vice President-Publications Linda Argote, Carnegie Mellon University Vice President- Sections and Societies Barrett Thomas, University of Iowa Vice President- Information Technology Bjarni Kristjansson, Maximal Software Vice President-Practice Activities Jack Levis, UPS Vice President-International Activities Jionghua “Judy” Jin, Univ. of Michigan Vice President-Membership and Professional Recognition Ozlem Ergun, Georgia Tech Vice President-Education Joel Sokol, Georgia Tech Vice President- Marketing and Outreach Anne G. Robinson, Cisco Systems Inc. Vice President-Chapters/Fora Stefan Karisch, Jeppesen

infoRMs offices

www.informs.org • Tel: 1-800-4INFORMS Executive Director Melissa Moore Marketing Director Gary Bennett Communications Director Barry List Corporate, Member, INFORMS (Maryland) Publications and 7240 Parkway Drive, Suite 300 Subdivision Services Hanover, MD 21076 USA Tel.: 443.757.3500 E-mail: [email protected]

Meetings Services INFORMS (Rhode Island) 12 Breakneck Hill Road, Suite 102 Lincoln, RI 02865 USA Tel.: 401.722.2595 E-mail: [email protected]

analytics editoRial and adveRtising

Lionheart Publishing Inc., 506 Roswell Street, Suite 220, Marietta, GA 30060 USATel.: 770.431.0867 • Fax: 770.432.6969

President & Advertising Sales John Llewellyn [email protected] Tel.: 770.431.0867, ext.209 Editor Peter R. Horner [email protected] Tel.: 770.587.3172 Art Director Lindsay Sport [email protected] Tel.: 770.431.0867, ext.223 Advertising Sales (A-K) Aileen Kronke [email protected] Tel.: 770.431.0867, ext.212 Advertising Sales (L-Z) Sharon Baker [email protected] Tel.: 813.852.9942

Analytics (ISSN 1938-1697) is published six times a year by the Institute for Operations Research and the Management Sciences (INFORMS). For a free subscription, register at http://analytics.informs.org. Address other correspondence to the editor, Peter Horner, [email protected]. The opinions expressed in Analytics are those of the authors, and do not necessarily reflect the opinions of INFORMS, its officers, Lionheart Publishing Inc. or the editorial staff of Analytics. Analytics copyright ©2011 by the Institute for Operations Research and the Management Sciences. All rights reserved.

DEPArtMENtSinside storyBy Peter Horner

executive edge By Michael Kubica

profit center By E. Andrew Boyd

analyze this! By Vijay Mehrotra

newsmakers

corporate profileBy Chris Holliday

the Five-minute analystBy Harrison Schramm

thinking analytically By John Toczek

1

4

6

8

10

37

41

42

37

W W W. i n f o r m s . o r g

Any strategic forecast is by definition a rep-resentation of the future. Understanding that a single estimate of the future is not truly repre-sentative, Monte Carlo simulation is a power-ful alternative to the “best estimate” forecast. Business leaders are becoming increasingly aware of the deficiencies inherent in tradition-al forecasting methods. And the past two de-cades have ushered in an explosion of tools to facilitate novice and expert alike in applying Monte Carlo simulation.

Though the growth and accessibility of these tools has been staggering, less prolific has been the adoption of the methodologies these powerful tools enable. Why? Part of the answer lies in a lack of understanding of what a simulation forecast is, what the relative mer-its and limitations are, and when it is most ap-propriate to consider using it. In this article I address these questions.

whAt IS A SIMuLAtION FOrEcASt?

In a traditional forecast, input assumptions are mathematically related to each other in a

model. Based on these defined mathematical relationships, model outputs are calculated, such as market units sold, market share and revenue. The model may be simple or very elaborate. The defining characteristic is that inputs are defined as single point values, or “best estimates.”

This type of model has been the staple of business for many years. They can answer questions such as, “If all of our assumptions are perfectly accurate, we can expect ...” How-ever, experience has shown that all of the as-sumptions are not perfectly accurate. We are forecasters after all, not fortunetellers!

Simulation can remedy this problem. In-stead of defining input variables as single point estimates, we define them as probabil-ity distributions representing the range of un-certainty associated with the variable being defined. These “ranged” variables are fed into the exact same forecast model. When the simulation model is run, we sample each input variable’s distribution thousands of times and relate each instance of the dis-tribution samples within the forecast model structure. Because we have defined the in-puts as uncertainties, the outputs represent all of these uncertainties in a simulation fore-cast. Instead of a single line on the graph, we may represent an infinite number of lines, bounded by the possibilities constrained by the input distributions. Of course, we sum-marize these probabilistic outputs according to the confidence intervals relevant to the decision at hand.

thE rELAtIvE MErItS AND LIMItAtIONS

Simulation forecasts have several impor-tant advantages over single point estimates. First, assuming that the input variables are representative of the full range of possible val-ues along with best estimates where available, we have what may be referred to as a repre-sentative forecast. A representative forecast incorporates all currently available informa-tion, including uncertainty about future values. In this sense it is a truthful forecast.

A simulation forecast allows for an exami-nation of both what is possible and how likely each of those possibilities are. We can exam-ine the best estimate forecast in the context of the full range of possibility and discern true up-side and downside risk. We gain these advan-tages without losing the ability to do specific scenario analyses. But now we can peer into the risk associated with achieving any defined scenario.

Simulation modeling does come at some cost, though. Rather than having a single in-put per assumption, you will define anywhere from one to four inputs, depending on the type of distribution being represented. This is be-cause, in order to create a probability distri-bution to represent your uncertainty regarding the assumption, you will need to define the bounds of possibility for that variable (mini-mum, maximum) in addition to a best estimate, and potentially a “peakedness” variable.

This makes the model appear more complex and can make it seem more daunting to users (the truth is, the model itself has not changed

4 | a n a ly t i c s - m aga z i n e . co m a n a ly t i c s | m ay / j u n e 2 011

executive briefing on simulation in strategic forecasting

By michael KuBica

simulation forecasts have several important advantages over single

point estimates.

e x e c u t i v e e D g e


from the point estimate, given that it was appropriately specified to begin with). This appearance of increased com-plexity can contribute to a “black box” perception among model consumers. Avoiding this issue is often as simple as explaining that the expanded input set is nothing more than representing the diligence that (hopefully) is going into formulating the best estimate in a traditional forecast, and leveraging all of this additional information to improve understanding and decision-making.

Simulation outputs cannot always be interpreted the same way as traditional forecast outputs. It is therefore prudent to hold an orientation meeting with model consumers to discuss how to interpret results and to address common misap-plications of simulation outputs. While it is not necessary for users to understand the theory per se, it is important to avoid having them multiply percentiles together or misinterpreting what the probabilistic outputs mean. A small investment here can go a very long way in creating value from the forecasting process.

whEN ShOuLD SIMuLAtION BE

cONSIDErED AS thE MEthODOLOgy

OF chOIcE?

I once attended a pharmaceuti-cal portfolio management conference

where I heard one of the speakers say (in the context of creating forecasts to drive portfolio analysis): “Simulation is OK, but you better be really sure you are right about your assumptions if you are going to use it!” I was astounded that such misinformation was coming from someone forwarded as an expert. Nothing could be further from the truth. The less certain you are about the as-sumptions and the more there is at stake based on the decisions being made from the model, the more appropriate and important simulation forecasting is. This is especially true if the cost as-sociated with being wrong significantly exceeds the cost of the incremental re-sources to define that uncertainty.

In summary, simulation forecasting is a powerful methodology for under-standing not just the possible future outcomes, but establishing a truthful representation of how likely any single scenario is within the range of possi-bilities. Because strategic (long-term) forecasting is inherently risky and driv-en by many uncertain variables, adding Monte Carlo simulation to your fore-casting tool chest can create enormous value. ❙

Michael Kubica is president of Applied Quantitative Sciences, Inc. Please send comments to [email protected].

e x e c u t i v e e D g e


When speaking about analytics, or any oth-er topic for that matter, it’s easy to be drawn into generalities. “Forecasts improve profits.” “Information on past purchases can be used to increase sales.” Generalities are important. They help us navigate environments crowded with details. But details provide important les-sons that generalities can’t, helping us learn by example.

In this column we look at one particular screen in one particular software system. It’s not overly complicated, but it illustrates three general traits common to many successful ap-plications of analytics.

To understand the context in which the screen is used, consider the example of a charitable organization preparing to mail re-quests for donations. At its disposal is a large database of past donors. The charity has a fixed budget for mailing. The question is, “Who among the many past donors should receive a mailer?”

Analytics can be used to evaluate any number of factors. Are recent contributors

more likely to give again, or is it better to target individuals who haven’t contributed in a while? Are people from certain geographic regions more likely to give than oth-ers? Analytics offers a multitude of mathemati-cal tools for answering these questions and de-termining which custom-ers are most likely to send a donation.

Whatever mathemati-cal tools are chosen, however, the results can be easily and clearly communicated. The screen capture shown in Figure 1 is taken from SAS Enterprise Miner. On the horizontal axis is the percent of the popula-tion the charity might send mailers to. For example, 20 percent on the horizontal axis corresponds to the question, “Suppose we send mailers to the 20 percent of donors most likely to respond?” The vertical axis then fills in the blank. “By choosing the 20 percent of donors most likely to respond, we can expect a response (cumulative lift) about 1.7 times greater than if we send mail-ers to 20 percent of the donor population at random.” (The charity’s budget corresponds to a mailing that reaches 20 percent of the donor population.) The system arrives at this number by determining which customers are most likely to respond.

The application and the screen vividly il-lustrate three fundamental characteristics of a successful analytics endeavor.

1. a “must answer” question is addressed. Contributors need reminding. Donations fall if charities don’t reach out. Without an unlimited mailing budget, the charity is forced to ask, “Who should we contact?” The question must be an-swered one way or another. Analytics provides an answer through the logical analysis of facts.

It’s useful to contrast the question faced by the charity with a question such as, “Should I change the price of a gallon of milk?” Retail-ers need to set prices, but once prices are set, there’s considerable inertia for leaving them unchanged. A retailer doesn’t need to change prices tomorrow. Analytics can still bring tre-mendous value in this case, and pricing has received considerable attention by analytics


learning by example

By e. anDrew BoyD

Fundamental characteristics: a “must

answer” question is addressed, the solution is simple and a specific

action is proposed.

three traits of successful analytics projects.

p r o F i t c e n t e r

Figure 1: Screen shot illustrates three general traits common to many successful applications of analytics.


practitioners. Nonetheless, it’s easier for analytics to be adopted in applica-tions where there’s a question that un-equivocally must be answered.

2. the solution is simple. It doesn’t take an advanced degree in mathemat-ics to understand either the problem or, at a general level, the logic behind the solution. Some people are more likely to respond to a mailer than others, and it’s possible to take an educated guess about who those people are based on historical data. And, recognizing that the question must be answered, it’s better to take an educated guess than a shot in the dark. The SAS system, along with similar systems offered by other analyt-ics software vendors, allows users to pick among different mathematical meth-ods for predicting who is most likely to respond. Modelers can then choose the method they feel most comfortable with to support an educated guess.

3. A specific action is proposed. The screen shows the expected lift from mailing to the right customers, but

more importantly, in the background it identifies those customers who should receive mailers. Once the analysis is run a very specific action results: mail to these customers.

Not all analytics applications pro-vide such explicit actions. Reports pro-vide useful information, but what to do with that information isn’t always clear. It’s of value to know that David closed $80,000 in business last month while his peers averaged $100,000, but what action should David’s manager take? When the action isn’t obvious, neither is the value. The value only becomes ap-parent when good business processes are put in place. In cases where the ac-tion is immediately apparent, the value is much easier to see.

It isn’t necessary for a successful application of analytics to demonstrate all three traits. Not all applications are so fortunate to have all of them. But when all three are present the case for analytics is extremely compelling, mak-ing life easier for everyone involved. We’ll return to look at other detailed ex-amples in future columns. ❙

Andrew Boyd served as executive and chief scientist at an analytics firm for many years. He can be reached at [email protected]. Thanks to Jodi Blomberg of SAS for her help in preparing Figure 1.

p r o F i t c e n t e r

subscr ibe to Analyt ics

it’s fast, it’s easy and it’s fRee! Just visit: http://analytics.informs.org/


In my ongoing quest to figure out what’s go-ing on in the world of analytics, I’ve recently been to the Predictive Analytics World (PAW) conference (www.predictiveanalyticsworld.com/sanfrancisco/2011/) and the INFORMS Business Analytics and Operations Research conference in Chicago (http://meetings2.in-forms.org/Analytics2011/). I have seen doz-ens of presentations, heard scores of panel discussions and had a million or so conversa-tions. In no particular order, here are some of my impressions.

big numbers: Attendance at both confer-ences was way up this year. PAW attracted 550 people, up 78 percent from the previous year. PAW was one of several workshops and conferences (along with Marketing Optimi-zation Summit, Conversion Conference and Google Analytics Users’ Great Event) bundled into Data Driven Business Week, which drew more than 1,300 attendees

Meanwhile, the INFORMS conference drew 623 registrants, or 52 percent more than 2010. The event’s new title obviously resonated more deeply than the old one (“INFORMS Practice Conference”) with many first-time attend-ees. Moreover, conference organizers clearly achieved their objective of engaging folks from

outside the traditional INFORMS community; I bumped into several people who had never had any interaction with the organization be-fore, and many non-members delivered com-pelling presentations.

Other numbers that caught my attention: 1,000+ (number of analytics professionals working at consulting firm Mu Sigma www.mu-sigma.com, probably the largest company of its kind in the world) and 2,000+ (the number of SAS licensees within Wells Fargo).

groundhog day: At lunch in Chicago one day, someone pointed out that one of the Edel-man Award finalists’ projects was an intelligent way to re-position empty shipping containers, something we had talked about ad nauseam in the early 1990s when consulting with Sea-Land. Other old-timers reported similar mo-ments of déjà vu. What’s different now? Two decades later, the data is now just a lot better. Sigh.

Whose party is this anyway? Many peo-ple from the Knowledge Discovery and Data Mining (KDD) community attended PAW. While some were heavy-duty statistics folks, one could sense a strong vibe that this conference was about something other than statistics. In-deed, at one session, a very technical speaker confessed that he had never taken a statistics class and knew only what he had needed to learn for a couple of consulting projects.

Of course, the INFORMS conference was attended by lots of us operations research and industrial engineering types, though not near enough folks from other classical

analytics disciplines such as business intel-ligence and machine learning. INFORMS has just launched a new section on analyt-ics. A big shout out to founding officers Mi-chael Gorman (president), Zahir Balaporia (president-elect), Warren Lieberman (trea-surer) and Doug Mohr (secretary) for making this happen, and for keeping their eyes and minds open for opportunities to collaborate with the KDD community and other profes-sional groups, because we’re all on the same side, and we all have lots to learn.

data Hopelessness: Despite the many inspiring success stories at both conferences and the many instructive presentations about “the mainstreaming of analytics,” I also had several dark coffee and cocktail conversa-tions about some old familiar struggles. Kudos to Sam Eldersveld for his treatise on a (still) common disease. “Data hopelessness,” he ex-plains, is what happens when an analyst has no hope of ever getting data in time to pos-sibly answer anything but the most short-term business questions. In its most common form, getting useful data requires its own hercule-an and unsustainable effort. In its strongest form, even data obtained under great duress is hopelessly incomplete/insufficient to answer the questions that are being asked with any level of confidence – and those questions are inevitably the wrong ones.

On the plus side for those dealing with data hopelessness, 25 employers participated at the informal job fair in Chicago and many po-tential employers trolled the hallways at both


let’s get this analytics party hopping

By vijay mehrotra

“Data hopelessness” is what happens when

an analyst has no hope of ever getting data in

time to possibly answer anything but the most

short-term business questions.

a n a ly z e t h i s !


PAW and INFORMS. Clearly, the pro-fessional opportunities are out there, but beware: the data may not be any cleaner on the other side.

Whose party is this anyway? (part 2): The two major sponsors for both of these events were SAS and IBM. These companies are the two biggest players in this space, but they could hardly be less alike: SAS is a privately held firm that has been focused on this business since its founding in 1976, while IBM is a publicly traded global behemoth that has spent more than $14 billion on analytics acquisitions such as ILOG, SPSS and Netezza over the past four years. After visiting with folks from both of these companies, it is clear to me that each is grappling with seri-ous challenges. For SAS, its growing on-demand business requires them to learn to operate data centers, man-age service level agreements and track clients’ data flows, a far cry from their roots as a tools vendor. In ad-dition, bigger data sets and smarter algorithms continue to put pressure on the SAS folks to get smarter about

how their software utilizes the capa-bility of modern hardware platforms. For IBM, the success of its acquisition strategy depends heavily on its abil-ity to integrate these capabilities into its global services organization and communicate to customers how these pieces can be leveraged. It, too, has a long road ahead.

Certainly it is terrific that these two companies are supporting these con-ferences and others like them. We also saw a ton of small entrepreneurial companies (consulting firms, software vendors, data aggregators, executive recruiters) at both conferences. But it sure would be nice to see some more big players bring their resources to the analytics space. Is anyone listening over there at SAP, Oracle, salesforce.com? We would all love to see you at the next conference … because this party is really just getting started.

Vijay Mehrotra ([email protected]) is an associate professor, Department of Finance and Quantitative Analytics, School of Business and Professional Studies, University of San Francisco. He is also an experienced analytics consultant and entrepreneur and an angel investor in several successful analytics companies.

a n a ly z e t h i s !

help promote Analyt icsit’s fast and it’s easy! visit: http://analytics.informs.org/button.html


Moments after Midwest ISO, which man-ages one of the world’s largest energy mar-kets, won the 2011 Franz Edelman Award for Achievement in Operations Research and the Management Sciences, company CEO and President John Bear was asked how many people comprise his high-end analytics (op-erations research, i.e. “O.R.”) group.

“That’s a trick question,” Bear said. “In the-ory, there are 850 of us involved with O.R. All of us – everyone who works for the company – try to focus every day on how can we continu-ously improve, how can we do things better, how can we move ourselves forward, whether it’s incrementally or fundamentally.”

The Edelman competition, sponsored by the Institute for Operations Research and the Management Sciences (INFORMS) and con-sidered the “Super Bowl of O.R.,” is an annual event that recognizes outstanding examples of operations research-based projects that have transformed companies, entire indus-tries and people’s lives. The 2011 winner was announced at an awards gala April 11 in con-junction with the INFORMS Conference on Business Analytics & Operations Research in Chicago.

For many years, the U.S. power in-dustry consisted of utilities that focused locally and ignored the possibility that there might be bet-ter, regional solu-tions. Midwest ISO was the nation’s first regional transmis-sion organization (RTO) to emerge fol-lowing the Federal Energy Regulation Commission’s push in the 1990s to re-structure and boost efficiency through-out the power industry. Headquartered in Car-mel, Ind., Midwest ISO has operational control over more than 1,500 power plants and 55,000 miles of transmission lines throughout a dozen Midwest states, as well as Manitoba, Canada.

Driven by the goal of minimizing delivered wholesale energy costs reliably, Midwest ISO, with the support of Alstom Grid, The Glarus Group, Paragon Decision Technology and Utilicast, used operations research and analytics to design and launch its energy-only market on April 1, 2005, and introduced its energy and ancillary services markets on Jan. 6, 2009.

Midwest ISO improved reliability and in-creased efficiencies of the region’s power

plants and transmission assets. Based on its annual value proposition study, the Midwest ISO region realized between $2.1 and $3.0 bil-lion in cumulative savings from 2007 through 2010. Midwest ISO estimated an additional $6.1 to $8.1 billion of value will be achieved through 2020. The savings translate into lower energy bills for millions of customers through-out the region.

Along with Midwest ISO, the other 2011 Edelman finalists included CSAV (Chilean shipping company), Fluor Corporation, In-dustrial and Commercial Bank of China (ICBC), InterContinental Hotels Group (IHG) and the New York State Department of Taxa-tion and Finance. ❙


midwest iso earns edelman honors

“all of us – everyone who works for the company – try to focus every day on how can we continuously improve, how can we do

things better, how can we move ourselves forward,

whether it’s incrementally or fundamentally.”

n e w s m a K e r s

The Edelman-winning team, including Midwest ISO CEO and President John Bear (fifth from right).

Bonus l inks

see the presentation of Midwest iso’s award-winning analytics project here: http://livewebcast.net/infoRMs_ac_edelman_award_2011

see Midwest iso receive the coveted infoRMs edelman award here:http://www.youtube.com/user/infoRMsonline#p/u/9/exl_9Mrblfa


As the banking sector gradu-ally rebounds from the global recession, many bank execu-tives and boards are focused

on incorporating the painful lessons learned during the past three years into their business operations. Chief among those lessons learned is the need to strengthen the management of credit risk as economic conditions fluctuate.

It is clear that many banks weren’t prepared for the economic and financial storms that struck in 2008. Not only were

the analytic models employed by banks ill-prepared for the depth and length of the re-cession, bank executives were caught off guard by their inability to do more to man-age through the onslaught of consumer defaults.

The good news is that these hard times spurred analytic innovation and produced useful data to strengthen risk management going forward. Our post-cri-sis research has revealed three important lessons regarding credit risk that can be instructive for banks everywhere:

1. Risk is dynamic. A bank’s risk-management strategy must be agile enough to keep pace with a risk environment that is evolving continuously.

2. Rapid and significant changes in economic and market forces can render traditional risk-management approaches less reliable.

3. Credit providers need better economic forecasting relative to risk management for loan origination and portfolio management.

EcONOMIcALLy cALIBrAtED rISk

MODELS

Risk models that are used to originate loans or make credit decisions on existing customers need to take an economically sensitive approach that offers the guidance and insight banks require for effective risk management. Such an approach will en-able models to provide decision makers with more reliable and actionable infor-mation. While most of today’s credit-risk models continue to rank-order risk properly during turbulent times, we now know that

the secret to better credit risk management: economically calibrated models

By anDrew jennings (leFt) anD carolyn wang (right)

A

B a n K i n g s e c t o r


W W W. i n f o r m s . o r g12 | a n a ly t i c s - m aga z i n e . co m a n a ly t i c s | m ay / j u n e 2 011

immediate past default experience can be a weak indicator of future payment perfor-mance when economic conditions change significantly and unexpectedly.

Empirical evidence shows that default rates can shift substantially even when credit scores stay the same (see Figure 1). For example, in 2005 and 2006, a 2 percent default rate was associated with a FICO Score of 650-660. By 2007, a 2 percent default rate was associated with a score of about 710 as rapidly worsen-ing economic conditions (and the impact of prior weak underwriting standards) af-fected loan performance.

Although most banks already incorpo-rate some type of economic forecasting into their policies, our research and experience indicates a substantial portion of this input

is static and may not provide useful guid-ance for risk managers. As a result, there is a tendency to over-correct and miss key revenue opportunities, or under-correct and retain more portfolio risk than desired. Fortunately, progress in predictive analytics over the past three years now allows fore-casting based on a more empirical founda-tion that is far more adept at managing risk in a dynamic environment.

Such forecasts can augment existing credit-risk predictions in two ways:1. They can improve predictions

for payment performance. These improved predictions can be incorporated into individual lending decisions, and they can be used at the aggregate level to predict portfolio performance.

2. They can be used to predict the migration of assets between tranches of risk grades. When used in conjunction with aggregate portfolio default probabilities, this can form the basis of forecasting risk-weighted assets for the purpose of Basel capital calculations (and other types of regulatory compliance).

rISk ShIFtS AS EcONOMy ShIFtS

During economic downturns, many lower-risk consumers may refinance their debt obligations, leaving their pre-vious lenders with portfolios full of risk-ier consumers. Other borrowers who were lower-risk in the past may reach their breaking points through job loss or increased payment requirements. And higher-risk consumers may get stretched further, resulting in more fre-quent and severe delinquencies and defaults.

Economically calibrated analytics give lenders a way to understand the com-plex dynamics at work during unstable economic times. The resulting models provide an additional dimension to risk prediction that enables lenders to:1. Grow portfolios in a less risky

and more sustainable manner by identifying more profitable customers and extending more appropriate offers.

2. Limit losses by tightening credit policies sooner and targeting appropriate customer segments more precisely for early-stage collections.

3. Prepare for the future with improvements in long-term strategy and stress testing.

4. Achieve compliance with capital regulations more efficiently. (Improved accuracy in reserving will also reduce the cost of capital.)

At the simplest level, next-generation analytics provides lenders with an un-derstanding of how the future risk level associated with a given credit score will change under current and projected eco-nomic conditions. These sophisticated analytic models are able to derive the relationship between historical changes in economic conditions and the default rates at different score ranges (i.e., the odds-to-score relationship) in a lender’s portfolio.

Using this derived relationship, lend-ers can input current and anticipated economic conditions into their models to project the expected odds-to-score out-come under those conditions. They can model their portfolio performance under a variety of scenarios utilizing economic indicators such as the unemployment rate, key interest rates, Gross Domestic

c r e D i t r i s K m a n ag e m e n t


Product (GDP), housing price changes and many others variables. These models can be constructed regionally or locally to account for the fact that economic condi-tions may not be homogenous across an entire country.

The odds-to-score relationship can be studied at an overall portfolio level or it can be scrutinized more finely for key customer segments that may behave dif-ferently under varying economic condi-tions. And, it can be applied to a variety of score types, such as origination scores, behavior scores, broad-based bureau scores like the FICO Score and Basel II risk metrics.

Economically calibrated analytics can be particularly valuable when examining the behavior scores that lenders utilize to manage accounts already on their books for actions such as credit line increases/decreases, authorizations, loan re-pric-ing and cross selling. An economically calibrated behavior score could be used in place of, or along with, the traditional behavior score across the full range of account-management actions.

SIgNIFIcANt vALuE ADD FOr

cOMPLIANcE

In addition to operational risk man-agement, the incorporation of econom-ic factors into portfolio performance

modeling can be valuable for regulatory compliance. When lenders set aside provisions and capital reserves, it is im-portant that they understand the risks in their portfolios under stressed eco-nomic conditions because that is when the reserves are likely to be needed most. In fact, forward-looking risk pre-diction is explicitly mandated in Basel II regulations, and such predictive analyt-ics should be part of any lender’s best practices for risk management.

FICO has been working for some time now with European lenders to add eco-nomic projections into Basel II Probability of Default (PD) models. Using the derived odds-to-score relationship between a lender’s PD score and various economic conditions, lenders can simulate the ex-pected PD at a given risk-grade level in many different scenarios. Thus, lenders can more accurately calculate forward-looking, long-term PD estimates to meet regulatory requirements and calculate capital reserves in a more efficient and reliable manner.

This can help banks free up more capital for lending and credit without tak-ing on unreasonable risk. It can also help improve the transparency of a bank’s compliance program and reduce the time and resources that must be dedicated to compliance.

APPrOAch ALrEADy BEArINg FruIt

We recently applied our economically calibrated risk-management methodol-ogy to the portfolio of a top-10 U.S. credit card issuer. We compared the actual bad rate in the portfolio to predictions from both the traditional historical odds ap-proach as well as the economically cali-brated methodology. We found that the latter would have reduced the issuer’s er-ror rate (the difference between the actu-al and predicted bad rates) by 73 percent over three years, resulting in millions of dollars of loss avoidance.

In a second example, European lender Raiffeisen Bank International (RBI) is us-ing an economically calibrated risk-man-agement technique to complement its more traditional credit scoring information.

RBI overlays macroeconomic information on the bank’s traditional credit-scoring process, creating a system that leverages and extends the value of RBI’s in-house economic research.

This provides the bank with a for-ward-looking element to its credit scoring following concerns about the creditwor-thiness of some of the central and east-ern European countries in which the bank operates. RBI is using this new approach on its credit card, personal loan and mort-gage portfolios to build future economic expectations into credit risk analysis.

Regulatory compliance was the ini-tial driver of this move, but RBI quickly realized the new approach could help it achieve stability in the overall capital re-quirements for its retail business segment.


Each market the bank serves faces different economic prospects, and calibrating risk strat-egies for each market can help the bank grow during good and bad economic periods.

In another real-world case, a U.S. credit card issuer retroactively applied this econom-ic-impact methodology to its credit-line-de-crease and collections strategies. An analysis of its 2008 data (conducted with the new meth-odology) found that the predicted bad rate for its portfolio rose more than 250 basis points compared to predictions based on a more tra-ditional approach. The new approach would have decreased the amount of credit extended to a larger portion of the portfolio (and not de-creased credit to those less sensitive to the downturn). The lender would have realized millions of dollars in yearly loss savings.

For the same U.S. card issuer, we retroac-tively used an economically adjusted behav-ior score in place of the traditional behavior score to treat early-stage (cycle 1) delinquent accounts. Prioritizing accounts by risk, the strategy would have targeted 41 percent of the population for more aggressive treatment in April 2008. We then examined the result-ing bad rates six months later (October 2008) and saw that these accounts resulted in high-er default rates than the accounts that weren’t targeted. In other words, the economically adjusted scores improved the identification of accounts that should have received more ag-gressive treatment in anticipation of the eco-nomic downturn. Using this strategy, the lender would have been ahead of its competition in


each market the bank serves faces different

economic prospects, and calibrating risk strategies for each market can help

the bank grow during good and bad economic

periods.



collecting on the same limited dollars.The lender could have saved ap-

proximately $4 million by taking ag-gressive action earlier. FICO calculated this figure using the number of actual bad accounts that would have received accelerated treatment, the average ac-count balance, and industry roll rates. The combination of this loss prevention through more aggressive collection and the millions of dollars the lender could have saved from an improved credit-line decrease strategy would have made a material impact on the lender’s earn-ings. This underscores the aggregate benefits of economically calibrated risk management when used across a cus-tomer lifecycle. And, the benefits are scalable for larger portfolios.

These are just a few examples among many worldwide that illustrate the value of economically calibrated analytics for risk management. In fact, one of the largest financial institu-tions in South Korea recently adopted this same approach to help it derive forward-looking estimates on the probability of default in its consumer finance portfolio. The lender will be

using these predictions to continu-ously adjust its operational decisions depending on anticipated economic conditions.

NOw IS thE tIME tO Act

Smart lenders are reevaluating their risk-management practices now – when economic conditions are somewhat calm and there is no immediate crisis that requires their full attention. A re-evaluation of risk-management practic-es can enable measured growth while simultaneously preparing a lender for the next recession.

The use of forward-looking analytic tools will become the risk-management best practice of tomorrow. With im-proved risk predictions that are better aligned to current and future economic conditions, lenders can more quickly adjust to dynamic market conditions and steer their portfolios through un-certain times. ❙

Andrew Jennings is chief analytics officer at FICO and the head of FICO Labs. Carolyn Wang is a senior manager of analytics at FICO. To read more commentary from Dr. Jennings and other FICO banking experts, visit http://bankinganalyticsblog.fico.com/.


subscr ibe to Analyt icsit’s fast, it’s easy and it’s fRee! Just visit: http://analytics.informs.org/


acknowledging risk’s existence and knowing how to minimize it.

Most of us go about our daily actions in our personal lives constantly evaluating risk vs. reward elements. Most

actions are based on decisions that are made quickly and sub-consciously with barely a thought regarding risk vs. re-ward, but others are more deliberate and calculating as, say, financial decisions to invest in the stock market.

We naturally carry our disposition re-garding risk into the workplace, too, mak-ing on-the-job decisions based on our

general attitude toward risk. People dif-fer in the level of risk with which they are comfortable. So the question, then, is how does the variety in people’s willingness to take risk affect decisions that have an im-pact on the company’s bottom line?

Let’s narrow this question down to the world of revenue management (RM). RM systems help in the decision-making pro-cess by evaluating vast amounts of com-plex demand and supply information and recommending optimal actions to maxi-mize revenues. But even so, RM analysts

have a role, rightly so, in deciding whether or not to approve or adjust these system recommendations. But can they do that without imposing their viewpoint on risk into the equation?

cruISE LINE’S ExPErIMENt EvALuAtES

EFFEct OF PrOPENSIty tO tAkE rISkS

As a real-life example to understand this phenomenon, a revenue management de-partment undertook an experiment with an-alysts using its RM system. This company was a cruise line so the resources being

priced were cabins for future sailings of varying durations on ships with various itin-eraries. With several hundred sailings each year for this cruise line, the RM workload of the department was divided among its doz-en analysts on the basis of the ship type, sailing duration, season and destinations.

The RM system evaluated the data and performed its modeling, forecasting and optimization steps to recommend prices for its products. The analysts either ap-proved the system recommendations or adjusted the prices up or down.

risk in revenue management

By param singh

M

c r u i s e l i n e e x p e r i m e n t




The key metrics for evaluating in-dividual sailings were occupancy and various flavors of net revenue. Rev-enue came from tickets for the cab-ins and cruise and onboard revenues from shopping, casinos, liquor, off-shore excursions, etc. High occupan-cy was desirable – sometimes at the cost of low ticket prices – for both the onboard revenue component plus the positive psychological effect on pas-sengers (similar to customers feeling somewhat let down if the restaurant they went to dine in was sparsely occupied). Also, the cruise line pre-ferred to raise ticket prices as the sail date approached, though this was not always upheld for various reasons such as poor forecasts, disbelief in forecasts or a variety of market con-ditions, giving rise to confusion from the customer’s point of view – some of whom thought it better to wait to get good deals on cruise prices.

As part of this experiment, a sin-gle small sample of future sailings of varying durations, itineraries, etc. was assigned to all RM analysts. This was workload over and above the indi-vidual collection of sailings they were each already responsible for. They were asked to evaluate the system recommendations for this handful of

sailings and decide whether to accept it or assign new lower or higher prices. All analysts had the same data avail-able to them. This experiment lasted several months since the sample con-sisted of sailings a few weeks from departure and others several months from departure. Even though only the “true owners” of the sample of sailings made the real implementable pricing decisions, all analysts recorded their pricing decisions and reasons behind them. This was done once a week, at the same frequency of the RM system forecasting/optimization runs, until the sailings departed.

The recorded results of decisions that would have been made had dif-ferent RM analysts been in charge of these sailings were very informa-tive. It became obvious that different people viewed the same information differently, sometimes to the point of making opposite decisions: If the sys-tem’s recommendation was to raise prices from their current level, some analysts suggested raising the price even higher whereas others sug-gested lowering the prices, the rec-ommendations notwithstanding! And all this based on the same RM data elements.

The risk tolerance in its most extreme


form was expressed by two divergent camps:1. The pessimists. Analysts who would

rather not wait till close-to-departure for higher revenue demand and filled the ship somewhat earlier by accepting demand sooner than later, thereby reducing the risk of empty cabins at sailing but also getting lower total revenues.

2. The optimists. Analysts who waited too long for the close-to-departure higher revenue demand and thus either suffered lower occupancy or did a last minute fire sale, resulting in lower total revenues.

One can debate which risk tolerance approach was best for the cruise line. The latter certainly sent the wrong mes-sage to the marketplace in terms of wait-ing for deals close to sailing, especially if it occurred often.

Another interesting observation was that senior management’s risk toler-ance also played into the analyst deci-sions. Since all pricing decisions had to be approved by the managers and/or directors, their risk tolerance and pref-erences were superimposed upon each analyst’s decision-making process. In this case, a systemic shift of metrics occurred for the department as a whole

during the time preceding the sailings: Early on, far from the sail date, the met-ric was net revenue (i.e., holdout and wait for higher valued demand), and, closer in as the sail date approached, the metric shifted toward occupancy.

StEPS tO MINIMIzE thE EFFEct OF

rISk PrEDISPOSItIONS

Although it was reassuring that ana-lysts did not blindly accept the RM system recommendations, it’s clear companies can better direct their efforts and minimize the risk-taking element through good RM models, training and metrics.

RM models. It’s vitally important to en-sure the effectiveness of the five main pieces of your RM system:1. Data: Good, clean and timely data

in a single location provide a reliable foundation for downstream RM processes.

2. Estimation models: Accurate and frequently updated models provide the best supporting parameters used in the RM system. These include cancellation rates, segmentation, unconstraining and price elasticities.

3. Forecasting: Accurate prediction of demand as best as data will allow, and flexibility to incorporate new business conditions or information without delay.

4. Optimization: Good recommendations based on valid representations of the real world’s business constraints and market conditions, built to take advantage of advances from evolving mathematical techniques.

5. Tracking and reporting: Visibility into knowing that the models are working well and that optimal revenue opportunities are being captured.

Training. Training provides both an un-derstanding and a belief in the RM sys-tem. Training underlines that RM models, if stochastic in nature, are generally risk neutral and on the average will provide superior revenue results compared to the “risk” taking by analysts which is akin to gambling. In the short term, it may pay off, but in the long term it will generate sub-optimal levels of revenues. If the analysts are trained to understand how the data is used, various parameters estimated, demand forecasted and optimization rec-ommendations produced, they are more likely to know where to focus their efforts in determining the validity of the RM sys-tem decisions.

Metrics. Confidence in the recommenda-tions produced by a RM system comes by producing and reviewing post-sail-ing metrics such as accuracy metrics of


The optimists. analysts who waited too long for the

close-to-departure higher revenue demand and

thus either suffered lower occupancy or did a last

minute fire sale, resulting in lower total revenues.


forecasts and other parameters used in the RM models and metrics of revenue opportunities captured. Showing analysts how well the forecasting models predict when the various demand streams can be expected to occur and did occur, will take them a long way in not unnecessar-ily second guessing the demand fore-casts. And viewing revenue opportunity captured metrics (actual revenue cap-tured on a scale of no RM revenue vs. op-timal revenue possible) also shows them the direct results of RM actions, whether positive or negative in nature.

rISk SENSItIvE MODELS

Practitioners and researchers using RM in several industries have observed that risk averseness is a common and natural human behavior. That’s especially true as RM analysts make their decisions under the generally difficult condition where the higher revenue customers’ de-mand occurs toward the end of the book-ing cycle. That’s when compensating for poor RM decisions or sub-par models is most difficult.

Most of the mathematics used in the RM optimization models rely on both the

long run – on the average, based on a high volume of flight departures, cruise sailings, hotel nights and car rentals – and therefore have risk-neutral revenue maximizing objective functions. But they don’t directly consider the fact that some-times RM industries may prefer stable financial results in the short term rather than some of the inherent volatility pro-duced with the use of risk-neutral models and market randomness.

Recent research and development of mathematical formulations incorporate a variety of mechanisms – called risk-sensitive formulations – into the RM models to mitigate these risk elements. Following are a number of different risk-sensitive methods incorporating a vari-ety of levers to achieve an acceptable risk objective:• various utility functions as a way

to reflect the level of risk that is acceptable;

• variance of sales as a function of price by using weighted penalty functions;

• value at risk or conditional value at risk functions;

• relative revenue per available seat mile at risk metric, for airlines;

• maximizing revenues, using constraints of minimum levels of revenue with associated probabilities; and

• target percentile risk measures that prevent falling short of a revenue target.(For more information and a compre-

hensive bibliography, see “Risk Minimiz-ing Strategies for Revenue Management Problems with Target Values” by Matthias Koenig and Joern Meissner.)

Even though most current RM models are risk-neutral models, RM practitioners have to ensure that they do not make risk predisposition-based, sub-optimal deci-sions while trying to maximize revenues. If the RM models in use, whether forecasting or optimization, indeed are in need of risk adjustments, then those enhancements should be made. However, incremental benefits are possible from using good mod-els to begin with, supported by frequent training and analytical review of results be-fore incorporating additional risk-sensitive components into the RM models. ❙

Param Singh, SAS Worldwide Marketing, has gained, over the past 15 years, a variety of cross-industry revenue management experience working in airlines, cruise lines, hotels and transportation. His responsibilities in RM have spanned all facets of revenue management systems including data management, forecasting, optimization, performance evaluation and metrics, reporting, GUI design, model calibration, testing and maintenance. Singh has also provided RM consulting services to several companies. Prior to RM, he worked in the application of a variety of operations research techniques and solutions in the airline industry in the areas of airport operations, food and beverage, maintenance and engineering.


practitioners and researchers using rm

in several industries have observed that risk averseness is a common

and natural human behavior. that’s especially true as rm analysts make

their decisions under the generally difficult

condition where the higher revenue customers’ demand

occurs toward the end of the booking cycle.



how should organizations embed segmentation to become highly relevant?

To marketers, it is fairly well established that enticing all customers with the same offer or campaign is useless. The

No. 1 reason why people unsubscribe or opt-out is due to irrelevant messaging. A while back marketers moved to grouping customers on the basis of certain metrics that gave a bit more context to the mar-keting strategies. Example: offering an entertainment service to customers who: 1. use a lot of MMS for sending videos

and pictures, 2. download songs using GPRS, and 3. have a very high ARPU (average revenue per user).

Segmentation drives conversion and avoids erosion. That’s a bold statement, but the reality is that “monotonous” sub-scribers do not come to your organization to make an activity. Your organization does not exist for a singular reason either. The core drivers of behavior are very dif-ferent for each core group of subscribers. When you look at all that in aggregate you

get nothing. You think “average duration of calls” means something or “revenue of calls” and “overall duration of calls” give you insights, but do they? Probably not much.

The problem is that all business re-porting and analysis is data in aggregate, e.g. total number of daily calls, total daily revenue, average monthly call duration, total weekly volume of GPRS, overall customer satisfaction and many more – gigabytes of data reports and megabytes of analysis, all just aggregates. The tiny percent of time that the analyst does segmentation, it seems to stop at ARPU. Segmenting by ARPU gives you seg-ments, but they are so basic that you will not find anything in there that will get you anything insightful.

So how can you make sure you are highly relevant to drive conversion and avoid erosion? If you want to find action-able insights you need to segment your data. Only then will you understand the behavior of micro-segments of your cus-tomers, which in turn will lead you to ac-tionable insights because you are not focusing on the “whole” but rather on a “specific.”

The power of segmenting your sub-scribers is that you get a 360-degree view of your customer while exploring such questions as, “Whom am I going

to sell a certain product to?” To answer this and similar questions, we’ll focus on the five best practices in behavioral segmentation.

BESt PrActIcE NO. 1:

FIrSt, DIScOvEr thE

cLIENt’S BuSINESS. START

WiTH quESTiONS, NOT THE

ANAlyTiCAl MODEl.

Business leaders feel frustrated when they don’t get insights that they can

act on. Similarly, from the analyst’s point of view, it can’t be fun to hold the title of “senior analyst” only to be reduced to run-ning aimless analytical models. Hence, the most important element of any effec-tive analytics project is to discover the cli-ent’s business dynamics by asking real business questions, understand those business questions and have the free-dom to do what it takes to find answers to those questions by using analytical strategies.

You need to ask business questions because, rather than simply being told what metrics or attributes to deliver, you need business context: What’s driving the request for the model? What is the client really looking for? Once you have context, then you apply your smart ana-lytical skills.

Five best practices in behavioral segmentation

By talha omer

t

m a r K e t i n g s t r at e gy

2 0 | a n a ly t i c s - m aga z i n e . co m a n a ly t i c s | m ay / j u n e 2 011

1


The business questions should have these three simple characteristics:1. They should be at a very high level,

leaving you room to think and add value.

2. They should have a greater focus on achievable steps because each step enables you to focus your efforts and resources narrowly rather than implementing universal changes, which making every step easier to accomplish.

3. They should focus on the biggest and highest-value opportunities because the momentum of a single big idea and potentially game-changing insight will incite attention and action.

The goal of business discovery is to pull an analyst up to do something he does less than 1 percent of the time in the analytics world – look at the bigger business picture. It is nearly impossible to find eye-catching actionable insights if you just build a model straight away. Efforts will be wasted and the project will stall if you don’t start by ask-ing business questions. Along with wasting resources, failing to ask the right business questions up front risks creating wide-spread skepticism about the real value of segmentation analytics.

The reason for asking business ques-tions can be summed up in one word:

context. We are limited by what we know – and what we don’t know.


rEcONcILE thE DAtA,

BuILD BuSINESS-

rELEvANt SEgMENtS.

HAVE AN OPEN MiND;

TOO MANy OR TOO fEW

NATuRAl SEGMENTS MAy

BE JuST RiGHT.

Big data is getting bigger. Information is coming from instrumented, interconnect-ed systems transmitting real-time data about everything from market demand and customer behavior to the weather. Ad-ditionally, strategic information has start-ed arriving through unstructured digital channels: social media, smart phone ap-plications and an ever-increasing stream of emerging Internet-based gadgets. It’s no wonder that perhaps no other activ-ity is as much a bane of our existence in analytical decision-making as is reconcil-ing data. Most of the things don’t seem to tie to anything; each time you present the outcomes, the executives are fanning the flames of mismatched numbers.

All of the attributes created for any analytical project are available to the stakeholders via standard BI reporting – simply compare the attributes with the reported numbers. If the numbers are

off by less than 5 percent resist the urge to dive deep into the data to find root cause.

A comprehensive agenda enables the reconciliation of the numbers. A senior analyst at one company, for example, stated that they were blindsided when it came to reconciling the data. But once they started checking every number to the ones reported to the business they found themselves able to go forward.

Cluster techniques transform data into insights. Cluster techniques are a power-ful tool to embed insights by generating segments that can be readily understood and acted upon. These methods make it possible for decision-makers to iden-tify customers having similar purchases, payments, interactions and other behav-ior, and to “listen” to customers’ unique wants and needs about channel and prod-uct preferences. As an analyst, you’ll no longer have to hypothesize the conditions and criteria on which to segment cus-tomers. Clustering techniques provide a 360-degree view of all customers, not just a segment of high-revenue customers.

Running the statistical process for clustering customers creates clusters that are statistically significant. The ques-tion then becomes, Are the clusters sig-nificant from a business perspective? To answer that, ask the following questions:

• Do you have enough customers in a segment to warrant a marketing intervention?

• What, and how many, attributes do they differ on? Are those attributes business critical to warrant different segments?

Once the above questions have been sufficiently answered, the project team can determine if there are customer be-haviors important enough from a busi-ness perspective to explain a marketing initiative.


rEFrESh thE

SEgMENtS. WHEN TO

REViSE SEGMENTS TO

ENSuRE THEy ARE AlWAyS

ACTiONABlE.

Segmentation sets the stage for how the organi-

zation is going to behave for a given time period. This is analytics at its best and one of the most resource-intensive analytics initiatives that will add huge value. As ex-ecutives start using segmentation more fre-quently to inform day-to-day decisions and strategic actions, this increasing demand will require that the information provided is recent and reliable. Therefore, it is neces-sary to keep the segments up to date.

B e h av i o r a l s e g m e n tat i o n

23


A senior executive told me his com-pany built a perfect statistical model that created highly actionable segments, but it soon became useless because a ma-jority of subscriber profiles had changed over time. This was due to the dynamic and competitive market the segmentation was focused on. In such environments, new campaigns, pricing and products are launched every day, causing instant be-havioral changes and hence accelerat-ing the model decay. The executive said they had to streamline the operational processes and automate them so that the company could rebuild segmenta-tion every month. At one time they even considered drawing real-time segmenta-tion since the benefits they reaped were unparalled.

Therefore, to keep the three gears mov-ing together – up-to-date segmentation, actionable insights and timely actions – the overriding business purpose must always be in view. New analytic insights are embed-ded into the segments as business chang-es, as new products are launched and as new strategic developments happen, and a virtuous cycle of feedback and improve-ment takes hold. It starts with a foundation of analytical capabilities built on organiza-tional context that delivers better insights, backed by a systematic review to continu-ously improve the decision process.


MAkE SEgMENtS

cOME ALIvE. ANAlyZE

SEGMENTS TO DRiVE

ACTiONS AND DEliVER

VAluE.

New methods and tools to embed infor-

mation into segments – analytics so-lutions, inter/intra-segment highlights, psychographic and demographic analy-sis – are making segments more under-standable and actionable. Organizations expect the value from these techniques to soar, making it possible for segments to be used at all levels of the organiza-tion, e.g. for brand positioning or allow-ing marketers to see how their brands are perceived.

Innovative uses of this type of infor-mation layering will continue to grow as a means to help individuals across the organization consume and act upon insights derived through segmentation that would otherwise be hard to piece together. These techniques to embed insights will add value by generating results that can be readily understood and acted upon:• Intra-segment analysis evaluates the

preferences of a segment, such as the highest proportion of revenue is realized from calls during the night,

etc. Measuring the proportion of traffic of an attribute for a segment will tell you the inclination and motivation for that segment.

• Inter-segment analysis reflects actual rank of a segment for an attribute across all segments – a technique that would give you the best/worst segments with respect to a particular attribute, e.g. highest revenue, second lowest GPRS users, etc.

• Psychographics and demographics analysis is a fantastic way to understand the demographic (male, female, age, education, household income) and psychographic (Why do they call during the night? What do they use Internet for?) makeup of any segment. For example, if you are interested in the technology savvy segment, targeted surveying of each segment and analysis will tell you what zip codes these subscribers are likely in, why they are using so much GPRS, what websites they visit, etc.

Once you establish the segments, you may then merge and/or discard segments that are business insignifi-cant. The rule of thumb for merging segments: If you believe that you can-not devise distinctive campaigns for two segments, merge them.

These methods will make it possible for decision-makers to more fully under-stand their segments of subscribers and boost business value. Businesses will be able to listen to customers’ unique wants and needs about channel and product preferences. In fact, making customers, as well as information, come to life within complex organizational systems may well become the biggest benefit of making da-ta-driven insights real to those who need to use them.


SPEEDINg INSIghtS

INtO thE

SEgMENtAtION

PrOcESS. WHAT

SEGMENTATiON-

fOCuSED COMPANiES

DO.

Most often, organizations start off their segmentation analysis by gather-ing all available data. This results in an all-encompassing focus on data manage-ment – collecting, reconciling and trans-forming. This eventually leaves little time, energy or resources to focus on the rest of the segmentation process. Actions tak-en, if any, might not be the most valuable ones. Instead, organizations should start in what might seem like the middle of the process: implementing segmentation


45


by first defining the business questions needed to meet the big objective and then identifying those pieces of data needed for answers.

By defining the business objective first, organizations can target specific subject areas and use readily available data in the initial analytic models. The insights delivered through these initial models will identify gaps in the data in-frastructure and business processes. Time that would have been spent collect-ing and pre-processing all the data can be redirected toward targeted data needs and specific process improvements that the insights identify, enabling a success-ful segmentation.

Companies that make data their overrid-ing priority often lose momentum long before the first iteration is delivered, frequently be-cause a data-first approach takes too long before delivering an actionable segmenta-tion. In cases where the market is very vol-atile, by the time you deliver the segments, time for “refresh” arrives and you are back to square one. By narrowing the scope of these tasks to the specific subject areas needed to answer key questions, value can

be realized more quickly, while the insights are still relevant.

Organizations that incorporate seg-mentation must be good at data cap-ture, processing and have plenty of space available in their warehouse. In these areas, they must outperform the competition up to tenfold in their abil-ity to execute. Time to market is very little. Market dynamics change quick-ly in highly competitive and saturated markets.

SEt yOurSELF uP FOr SuccESS

Remember, segmentation analy-sis is a tough game. The good news is that this is very far from daily business reporting and analysis. It requires more intense and focused effort, and it truly is advanced analysis. Not every company will be ready to leverage all of the above practices. The reader is encouraged to perform a self-critical analysis of your own abilities before you go into segment-ing your subscribers, even though the up-side is literally huge sums of money and a strategic advantage that will influence your fundamental business strategy in a very positive way.

For your company and business, maybe revenue from off-net calls is not as important as duration of calls, the number of MMS, the volume of GPRS

or bundled subscriptions. Understand what your business is, what are the ar-eas of strategic focus, and then seg-ment away.

Likewise, the more you understand what your customers are doing, the more likely it is that you’ll stop the irrelevance of your marketing campaigns. You’ll also likely find the optimum balance between what you want to have happen and what your cus-tomers want. You’ll make happier custom-ers, who will in turn make you happy.

Summing up, start on the path to

segmentation, keep everyone focused on the big business issues and select the business problems that segmentation can solve today with new thinking and a framework for the future. Build on the operational and strategic capabilities you already have, and always keep pressing to embed the insights you’ve gained into your business strategy. ❙

Talha Omer is an analytics professional and researcher. He currently serves as an analyst at a major telecommunication company. He holds a Master’s degree from Cornell University in operations research. He can be reached at [email protected].


subscr ibe to Analyt ics

it’s fast, it’s easy and it’s fRee! Just visit: http://analytics.informs.org/


Data miner survey examines trends and reveals new insights.

For four years, Rexer Ana-lytics has conducted annual surveys of the data mining community to assess the ex-

periences and perspectives of data min-ers in a variety of areas. In 2010, Rexer sent out more than 10,000 invitations, and the survey was promoted in a vari-ety of newsgroups and blogs. Each year the number and variety of respondents has increased, from the 314 in the inau-gural year (2007) to 735 respondents in 2010. [Rexer Analytics did not specifically

define “data miner” or “data mining”; the decision to participate in the survey was an individual choice.]

The data miners who responded to the 2010 survey come from more than 60 coun-tries and represent many facets of the data mining community. The respondents in-clude consultants, academics, data mining practitioners in companies large and small, government employees and representa-tives of data mining software companies. They are generally experienced and come from a variety of educational backgrounds.

Each year the survey asks data min-ers about the specifics of their modeling process and practices (algorithms, fields of interest, technology use, etc.), their pri-orities and preferences for analytic soft-ware, the challenges they face and how they address them, and their thoughts on the future of data mining. Each year the survey also includes several questions on special topics. Often these questions are selected from the dozens of suggestions we receive from members of the data min-ing community. For example, the 2010

survey included questions about text min-ing and also gathered information about the best practices in overcoming the key challenges that data miners face.

For a free copy of the 37-page sum-mary report of the 2010 survey findings, e-mail [email protected].

DAtA MININg PrActIcES

The data miners responding to the survey apply data mining in a diverse set of industries and fields. In all, more

understanding data miners

By Karl rexer (pictureD), heather n. allen anD paul gearan

F

6 0 c o u n t r i e s r e p r e s e n t e D


Data m i n i n g s u r v e y


text mining and asso-ciation rules were all used by at least one quarter of respondents. This year, for the first time, the survey asked about ensemble mod-els and uplift modeling. Twenty-seven percent of data mining consul-tants report using en-semble models; about 20 percent of corporate, academic and non-gov-ernment organization

data miners report using them. About 10 percent of corporate and consulting data miners report using uplift modeling, whereas this technique was only used by about 5 percent of academic and NGO/Gov’t data miners. Model “size” varied widely. About one-third of data miners typically utilize 10 or fewer variables in their final models, while about 28 percent generally construct models with more than 45 variables.

Text mining has emerged as a hot data-mining topic in the past few years, and the 2010 survey asked several questions about text mining. About a third of data miners currently incor-porate text mining into their analyses, while another third plan to do so. Most

than 20 fields were mentioned in last year’s survey, from telecommunica-tions to pharmaceuticals to military se-curity. In each of the four years, CRM/marketing has been the field mentioned by the greatest number of respondents (41 percent in 2010). Many data miners also report working in financial servic-es and in academia. Fittingly, “improv-ing the understanding of customers,” “retaining customers” and other CRM goals were the goals identified by the most data miners.

Decision trees, regression and cluster analysis form a triad of core algorithms for most data miners. However, a wide variety of algorithms are being used. Time series, neural nets, factor analysis,


data miners using text mining employ it to extract key themes for analysis (sen-timent analysis) or as inputs in a larger model. However, a notable minority use text mining as part of social network anal-yses. According to the survey respon-dents, data miners employ STATISTICA Text Miner and IBM SPSS Modeler most frequently for text mining.

The survey also asked data miners working in companies whether most data mining is handled internally or externally (through consultants or vendor arrange-ments). Thirty-nine percent indicated that data mining is handled entirely internally, and 43 percent reported that it is han-dled mostly internally, while only 1 per-cent reported that it was entirely external. Additionally, 14 percent of data miners reported that their organization offshores some of its data analytics (an increase from 8 percent reported in the previous year).

SOFtwArE

One of the centerpieces of the data miner survey over the years has been as-sessing priorities and preferences for data mining software packages. Data miners consistently indicate that the quality and accuracy of model performance, the abil-ity to handle very large datasets and the variety of available algorithms are their top priorities when selecting data mining software.

Data miners report using an average of 4.6 software tools. After a steady rise across the past few years, the open source data mining software R overtook other tools to become the tool used by more data miners (43 percent) than any other. SAS and IBM SPSS Statistics are also used by more than 30 percent of data miners. STA-TISTICA, which has also been climbing in the rankings, was selected this year as the primary data-mining tool by the most data miners (18 percent).

The summary report shows the differ-ences in software preferences among cor-porate, consulting, academic and NGO/Government data miners. For example, STATISTICA, SAS, IBM SPSS Modeler and R all have strong penetration in cor-porate environments, whereas Matlab, the open source tools Weka and R, and the IBM SPSS tools have strong penetra-tion among academic data miners.

The survey also asked data miners about their satisfaction with their tools. STATISTICA, IBM SPSS Modeler and R received the strongest satisfaction ratings in both 2010 and 2009. Data miners were most satisfied with their primary software on two of the items most important to them – quality and accuracy of performance and variety of algorithms – but not as sat-isfied with the ability of their software to handle very large datasets. They were also highly satisfied with the dependabil-ity/stability of their software and its data manipulation capabilities. STASTICA and R users were the most satisfied across a wide range of factors.

Data miners report that the comput-ing environment for their data mining is frequently a desktop or laptop computer, and often the data is stored locally. Only a small number of data miners report using cloud computing. Model scoring typically happens using the same software that developed the models. STATISTICA us-ers are more likely than other tool users to deploy models using PMML.

chALLENgES

In each of Rexer Analytics’ previous Data Miner Surveys, respondents were asked to share their greatest challeng-es as data miners. In each year, “dirty data” emerged as the No. 1 challenge.


the 5th annual data Miner surveyRexer Analytics recently launched its fifth annual data miner survey. In addition to continuing to collect data on trends in data miners’ practices and views, this year Rexer analytics has included additional question on data visualization, best practices in analytic project success measurement and online analytic resources. to participate in the 2011 survey, follow this survey participation link, and use access code inf28.

Data miners consistently indicate that the quality

and accuracy of model performance, the ability

to handle very large datasets and the variety of

available algorithms are their top priorities when

selecting data mining software.



Explaining data mining to others and difficulties accessing data have also persisted as top challenges year after year. Other challenges com-monly identified include limitations of tools, difficulty finding qualified data miners and coordination with IT departments.

In the 2010 survey, data miners also shared best practices for overcoming the top challenges. Respondents shared a wide variety of best practices, coming up with some innovative approaches to these perennial challenges. Their ideas are summarized and along with verba-tim comments (196 suggestions) on the website: www.rexeranalytics.com/Overcoming_Challenges.html.

Key challenge no. 1: Dirty Data. Eighty-five data miners described their experiences in overcoming this chal-lenge. Key themes were the use of de-scriptive statistics, data visualization, business rules and consultation with data content experts (business users). Some example responses:• In terms of dirty data, we use

a combination of two methods: informed intuition and data profiling. Informed intuition required our human analysts to really get to know their data. Data profiling entails

checking to see if the data falls into pre-defined norms. If it is outside the norms, we go through a data validation step to ensure that the data is in fact correct.

• Don’t forget to look at a missing data plot to easily identify systematic pattern of missing data (MD). Multiple imputation of MD is much better than not to calculate MD and suffer from “amputation” of your data set. Alternatively flag MD as new category and model it actively. MD is information! Use random forest (RF) as feature selection. I used to incorporate often too many variables which models just noise and is complex. With RF before modeling, I end up with only 5-10 variables and brilliant models.

• A quick K-means clustering on a data set reveals the worst as they often end up as single observation clusters.

• We calculate descriptive statistics about the data and visualize before starting the modeling process. Discussions with the business owners of the data have helped to better understand the quality. We try to understand the complexity of the data by looking at multivariate combinations of data values.


Key challenge no. 2: Explaining data mining to others. Sixty-five data miners described their experiences in overcom-ing this challenge. Key themes were the use of graphics, very simple examples and analogies, and focusing on the busi-ness impact of the data mining initiative. Some example responses:• Leveraging “competing on analytics”

and case studies from other organizations help build the power of the possible. Taking small impactful projects internally and then promoting those projects throughout the organization helps adoption. Finally, serving the data up in a meaningful application – BI tool – shows our stakeholders what data mining is capable of delivering.

• The problem is in getting enough time to lay out the problem and showing the solution. Most upper management wants short presentations but don’t have the background to just get the results. They often don’t buy into the solutions because they don’t want to see the background. Thus we try to work with their more ambitious direct reports who are more willing to see the whole presentation and, if they buy into it, will defend the solution with their immediate superiors.

• I’ve brought product managers

(clients) to my desk and had them work with me on what analyses was important to them. That way I was able to manipulate the data on the fly based on their expertise to analyze different aspects that were interesting to them.

Key challenge no. 3: Difficulty access-ing data. Forty-six data miners described their experiences in overcoming this challenge. Key themes were devoting re-sources to improving data availability and methods of overcoming organizational barriers. Some example responses:• I usually would confer with the

appropriate content experts in order to devise a reasonable heuristic to deal with unavailable data or impute variables. Difficult to access data means typically we don’t have a good plan for what needs to be collected. I talk with the product managers and propose data needs for their business problems. If we can match the business issues with the needs, data access and availability is usually resolved.

• A lot of traveling to the business unit site to work with the direct “customer” and local IT ... generally put best practices into place after cleaning what little data we can find. Going forward we generally develop a project plan around better, more robust data collection.

thE FuturE OF DAtA MININg

Data miners are optimistic about contin-ued growth in the number of projects they will be conducting in the near future. Sev-enty-three percent reported they conducted more projects in 2010 than they did in 2009, a trend that is expected to continue in 2011. This optimism is shared across data miners working in a variety of settings.

When asked about future trends in data mining, the largest number of re-spondents identified the growth in adop-tion of data mining as a key trend. Other key trends identified by multiple data min-ers are increases in text mining, social network analysis and automation. ❙

Karl Rexer ([email protected]) is president of Rexer Analytics, a Boston-based consulting firm that specializes in data mining and analytic CRM consulting. He founded Rexer Analytics in 2002 after many years working in consulting, retail banking and academia. He holds a Ph.D. in Experimental Psychology from the University of Connecticut. Heather Allen ([email protected]) is a senior consultant at Rexer Analytics. She has built predictive models, customer segmentation, forecasting and survey research solutions for many Rexer Analytics clients. Prior to joining the company she designed financial aid optimization solutions for colleges and universities. She holds a Ph.D. in Clinical Psychology from the University of North Carolina at Chapel Hill. Paul Gearan ([email protected]) is a senior consultant at Rexer Analytics. He has built attrition analyses, text mining, predictive models and survey research solutions for many Rexer Analytics clients. His 2006 in-depth analyses of the NBA draft resulted in an appearance on ESPNews. He holds a master’s degree in Clinical Psychology from the University of Connecticut. More information about Rexer Analytics is available at www.RexerAnalytics.com. Questions about this research and requests for the free survey summary reports should be e-mailed to [email protected].


Data miners are optimistic about continued growth in

the number of projects they will be conducting in the near future. seventy-three percent

reported they conducted more projects in 2010 than they

did in 2009, a trend that is expected to continue in 2011.


analytics are proving to be dispositive in high-stakes sports industry litigation.

As highlighted by James C. Cochran in the January/Feb-ruary 2010 issue of Analyt-ics and a forthcoming special

issue of Interfaces co-edited by Michael J. Fry and Jeffrey W. Ohlman, the sports industry has firmly embraced the use of analytics in the decision-making process. Such methods have similarly been adopt-ed in sports law, a corollary field inextri-cably intertwined with the dynamic sports business. As a prime example, Shaun As-sael of ESPN [1] recently described the

ongoing litigation involving the National Collegiate Athletic Association’s (NCAA) licensing of former student-athletes’ names and likenesses in video games (a now-consolidated case [2] that started with the filing of two separate actions – Keller v. NCAA, et al and O’Bannon v. NCAA, et al) as one of “five lawsuits that will change sports,” giving credence to the relevancy and importance of how an-alytics can and will be used in furthering specific arguments arising in the lawsuit. The purpose of this article is to provide

an overview of sports law analytics and discuss the role of analytics in sports law cases moving forward, with a pointed dis-cussion of the aforementioned consoli-dated Keller and O’Bannon case and the U.S. Supreme Court’s recent decision in American Needle v. NFL, et al [3].

OvErvIEw OF SPOrtS LAw ANALytIcS

The interdisciplinary methods em-ployed in sports law analytics are derived from statistics, management science, op-erations research, economics, psychology

and sociology. However, the practice pa-rameters of sports law analytics are set by evidentiary rules and relevant case law precedent. The consolidated case encompassing both Keller and O’Bannon implicates important intellectual property principles such as publicity rights and consent. Similarly, American Needle re-volves around antitrust law and the com-plex competition-centered analysis that goes along with it.

Daubert v. Merrell Dow Pharmaceu-ticals [4] was a U.S. Supreme Court

sports law analytics

By ryan m. roDenBerg (leFt) anD anastasios KaBuraKis (right)

A

l e ga l i s s u e s


W W W. i n f o r m s . o r g3 0 | a n a ly t i c s - m aga z i n e . co m a n a ly t i c s | m ay / j u n e 2 011

opinion that addressed the admissibility of expert testimony within the context of a drug-related birth defect case. Since be-ing decided in 1993, Daubert has been the seminal case on the issue of whether expert testimony should be admitted or excluded. As binding precedent on every federal trial court in the United States, an understanding of the Daubert standard is a prerequisite to applying sports law analytics in pending litigation. Daubert re-quires courts to adopt a standard that de-termines whether the proffered evidence “both rests on a reliable foundation and is relevant to the task at hand” (597). In ad-dition, the judge must consider “whether the reasoning or methodology underlying the testimony is scientifically valid” (592-93). The case has had the effect of limit-ing the use of the so-called “hired gun” expert.

Daubert requires the trial court judge to act as a gatekeeper to protect against un-reliable expert testimony being admitted into evidence (592-94). As summarized in Nelson v. Tennessee Gas Pipeline [5], the Daubert case set forth several factors to be considered:

…(1) whether a “theory or tech-nique…can be (and has been) tested”; (2) whether the theory “has been sub-jected to peer review and publication”; (3) whether there is a high “known or

potential rate of error” and whether there are “standards controlling the technique’s operation”; and (4) wheth-er the theory or technique enjoys “gen-eral acceptance” within the scientific community (251).

The U.S. Supreme Court, in cases such as Castaneda v. Partida [6], has offered guidance on the admissibility standards for quantitative evidence. As outlined by Winston [7], the nation’s highest court has “accepted the 5 percent level of sig-nificance or two standard deviation rule as the level of evidence needed to shift the burden of proof from plaintiff to defen-dant or vice versa” (96).

Kentucky Speedway v. NASCAR, et al [8], a December 2009 case out of the U.S. Court of Appeals for the Sixth Circuit, illus-trates the power and pitfalls of sports law analytics in litigation. The plaintiff alleged that NASCAR and an affiliate violated federal antitrust laws when the plaintiff’s application for an elite-level sanction was not granted and the plaintiff’s attempts to purchase pre-sanctioned races proved unsuccessful. The case also evidences how Daubert is applied in sports industry legal disputes. In Kentucky Speedway, NASCAR and its co-defendants pre-vailed after the court of appeals upheld the district court’s determination that the

plaintiff’s primary expert witness was un-reliable. Specifically, the expert retained by the plaintiffs was deemed to have ap-plied his own (incorrect) analytical test when testifying. Pointedly, the Kentucky Speedway court found that the expert’s “own version” of the well-accepted ana-lytic pertaining to consumer substitution in the marketplace “has not been tested, has not been subjected to peer review and publication; there are no standards controlling it, and there is no showing that it enjoys general acceptance within the scientific community…[f]urther, it was produced solely for this litigation” (918).

Federal Rule of Evidence 702 [9] is the primary rule that guides the admissibility of evidence in the federal court system and was revised after the U.S. Supreme Court decided Daubert. In relevant part, the rule provides:

If scientific, technical, or other spe-cialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training, or educa-tion, may testify thereto in the form of an opinion or otherwise, if (1) the tes-timony is based upon sufficient facts or data, (2) the testimony is the prod-uct of reliable principles and methods, and (3) the witness has applied the

s p o r t s , a n a ly t i c s & t h e l aw

as binding precedent on every federal trial court in the united states, an

understanding of the Daubert standard is a

prerequisite to applying sports law analytics in

pending litigation.



principles and methods reliably to the facts of the case.

Federal Rule of Evidence 702, cou-pled with Daubert, form the parameters of sports law analytics in the courtroom regardless of whether the dispute per-tains to intellectual property law (Keller and O’Bannon), antitrust law (American Needle) or otherwise.

ANALytIcS IN INtELLEctuAL

PrOPErty LItIgAtION

Quantitative-based analysis is play-ing a major part in the outcomes of virtually all contemporary intellectual property litigation that reaches trial. In fact, the absence of such analytics has been held as a shortcoming in the context of some intellectual property litigation. Commercial publicity rights have been represented by a number of malleable concepts on which there is no uniformity of acceptance, no dis-positive codified law, and jurisdictions across the U.S. have been split. The student-athletes in the Keller/O’Bannon class action video game case are alleg-ing, among other things, that the NCAA impermissibly licensed their name and likeness, a violation of their right of publicity.

Section 46 of the Restatement

(Third) of Unfair Competition [10] sets the burden of proof for establishing a violation of a right of publicity as: (i) use of the plaintiff’s identity; (ii) identity has commercial value; (iii) appropria-tion of commercial value for purposes of trade; (iv) lack of consent; and (v) resulting commercial injury. The afore-mentioned fourth prong will likely be at issue in the video game litigation, as the NCAA’s defense will probably include a claim that the student-athletes depicted in the interactive games provided de facto consent to such licensing via their scholarship agreement, letter of intent or other related document. Moreover, the second prong has been tradition-ally decided after consideration of mar-keting research surveys and several analytical tools attempting to establish whether there is indeed commercial value; e.g. whether consumers can suf-ficiently identify the plaintiff and, in turn, make the clear connection between the plaintiff and the digital expression in a video game.

Sports law analytics will almost cer-tainly play an integral part in the reso-lution of the consolidated class action containing both Keller and O’Bannon if the case goes to trial. A bevy of expert witnesses will testify. Analytics-driven evidence will be proffered by both sides.


Both Keller and O’Bannon were seeded in the use of former players’ images in college sports video games, for which the NCAA, Collegiate Licensing Com-pany (CLC) and NCAA member schools had contracted with Electronic Arts (EA), a leading video game manufacturer. Per NCAA policies on amateurism, student-athletes are not permitted to use their athletic skill to endorse commercial prod-ucts or services.

Similarly, the NCAA has taken the posi-tion that former student-athletes depicted in video games years after their colle-giate careers have ended are not entitled to receive compensation in exchange for the licensing of the name and likeness. Keller filed his complaint in May 2009 and, among other things, alleged that the NCAA, CLC and EA violated his rights of publicity under Indiana and California law. O’Bannon and several co-plaintiffs, all former college basketball and football players, filed a related lawsuit two months later. Analytics presented on the twin is-sues of the right of publicity and the pres-ence of consent will be influential, if not dispositive, in the case’s resolution.

ANALytIcS IN ANtItruSt LItIgAtION

The importance of sports law analytics will also be realized in American Needle if the dispute reaches trial following re-mand by the U.S. Supreme Court on May 24, 2010. The American Needle case in-volved an antitrust challenge by a Chica-go-area headwear manufacturer against the NFL following the league’s decision to enter into an exclusive arrangement with Reebok for the manufacture of officially licensed headwear. The Supreme Court unanimously reversed a lower court sum-mary judgment motion in favor of the NFL, concluding that the league is not immune from antitrust scrutiny in connection with its intellectual property licensing activities. Barring settlement, the now-remanded case will go to trial. There, plaintiff Ameri-can Needle will have the opportunity to present evidence showing that the NFL-Reebok agreement stifled competition in the marketplace, damaged the company’s book of business and adversely impacted consumers.

Analytics will play a role on two lev-els. First, macro-level experts for both sides will testify about economics-heavy

antitrust principles, gauging whether the pro-competitive effects of the exclusive arrangement are outweighed by the anti-competitive impact of the NFL-Reebok exclusivity. Second, a narrow investiga-tion will be undertaken to ascertain the impact on consumers. American Nee-dle’s micro-level analytics will be aimed at showing how a purported decreased level of competition has affected custom-ers. Such analytics will likely focus on costs at the retail level before and after the NFL granted Reebok an exclusive license. In response, the NFL will likely retain experts capable of testifying about how consumers are benefitted by the economies of scale resulting from an all-encompassing agreement in the form of greater selection, uniformity and quality control, for example. Finally, American Needle will need to demonstrate the ex-tent of its lost profits following the NFL-Reebok licensing pact.

cONcLuSION

High-stakes litigation in the sports industry often turns on analytics. The consolidated class action containing Keller/O’Bannon and the American Nee-dle v. NFL case are current examples. While this article explained the feder-al evidentiary rules and U.S. Supreme Court opinions that set the parameters

for the admissibility of statistical evidence and expert testimony in sports-related tri-als, such parameters can be generalized to non-sports contexts, as the legal rules are equally applicable. Experts with ana-lytical acumen and some baseline level of sport-specific institutional knowledge fre-quently provide expert witness and con-sulting services, as the underlying legal disputes are often nuanced and techni-cal, making them ripe for analytics. ❙

Ryan M. Rodenberg ([email protected]) is an assistant professor at Florida State University. He earned a Ph.D. from Indiana University-Bloomington and a JD from the University of Washington-Seattle. Anastasios Kaburakis ([email protected]) is an assistant professor at Saint Louis University. He earned a Ph.D. from Indiana University-Bloomington and a law degree from Aristotle University in Thessaloniki, Greece.


R e f e R e n c e s

1. shaun assael, “five lawsuits that Will change sports,” espn.com, nov. 8, 2010.

2. in Re student athlete name and likeness licensing litigation, c 09-01967 cW (n.d. cal. 2010).

3. american needle v. nfl, et al, 130 s.ct. 2201 (2010).

4. daubert v. Merrell dow pharmaceuticals, 509 u.s. 579 (1993).

5. nelson v. tennessee gas pipeline, 243 f.3d 244 (6th cir. 2001).

6. castaneda v. partida, 430 u.s. 482 (1977).

7. Wayne l. Winston, 2009, “Mathletics,” princeton, n.J.

8. Kentucky speedway v. nascaR, et al, 588 f.3d 908 (6th cir. 2009).

9. federal Rule of evidence 702 (2011).

10. Restatement (third) of unfair competition §46 (1995).



Regardless of the organiza-tion that you work for, chances are that you use dashboards to display and deploy metrics.

The technology for building dashboards has continuously evolved, so much so that it is now possible for a non-technical person to “build” a dashboard. Despite their ubiquity, whether dashboards have been able to achieve their utmost poten-tial is subject to debate.

Most dashboards typically start life in a business function (e.g. a spreadsheet

tracking report). With increasing use, more data integration is required and the number of users burgeons, spawning the need for a full-fledged dashboarding solu-tion. Departments (or governance bodies, in some instances) typically determine key metrics that must be part of the dash-boarding solution, and IT is brought in to gather requirements and select the tech-nology for a successful implementation.

Independent of the hierarchy of imple-mentation, each such exercise must at-tempt to answer two key questions:

• What metrics must be chosen to maximize impact on business?

• What is the relationship between met-rics, and is there an overarching frame-work into which these KPIs slot in?

Often the latter of the two – the focus on the big picture – is lost during the de-velopment of dashboards.

DAShBOArDS tO “cOckPItS” –

A SyStEM DyNAMIcS APPrOAch

Imagine that you’re piloting a space shuttle. Would you prefer a convention-al dashboard displaying certain choice metrics and trends or would you prefer a control panel, a “cockpit,” with “action-able insights” to negotiate the vagaries of inter-stellar travel? Piloting an orga-nization is often not very different from helming a space shuttle, and the future of dashboards depends on the extent to which they can emulate “cockpits,” “flight simulators” and “auto-pilot mode,” a no-tion first explored by Rob Walker [1].

Figure 1: The stock flow concept.

The secret to developing dashboards of such astounding efficacy and power

simulation frameworks: the key to dashboard success

By (leFt to right) zuBin Dowlaty, suBir mansuKhani anD Keshav athreya

r

D i s p l ay & D e p loy m e t r i c s


most dashboards typically start life in a business function (e.g. a

spreadsheet tracking report).

B u i l D i n g Da s h B oa r D s


a commonly studied concept in simulation is the “stock Flow” where

a “stock” is simply an accumulation of an entity over time, and the status

of stock varies depending on the “flow” variable.

could lie in the disciplines of simulation and system dynamics. A commonly studied con-cept in simulation is the “Stock Flow” where a “stock” is simply an accumulation of an en-tity over time, and the status of stock var-ies depending on the “flow” variable. The mathematical equivalents of stock and flow are the “integral” and “partial derivative,” re-spectively. This metaphor is appealing given its simplicity of explanation and intuitive ap-peal; stocks can be thought of as a bathtub and a flow will fill or drain the stock. Using these building blocks, one can then visually build a system complete with graphics and metrics that derive from the model. Some-one non-technical could intuitively verify the model assumptions.

cuStOMEr LOyALty PrOgrAM –

thE DAShBOArD FrAMEwOrk

Let’s say you’re responsible for creat-ing a tool to monitor the health of a loyalty program. Following the system dynamics approach would first entail the creation of a stock-flow map (see figure 2). Performing this exercise early on in the life of a dash-board ensures that the subsequent steps are grounded in theory and are sufficiently rep-resentative of reality. For a loyalty program, the key actors in the map are the customers

subscr ibe to Analyt icsit’s fast, it’s easy and it’s fRee! Just visit: http://analytics.informs.org/



prospects flow into the enrollee customer state, and

enrollees either activate into customers or they never

conduct business with the loyalty program. customers

that flow from the active state to inactive are considered the loyalty program’s churn flow.

(“stock”) and their inflow/outflow represents the “flow” variable.

Prospects flow into the enrollee customer state, and enrollees either activate into cus-tomers or they never conduct business with the loyalty program. Customers that flow from the active state to inactive are considered the loyalty program’s churn flow.

Figure 2: A stock flow map for a typical loyalty program.

Now that we have “The Big Picture” in place, designing the dashboard is a straightforward process. We have an accurate view of the in-terrelationships governing the key metrics. The map display as a navigation device is a useful addition to any dashboard. Metric trends may be animated on the map. When one clicks on a stock or a flow, all the key metrics describing that state are displayed.

For example, if one clicks the “Actives” stock, one could then see the number of active customers, customer segment distributions, “recency” and frequency tables, revenue and OLAP-style drill-downs displayed in a dedicat-ed dashboard view. A benefit of this approach is it segments metrics immediately into two groups: Stock or Flow.


Upon practice, one can utilize a simi-lar template for stock variables and an-other for flow variables. The variable segmentation promotes re-use of the designed templates, thus enabling sim-pler implementations from a technology standpoint.

cuStOMEr LOyALty PrOgrAM –

ONwArD tO SIMuLAtION AND

OPtIMIzAtION

With the stock flow map in hand, one can then form the basis for constructing a mathematical model of the system. The mathematical model opens the door for robust simulation and optimiza-tion as one matures beyond the dash-board reporting view. In its simplest form, the evolution of the system over time is constructed using the stocks and flows in the published map. For ex-ample, an analyst observes the active customer base in the firm’s customer loyalty program has begun to stagnate. The number of active customers is not increasing over time as expected.

You need to intervene and try to boost active customers, but what do you do? Viewing Figure 2, let’s increase

spending in the prospecting area of the map and boost the flow of spending dol-lars into the prospect stock. What would be the outcome with respect to active customers of this action? For example, increased prospect spending would likely cause an increase in the number of prospects, given an estimate of the response activation rate. You then can calculate the new stock of enrollees. In-creased enrollees translate into a boost of active customers through the new enrollee activation rate.

Improving response activation per-formance, attempting to reduce churn or some combination of these strategies pres-ent other scenarios to focus on. All these example scenarios are estimable from the map. In order to further enhance simulation accuracy one could introduce hierarchy. For each stock of customers, utilize customer segmentation to form sub groups. A cus-tomer segment is treated like a sub-stock to the parent stock, and one can track the inflow and outflow of each customer seg-ment. This will reconcile in the parent stock, and one would gain considerable improve-ment in tactical ability.

As the process of scenario analy-sis matures, most likely users will begin asking for the dashboarding system to recommend optimal scenarios given con-straints. Optimization naturally extends

the simulation apparatus; one can link the optimization engine with the auto-mated output of the simulator, iterate and search for an optimal condition or control rule. Stochastic optimization as well as probabilistic meta heuristic approaches such as simulated annealing work fine in these applications.

LASt wOrD

In sum, incorporating the “stock flow” mapping technique empowers the de-veloper and end-user by giving them an extensible framework for understanding dashboards. Furthermore, this approach paves the way for successful implemen-tation and is a natural step in the progres-sion toward flight simulator and auto-pilot dashboards. ❙

Zubin Dowlaty ([email protected]) is vice president/head of Innovation & Development with Mu Sigma Inc., a provider of decision sciences and analytics services. Subir Mansukhani is senior innovation analyst and Keshav Athreya is a senior business analyst with Mu Sigma.


R e f e R e n c e s

1. Rob Walker, 2009, “the evolution and future of business intelligence,” information Management, sept. 24 2009.

2. barry Richmond, 1994, “systems dynamics/systems thinking: let’s Just get on With it,” international systems dynamics conference sterling, scotland.

3. lawrence evans, “an introduction to Mathematical optimal control theory,” department of Mathematics university of california, berkeley.


as the process of scenario analysis matures,

most likely users will begin asking for the

dashboarding system to recommend optimal

scenarios given constraints. optimization naturally extends the simulation

apparatus.

Big companies have complex systems.

Wait. That sentence was not finished.

Big companies have complex systems to design and operate.

Hold on. There is more.

Big companies have complex systems to design and operate, which makes them a playground for operations research practitioners.

FedEx falls into the big company catego-ry. The recent 2010 FedEx Annual Report shows that the company had $34.7 billion in revenue. More than 280,000 team members provide service to over 220 countries. There are 664 aircraft and more than 80,000 ve-hicles moving eight million packages a day. All those employees with all those vehicles moving all those packages on a daily basis provide problems that need to be modeled and solved.

FedEx Express is the express airline sub-sidiary of FedEx Corporation and is the world’s largest express company.

The operations research group at FedEx Express has been solving operation-al challenges since the early stages of the company. The group operates as an inter-nal consultant, working on specific issues for various departments. Customers within FedEx Express include Air Operations, U.S. Operations, Central Support Services, Air

Ground Freight Services and International Operations.

FedEx Founder Fred Smith introduced the “People, Service, Profit” philosophy at FedEx. If you put your people first, they will in turn pro-vide quality service and profit will be the end result. People, Service, Profit also works well when grouping operations research (O.R.) problems. Without getting into too much detail on solutions, the playground of problems in-cludes the six listed in the following groups:


Fedex presents a “playground” of analytical problems

By chris holliDay

all those employees with all those vehicles moving

all those packages on a daily basis provide

problems that need to be modeled and solved.

c o r p o r at e p r o F i l e

FedEx Express schedules tens of thousands of workers to match anticipated work.



PEOPLE:

1. Problem: FedEx Express must sched-ule tens of thousands of workers to match the anticipated work. Work fluctuates with the amount of packages at any specific location. Specific workers have specific skills and must be matched to specific work.

Requirements: FedEx Express needs to match available workers to the shifts that need to be covered.

Approach: A multi-stage mixed integer pro-gram is used to solve this problem. All work tasks must be identified by time of day. Work-ers and skill sets are documented. Work must be grouped into shifts for full-time and part-time work. Specific employees are assigned.

2. Problem: FedEx Express has a group of analysts who design the delivery and pickup routes for the couriers. The need for a new route structure varies based on package growth and facility limitations. The work must be done with the help of local operators to make the imple-mentation successful.

Requirements: FedEx Express needs to balance the workload for analysts planning the courier routes.

Approach: A generalized assignment ap-proach is used and solved with an integer pro-gram. The current workload of the analysts must be reviewed, as well as their availability for future work. A list of facilities that require a restructure must be compiled. Other con-siderations included in forming the list of fa-cilities are the total number of courier routes,

38 | a n a ly t i c s - m aga z i n e . co m a n a ly t i c s | m a r c h / a p r i l 2 011

Fedex express volume fluctuates from day to

day. the delivery routes are designed to meet a

specific demand, but the couriers must expand or

reduce route coverage based on volume changes.

FedEx moves more than eight million packages a day.

geography


geography and route complexity. With this information, an optimized assign-ment of the analysts must be provided to management.

SErvIcE

3. Problem: FedEx Express volume fluctuates from day to day. The delivery routes are designed to meet a specific demand, but the couriers must expand or reduce route coverage based on volume changes.

Requirements: FedEx Express needs to balance the workload and optimize the routes for the delivery couriers.

Approach: A heuristic-based vehicle routing approach is used to solve this problem. All deliveries for a specific day are verified for a facility. The delivery routes are then optimized based on vol-ume and drive time. The solutions are provided to the delivery couriers.

4. Problem: FedEx Express has facili-ties throughout major metropolitan areas. The couriers work at these facilities sort-ing and processing packages. The routes driven by the couriers must begin and end at these facilities. Being close to the

customer leads to better service.Requirements: FedEx Express needs

to determine the optimal location for facilities.

Approach: A classic location analysis model is used to solve this problem. The number of shipments to and from every customer must be determined. Those packages must be divided into courier routes. The distance to begin each route as well, as return to building for each route must be determined. The best location for a facility will include this input and then be passed on to management for review.

PrOFIt

5. Problem: FedEx Express must in-vest in aircraft and facilities. As the in-ternational market continues to grow, larger aircraft are desired. Fuel efficiency is a major factor. These larger, newer air-planes are costly. They must be parked at airport facilities, some of the most expen-sive property in the world. These facili-ties must have support equipment able to move and sort packages. The purchase of aircraft and airport facilities requires significant lead time.

Requirements: FedEx Express needs

to determine the number and size of air-craft required by the system five to 10 years in the future. The size of the sup-porting facilities and equipment needed must also be estimated.

Approach: A multi-stage mixed integer program is used to solve this problem. The operations research groups must put together a forecast of packages and weight for five to 10 years in the future. The information must include package flow from each airport to each airport. The


number of aircraft available and connec-tions to hub facilities must be determined. Input also includes the cost of operating aircraft as well as capital. With the infor-mation, an optimized network must be built.

6. Problem: FedEx Express has 664

aircraft that move throughout the world. Making sure that the right aircraft are in the right place at the right time is an on-going task.



Flight schedules are cre-ated months in advance and are refined as the actual implementation date moves closer. During implementa-tion, the actual available air-craft are assigned to the flight schedule.

Requirements: FedEx Ex-press needs to match aircraft with flight schedule.

Approach: The operations research technique known as the tanker scheduling ap-

proach is used to solve this problem. The ac-tual aircraft total must include those available for service, those in maintenance plus those being used as spares. Certain aircraft are not allowed to fly into certain airports. Restrictions include noise and time-of-day. The actual as-signment of specific aircraft to the flight sched-ule is optimized.

The six problems described above are just a few of the many opportunities to apply O.R. and other advanced analytics techniques at FedEx Express. One of the ongoing challeng-es is to apply a technique to each problem that will provide a solution that is easily understood and successfully implemented. ❙

Chris Holliday ([email protected]), P.E., has been with FedEx 29 years and manages a group of operations research practitioners. Multiple FedEx team members contributed to this article. A version of this article appeared in OR/MS Today.

4 0 | a n a ly t i c s - m aga z i n e . co m a n a ly t i c s | m a r c h / a p r i l 2 011

Fedex express has 664 aircraft that move

throughout the world. making sure that the right

aircraft are in the right place at the right time is

an ongoing task.


FedEx deploys 664 aircraft and more than 80,000 vehicles.


I was struck by something I saw in the news this morning – lawmakers are concerned that so-called “DUI checkpoint apps” for smartphones would help drunk drivers avoid capture and abet them in breaking the law [1]. The story nagged at me all day; it was the sort of issue that I couldn’t let go of. So I decided to ply my trade as an op-erations researcher and put a nickel’s worth of analysis against the problem [2].

The first thing I did was field research. I downloaded two such apps; “Checkpoint Wing-man” and “Phantom Alert.” These apps work basically as a message board; persons who have the app can “report” a DUI checkpoint that they come across, and then these reports become part of a database. Owners of the app may then “pull” from the database the reported checkpoints and (theoretically) know whether they are at risk of getting “busted” with a DUI.

Let’s assume there’s a strong correlation be-tween a person’s propensity to drive intoxicated and the odds that they would be willing to post to the database [3]. If this assumption stands, then the database relies on persons who drive intoxicated frequently but don’t get caught at the checkpoint to make the updates. The updates could be so time late as to be useless. Because this is the five-minute analyst, we’ll assume that (substantial) problem away with a hand-wave.

Now, we can take cases on checkpoints. If

the checkpoint is optimally situated, that is, in a “chokepoint” that must be crossed for the drunk to get from his starting location to his destination, there are two outcomes: either he elects to make the trip while in-toxicated and is arrested, which is counted as a “win” for law enforcement; or he is deterred from making the trip and does “something else” – takes a cab, gets a driver, sleeps it off – which is also a “win” for law enforcement.

Easy enough. Now let’s extend this to the case where there are two routes from the start-ing point to the destination. It would seem at first that the drunks would now have an ad-vantage because they could gain knowledge about the risk of the paths. However: 1. As we discussed above, the information

could be time-late.2. The police get the same information.

There’s no reason that the police can’t download the DUI apps and gain intelligence about where the drunks think the checkpoints are. Because both sides have the same in-formation stream [4], this breaks down into a two-player game (payoffs are relative to the drunks; see Table 1):

The solution to this game is a mixed strategy for both players, and any individual drunk playing against the police in this situation will have a 50 percent chance of being caught [5], the same as if there was no app at all! An identical argument will show that the odds of escaping the check-point are 1/n where n is the number of possible (different) routes across the checkpoint plane.

The police could take this a step further and post false information about the checkpoints. From a practical standpoint, the drunks may see “Checkpoints everywhere” and simply choose to do something else [6].

With a small amount of data and a short amount of time, we have shown that the DUI-avoidance apps are no better than useless to the user (i.e. drunk) and no worse than harmless to law enforcement. ❙

Harrison Schramm ([email protected]) is a military instructor in the Operations Research Department at the Naval Postgraduate School in Monterey, Calif.


police vs. smartphone Dui apps

By harrison schramm

there’s no reason that the police can’t download the Dui apps and gain

intelligence about where the drunks think the

checkpoints are. Because both sides have the same

information stream, this breaks down into a two-

player game.

t h e F i v e - m i n u t e a n a ly s t

drunks/police deploy opposite dui app

deploy where dui app says

Believe DUI app -1 0Don’t believe DUI app 0 -1

Table 1: Police vs. drunks: a two-player game.

R e f e R e n c e s

1. http://techland.time.com/2011/03/23/senators-to-app-stores-get-rid-of-pro-drunk-driving-apps/

2. i do not intend to comment on the policy implications; i’m not particularly convinced whether these apps should be legal or not. What i am interested in from the oR point of view is what effect these apps have on the common good.

3. Justification: If you have a disposition to drive intoxicated, you consider knowing where the checkpoints are to be a “public good”; conversely, if you do not drive intoxicated you consider enforcement of checkpoints to be a “public good.”

4. assuming, of course, that the police can re-deploy (which is a good assumption).

5. for the game-theorists in the audience, because the value of the game is -1/2, for mixed strategy the drunks pick, the police can choose a corresponding mixed strategy and achieve the same result. this extends to the multi-road case as well, where the value of the game is -1/n

6. critics will note that i have valued deterring dui equally with punishing drunk drivers. those who weight punishment above deterrence will naturally come to a different conclusion.

Figure 1 is a three-dimensional map of the universe containing nine galaxies that you, as the traveling spaceman, wish to visit. Each galaxy’s position in the universe is indicated by Table 1.

QuEStIONS:

1. Starting (and ending) at galaxy “a,” in what order should you visit each galaxy to minimize the traveled distance? You must visit each galaxy and you cannot visit any galaxy more than once.

2. What is the total distance traveled?

hINtS:

1. Larger galaxies indicate that they are closer to your viewpoint.

2. This problem can be solved using AMPL.

Send your answer to [email protected] by July 15. The winner, chosen randomly from the correct answers, will re-ceive an “Analytics: Driving Better Business Decisions” T-shirt. ❙

John Toczek is a risk analyst for ARAMARK Corporation in the Decision Support group. He earned his Bachelor of Science degree in Chemical Engineering at Drexel University (1996) and his Master of Science in Operations Research from Virginia Commonwealth University (2005).

42 | a n a ly t i c s - m aga z i n e . co m a n a ly t i c s | s e p t e m b e r / o c to b e r 2 010

the traveling spaceman problem

By john toczeK

t h i n K i n g a n a ly t i c a l ly

Figure 1: Nine-galaxy universe.

Table 1: Coordinates for nine galaxies.

analytics magzine

Documents