sunz 2011- susan needham - nz post case study genius
TRANSCRIPT
New Zealand Post Group: New Zealand Post Limited.
Sexy Analytics.Pure Genius!
Susan Needham24 February 2011
2
... a new kind of professional has emerged, the data scientist, who combines the skills of software programmer,
statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data.
Hal Varian, Google’s chief economist, predicts that the job of statistician will become the “sexiest” around. Data, he
explains, are widely available; what is scarce is the ability to extract wisdom from them.
From The Economist, February 2010http://annezelenka.com/2010/02/27/the-economist-on-the-data-deluge/
Statisticians are sexy!
Page 3
Key challenges faced in most organisations
Business people and analysts don’t always speak the same language
Organisations either don’t have advanced analytical capability or tools, or they are under utilising them
Data we have available
Page 4
GeniusTM
Page 5
The birth of Genius…We kicked off the project in September 2009 to develop our own geo-demographic segmentation model
The clients told us what they wanted - a model based on family mix, lifestyle pattern, where people spend their money and their attitudes would be most beneficial to the market
We formed a project team with a mix of technical, business and marketing expertise
We then had to determine which data to include – what information we already had and what to purchase externally
Information at sub-meshblock was critical to our clients, as was the frequency of the updates
Page 6
Successful segmentation factorsTo develop a successful segmentation, it is necessary to follow an approach that is designed to ensure a happy marriage between what are often seen as opposing poles of a continuum – Statistics and Pragmatism
A good segmentation solution should have the following characteristics:
Large enough to be worth targeting
Sufficient variance between clusters
Minimal variance within clusters
Easy to identify in the real world
Stability over time
Page 7
Data strategy to maximise segment granularity
We wanted as much data at DPID (household) level as possible
We utilised our own Rural (~200K households) and Lifestyle data (~200K households) to achieve this
For remaining households we determined their segment at a higher geographic level, and distilled it down to the household level
For medium to large meshblocks we got as much data as possible at a sub-meshblock level
─ We ended up with 55k sub-meshblock partitions vs 37k meshblock partitions: ~48% increase in granularity
Page 8
Segmentation objective and approach
Objective (for Urban Segmentation):To create 20 to 40 segments that are distinct for key variables (relating to lifestage & affluence) from our collection of household, sub-meshblock and meshblock data
Data challenge
─ 55k sub-meshblock partitions and 1,300+ variables
─ Development data files total more than 30gb
Key ToolsSAS – for the heavy duty data crunching and model building
─ The main procedure utilised was Proc Cluster
Excel – for data visualisationPage 9
Segment Visualisation
Over-represented
Under-represented
Distinctive Features
Page 10
The final stepAs you can see the colourful spreadsheet would make most people’s eyes glaze over!
Our next challenge was to bring it to life in a way that would make it meaningful to marketers and other business people
We did this by assigning each of the 36 segments a name, and grouping these up into similar clusters
─ The segment names are very “kiwi”, and as descriptive as possible
We also trawled through very detailed information to produce a summary for each segment
Page 11
NZ population by Genius™ Clusters
The NZ population is divided into 9 clusters and 36 segments
A. Urban AffluenceThe most affluent cluster of NZ representing the top 12% of NZ householdsMore likely to have a university degree or diploma, and to have post-grad qualificationsMedian house value of this cluster is $775k mostly in the decile 9 to 10 school zonesThis cluster is over-represented by those with homes in family trustMuch more likely to have their own business, with income from self employment and / or have investment incomeSkewed to age 45 to 64, self-employed or have investment incomeMain location – Remuera, Mt Eden, Oriental Bay, Khandallah, Fendalton, Chatswood
Page 13
A. Urban AffluenceA1 Cream of the Crop• Top 1% of NZ Population• High spenders in most categories,
eg. dry cleaning, gardening, beauty & hair salon etc
A2 Flushed with success• 2nd highest spender in a broad range
of categories eg. home electronics, furnishings, cafes
• Average house value $894k
A3 Saffron & Silk Ties• Likely to be born overseas with
overseas qualifications, skew to Asians
• Top spenders on recorded music and smash repair
A4 Secure Urban Families
• Slight skew to couple with two children
• Average house value of $647k
A5 Stable Futures• Slight skew to having
no children• Houses more likely to
be built before 1940, with average value of $558k
Page 14
Genius™ map - Karori area
A few challenges
Our old SAS server was having a “melt-down” during the Genius development
This did result in things taking longer than plannedWorking with some of the larger datasets did cause the server to slow down or freezeWe did eventually get a new SAS server right towards the end of the development─ At the same time we moved to using EG, which presented other “challenges”
Our new server is already having space issues – after only a year!
Things did take a lot longer than we originally anticipatedWe produced weekly progress reports to keep management informed of delays
Page 16
But wait there’s more…
Page 18
Car prediction modelDue to changes in privacy legislation, the supply of refreshed NZTA vehicle registration data will no longer be availableNZ Post have purchased NZTA registration data - with car ownership history for 2.4 million car owners
We have combined this with other NZ Post data (inc GeniusTM ), and created a single customer view for each NZ Car owner – containing:─ Name and latest address
─ Number of cars owned - may be used for inferring family composition
─ Car age and prices at the time of purchase – for inferring the owner’s preference for new cars and budget for purchases
─ Car make and model – for inferring preference for (eg European cars, coupe)
─ Car type – for inferring family composition and lifestage (e.g. people movers)
─ Time between purchases – for inferring when will be the next purchase
Page 19
New Car Purchase & GeniusTM
Pct New Cars Purchased by Genius Segment (e.g. 32% cars owned by A1 customers are new)
0%
5%
10%
15%
20%
25%
30%
35%
A1 A2 A3 A4 A5 B1 B2 C1 C2 D1 D2 D3 E1 E2 E3 E4 F1 F2 F3 F4 G1 G2 H1 H2 H3 H4 H5 R1 R2 R3 R4 R5 R6 R7 R8
Pct N
ew C
ars P
urch
ased
by
Geni
us S
egm
ent
Ethnicity predictionA number of our clients wanted to be able to predict the ethnicity of every household in NZWe are currently developing this utilising the names we have from various data sourcesUsing a combination of a proprietary look-up dictionary and the probability of occurrence from the Lifestyle Survey, we have predicted the actual ethnicity of the person or the ethnic origin based on their surnameMAORI: Hape, Moka
PACIFIC ISLANDERS: Folau, Tuipulotu
ASIAN: Cho, Huang
VIETNAMESE: Banh, Bui
JAPANESE: Fujimori, Fujino
INDIAN: Guha, Gupta
OTHER EUROPEAN: Cloete, Taliaard
EUROPEAN: Bolton, Gifford
Will be used to further enhance the accuracy and granularity of Genius
Page 20
Making analytics sexy to marketers
First we whip the data into shapeBefore we can begin, we need to append DPID’s to the client database─ A DPID is a Delivery Point Identifier that has been allocated to every house,
business, church, school in New Zealand, some 2.2 million points─ Each one is flagged as delivered or not delivered to
Then we remove duplicate records to obtain a single customer viewThen we can use the DPID to append other data sources…
Page 22
Let’s take a random address from your database Suburb = Remuera
Deprivation Index 1 (In the wealthiest 10% of country)The meshblock for that random address has approx:
126 residents in 42 dwellings
Median Age 41
Median 4 bedrooms per dwelling
75% of households earning HHI $100k+
44% of residents have a Bachelors Degree or Higher education
From Movers/NZCOAMr and Mrs X moved address in July 2008
They are owners not renters
They moved from Wellington
From Lifestyle SurveyThey have two children, a cat and a dogThey drive a BMW and a VW GolfThey have household income of over $150kThey like water sportsThey have a two year fixed rate loan reviewed in May 2011They are considering a trip to Europe
The address is a residential property, with a “no circulars” sign
They are in Genius Segment “A1 Cream of the Crop”
By linking the DPID of your customer to other data sources you build a much bigger picture of who they are
Page 23
Then we profile a client’s databaseWe would normally start by profiling the client database by Genius segment to get an understanding of existing customers─ We would sometimes do sub-profiles focusing on their “best”
customers
Page 24
And we usually produce some maps
Page 26
Then find prospects that are similarAcquisition
Now we know the Genius segments to focus on, we can target suitable acquisition lists to find prospects that fit this Genius profileWe can then target the best areas for:
─ Unaddressed letterbox drops
─ Semi addressed mailing
─ Addressed mailing
Cross-sellWe can profile the Genius segments of customers that are the most profitable, then find more customers that fit this profile from the wider customer base to cross-sell to
Page 27
SummaryWe started off with the right people – with technical capability and business nousWe had the right tools─ SAS has been an integral part of the development of GeniusTM, and other work we do
in our team
We had the right data to make the project a successAnd we had the organisational vision to support a project such as this
A marriage made in heaven!
Page 28
Questions & Discussion