ubm asia reifier marketing use case
TRANSCRIPT
Efficient Marketing by Customer Record Deduplication UBM Asia
How Reifier is helping UBM Asia gain single view of customers at low costs About UBM Asia UBM Asia (www.ubmasia.com) is Asia’s major trade fair and exhibition organizer. Owned by UBM plc listed on the London Stock Exchange, UBM Asia is headquartered in Hong Kong and has subsidiary companies in Asia and US, spanning 24 cities and 31 offices and a staff of 1300 people. With a track record spanning over 30 years, UBM Asia operates in 20 market sectors with 230 dynamic facetoface exhibitions and highlevel professional conferences, 21 targeted trade publications, 18 roundtheclock online products for over 2,000,000 quality exhibitors, visitors, conference delegates, advertisers and subscribers from all over the world. The problem UBM Asia collects contact information of visitors to its leading trade fairs. Due to the international nature of the visitorship, the data is multilingual with a mix of English, Chinese, Thai, Turkish, Korean, Japanese. The contact information is collected at various points in the cycle registration desk, online portals, survey forms, social media and various directories etc. The contact information is primarily obtained through paper forms, which get digitized or manual entry into a database. Typical record volumes are about a million entries. At the end of the fair, the data is collated and fed into a CRM (Client Relationship Management) system for future correspondence, offerings and promotions. The CRM is the cornerstone for UBM’s business. The sales and marketing team use the system heavily to invite attendees as well as send marketing promotions, service emails and critical event planning information to prospects both electronically and via traditional means. The contact information is riddled with poor quality data missing fields, typographical and lexical differences as well as field swapping within multiple entries of the same person. Many times, visitors provide common company sales or marketing email, phone numbers and addresses instead of their own personal email ids, phones & addresses. Other times the same visitor may provide different emails or phone numbers, or official address in one case and personal address in another. There are also misspellings, partial names with missing first, middle or last names, leading and trailing spaces and other typographical variations across all the fields. As a consequence of having these duplicates, UBM Asia was
Missing costsaving opportunities Suboptimal customer experience arising from the same customer being
approached multiple times for the same offer
www.nubetech.co | [email protected]
The sheer size of the data as well as the nuanced differences make manual deduplication impossible. As exact matches are rare, database joins and filtering are ruled out too. Requirements UBM Asia wanted a solution for data matching and quality which could
a. Handle different variations in fields across records Missing middle, first and last names, abbreviations in different parts of names and addresses, typographical errors etc. The tool also needed to handle a mix of Chinese and English characters within the same record as well as datasets containing both Chinese and English records
b. Support different geographies even when the names are in English, there are regional differences when the event is held in India vs one in Singapore
c. Yield results faster d. Work without data massaging, normalization and preprocessing
Approach UBM Asia tried multiple existing solutions but none of them could handle the complexity, volume and variety of data and provide a useable level of accuracy. Existing deduplication solutions are rule and dictionary based where defining and managing the rules is a complex and time consuming activity performed by a developer who has a background in matching algorithms and tweaks weights assigned to different fields. To create precise rules, a lot of data cleansing and preprocessing is also needed. Rules and dictionaries need modification when the context of data changes or with the change in language or locale. A rule mapping English name Jonathan to Jo is invalid in an Asian context, where Jo is a name in itself. Thus learnings from one set of data cannot be easily used on another set of data and requires costly and time consuming intervention from an expert. UBM Asia’s Business Intelligence team uses Reifier fuzzy machine engine to make smart matching. With minimal setup time, Reifier matches and links contact records containing different languages as well as variations across fields yielding an accuracy of 70% or more. The same training model works with English and Chinese records. UBM Asia is also able to successfully match and link Japanese records on the same setup without any configuration changes. Using Reifier’s smart web interface, UBM Asia’s Business Intelligence team performs their matching tasks with ease, deduplicating and linking data within minutes instead of days. 1
Reifier’s innovative fuzzy machine algorithms use machine learning to overcome the limitations of traditional systems. Reifier is directly managed by the business user, data
1 As per industry average and UBM Asia’s internal findings, a temporary worker can manually verify upto 1000 records a day.
www.nubetech.co | [email protected]
scientist or data engineer who can train Reifer to identify duplicates just like a human would without the need of a data matching developer or expert.
Before Reifer we had to use a lot of manual efforts to identify potential duplicates in customer data, now the system can learn patterns and find duplicates for us intelligently. It’s a breakthrough to a longstanding issue of our businesses.” Mr. Dave Chan, Regional Director Business Intelligence, UBM Asia
Reifier’s automated learning engine brings up the deduplication system 5 times faster and identifies 2 times more duplicates than conventional tools . As Reifier learns from the
2
data itself, it works seamlessly with different datasets products, people, organizations, addresses etc. Built on Apache Spark, Reifier is highly scalable to billions of records. Reifier can be deployed on premise or on the cloud, providing sufficient ROI to the end user. To see how Reifier can help you, write to us at [email protected] / tweet to @nubetech / call +918800541717 today.
2 Comparison performed independently by another customer, reference available on request
www.nubetech.co | [email protected]