ibm research noisy text correction – an exercise in futility? sreeram balakrishnan ibm india...

9
IBM Research Noisy Text Correction – an exercise in futility? Sreeram Balakrishnan IBM India Research Lab

Upload: samuel-anderson

Post on 27-Mar-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: IBM Research Noisy Text Correction – an exercise in futility? Sreeram Balakrishnan IBM India Research Lab

IBM Research

Noisy Text Correction – an exercise in futility?

Sreeram BalakrishnanIBM India Research Lab

Page 2: IBM Research Noisy Text Correction – an exercise in futility? Sreeram Balakrishnan IBM India Research Lab

2

Noisy Text Analytics Workshop IIIT Jan 8th 2007

Aggregate versus Instance analysis

Can divide applications for noisy text into two broad categories1. Applications that look at individual text instances

Eg Search, transcription (OCR, speech2text)

2. Applications that look at aggregate features of the text

Eg Document classification, Aggregate text analytics

Aggregate analysis is more robust to noise since errors can be averaged out.

Text correction techniques can help improve accuracy of aggregate statistics

Applications that require accurate correction of each text instance may be an exercise in futiliy (at least in the short term)

Eg the example of SMSs that require knowledge of whole context of conversation to manually correct

Page 3: IBM Research Noisy Text Correction – an exercise in futility? Sreeram Balakrishnan IBM India Research Lab

3

Noisy Text Analytics Workshop IIIT Jan 8th 2007

Example – Customer Contact Records

Date: 19990425, ID: 13163548 PRELA 04/25/1999 20:46 - Call started by John Velocci (MOB_NORTH). Q: wants to know if he's protected againt the CIH virus. a: has ibm anti virus installed. A: told him to goto the web site for upgrade patches. told him to fax pop. s: self st: closed 04/25/1999 20:51 - Call closed by John Velocci (MOB_NORTH).

Date: 19990426, ID: 13171316 POWER b04/26/1999 18:50 - Call started by Scott MacDonald (MOB_NORTH). q: CX'S DOG ATE HER POWER SUPPLY a: looked up the pn for ac adapter 02k6496 and transfered her to parts 04/26/1999 18:55 - Call closed by Scott MacDonald (MOB_NORTH).

Date: 19990604, ID: 13376646MONIT 06/04/1999 22:16 - Call started by Barry O'Kelly (IREL_MOB3). Q:Tp attached to dock with external monitor.....black border around LCD and monitor A:Undocked Tp......booted.....screen full Only gets border when attached to dock and monitor Was reinstalling monitor when cus disconnected S:Training 06/04/1999 23:11 - Call placed in Mobiles call back queue by Barry O'Kelly (IREL_MOB3). 06/04/1999 18:14 - Call taken by Andrew Atias (TAG4). Q: Customer calling back, customer still getting black border on LCD and monitor. A: I explained to the customer that this will will happen when using a simultanious display. S: SOP 06/04/1999 19:04 - Call closed by Andrew Atias (TAG4).CALL TYPE: Technical Information for Purchased EquipmentCODENAME: MICH-2MACHINE TYPE: 9546OMPONENT TYPE: Monitor/Display

::

Date: 19990605, ID: 13376581MONIT 06/03/1999 10:39 - Call started by Robert Dennis (MOB_NORTH). Machine is oow warranty Q:machine lcd panel will cut out on cust with the machine sitting normally cust states that presure on the under side of machine at or near the F10 key is what is needed to keep machine running....advised billable repair cust agreed, seeking R3 service 10:46:02 * MSG FROM EZSRV : EasyServe R3 pickup request received for 10:46:02 * MSG FROM EZSRV : Machine Type: 2640 Serial# 78GM283 advised customer that data may be lost when sending in to ez serve.... advised customer to back up all personal data...if possible also machine may be reloaded as part of pd/repair process please have all software, product licenses and COA's available when machine is returned also please write the case number on the outside of the box before sending in to ez serve repair 06/03/1999 10:49 - Call closed by Robert Dennis (MOB_NORTH). 06/04/1999 22:52 - Case Number: 13366697 continued by Margaret Butler ::

Date: 19991001, ID: 8629697

inquiry about TP8:memory upgrade

customer would like to upgrade TP8:memory but needs information.

gave customer information about memory products.diverse contact media

large number of customer claim logs

•IBM PC help centers received over 500,000 calls per year•Agents produce summary transcripts for each call

Date: 19991051, ID: 8630655

complaint about TPxx

customer reports paint is peeling around palm rest

•Summary records contains details of why customers are unhappy•Aggregate analysis of the key phrases reveals that paint peeling at palm rest is common complaint of many TPxx users

Page 4: IBM Research Noisy Text Correction – an exercise in futility? Sreeram Balakrishnan IBM India Research Lab

4

Noisy Text Analytics Workshop IIIT Jan 8th 2007

Value from wider sources of data

Customer Care How can customer service be improved?

Customer survey: I was unhappy with service because the agent did not answer my question

Market Intelligence

How can campaigns for upsell be better targeted?

Customer contact: I would like information on mutual funds

Quality Insights Why are warranty claims high for product XX

Claim form: Hinge cracked on product XX after 3 monthsEpinions.com: Don’t buy product XX, the hinge cracks within 1 year

Research & Discovery

What diseases are associated with gene-xyx

Medline documents: Gene-xyx expressed in mouse with zzz

Page 5: IBM Research Noisy Text Correction – an exercise in futility? Sreeram Balakrishnan IBM India Research Lab

5

Noisy Text Analytics Workshop IIIT Jan 8th 2007

Some example of SMS data

Please send me about yyy card What is the no. that i may have to dial for kaun banega lakhpati ? Tell me about new plans plese thanks Mera custmbar care kayu band kar diya gaya hai kirpa karke aap mera custmbar chalu karen

kayu ki mera ko newplan ki jankare chahi Sir mere custmer care nall gall nahi ho rahi.. Please activate my wap over gprs Gup shup pack not activating .unable to connect 656. Pl. confirm the receipt of payment of Rs. 500 paid on 19.05.06 vide receipt 0244213 at

Karanagar. Thanks Request har never made by me for ISD. I dont need ISD 24 hrs over what is the reply I am post paid customor & i have quary about my bill but custamer ex are not there. What can i

do now 3 din ho gaye aap ko aap ke 24 hour kab pure honge Xxx ki service ko kya ho gaya hai.custmer ko satisfied hi nahi karte. Tell me where i can contact othewise i would take another connection. The service of xxx is extremly bad and some of the senior employee are irresponsible regarding

their work.e.g. (XYZ) Xxx service veri poor. No care for customer is what Xxx focus on. I've to leave xxx as it is not solving my problem.

Gudbye Keep NOT care customers I am very distrebed to xxx massangar I riqvest 3rd time complained Bhaji plz custmar care service chalu kar do nahi ta mai no. Band kar devaga.Menu bahut mushkil

aa rahi plz.Kal spice da no chalega

Page 6: IBM Research Noisy Text Correction – an exercise in futility? Sreeram Balakrishnan IBM India Research Lab

6

Noisy Text Analytics Workshop IIIT Jan 8th 2007

Some examples of call centre notes

no.fwd to unbarr pls actv. AR on cst reqt ............10:19am. cust wants to actv roam as he don't understand the ivr roming actv on cust req ,,,,,,,,,,charges told,,,,reena no. unbarred as pymt reflected on cust req xxx roaming deactv on cust req the cust secratory called up and he inf tht he was not able to access

GPRS ,he was not able to confirm whether its masala or MO,and he told that he will call back with other details later and disconn teh call

No waiver given to him at any cost........ promotional mssg restricted as on cst req......11:14am Customer was charged SMS for Rs.3074.But customer didnt give request

for deactivation of 10000sms pack.Since om dwn,not able to chk active or not.But its shows active in new crm window.

resume no. as pymt is reflected.....................9.20am ar deactivated on cust request case escallated : HEALTH ALERTS to be deactivated ......11:15 am

Page 7: IBM Research Noisy Text Correction – an exercise in futility? Sreeram Balakrishnan IBM India Research Lab

7

Noisy Text Analytics Workshop IIIT Jan 8th 2007

Enriching structured BI with unstructured data

Augment classic structured data warehouses with information extracted from unstructured sources

Domain specialized annotators embedded in UIMA (open source) extract structured attributes from unstructured sources

Unstructured sources

UIMA processingUnstructured Enriched Data

WarehouseLink/Cleanse/Transform

Structured sources

ReportingOLAP toolsModeling

Mining

Unstructured Information Management Architecture

Page 8: IBM Research Noisy Text Correction – an exercise in futility? Sreeram Balakrishnan IBM India Research Lab

Analysis of Agent Performance (AT002)

Scenario: IBM BPO business with agents handling car rentals wants higher check-out rate

Solution: Extract and correlate key phrases from call transcripts with outcomes

Value Selling Phrases

Mention of Good Vehicle Mention of Good Rate

Page 9: IBM Research Noisy Text Correction – an exercise in futility? Sreeram Balakrishnan IBM India Research Lab

Analysis of Agent Performance

Checked out cars

No shows

Cancelled

47%

25%

Value selling phrases

Pick up information

Higher use of value selling phrases mentioning good rate for checked out cars versus no show