interactive voice response (ivr) systems: mobile applications for low-literate users
DESCRIPTION
Interactive Voice Response (IVR) Systems: Mobile Applications for Low-Literate Users. Juan Roldan , Usha Chandna , Kautilya Nalubolu , Alex Mitchell November 11, 2013. Outline. Mobile Phone Technology and the global illiteracy problem - PowerPoint PPT PresentationTRANSCRIPT
Interactive Voice Response (IVR) Systems: Mobile Applications for Low-
Literate Users
Juan Roldan, Usha Chandna, Kautilya Nalubolu, Alex Mitchell
November 11, 2013
OutlineI. Mobile Phone Technology and the global illiteracy
problem
II. Designing Mobile Interfaces for Novice and Low-Literacy Users
III. IVR System: Voice Query Voice Response Model
IV. Polly
V. Additional IVR applications
VI.Conclusions and challenges
I. Mobile Phone Technology and the global illiteracy problem
• Obsolescence of PDAs/other handheld devices, with a sustained or increasing need for mobility– Laptop computers are less portable and tablets
more costly• Increasing sophistication in applications/programs
available on a mobile platform• The great number of mobile phone users/ subscribers
already in developing countries– Illiterate populations in India, in parts of Africa and
throughout much of Latin America have at least a passing familiarity with mobile technology
Why Mobile Phones?
Literacy Rates by Continent
http://www.maps.com/ref_map.aspx?pid=12877, 2011
• A. Illiteracy: the inability to read and write within one’s native tongue– a. We distinguish between nonliterate and semiliterate
populations• 1. Nonliterate: having no reading/writing ability• 2. Semiliterate: an inability to read more than basic or
perfunctory sentences; may be fluent in numeracy• B. Technological illiteracy: expressing inexperience with or a
limited facility for using and applying (mobile) technology
Two Types of Illiteracy
Mobile Phone Ownership by Continent
http://www.trendhunter.com/trends/waste-ventures, 2012
• In India, calls are billed at a per-minute rate of less than $0.01, one-eighteenth to one-twentieth of rates observed in the UK, US and Japan
• Per-minute/text rates in Latin America begin around the penny mark in some countries, and exceed $0.10 in others– Onerous taxation in Chile, monopolies in Mexico– Data plans are priced commensurate with the American market,
despite enormous differences in GDP per capita earning• In Africa, call rates vary significantly by country
– In developing countries, mobile phone costs account for as much as 30% of household income
– Mobile costs exacerbate income inequalities
Phone costs in India and Latin America
Barrantes, Roxana, and Hernan Galperinee. "Can the Poor Afford Mobile Telephony? Evidence from Latin America." Elsevier 32.8 (2008): 521-30.Http://www.sciencedirect.com/science/article/pii/S0308596108000554. Elsevier, Sept. 2008. Web.
II. Designing Mobile Interfaces for Novice and Low-Literacy Users
INDRANI MEDHI, Microsoft Research IndiaSOMANI PATNAIK, Massachusetts Institute of TechnologyEMMA BRUNSKILL, University of California, BerkeleyS. N. NAGASENA GAUTAMA and WILLIAM THIES, Microsoft Research IndiaKENTARO TOYAMA, University of California, Berkeley
Medhi, Indrani, Somani Patnaik, EMMA Brunskill, Nagasena Gautamala, and Kentaro Toyama. "Designing Mobile Interfaces for Novice and Low-literacy Users." ACM Transactions on Computer-Human Interaction (TOCHI) 18.1 (April 2011): 2.1+. Web.
Set out to describe barriers to mobile use and design possibilities for better engaging illiterate users, who occupy an increasing market share
Illiterate users were most likely to use phones exclusively for synchronous calling, and rarely exploited higher-order applications
Focus on low-cost mobile phone development projects• Examples from mobile health programs and mobile banking• In many cases observed, respondents were already phone owners
Designing Mobile Interfaces for Novice and Low-Literacy Users
Nonnumeric inputs: nonliterate populations struggled to use and identify unfamiliar symbols (*, #, &) in addition to letters for messages requiring text inputs
Soft-key mapping: difficulties experienced with utilizing unlabeled and ambiguously labeled navigation keys
Discoverability: features or attributes laid out incoherently in a mobile interface• Scrollbars: novice and Inexperienced users may be unaware that some
features are “hidden” below those appearing on the main menu Hierarchical navigation: Pertinent features and applications are buried in
unreadable blocks of text• Graphics not intuitively designed for navigation/to reflect the purpose of a
button
Barriers to mobile use by nonliterate populations
Language barriers occur where non and semiliterate populations cannot read/write within their down dialect, and where– even among literate users– the language and terminology of an application is foreign
Mobile banking and healthcare apps: language characterized by technical jargon, alien phrases/idioms
Many apps produced for a global market use a single language, English, as a means of capturing many users with minimal investment• Still other apps, produced for foreign markets, use English
prompts exclusively, or an unintelligible mix of domestic and foreign terms
The Peculiar primacy of the English language in mobile applications
Study 1• Tested 58 subjects in Bangalore, India on fluency with mobile banking
technology, each with absent or limited writing/reading capabilities• 3 Conditions:
• Text-based– Control group
• Voice UI (IVR)– Spoken options for menu selection; speech-based feedback
• Graphical UI– Picture-based menus – Static, hand-drawn and culturally-relevant graphical representations
• Three groups:• (a) novice users • (b) seasoned users• (c) no experience with mobile devices
Two experiments
Results
Illiterate users were uniformly incapable of completing a transaction on the text-based UI
Voice-based UIs were completed with a 72% success rate, and at less than half the speed of graphical UI trials
Graphical UIs saw a 100% completion rate, at an average completion time of 13 minutes
Speed differentials are thought to be related to users’ familiarity with voice-based technologies generally • A natural fealty to voice-based UIs given experiences with synchronous
calling, etc.• Less hesitation, and a reduced fear of “breaking” or “spoiling” the phone,
fears which are likely to abate with experience on graphical UIs
Results (Cont’d)
(1) Provide graphical cues.(2) Provide voice annotation support wherever possible.(3) Provide local language support, both in text and audio.(4) Minimize hierarchical structures.(5) Avoid requiring nonnumeric text input.(6) Avoid menus that require scrolling.(7) Minimize soft-key mappings.(8) Integrate human mediators into the overall system, to familiarize potential users with scenarios and UIs.
Design Recommendations for Mobile Phone Technology
Technology resistance Temporary service without durable solutions to the illiteracy
problem Programs do not provide mobile technology, but merely make it
more accessible to current users– High vulnerability to financial shocks, theft, etc
Complexity of creating UIs for countries with multiple dialects/languages– A limitation felt more strongly by voice UIs than by graphically-
oriented ones Program costs and sustainability
– Donor attrition rates
Limitations
III. Interactive Voice Response System (IVR)
Interactive Voice Response(IVR) System?
An automated telephony system that • interacts with callers, • gathers information • routes calls to the appropriate recipient. Comprise of • Telephony equipment• Software applications, • a database and • a supporting infrastructure
Vashistha, Aditya, and Rajarathnam Nalluswamy. "Voice Based Social Networking and Informatiion Delivery System for Farmeres." Convergence Lab, n.d. Web.
IVR: Challenges in Scaling Voice Forum
Moderating Content at Scale• Possible solutions :a. Hiring large fleet of dedicated moderatorsb. Utilize community moderator
Managing Call Cost at Scale Possible solutions :a. Call charges are reduced by leveraging local callsb. Broadcast audio via mobile internet
Vashistha, Aditya,”IVR Junction: Building Scalable and Distributed Voice Forums in the Developing World” Microsoft Research
IVR Junction
Connects internet based users with phone based users
Information exchange at international level
Save users the cost of long distance phone call
Vashistha, Aditya,”IVR Junction: Building Scalable and Distributed Voice Forums in the Developing World” Microsoft Research
IVR Junction
IVR Junction stores all voice data using online Cloud storage
www.microsoftresearch.com
IVR Junction
IVR + Cloud based technology = IVR Junction
IVR junction integrates IVR service with social media services
www.microsoftresearch.com
IVR Junction Users
Applications of IVR Junction
CGNet Swara
Avaaj Otalo
Health line
Viral Entertainment Platform-Polly
www.microsoftresearch.com
Applications of IVR Junction
CGNet Swara
Avaaj Otalo
Health line
Viral Entertainment Platform-Polly
IV. Polly
Polly
Polly is a telephone-based, voice-based application which allows users to make a short recording of their voice, modify it and send the modified version to friends.
Video: http://www.cs.cmu.edu/~Polly/
Source: Jobs Opportunities through Entertainment: Virally Spread Speech-Based Services for Low-Literate Users. CHI13 presentation.
Source: Jobs Opportunities through Entertainment: Virally Spread Speech-Based Services for Low-Literate Users. CHI13 presentation.
Voice-based entertainment service• Entertainment as a “viral conduit”• Disseminate development telephone based
services• Incentivize people to train themselves
Polly
Source: Jobs Opportunities through Entertainment: Virally Spread Speech-Based Services for Low-Literate Users. CHI13 presentation.
Job ad browsing• For low-skilled, low-literate workers
Additional voice-based applications?
Polly
Source: Jobs Opportunities through Entertainment: Virally Spread Speech-Based Services for Low-Literate Users. CHI13 presentation.
Polly – goals
Can a system like Polly be scalable? Demographic characteristics of Polly users.
Cost-sensitivity, are users willing to pay? First development-focused service: response
of Polly users to the Job information service.
Source: Jobs Opportunities through Entertainment: Virally Spread Speech-Based Services for Low-Literate Users. CHI13 presentation.
Polly – 2012 scale deployment
Initially Polly’s phone number was given to 5 low-literate people on May 2012:• 85,000 users in ~4.5 months• 495,000 interactions• 1,000 new people daily
As of April 2013:• 163,000 users• 630,000 interactions
Source: Jobs Opportunities through Entertainment: Virally Spread Speech-Based Services for Low-Literate Users. CHI13 presentation.
Polly – 2012 scale deployment
Source: Jobs Opportunities through Entertainment: Virally Spread Speech-Based Services for Low-Literate Users. CHI13 presentation.
Polly – 2012 scale deployment
Source: Jobs Opportunities through Entertainment: Virally Spread Speech-Based Services for Low-Literate Users. CHI13 presentation.
Polly – Demographics
Source: Jobs Opportunities through Entertainment: Virally Spread Speech-Based Services for Low-Literate Users. CHI13 presentation.
Polly – Controlled trails
Source: Jobs Opportunities through Entertainment: Virally Spread Speech-Based Services for Low-Literate Users. CHI13 presentation.
Polly – Effect on daily quota of 7calls
Source: Jobs Opportunities through Entertainment: Virally Spread Speech-Based Services for Low-Literate Users. CHI13 presentation.
Polly – Job information service
On the first ~4.5 months:• 27,000 people used the job search service• Listened 270,000 times to job ads• Forwarded them 22,000 times to friends
As of April 2013:• 34,000 people used the job search service• Listened 385,000 times to job ads• Forwarded them 33,500 times to friends
57% of the interviewed users had used job search Only a handful of them applied.
Source: Jobs Opportunities through Entertainment: Virally Spread Speech-Based Services for Low-Literate Users. CHI13 presentation.
Polly – Conclusions and challenges
Scalability? Infrastructure capacity? How to achieve cost-efficiency? Willingness to pay? Long-term users? Impact on job offers? Additional applications?
V. Additional IVR applications
Video Kheti
Partnership with Digital Green
Digital Green – Demonstrates farming practices using videos
Designed to address Digital Green’s constraints
Video Kheti uses IVR to provide video content for farmers using multimodal interface similar to Siri, Google Voice, etc..Medhi, Indrani, Kalika Bali, and Edward Curtell. "Pages 2833-2842."Http://chi2013.acm.org/. Proc. of CHI2013 Changing Presepective, France, Paris. ACM, New York, 2013. Web.
Video Kheti- Is it effective?
Targets – rural users in developing countries
5 billion mobile subscription in 2011, growing at 20% a year
Graphical interface more successful than text based interfaces for illiterate and novice users.
Success Is correlated to education of users.
Applications : Avaaj Otalo
Similar to Polly
Can record, browse and respond to agricultural questions and answers
http://www.sautiyawakulima.net/research/wp-content/uploads/2011/11/howitworks1_avaajotalo.jpg
Applications: CGNet Swara
An effort to involve the underprivileged in main stream media; started in Chhattisgarh.
Mobile interface that allows to either record or listen to a 3 minute message.
Record messages are available on the phone and web. Web also features text form of these messages and are mailed to the mailing list.
Mudliar, Preeti, Jonathan Donner, and William Theis. "Emergent Practices Around CGNet Swara: A Voice Forum for Citizen Journalism in Rural India." Information Technologies and International Development 9.2 (2012): 65-79.Http://itidjournal.org/index.php/itid/article/view/1053/433. Information Technologies and International Development, 2012. Web.
Source: http://harrysurjadi.files.wordpress.com/2012/09/swara-system.png
CGNet Swara : How does it work?
CGNet Swara: Is it effective?
A participatory approach called citizen journalism
Illiterate people are now able to voice their problems and also learn about other communities.
This leads to transparency as any government or corporate misdeeds will be brought into everyone’s notice.
Challenges
Multiple languages in developing countries
Training a single automatic speech recognition for a language requires many hours of manually annotated speech.
Farmers are not equipped with devices that display videos.
Discussion
Q1: Why scaling up Polly is a challenge?
Q2: How successful has the job information sharing service been?
Q3: Can you think of few development challenges that Polly can address?