before the federal communications commission …...apr 20, 2020 · 2 first, as captioncall and...
TRANSCRIPT
Before the
FEDERAL COMMUNICATIONS COMMISSION
Washington, D.C. 20554
)
In the Matter of )
)
Misuse of Internet Protocol (IP) Captioned ) CG Docket No. 13-24
Telephone Service )
)
Telecommunications Relay Services and Speech- ) CG Docket No. 03-123
to-Speech Services for Individuals with Hearing )
and Speech Disabilities )
)
COMMENTS OF CAPTIONCALL, LLC ON THE AMENDMENT OF
MEZMO CORP. (DBA INNOCAPTION)
FOR CERTIFICATION TO PROVIDE
AUTOMATIC SPEECH RECOGNITION BASED
INTERNET PROTOCOL CAPTIONED TELEPHONE SERVICES
Cindy Williams
General Counsel
Sorenson Holdings, LLC
4192 South Riverboat Road
Salt Lake City, UT 84123
April 20, 2020
i
TABLE OF CONTENTS
INTRODUCTION AND SUMMARY ............................................................................................1
I. The Commission Must Adopt Service-Quality Standards for IP CTS and Other
Mandatory Minimum Standards for ASR-Only/Hybrid Services Before Certifying
Provision of Such Services. .................................................................................................3
II. InnoCaption’s Amendment Does Not Provide Sufficient Information to
Demonstrate that Its ASR Feature Satisfies the Declaratory Ruling, and Its
Testing Data Is Incomplete and Confirms that ASR Services Are Generally Not
Ready. ..................................................................................................................................7
A. InnoCaption Should Provide Additional Details Regarding its Testing. .................8
B. InnoCaption’s User Control Settings Do Not Correct Deficiencies with
ASR. .......................................................................................................................10
CONCLUSION ..............................................................................................................................14
1
Before the
FEDERAL COMMUNICATIONS COMMISSION
Washington, D.C. 20554
)
In the Matter of )
)
Misuse of Internet Protocol (IP) Captioned ) CG Docket No. 13-24
Telephone Service )
)
Telecommunications Relay Services and Speech- ) CG Docket No. 03-123
to-Speech Services for Individuals with Hearing )
and Speech Disabilities )
)
COMMENTS OF CAPTIONCALL, LLC
CaptionCall, LLC (“CaptionCall”) submits these comments on Mezmo Corp.’s (dba
InnoCaption’s) amendment (“Amendment”) to provide Internet Protocol Captioned Telephone
Service (“IP CTS”) using automatic speech recognition (“ASR”) technology.1
INTRODUCTION AND SUMMARY
CaptionCall is committed to innovation and believes that ASR holds tremendous
promise. Indeed, the recent COVID-19 pandemic demonstrates the public interest benefits that
ASR will offer, once the technology is sufficiently well developed to satisfy the Americans with
Disabilities Act’s (“ADA”) mandate of “functional equivalen[ce]”2 and the Commission’s
mandatory minimum standards—especially being able to handle all “type[s] of call[s].”3
However, CaptionCall has concerns regarding InnoCaption’s Amendment for two reasons.
1 See Comment Sought on Amendment to Application of InnoCaption for Certification as a Provider of IP Captioned
Telephone Service, Public Notice, CG Docket Nos. 03-123, 13-24, DA 20-313 (rel. Mar. 20, 2020).
2 47 U.S.C. § 225(a)(3) (requiring that TRS enable “functionally equivalent” communications by telephone).
3 47 C.F.R. § 64.604(a)(3)(ii); see also In re Misuse of Internet Protocol (IP) Captioned Telephone Service, Report
and Order, Declaratory Ruling, Further Notice of Proposed Rulemaking, and Notice of Inquiry, 33 FCC Rcd 5800,
5832 ¶ 60 (2018) (“Declaratory Ruling”) (requiring ASR providers to “be capable of handling all types of calls”).
2
First, as CaptionCall and numerous other stakeholders have pointed out on the record, the
Commission should adopt service-quality and ASR-specific mandatory minimum standards
before certifying any ASR-only/hybrid providers.4 Indeed, InnoCaption itself has previously
joined with other IP CTS providers in advocating, among other things, that “[q]uality standards
are essential to ensure that IP CTS consumers are receiving high quality services.”5 The
Commission has not yet adopted such standards, which are necessary to determine whether ASR-
based services comply with the ADA. The Commission should thus defer consideration of
InnoCaption’s Amendment, as well as other pending applications, until meaningful standards for
evaluating them are in place.
4 See, e.g., Comments of CaptionCall, LLC on the Applications of MachineGenius Inc., VTCSecure, LLC, and
Clarity Products, LLC for Certification To Provide Automatic Speech Recognition Based Internet Protocol
Captioned Telephone Services, CG Docket No. 03-123, at 3-13 (Sept. 25, 2019) (“CaptionCall Comments”);
Recommendation of the FCC Disability Advisory Committee, Relay and Equipment Distribution Subcommittee:
Internet Protocol Captioned Telephone Relay Service Metrics ¶ 6 (adopted Oct. 3, 2018), https://docs.fcc.gov/pub
lic/attachments/DOC-354522A1.pdf; Letter from Blake E. Reid, Counsel to Telecommunications for the Deaf and
Hard of Hearing, Inc., to Marlene H. Dortch, Secretary, FCC, CG Docket Nos. 03-123 and 13-24, at 3-7 (July 26,
2018); Letter from Clear2Connect Coalition to Marlene H. Dortch, Secretary, FCC, CG Docket Nos. 13-24, 03-123
(May 14, 2019) (“Clear2Connect Coalition 5-14-19 Ex Parte”); Comments of Hamilton Relay, Inc., CG Docket
Nos. 13-24, 03-123, at 13 (Sept. 7, 2018); Sprint Petition for Clarification or, in the Alternative, Reconsideration,
CG Docket Nos. 13-24, 03- 123, at 2-3 (July 9, 2018); accord Reply Comments of CaptionCall, CG Docket Nos.
13-24, 03-123, at 2, 6 (Nov. 15, 2018) (discussing importance of establishing “consistent, objective, and technology
neutral” service quality metrics, testing methodologies, and standards that should be applied to all prospective IP
CTS providers).
5 Letter from Dixie Ziegler, Vice President of Relay, Hamilton Relay, Inc.; Cristina Duarte, Director of Regulatory
Affairs, Mezmo Corp.; Michael Strecker, Vice President of Regulatory and Strategic Policy, ClearCaptions, LLC;
Scott Freiermuth, Counsel, Government Affairs, Federal Regulatory, Sprint Corp.; Bruce Peterson, Vice President
Government & Community Relations, CaptionCall, LLC; and Kevin Colwell, Vice President of Engineering,
Ultratec, Inc. to Marlene H. Dortch, Secretary, FCC, CG Docket Nos. 13-24, 03-123 (Sept. 20, 2019) (“Joint
Provider Group Ex Parte”); see also IP CTS Quality Metrics: Provider Recommendations, at 3 (Aug. 21, 2018),
Attachment to Letter from Bruce Peterson, Vice President of Marketing, CaptionCall, LLC; Cristina Duarte,
Director of Regulatory Affairs, Mezmo Corp.; Michael Strecker, Vice President of Regulatory and Strategic Policy,
ClearCaptions, LLC; Dixie Ziegler, Vice President of Relay, Hamilton Relay, Inc.; and Scott Freiermuth, Counsel,
Government Affairs, Federal Regulatory, Sprint Corp to Marlene H. Dortch, Secretary, FCC, CG Docket Nos. 13-
24, 03-123 (Aug. 21, 2018) (“To monitor the performance of . . . fully automated speech recognition (ASR) applied
in the industry, it is important to establish a common set of metrics that will allow IP CTS users, regulators and other
stakeholders to set meaningful standards by which new approaches to providing IP CTS can be measured.”).
3
Second, even if the Commission does not adopt such a framework, InnoCaption’s
Amendment does not provide sufficient evidence to show that its service satisfies the mandatory
minimum standards as clarified by the Declaratory Ruling. In particular, InnoCaption’s reliance
on users’ five-star surveys, while an improvement over other applications, is not a meaningful
substitute for testing the service quality of ASR for different callers and call types.6 InnoCaption
itself has asserted that “[p]olicy decisions” about IP CTS “must be based on [such] data.”7
Moreover, even if InnoCaption’s survey results were a reasonable proxy for testing data,
InnoCaption’s own data shows that its ASR software does not meet the standard of its
Communications Assistants (“CAs”), and thus raises questions as to whether any ASR service is
ready for certification.8 Finally, InnoCaption’s reliance on user controls to address these
concerns imposes significant burdens onto consumers in contravention of the ADA.
I. The Commission Must Adopt Service-Quality Standards for IP CTS and Other
Mandatory Minimum Standards for ASR-Only/Hybrid Services Before Certifying
Provision of Such Services.
In the Declaratory Ruling, the Commission determined that IP CTS using ASR is an
eligible form of relay service so long as it complies with the applicable Telecommunications
Relay Service (“TRS”) mandatory minimum standards.9 To that end, the Commission directed
the Consumer and Governmental Affairs Bureau to approve applications to provide ASR service
only if an applicant demonstrates that its service meets the ADA and Commission’s standards
“through documentary and other evidence,” including “quantitative test results demonstrating
6 Amendment to Application of [InnoCaption] for Certification as a Provider of IP Captioned Telephone Service,
CG Docket Nos. 03-123 and 13-24, at 5-7 (filed Mar. 13, 2020) (“InnoCaption Amendment”).
7 Joint Provider Group Ex Parte at 3.
8 InnoCaption Amendment at 3, 6-7.
9 Declaratory Ruling, 33 FCC Rcd at 5827, 5832-35 ¶¶ 48, 60-65.
4
that the applicant’s service will afford a level of quality that is at least comparable to currently
available CA-assisted IP CTS with respect to captioning transcription, delays, accuracy, speed,
and readability.”10 Among other things, the Commission specifically directed applicants to show
that their services are capable of handling “all types of calls” and will “ensure that conversations
are kept confidential.”11
Yet there are currently no concrete service-quality standards for evaluating whether an IP
CTS offering that incorporates ASR is comparable to a CA-assisted service. The lack of
meaningful standards presents a standing risk that the Commission will approve ASR services
that cannot deliver functionally equivalent service to users. The record reflects widespread
concern that services incorporating current ASR technology (whether ASR-only or hybrid
models) do not match the quality of CA-assisted services.12
In particular, current ASR technology is vulnerable to bias and prone to errors in several
important call contexts. That includes emergency calls; calls with minority speakers, speakers
with accents, speakers who are soft-spoken, speakers with high- or low-pitched voices, and
speakers with speech impairments; calls with specialized or personalized jargon or speech
content; and calls with difficult background conditions.13 Although CaptionCall is optimistic
that the quality and capabilities of ASR will improve, these shortcomings will degrade service
quality in the near term. The Commission should therefore adopt standards that will ensure that
10 Id. at 5834 ¶ 63.
11 Id. at 5832-33 ¶ 60.
12 See Reply Comments of Hamilton Relay, Inc., CG Docket Nos. 13-24, 03-123, at 2-3 & nn.3-5 (Oct. 10, 2019)
(“Hamilton Reply Comments”) (collecting similar concerns expressed by other commenters); Clear2Connect
Coalition 5-14-19 Ex Parte at 2-3, 7-11; supra note 3.
13 CaptionCall Comments at 4-5.
5
certified ASR-based services account for these well-known weaknesses in ASR and mitigate
their effect on consumers.
In addition, the record reflects widespread agreement that, before certifying the provision
of IP CTS incorporating ASR technology, the Commission must adopt ASR-specific mandatory
minimum standards, including as necessary substitutes for current standards that refer
specifically to CAs—and which applicants have argued are therefore not applicable to ASR.14
Most importantly, the Commission must address how the “all types of calls” standard
applies to ASR,15 a technology that relies on algorithms that are trained to recognize speech
patterns and convert human speech inputs into reliable transcriptions. Because of variability in
how people speak, and in the background environments in which calls are made, these
algorithms must be trained to reliably recognize the speech patterns of a variety of different
speakers and in a variety of different contexts. If the data sets used to train these algorithms are
skewed by oversampling the speech patterns of certain groups, the resulting engines could be
biased to perform reliably for some speakers but to struggle to recognize and transcribe the
speech of others.
It is well documented that current ASR technology often exhibits bias that results in
unreliable transcription for a variety of different speakers. Professor Richard M. Stern of
Carnegie Mellon University, who has worked on automatic speech recognition since 1982, has
observed that ASR struggles to provide accurate transcriptions for “women, the elderly, young
children, members of minority groups and individuals who speak in dialects, non-native
14 Hamilton Reply Comments at 2-3; supra note 3.
15 Declaratory Ruling, 33 FCC Rcd at 5832 ¶ 60; see also 47 C.F.R. § 64.604(a)(3)(ii).
6
speakers, speakers under stress, hearing-impaired speakers, and individuals suffering from
various types of neurological impairment including Parkinson’s disease, cerebral palsy, etc.”16
Moreover, in a recent study, researchers at Stanford University found that ASR systems
misidentified 35% of the words spoken by black participants, compared to only 19% of the
words spoken by white participants.17 When the quality of the transcriptions produced by an
ASR system depends heavily on the identity of the speaker, no IP CTS offering relying on that
system can truly be said to provide functionally equivalent service to persons with hearing loss
because such services are not capable of handling “all types of calls.”
The Commission also must establish appropriate privacy protections for ASR-
only/hybrid services, given that its current mandatory minimum standard for confidentiality is
specific to CAs.18 Specifically, the Commission should adopt rules that require ASR providers
to offer a comparable level of privacy to CA-assisted services and provide transparency about the
third parties to whom an ASR provider discloses user data.19
The COVID-19 pandemic is a stark reminder of the importance of making sure that ASR-
only/hybrid services satisfy the ADA and the Commission’s mandatory minimum standards as
clarified in the Declaratory Ruling. Now more than ever, individuals with hearing loss rely on
captioning to interact with their loved ones, to continue their education, and to participate fully in
the American economy. Yet, ASR is not yet ready for these critical activities. In the educational
16 CaptionCall Comments, App. A at 1-2 (“Stern Report”).
17 See, e.g., Cade Metz, There Is a Racial Divide in Speech-Recognition Systems, Researchers Say, N.Y. Times
(Mar. 23, 2020), https://www.nytimes.com/2020/03/23/technology/speech-recognition-bias-apple-amazon-google.
html (reporting on the results of a Stanford University study finding that ASR systems misidentified 35% of the
words spoken by black participants, compared to only 19% for white participants); Stern Report at 17-19.
18 See 47 C.F.R. § 64.604(a)(2).
19 See CaptionCall Comments at 11-13.
7
context, for example, the CEO of the National Association of the Deaf, recently remarked that
“[c]olleges should not look to [ASR] for live video formats . . . . We challenge any claim of
accuracy measurements of ASR given that there is absolutely no valid metric to assess the
accuracy of captioning at this time.”20 Thus, although ASR shows promise and may be an
important component of IP CTS in the future, the Commission should proceed with caution by
adopting objective standards to evaluate proposed ASR-only/hybrid services before certifying
them.
II. InnoCaption’s Amendment Does Not Provide Sufficient Information to Demonstrate
that Its ASR Feature Satisfies the Declaratory Ruling, and Its Testing Data Is
Incomplete and Confirms that ASR Services Are Generally Not Ready.
InnoCaption’s Amendment requests approval to incorporate an ASR feature into its IP
CTS that would require users to opt into their preferred modality of captioning in one of three
possible default settings—and, eventually, to switch between ASR and CART captioning during
telephone calls. In support of its Amendment, InnoCaption relies on user trials. But
InnoCaption’s testing is insufficient and actually appears to confirm that the user experience
during ASR calls is considerably worse than during CA calls. Moreover, InnoCaption’s
purported solution to ASR service-quality problems is to place the burden on users to determine
when ASR should be used. This approach is contrary to the ADA, which places the burden of
providing functionally equivalent communications on providers,21 and is likely to prove difficult
20 Greta Anderson, Accessibility Suffers During Pandemic, Inside Higher Ed (Apr. 6, 2020), https://www.insidehigh
ered.com/news/2020/04/06/remote-learning-shift-leaves-students-disabilities-behind?utm_source=Inside+Higher+
Ed&utm_campaign=6555d0b586-DNU_2019_COPY_02&utm_medium=email&utm_term=0_1fcbc04421-6555d0b
586-215563457&mc_cid=6555d0b586&mc_eid=893cb07197.
21 See Declaratory Ruling, 33 FCC Rcd at 5834-35 ¶ 63 (describing that the applicant has burden of demonstrating
satisfaction of mandatory minimum standards and that “no application to provide ASR will be approved unless the
applicant demonstrates that the specific ASR technology described in the application meets applicable FCC
requirements” intended to ensure functional equivalence); see also 47 U.S.C. § 225(d)(1) (requiring Commission to
establish regulations for TRS operators).
8
and complex for IP CTS users. And, in the case of the in-call switching functionality,
InnoCaption’s technology has not yet even been fully developed, and thus cannot be approved at
this time.22
A. InnoCaption Should Provide Additional Details Regarding its Testing.
In support of its Amendment, InnoCaption reports that it has tested ASR and gathered
user feedback about its ASR service during calls over the course of a year, asking users to rank
their experience from 1 to 5 stars.23 InnoCaption claims that the “5-star ratings provided by [its]
users represent a[n] accurate representation of the overall usability of the ASR Calling
Feature.”24 But InnoCaption’s survey results, without more, do not demonstrate that the ASR
feature satisfies the mandatory minimum standards as clarified in the Declaratory Ruling for
several reasons.
First, a five-star survey does not actually demonstrate the quality of InnoCaption’s ASR
feature, because the survey results are inherently subjective and do not say anything about how
the service performs against the critical service-quality metrics (e.g., accuracy, latency,
readability, etc.). A user might award five stars to a call with her grandchild based on her
22 InnoCaption’s Amendment also fails to request partial waivers of all of the necessary mandatory minimum
standards that are expressly applicable to CAs and not its ASR feature. See Mezmo Corporation Request for
Limited Waiver, CG Docket Nos. 13-24, 03-123 (Mar. 13, 2020). Critically, by not requesting these waivers,
InnoCaption also has bypassed explaining salient operational parameters for its incorporation of the ASR feature.
For example, unlike the other ASR applicants, InnoCaption has not requested waiver of the CA training and skills
standards insofar as it provides ASR-only captioning. See 47 C.F.R. § 64.604(a)(1). In requesting a waiver of these
requirements, InnoCaption should address how it has designed and implemented its ASR feature to address the
specialized needs and culture of individuals with disabilities. Likewise, InnoCaption has not requested a waiver of
the requirement to provide to the TRS Fund Administrator a “CA ID number” in connection with requests for
compensation. See 47 C.F.R. § 64.604(b)(8). In requesting a waiver of this requirement, InnoCaption should
explain how it will track and report its usage of ASR. InnoCaption also has not addressed how it will change its IP
CTS registration and certification communications to reflect that captioning may not always be provided by a live
CA, see 47 C.F.R. § 64.611(j), which are intended to ensure users understand how IP CTS is provided.
23 See InnoCaption Amendment at 5-6.
24 See id. at 6.
9
affective response to the call, irrespective of the fact that the captions were highly inaccurate
during the call. And during a life-and-death call (such as with a doctor), the fact that a user
believes he or she had a positive experience is small comfort if the information conveyed is
inaccurate. That is why the Commission must subject services to testing that is “objective and
generalizable”—a proposition which InnoCaption itself once endorsed.25
Second, InnoCaption does not provide any information about the types of calls that were
included in this survey. For its survey results to have any generalizable significance,
InnoCaption would have to confirm that the tests apply to the wide spectrum of callers (e.g.,
women, children, racial and ethnic minorities, non-native English speakers, etc.) and call types
(e.g., residential, commercial, technical, etc.).26 In the absence of such testing, it would be
impossible to conclude that the feature can handle all “type[s] of call[s].”27
Third, InnoCaption’s survey results appear to confirm that ASR does not match CA-
assisted services and are not ready for certification. InnoCaption candidly admits that only 74
percent of its calls involving its ASR feature received a rating of 4 or 5 stars—in contrast to the
86 percent of its calls involving a CA that received 4- or 5-star ratings.28 This means that nearly
twice as many ASR calls (26 percent and 14 percent, respectively) failed to achieve a 4- or 5-star
25 CaptionCall Comments at 7; see also Joint Provider Group Ex Parte at 3 (“[I]t is critical that the Commission
obtain data developed through a robust experimental design, including statistically representative sample sizes and
valid research methods. . . . [I]t is necessary to develop IP CTS quality standards based on comprehensive, uniform,
objective, and replicable testing.”).
26 See supra notes 13-15 and accompanying text (discussing evidence that ASR is likely to systematically
underperform on calls involving difficult speakers, including protected classes of speakers).
27 47 C.F.R. § 64.604(a)(3); see also CaptionCall Comments at 8-10. See Joint Provider Group Ex Parte at 3
(“Samples used for testing must represent the wide range of actual IP CTS calls, users, and user experiences. . . .
Testing must be performed and reported over a wide variety of call types. The use of aggregate data masks
important variations in performance.”).
28 InnoCaption Amendment at 6.
10
rating. Likewise, InnoCaption admits that its ASR calls received “double the proportion of 1-
star ratings . . . compared to . . . CA captioning [calls].”29 Yet InnoCaption leaps strangely to
the conclusion that these results “indicate[] that the ASR Calling Feature is . . . able to meet the
accessibility needs of our users.”30 In fact, these survey results, without more, suggest that the
ASR feature fails to satisfy the mandatory minimum standards as explained in the Declaratory
Ruling.31
B. InnoCaption’s User Control Settings Do Not Correct Deficiencies with ASR.
InnoCaption’s Amendment acknowledges that its ASR feature does not match its CAs in
terms of service quality, but it proposes to remedy these weaknesses by having consumers decide
whether and when to use ASR—including, eventually, by toggling between CA and ASR modes
during calls, as they deem necessary.32 This approach impermissibly shifts the burden of
ensuring functionally equivalent service onto consumers. If the user chooses the wrong mode of
service for his or her needs (for example, using ASR for a phone call with a grandchild), the user
will get sub-standard service, yet InnoCaption will still receive full compensation. Further, for a
user to be able to assess the accuracy of captioning, he or she must match the captions he or she
reads to the speech that he or she hears. However, since IP CTS users are, by definition, hearing
29 Id. at 7 (emphasis added).
30 Id. at 6.
31 See Declaratory Ruling, 33 FCC Rcd at 5834 ¶ 63 (“In accordance with our rules, applicants should support all
claims . . . through documentary and other evidence. For example, this could include trials and quantitative test
results demonstrating that the applicant’s service will afford a level of quality that is at least comparable to currently
available CA-assisted IP CTS.” (footnote omitted)).
32 InnoCaption Amendment at 3 (“InnoCaption users are able to select ASR captioning as a default captioning mode
on their mobile app if they feel it meets their accessibility needs generally or, alternatively, may use ASR captioning
for a particular call.” (emphasis added)); id. at 4 (“we are in final stages of developing an additional function that
allows users to switch between CART CA and ASR captioning during a call. This functionality is essential in
providing functional equivalence and maximum service reliability due to the current limitations of ASR technology,
especially on calls with poor connections or speakers with accents.” (emphasis added)).
11
impaired, it is unreasonable to expect them to judge the accuracy of different captioning
modalities. Until ASR is shown to match CA quality, the risk remains that users may select ASR
for the wrong types of calls and receive poor quality service. Moreover, as ClearCaptions
recently explained, this approach to ASR-hybrid service actually creates additional burdens for
users, and does not cure (or even necessarily mitigate) concerns with ASR.33
InnoCaption recounts its “firm[] belie[f] that having dual caption modes” available as
default settings “will resolve [ASR’s] technological limitation[s].”34 Specifically, InnoCaption
plans to offer users three “default” captioning settings: (1) “CA Priority,” which routes calls for
ASR captioning only in the event of unprojected call surges; (2) “CA Only” which always routes
calls to CAs and never uses ASR captioning; and (3) “ASR Only” which routes all non-911 calls
to ASR captioning.35 InnoCaption further describes that it is in the “final stages of developing an
additional function that allows users to switch between CART CA and ASR captioning during a
call.”36 This approach raises a number of concerns.
IP CTS users are generally older and may suffer from other disabilities, and they will
likely have a very difficult time learning, understanding, and using the default settings and in-call
switching feature on their mobile devices. Introducing any complexity into a captioning service
or end-user equipment tends to deter users from accessing and using IP CTS. InnoCaption’s
ASR feature appears to be highly complex: It requires users to understand two different
captioning modalities, and three different default settings, which all relate to the underlying
33 See Letter from Tamar E. Finn, Counsel to ClearCaptions, LLC, to Marlene H. Dortch, Secretary, FCC, CG
Docket Nos. 13-24, 03-123 (Apr. 13, 2020) (“ClearCaptions 4-13-20 Ex Parte”).
34 InnoCaption Amendment at 7.
35 Id. at 3-4.
36 Id. at 4.
12
generation of captions, not to the user-facing attributes of captions.37 Moreover, even if, as
suspected, InnoCaption’s user base is younger and more technology savvy than the rest of the
market’s, this feature will be a hurdle to access for most IP CTS users—and to the growing
population of Americans aging into hearing loss.38
InnoCaption appears to contemplate making its ASR feature more accessible by engaging
in outreach and customer education to teach its users about when ASR is and is not likely to
match CA service quality.39 But before accepting that InnoCaption can make the new feature
accessible, the Commission should require more information about its planned education to
determine whether it will (or indeed even can) be effective. For example, as Dr. Stern has
explained, a number of factors can influence ASR performance, including the identity of the
speaker, the content of his or her speech, background noise, and so forth.40 Moreover, on live
telephone calls, users will encounter myriad permutations of callers, speech, and background
conditions—all of which may even change over the course of a single call (e.g., as a family
hands around a phone, or as a mobile user walks down a street). For InnoCaption’s users to
understand when ASR is and is not appropriate, its education will either have to be incredibly
37 See id. at 3-4.
38 See Comments of CaptionCall, LLC, Docket Nos. 13-24, 03-123, at 8, 14-15 (Sept. 17, 2018) (discussing
difficulties that IP CTS users will have with RTT solutions, which, like InnoCaption’s ASR feature, require careful
manipulation on mobile devices); see also ClearCaptions 4-13-20 Ex Parte at 2 (“The consumer, who is typically a
senior, should not be expected to self-select ASR or a live CA.”).
39 See Letter from Cristina O. Duarte, Director of Regulatory Affairs, InnoCaption, to Marlene H. Dortch, Secretary,
FCC, Docket Nos. 13-24, 03-123, at 2 (Mar. 2, 2020) (“InnoCaption 3-2-20 Ex Parte”).
40 See Stern Report at 9-17.
13
detailed—and thus likely too complicated for most users to understand—or it will not provide
enough information for users to make informed decisions.41
Thus, at least initially, as ClearCaptions explains, ASR should be incorporated into IP
CTS offerings that are “intelligent enough, based on real-time analytics and metrics, to know
when to switch between a live captioning agent and ASR. . . . [T]he[] analytics and metrics”—
not the user—“should determine when ASR will provide functionally equivalent
communications to an IP CTS customer.”42
Finally, even if InnoCaption’s in-call switching features could redress some of the
concerns with ASR—which is unlikely, for the reasons discussed above in this section—the
Commission should not authorize InnoCaption’s incorporation of in-call switching until the
functionality has been fully developed. InnoCaption has admitted that it cannot release this
functionality until it hires an additional engineer to support it.43 It would be risky for the
Commission to authorize a functionality that cannot currently be supported; it also seems
surprising that a single engineer could support this functionality at any sort of scale. Thus, the
Commission should require InnoCaption to seek a further amendment, supported by sufficient
information and evidence, once this functionality is ready to be deployed, so that the
Commission can fully determine that it improves, rather than hinders, functionally equivalent
communications by telephone.
41 See ClearCaptions 4-13-20 Ex Parte at 2 (“IP CTS consumers are not likely to have access to the data necessary
to make an informed decision to select ASR over a live captioner.”).
42 Id.
43 See InnoCaption 3-2-20 Ex Parte at 2.
14
CONCLUSION
For the foregoing reasons, the Commission should not grant InnoCaption’s Amendment.
Respectfully submitted,
/s/ Cindy Williams
Cindy Williams
General Counsel
Sorenson Holdings, LLC
4192 South Riverboat Road
Salt Lake City, UT 84123
April 20, 2020