pr!!ch: a system for privacy- preserving speech transcription · 2. sensitive word scrubbing final...
TRANSCRIPT
![Page 1: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/1.jpg)
Pr𝜀𝜀ch: A System for Privacy-Preserving Speech Transcription
Shimaa Ahmed, Amrita Roy Chowdhury, Kassem Fawaz, and Parmesh Ramanathan
USENIX Security Symposium 2020
1
![Page 2: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/2.jpg)
Speech Transcription*
Speech transcription applications:
2
• Scalable
• Privacy-preserving
• Accurate
* Transcription by IBM Speech-to-Text Demo
![Page 3: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/3.jpg)
Speech Transcription Services
Cloud-based transcription
Deep Speech
+by
3
Open-source offline transcription
![Page 4: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/4.jpg)
Performance Comparison
Standard datasets
4
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
LibriSpeech DAPS TIMIT
WER* (%)
GoogleAWSDeep Speech
* Word-Error-Rate (WER) = "#$#%&𝐷:𝐼:𝑆:𝑁:
# Deletion# Insertion# Substitution# Reference words
![Page 5: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/5.jpg)
Performance Comparison
Real world use-cases• Facebook hearing before the US Senate
• Supreme Court case “Carpenter v. United States”
• VCTK: non-American accent dataset• Speaker p266 of an Irish accent• Speaker p262 of a Scottish accent
5
![Page 6: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/6.jpg)
Standard vs Real World Performance
0.00%5.00%
10.00%15.00%20.00%25.00%30.00%35.00%40.00%45.00%
LibriS
peec
hDAPS
TIMIT
VCTK - p266
VCTK - p262
Faceb
ook1
Faceb
ook2
Faceb
ook3
Carpne
ter1
Carpen
ter2
WER (%)
GoogleAWSDeepSpeech
6
Standard Real World
Off-the-shelf offline transcribers are not reliable for real world applications
![Page 7: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/7.jpg)
Speech is a Rich Source of Sensitive Information
Voice biometrics
• Personal attributes
• Identity
• Impersonation
7technology can clone a speaker’s voice from a short segment of their speech
![Page 8: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/8.jpg)
Textual content
• Sensitive words
• Statistical analysis of the entire transcript• Topic model• Stylometry analysis• Document classification• Sentiment analysis
8
Speech is a Rich Source of Sensitive Information
![Page 9: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/9.jpg)
Utility-Privacy Trade-off
9
Utility
Cloud service providers
Offline service providers
Privacy
Cloud service providers
Offline service providers
Goal: Design an end-to-end transcription system that provides an intermediate solution along the utility-privacy spectrum
![Page 10: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/10.jpg)
• Obfuscates the users’ voice biometrics
• Protects the sensitive textual content
• Improves on the transcription accuracy compared to offline systems
• Provides control knobs to customize its utility and privacy levels
Original SpeechPrivacy-Preserving Operations
High-accuracy Transcription Final Transcript
De-noising
10
Pr𝜀𝜀ch: Privacy-Preserving Speech Transcription
![Page 11: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/11.jpg)
• Obfuscates the users’ voice biometrics
• Protects the sensitive textual content
• Improves on the transcription accuracy compared to offline systems
• Provides control knobs to customize its utility and privacy levels
Original SpeechPrivacy-Preserving Operations
High-accuracy Transcription Final Transcript
De-noising
11
Pr𝜀𝜀ch: Privacy-Preserving Speech Transcription
![Page 12: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/12.jpg)
Voice Biometrics
Many-to-One Voice Conversion
Voice Conversion
12
Senator Harris
Senator Thune
Mark Zuckerberg
0% accuracy in matching original speakers with their voice-converted speech using Azure speaker identification API
![Page 13: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/13.jpg)
• Obfuscates the users’ voice biometrics
• Protects the sensitive textual content
• Improves on the transcription accuracy compared to offline systems
• Provides control knobs to customize its utility and privacy levels
Original SpeechPrivacy-Preserving Operations
High-accuracy Transcription Final Transcript
De-noising
13
Pr𝜀𝜀ch: Privacy-Preserving Speech Transcription
![Page 14: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/14.jpg)
Break the context
• Segmentation
• Sensitive words scrubbing Original speech Segmentation
~3 non-stop words
Sensitive WordsScrubbing
14
Deep Speechtranscription
information about 87million Facebook users being obtained by the company CambridgeAnalytica
![Page 15: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/15.jpg)
Break the context
• Segmentation
• Sensitive words scrubbing Original speech Segmentation
~3 non-stop words
Sensitive WordsScrubbing
15
Deep Speechtranscription
PocketSphinx
information about 87million Facebook users being obtained by the company CambridgeAnalytica
![Page 16: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/16.jpg)
Break the context
• Segmentation
• Sensitive words scrubbing
• The textual content is transferred into a bag-of-words model
Original speech Segmentation
~3 non-stop words
Sensitive WordsScrubbing
16
Deep Speechtranscription
information about 87million Facebook users being obtained by the company CambridgeAnalytica
![Page 17: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/17.jpg)
Differentially Private (DP) Words’ Histogram
• Bag-of-words: histogram of words
• Apply DP to the true histogram• A randomized mechanism 𝐴: ℕ|"| → ℕ|"| satisfies (𝜀, 𝛿)-DP, if for any pair of
histograms 𝐻# and 𝐻$ such that ||𝐻#, 𝐻$||# = 𝑑, and for any set 𝑂 ⊆ ℕ|"|,
Words
Coun
t
17
Pr 𝐴(𝐻# ∈ 𝑂] ≤ 𝑒% Pr 𝐴(𝐻$ ∈ 𝑂] + 𝛿
![Page 18: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/18.jpg)
DP Challenges
• Pr𝜀𝜀ch has access only to the speech, but not the transcript• No access to the true histogram
• The noise ‘dummy words’ must be added in the speech domain
• The dummy words must be indistinguishable from the true speech• Segment length• Voice• Language model
18
![Page 19: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/19.jpg)
Dummy Segments Generation
The Lanham Act's banon federal registrationof scandaloustrademarks is not arestriction on speechbut a valid conditionon participation in afederal program.
Dummy Text Generation TTS
Domain
Legal, courtcase, violation
Original Speech
Offline transcriber
19
ban on federal registration
DP Mechanism
trademarks is not a restriction
in a federal program
![Page 20: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/20.jpg)
End-to-End System(1) Hide voice biometrics (2) Ensure noise indistinguishability
Protect the textual content
20
The Lanham Act's banon federal registration ofscandalous trademarksis not a restriction onspeech but a validcondition onparticipation in a federalprogram.
At issue in thiscase is thegovernment'swarrantlesscollection of 127days ofPetitioner's cellsite locationinformation
Original speech
DomainLegal, courtcase, supremecourt
1. Segmentation
3. Dummy Text Generation 4. TTS
5. Voice Conversion
7. Transcription
8. De-noising and de-shuffling
6. Partitioning,adding noise, and shuffling.
government's At on scandalous
cell issue of location7. Transcription
2. Sensitive wordscrubbing
Final Transcript
In Pr𝜀𝜀ch, the DP noise does NOT deteriorate the utility, instead it adds monetary cost overhead
speech not a registration
in condition a federal restriction
valid Petitioner's 127 Lanham on collection
Participation in this program
![Page 21: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/21.jpg)
• Obfuscates the users’ voice biometrics
• Protects the sensitive textual content
• Improves on the transcription accuracy compared to offline systems
• Provides the users with control knobs to customize its utility and privacy levels
Original SpeechPrivacy-Preserving Operations
High-accuracy Transcription Final Transcript
De-noising
21
Pr𝜀𝜀ch: Privacy-Preserving Speech Transcription
![Page 22: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/22.jpg)
End-to-End System
22
The Lanham Act's banon federal registration ofscandalous trademarksis not a restriction onspeech but a validcondition onparticipation in a federalprogram.
At issue in thiscase is thegovernment'swarrantlesscollection of 127days ofPetitioner's cellsite locationinformation
Original speech
DomainLegal, courtcase, supremecourt
1. Segmentation
3. Dummy Text Generation 4. TTS
5. Voice Conversion
7. Transcription
8. De-noising and de-shuffling
6. Partitioning,adding noise, and shuffling.
government's At on scandalous
cell issue of location7. Transcription
2. Sensitive wordscrubbing
Final Transcript
speech not a registration
in condition a federal restriction
valid Petitioner's 127 Lanham on collection
Participation in this program
![Page 23: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/23.jpg)
Utility: Transcription Accuracy
Dataset No Voice Conversion
One-to-one Voice Conversion
Many-to-One Voice Conversion
Deep Speech
VCTK p266 5.15 16.55 21.92 26.72VCTK p262 4.53 7.39 10.82 15.97Facebook1 8.26 14.60 20.30 24.72Facebook2 9.75 18.27 19.44 26.61Facebook3 14.93 23.25 27.06 30.72Carpneter1 14.43 23.88 22.63 25.85Carpenter2 13.53 33.71 38.90 39.71
Table: WER(%) at different settings of Pr𝜀𝜀ch vs Deep Speech
23
2% − 32.5%44% − 80%
![Page 24: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/24.jpg)
Textual Content Privacy
Noise Level
Original Word Cloud
24
![Page 25: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/25.jpg)
Formal Privacy Guarantee
Post-processing of DP:
25
For a speech file 𝑆, Pr𝜀𝜀ch provides perfect voice privacy using many-to-one voice conversion and an (𝜀, 𝛿)-DP guarantee on the word histogram for the domain considered, under the assumption that the dummy segments are indistinguishable from the true segments.
Any statistical analysis on the noisy words’ histogram does not cause loss in privacy
![Page 26: Pr!!ch: A System for Privacy- Preserving Speech Transcription · 2. Sensitive word scrubbing Final Transcript In Pr,,ch, the DP noise does NOT deteriorate the utility, instead it](https://reader036.vdocuments.site/reader036/viewer/2022071219/6056b2858b32ab30540c8a40/html5/thumbnails/26.jpg)
TakeawaysPr𝜀𝜀ch as a privacy-preserving speech transcription system:• Provides an improved performance relative to offline transcription• 2% to 32.52% relative improvement in WER
• Obfuscates the speakers’ voice biometrics• 0% accuracy in matching real speakers with their voice-converted speech
• allows only a DP view of the textual content.
26
Contact us: [email protected]