Redaction, a silent foundation of the debt collections industry
Redaction of sensitive information and PII is one of the most important & complex cogs in the tech stack and machinery that drives the overall financial services industry, more specifically debt collections. As a debt travels from origination to default to collections & then maturity or charge-off, it is loaded with massive amounts of information, of which a significant share of info resides in audio recordings & transcripts of interactions with the debtor. A debt is owned and managed by a number of organizations through its lifecycle, with the data being exposed to several companies that work with those organizations.
To de-risk organizations of this exposure, it is of paramount importance to control the visibility of not just the structured quantitative data, but also the unstructured interactions with the debtor (primarily, audio recordings). Due to this, redaction as a feature, while a silent tool, is critically important to ensure a seamless flow of information across entities.
Prodigal’s Redaction model in use
Automated redaction of various kinds of sensitive information is one of the most recent & crucial solutions offered by Prodigal, augmenting our productivity & compliance suite of products. The applications of this cut across various fronts in the debt collections industry, ranging from sharing recordings of past debtor conversations from debt originators with contact centers to masking data from 3rd party software being used for internal productivity.
Examples of Redaction Indicator Language:
- “Card number”
- “CVV code”
- “Code on the back/front”
- “three/four-digit code”
- “Go ahead with that number”
- “Give me that number”
- “Bank account number”
- “Routing number”
- “Expiration date”
- “Repeat that card number”
- “Sixteen digits”
- “last four of your SSN”
- “last four of your social”
- “ZIP code for address ..”
- “date/year of birth”
- “alternate phone number”
- “best number to reach”
- “balance on the account”
Numerical data includes:
- Spoken digits (e.g. “one”, “two”, “nineteen”, “twenty”, “thirty-one”, etc.)
Prodigal’s off-the-shelf redaction algorithm offers several modules, like social security, address, credit card details, payment mode info (routing number, checking account number), debtor demographics, debtor name, among others. There are different levels of Redaction that Prodigal offers, based on the level of comfort the customers have with their personnel accessing sensitive data.
The specific data types targeted for Redaction depend on the level of Redaction the customer chooses. Sensitive data redacted from call transcripts are replaced with a special token (typically “*”). Redacted audio in call recordings is replaced with a soft beep.
When Redaction is enabled, data is redacted from transcripts and audio in a fully automated manner and is permanently and irreversibly destroyed; no un-redacted data is stored anywhere in the Prodigal system.
What are the Redaction Levels available?
Prodigal offers 3 different levels of Redaction, namely:
PCI Data Redaction (Level 1)
Within PCI Data Redaction, the system removes information such as debit and credit cards (15 or 16 digits), expiration dates, CVV codes, and PIN numbers.
PCI and PII Data Redaction (Level 2)
In addition to PCI Data, at Level 2 the system removes numeric data such as account and routing numbers for bank accounts, the last four digits of SSN, numeric portions of mailing addresses, ZIP codes, and information about the date and year of birth. This is cumulatively PII data that is often used in consumer debt collection conversations to complete Right Party Verification as mandated by FDCPA regulations. As one can expect, at Level 2, the amount of data that gets redacted is larger in quanta than that in Level 1.
Numeric Data Redaction (Level 3)
For extreme safety, Prodigal offers a Level 3 redaction option - wherein all numeric data is redacted. In addition to the data redacted at Level 2, this may include information such as balance due, payment amounts, settlement offers, phone numbers, specific dates that are mentioned in the conversation. With this information redacted the audio and call transcripts are free of sensitive data and can be viewed or heard freely without a security risk.
How does Redaction work?
Despite being one of the most critical components of the financial services industry, technological research & literature remains severely limited. Prodigal’s machine learning team explored some of the most popular approaches used across industries to identify the right blend for the ARM industry.
The Prodigal Redaction process searches call transcripts for topics, phrases, and key words (collectively called Redaction Indicator Language) that indicate the likely presence of sensitive data nearby. When Redaction Indicator Language is found, nearby “numerical” words and phrases are redacted, by being removed from both transcripts and audio.
Redaction is therefore targeted and proximity-based. A targeted, proximity-based approach to redaction allows potentially valuable data -- for analytical or call review purposes -- to remain intact in transcripts and audio. Upon customer request, the Redaction algorithms can be relaxed or tightened, leading to less or more aggressive data redaction respectively.
Removal of audio data is achieved by noting the time ranges at which redacted words occur (as time-stamped in the transcript), and writing segments of silence to the audio file for those time ranges.
Because Prodigal’s redaction uses a targeted, proximity-based approach to finding numerical data near Redaction Indicator Language, it is possible for “over-redaction” (redaction applied to non-sensitive data) or “under-redaction” (redaction not applied to sensitive data) to occur.
For example, if a credit card number is spoken outside of any typical “payment” context, it may not be detected and redacted in Level 1. Such occurrences are extremely rare but theoretically possible in Level 1 redaction. They are even rarer in Level 2 and impossible in Level 3.
Likewise, if numerical data occurs near Redaction Indicator Language, it may be redacted even though the data is not sensitive.
In general, however, Prodigal’s redaction can be appropriately “tuned” to suit customer needs and balance the security requirements (redact more aggressively) with usability of transcripts and audio for analytics (redact less aggressively).
Maximize Compliance And Collections Revenue
Prodigal’s Speech AI monitors 100% of calls for
compliance and collector performance with