Forensic voice analysis has become commonplace in criminal and intelligence cases. While it is useful for identifying suspects, it isn’t immune to error, with some experts calling for this type of data to be omitted from courtrooms altogether.
Image Credit: peterschreiber.media
Voice analysis has become a prominent feature of criminal forensics
Voice analysis has become fairly common in criminal cases. Many high-profile cases have memorably featured vocal analyses, demonstrating its use in a variety of investigations. For example, when journalist James Foley was kidnapped and beheaded by ISIS in 2012, the terrorist gang released a video of the murder which featured a masked terrorist who was speaking during the video. Experts from around the world attempted to identify the terrorist by analyzing their voice.
Additionally, former computer intelligence consultant Edward Snowden leaked classified information from the National Security Agency (NSA) while he was employed with the Central Intelligence Agency (CIA) in 2013. The documents revealed that NSA had analyzed and extracted millions of phone conversations. Demonstrating the extent to which voice analysis is currently being used in the US.
Given the frequency of phone calls now made each day, and the ability to retain phone conversation data, it is unsurprising that voice analysis has become a key tool in the criminal forensic scientist’s toolbox.
What is the process of voice analysis?
Voice analysis in forensics can involve several different techniques, the employment of which is dependent on the case and the type of evidence the investigator has to work with.
Usually, the process of voice analysis involves one or more of the following techniques: interpreting noises or verifying a recording’s authenticity, transcribing a dialogue from a voice recording, profiling a speaker based on factors of their speech such as dialect, language spoken, or content of the conversation, putting a suspect’s voice in a lineup of voices.
Fragments of conversations are most often obtained from phone conversations, including calls making demands for ransom, hoax calls, and calls to the emergency services. Voicemails are also common sources of vocal information, as well as secretly recorded conversations and voices captured in videos.
Challenges in forensic voice analysis
Potentially the biggest challenge in forensic voice analysis is the often poor quality of the fragments experts have to work with. In general, recordings obtained from telephone conversations do not contain enough information to support the fine-grain analysis of vocal data to allow for detailed distinctions of speech sounds. Usually, a band twice as broad as that provided by phone call data is required to distinguish between similar-sounding consonants, such as m and n.
Additionally, voice recordings are often subject to background noise which reduces the quality of the sample, making it harder to distinguish. Often, voice recordings can be short or years old, making them even more difficult to analyze. In situations where experts need to recreate the conversation to analyze the call, these factors, such as background noise, short durations, and the age of the recording, can make these recreations incredibly difficult.
See How Detectives Use Voice Analysis To See If A Suspect Is Lying
The reliability of voice analysis
While voice recognition has become commonplace in criminal investigations and particularly in intelligence investigations, experts often dispute its reliability and believe its applications are limited.
One particular concern is the use of forensic phonetic expertise in courts. Over the years, many experts have voiced their opinions that forensic vocal evidence being presented in court uses techniques that have been discredited. For example, recent data published by INTERPOL revealed that as many as half of forensic experts are still using audio analysis techniques that have been widely discredited. The issue with this is that misinterpretations of voice recordings can lead to injustices being carried out, such as wrongful convictions.
The CSI effect, as it is known, is the phenomenon where jurors have unrealistic expectations of the capabilities of forensic analyses due to popular criminal television programs, like CSI, that portrays forensic science as being highly accurate, indisputable, and often, the only source of evidence to base a criminal conviction on. As a result, guilty defendants are wrongfully acquitted where no forensic evidence is presented. Similarly, the accuracy of forensic techniques is overestimated, leading to inferences made from forensic analyses, which may be flawed, contributing to wrongful convictions.
For example, in the 1990s, Jerome Prieto was wrongfully convicted of car bombing after his voice was incorrectly identified as being responsible for a phone call made claiming responsibility for the crime. Following this incident, a public request was issued by the French Acoustical Society in 1997 to end the use of forensic voice analysis in the courtroom.
Given that it is estimated that hundreds of voice investigations are conducted in each country every year, and as little as 20% contains as much as 20 seconds of usable voice, it is important that vital that regulations are established to mediate the use of voice analysis in criminal cases. Additionally, it is important to manage jurors’ expectations of forensic science, particularly in terms of voice analysis capabilities and accuracy.
Further Reading