Voicesense once made an attractive promise to its customers: give us someone's voice and we will tell you what they are going to do. Another company in Israel analyzes voice in real-time calls to judge whether customers can pay the debt, whether they can afford a more expensive product or whether they are best candidates for works.

In recent years, researchers and startups have begun to pay attention to the surprising information that can be exploited from voices, especially when the popularity of voice assistants like Amazon's Alexa makes consumers increasingly find it easier and more comfortable to talk to their device. According to a report by business analyst IdTechEx, the voice technology market is growing and is expected to reach $ 15.5 billion by 2029. Mr. Satrajit Ghosh, a research scientist at McGbad's Brain Research Center said, “Almost everyone talks and there are lots of devices can get the voice, from your phone to assistants as Alexa or Google Home. Voice is gradually being exploited everywhere in today's life."

Voice is not only everywhere but it is also very personal, difficult to fake. People can talk to Alexa in their homes, and digital voice assistants are also increasingly being used in hospitals. Voice applications such as Maslo allow users to speak directly about their issues. It is our voice that is a form of data that can tell us about ourselves and also tell us about others. Because of this, many interesting studies have emerged about how voice can enrich our lives, answering privacy concerns and how they will be used.

The key to successful voice analysis researching lies not only in the content of the sentence, but also in the way it says: tone, speed, emphasis, break. With machine learning, we take labeled templates from two groups - those who are worried and those who don't worry - and put data into an algorithm. This algorithm will learn how to choose sophisticated signals that can tell us if someone belongs to Group A or Group B, and the machine can do the same with new models in the future.

Mr. Louis-Philippe Morency, a computer scientist at Carnegie University Mellon, a founder of a project named SimSensei that can help detect voice depression, saying: "The results can sometimes be contrary to our intuition". In some early studies attempting to relate characteristics of a voice to one's suicidal ability, Morency's team discovered that people with pleasant voices were more likely to suicidal than those with angry voices. However, that research is only preliminary and links are often not so simple. Often, there are many complex features and types of speech that only algorithms can sense.

"We can make predictions about your health, work and entertainment"

Researchers have developed voice-based algorithms to help diagnose many diseases from Parkinson to traumatic disorders. For many, the greatest potential of this technology lies in the link between speech analysis and mental health, in the hope of being able to easily track and help those at risk of relapse.

People with unstable health conditions are closely monitored when they are in the hospital, "but there are many problems that can occur in their daily lives," said David Ahern, the leader of Digital Behavioral Health program at Brigham Hospital and Women. He said that outside the hospital where they were supervised, daily life could make people tired and slow. In that situation, a person with depression may not even know that they have been depressed again. “These events happen again when they are not connected to any health system. And if a situation gets so bad that they have to go to the emergency room, it's too late, ”Ahern said. “The idea of ​​a pocket sensor can keep track of patients' activities and behaviors quite reasonably. It can help warn us sooner. ”

Ahern is currently investigating a health monitoring system called CompanionMx, which has just launched in December 2018. The patient records the audio log into the application. The program analyzes those logs along with metadata such as call logs and locations to determine the patient's status according to four factors - depressed mood, reduced excitement, avoidance and fatigue - and track their changes over time. This information, protected by HIPAA federal privacy laws, is shared with patients and is also presented in the doctors' control panel.

In the pilot studies, 95% of the patients left the audio log at least once a week and the clinicians accessed the control panel at least once a day. These numbers are very promising, although there are still many questions about components, applications, feedback, or dashboards ... The research is still ongoing, and some other results have been published. CompmateMx is also planning to cooperate with other health organizations and is considering cooperating with the Ministry of Veterans Affairs.

Services like Voicesense, CallMiner, RankMiner and Cogito's one-time parent company also promise to use voice analysis in business. This means improving customer service at customer care centers, but Voicesense has bigger dreams. "Today, we were able to create a complete profile," said CEO Yoav Degani. His plan is not only to satisfy demanding customers. He is interested in everything from predicting loan norms, predicting insurance claims, revealing customer investment styles, to assessing candidates, judging whether employees are able to take breaks. work or not. “We may not be 100% accurate, but our numbers are still impressive. We can make predictions about your health, work and entertainment. "

In a study case, Voicesense tested its technology in a major European bank. The bank provided voice samples of several thousand borrowers (the bank knew who could and who could not repay) Voicesense ran its algorithm on these models and categorized the recordings into three types of risks. low, medium and high. In such an analysis, only 6 percent of those in the low-risk group defaulted, and up to 27 percent of the high-risk group did not actually pay the debt. In another assessment, considering the possibility that employees will quit, only 13% of low risk groups quit, and up to 39% of high risk groups actually quit.

What happens if the algorithm is wrong?

Ghosh, a scientist at MIT, said: "There is nothing alarming for us. But as with any prediction technology, errors will occur if the analysis is not done well. In general, until we see evidence that something has been validated on the number of X people, it is still impossible to fully trust the judgments. Voice characteristics can change quite a lot unless you get enough samples, which is why we stay away from making absolute judgments ”.

For his part, Degani said that Voicesense's voice processing algorithm measures more than 200 parameters per second and can be applied in many different languages, including Mandarin-like languages. This program is still in the testing phase, but the company has contacted major banks and investors. Most people are impressed by the potential of such technologies.

Customer service is one thing, but Robert D’Ovidio, a professor of crime at Drexel University, is concerned that some applications provided by Voicesense may cause discrimination. Imagine you call a mortgage company, they will analyze your voice and determine that you have a high risk of heart disease, and from there they will put you in the "high risk customer" group and may receive different treatments. "We will need consumer protection laws to combat this collection."

Ryan Calo, a professor at the University of Washington Law School, points out that "Some of these consumer protection measures still exist. The voice is considered a biometric measure, and in some states, like Illinois, there is a biometric security law. Calories add, sensitive issues such as race or gender, are common in machine learning techniques, even if those techniques are used in speech analysis or curriculum vitae. But often people feel uncomfortable when those learning methods are used to identify faces or voices, perhaps in part because these characteristics are highly personal. And while anti-discrimination laws are still there, many issues surrounding voice analysis raise questions about what constitutes discrimination, which is the problem that our society has not yet. solvable.

"I hope that as we move forward, we will realize that this is just data, no matter which form it is, like a series of numbers entered into a spreadsheet," D'Ovidio said. . At the very least, he added, we should have the right to know when our information is exploited. “And I also want to see changes in consumer protection laws. What happens when the algorithms are wrong? ”