Voice authentication is a biometric method of speaker recognition based on measuring the distinctions in individual voices to uniquely identify users.
Instead of a password, which might be forgotten or not strong enough to ensure security, voice authentication allows people to use their voices themselves as passwords. Voice authentication can also be used in conjunction with other methods for multifactor authentication.
The technologies behind voice authentication developed in tandem through developments in speech synthesis and speech recognition, two overlapping fields of study. Study of the structures used to produce speech revealed hundreds of measurable characteristics that are distinct in the voices of individuals. In combination, those metrics make up a unique voice print for each user that is harder to fake than a finger print.
There are two methods used in voice authentication: text-based (constrained mode) and text-independent (unconstrained mode). Text-based modes use scripted words which may also be verbal passwords; the words used can be changed. Text-independent modes can use whatever words are spoken to recognize individuals and thus can be used for surreptitious identification. In each method, the recorded audio waveforms are analysed to pick out hundreds of behavioral and physiological individual characteristics.
Gunnar Fant laid the foundations for voice authentication in the 1960s through his modelling the physiological elements of speech. These works were based on x-rays of individuals making specific phonetic sounds. Dr. Joseph Perkell built on Fant’s work in 1970 with motion x-rays that included movements of the tongue and jaw. The first prototype systems for voice authentication were furnished by Texas Instruments and used by the United States Air Force in 1976.