Browse Definitions:
Definition

Speech Application Program Interface (SAPI)

Contributor(s): Joe Joseph

SAPI (Speech Application Program Interface) is an application program interface (API) provided with the Microsoft Windows operating system that allows programmers to write programs that offer text-to-speech and speech recognition capabilities. Interfaces are provided for the C, C++, and Visual Basic programming languages. Using Microsoft's COM (Component Object Model) architecture, SAPI is the most widely used speech application program interface used today. In the future, Microsoft plans to embed speech technology using SAPI into their operating system.

SAPI has seven main components:

  • Voice Command: Voice Command is a high-level interface that provides command and control speech recognition for applications. Voice Command allows a developer to create a Voice Command menu that contains voice commands, such as "new file" or "send mail to someone@anywhere.net" that a user speaks into a microphone or other audio device. The user can control the computer without needing a keyboard or mouse.
  • Voice Dictation: Voice Dictation allows the user to dictate into any application that supports speech recognition. An invisible or virtual edit box receives the text the user dictates and displays the text in an application window. Voice Dictation allows text formatting such as capitalization, translation of punctuation words into punctuation symbols, built-in glossary entries, and correction of the last word spoken or a selected word. Applications that use Voice Dictation classify speech by topics that use different language styles. Topics include e-mail speech, formal writing, or programming speech. Voice Dictation stores the information for each topic on your hard drive.
  • Voice Text: Voice Text converts text into speech that is played over computer speakers or sent over a telephone line. The speech played has several different modes, each with a different voice.
  • Voice Telephony: Voice Telephony uses telephony controls that are similar to Windows controls. Windows controls include buttons, list boxes, sliders and other objects that can be manipulated by a mouse or keyboard. Telephony controls are codes that recognize spoken responses such as Yes or No, your phone number, the date, and the time. Telephony controls create a dialogue between the user and the computer. For example, a user calls a vendor to order an item. The user then answers several questions by speaking into the telephone receiver. The telephony controls recognize these responses and sends them to the application that processes responses. Telephony controls also handle error conditions (these are common with spoken numbers or when the caller does not respond) and variations of answers such as "January 4th" or "tomorrow."
  • Direct Speech Recognition: This is a low-level interface similar to Voice Command. The main difference is Direct Speech Recognition speaks directly to the speech engine. This gives the application more control and speed.
  • Direct Text To Speech : This is a low-level interface similar to Voice Text that also speaks directly to the speech engine.
  • Audio Objects: An Audio Object tells the speech engine where to get its audio.

The future of speech technology will include products that allow you to do such things as surfing the Internet using speech and asking your television what is showing tonight. Software developers are developing applications that understand concepts. For example, if you tell your computer to print a certain document, your application will know whether to print it on your printer or the network's printer. Speech technology is important for medical professionals, law enforcement personnel, the physically handicapped, as well as many business and home users.

This was last updated in August 2005

Continue Reading About Speech Application Program Interface (SAPI)

Start the conversation

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

-ADS BY GOOGLE

File Extensions and File Formats

Powered by:

SearchCompliance

  • cyborg anthropologist

    A cyborg anthropologist is an individual who studies the interaction between humans and technology, observing how technology can ...

  • RegTech

    RegTech, or regulatory technology, is a term used to describe technology that is used to help streamline the process of ...

  • conduct risk

    Conduct risk is the prospect of financial loss to an organization that is caused by the actions of an organization's ...

SearchSecurity

  • security

    Security, in information technology (IT), is the defense of digital information and IT assets against internal and external, ...

  • insider threat

    An insider threat is a malicious hacker (also called a cracker or a black hat) who is an employee or officer of a business, ...

  • virus (computer virus)

    A computer virus is malicious code that replicates by copying itself to another program, computer boot sector or document.

SearchHealthIT

  • HIPAA Privacy Rule

    The Standards for Privacy of Individually Identifiable Health Information, commonly known as the HIPAA Privacy Rule, establishes ...

  • HIPAA business associate agreement (BAA)

    Under the U.S. Health Insurance Portability and Accountability Act of 1996, a HIPAA business associate agreement (BAA) is a ...

  • telemedicine

    Telemedicine is the remote delivery of healthcare services, such as health assessments or consultations, over the ...

SearchDisasterRecovery

  • data recovery

    Data recovery restores data that has been lost, accidentally deleted, corrupted or made inaccessible. Learn how data recovery ...

  • disaster recovery plan (DRP)

    A company's disaster recovery policy is enhanced with a documented DR plan that formulates strategies, and outlines preparation ...

  • fault-tolerant

    Systems with integrated fault tolerance are designed to withstand multiple hardware failures to ensure continuous availability.

SearchStorage

  • Secure Digital card (SD card)

    SD cards use flash memory to provide nonvolatile storage. They are more rugged than traditional storage media and are used in ...

  • data storage

    In a computer, storage is the place where data is held in an electromagnetic or optical form for access by a computer processor.

  • flash storage

    Flash-based storage, based on flash memory, is used for data repositories, storage systems and consumer devices, such as USB ...

SearchSolidStateStorage

  • flash file system

    Flash file systems are designed specifically for memory devices. A well-designed flash device and flash file system ensure ...

  • IOPS (input/output operations per second)

    IOPS measures the maximum number of reads and writes to non-contiguous storage. It is not an actual benchmark since vendor ...

  • eMMC (embedded MultiMediaCard)

    An embedded MultiMediaCard (eMMC) is a small storage device made up of NAND flash memory and a simple storage controller.

SearchCloudStorage

  • RESTful API

    A RESTful application program interface breaks down a transaction to create a series of small modules, each of which addresses an...

  • cloud storage infrastructure

    Cloud storage infrastructure is the hardware and software framework that supports the computing requirements of a private or ...

  • Zadara VPSA and ZIOS

    Zadara Storage provides block, file or object storage with varying levels of compute and capacity through its ZIOS and VPSA ...

Close