Browse Definitions :
Definition

information extraction (IE)

Contributor(s): Matthew Haughn

Information extraction (IE) is the automated retrieval of specific information related to a selected topic from a body or bodies of text.

Information extraction tools make it possible to pull information from text documents, databases, websites or multiple sources. IE may extract info from unstructured, semi-structured or structured, machine-readable text. Usually, however, IE is used in natural language processing (NLP) to extract structured from unstructured text.

Information extraction depends on named entity recognition (NER), a sub-tool used to find targeted information to extract. NER recognizes entities first as one of several categories such as location (LOC), persons (PER) or organizations (ORG). Once the information category is recognized, an information extraction utility extracts the named entity’s related information and constructs a machine-readable document from it, which algorithms can further process to extract meaning. IE finds meaning by way of other subtasks including co-reference resolution, relationship extraction, language and vocabulary analysis and sometimes audio extraction.

IE dates back to the early days of Natural Language Processing of the 1970’s. JASPER is a system for IE that for Reuters by Carnegie Melon University is an early example. Current efforts in multimedia document processing in IE include automatic annotation and content recognition and extraction from images and video could be seen as IE as well.

Because of the complexity of language, high-quality IE is a challenging task for artificial intelligence (AI) systems.

This was last updated in January 2018

Continue Reading About information extraction (IE)

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

Dateiendungen und Dateiformate

Gesponsert von:

SearchCompliance

  • Whistleblower Protection Act

    The Whistleblower Protection Act of 1989 is a law that protects federal government employees in the United States from ...

  • smart contract

    A smart contract, also known as a cryptocontract, is a computer program that directly controls the transfer of digital currencies...

  • risk map (risk heat map)

    A risk map, also known as a risk heat map, is a data visualization tool for communicating specific risks an organization faces. A...

SearchSecurity

  • certificate authority (CA)

    A certificate authority (CA) is a trusted entity that issues digital certificates, which are data files used to cryptographically...

  • hacktivism

    Hacktivism is the act of hacking, or breaking into a computer system, for a politically or socially motivated purpose.

  • advanced persistent threat (APT)

    An advanced persistent threat (APT) is a prolonged and targeted cyberattack in which an intruder gains access to a network and ...

SearchHealthIT

  • Cerner Corp.

    Cerner Corp. is a public company in North Kansas City, Mo., that provides various health information technologies, ranging from ...

  • clinical decision support system (CDSS)

    A clinical decision support system (CDSS) is an application that analyzes data to help healthcare providers make decisions and ...

  • Health IT (health information technology)

    Health IT (health information technology) is the area of IT involving the design, development, creation, use and maintenance of ...

SearchDisasterRecovery

  • tabletop exercise (TTX)

    A tabletop exercise (TTX) is a disaster preparedness activity that takes participants through the process of dealing with a ...

  • risk mitigation

    Risk mitigation is a strategy to prepare for and lessen the effects of threats faced by a data center.

  • ransomware recovery

    Ransomware recovery is the process of resuming options following a cyberattack that demands payment in exchange for unlocking ...

SearchStorage

  • SSD (solid-state drive)

    An SSD (solid-state drive) is a type of nonvolatile storage media that stores persistent data on solid-state flash memory.

  • file system

    In a computer, a file system -- sometimes written filesystem -- is the way in which files are named and where they are placed ...

  • storage virtualization

    Storage virtualization is the pooling of physical storage from multiple storage devices into what appears to be a single storage ...

Close