Browse Definitions :
Definition

information extraction (IE)

Information extraction (IE) is the automated retrieval of specific information related to a selected topic from a body or bodies of text.

Information extraction tools make it possible to pull information from text documents, databases, websites or multiple sources. IE may extract info from unstructured, semi-structured or structured, machine-readable text. Usually, however, IE is used in natural language processing (NLP) to extract structured from unstructured text.

Information extraction depends on named entity recognition (NER), a sub-tool used to find targeted information to extract. NER recognizes entities first as one of several categories such as location (LOC), persons (PER) or organizations (ORG). Once the information category is recognized, an information extraction utility extracts the named entity’s related information and constructs a machine-readable document from it, which algorithms can further process to extract meaning. IE finds meaning by way of other subtasks including co-reference resolution, relationship extraction, language and vocabulary analysis and sometimes audio extraction.

IE dates back to the early days of Natural Language Processing of the 1970’s. JASPER is a system for IE that for Reuters by Carnegie Melon University is an early example. Current efforts in multimedia document processing in IE include automatic annotation and content recognition and extraction from images and video could be seen as IE as well.

Because of the complexity of language, high-quality IE is a challenging task for artificial intelligence (AI) systems.

This was last updated in January 2018

Continue Reading About information extraction (IE)

SearchCompliance
  • pure risk

    Pure risk refers to risks that are beyond human control and result in a loss or no loss with no possibility of financial gain.

  • risk reporting

    Risk reporting is a method of identifying risks tied to or potentially impacting an organization's business processes.

  • risk avoidance

    Risk avoidance is the elimination of hazards, activities and exposures that can negatively affect an organization and its assets.

SearchSecurity
  • script kiddie

    Script kiddie is a derogative term that computer hackers coined to refer to immature, but often just as dangerous, exploiters of ...

  • cipher

    In cryptography, a cipher is an algorithm for encrypting and decrypting data.

  • What is risk analysis?

    Risk analysis is the process of identifying and analyzing potential issues that could negatively impact key business initiatives ...

SearchHealthIT
SearchDisasterRecovery
  • What is risk mitigation?

    Risk mitigation is a strategy to prepare for and lessen the effects of threats faced by a business.

  • fault-tolerant

    Fault-tolerant technology is a capability of a computer system, electronic system or network to deliver uninterrupted service, ...

  • synchronous replication

    Synchronous replication is the process of copying data over a storage area network, local area network or wide area network so ...

SearchStorage
  • gigabyte (GB)

    A gigabyte (GB) -- pronounced with two hard Gs -- is a unit of data storage capacity that is roughly equivalent to 1 billion ...

  • MRAM (magnetoresistive random access memory)

    MRAM (magnetoresistive random access memory) is a method of storing data bits using magnetic states instead of the electrical ...

  • storage volume

    A storage volume is an identifiable unit of data storage. It can be a removable hard disk, but it does not have to be a unit that...

Close