Browse Definitions :
Definition

automatic content classification

Contributor(s): Corinne Bernstein

Automatic content classification is a process for managing text and unstructured information by categorizing or clustering text. By labeling natural language texts with relevant categories from a predefined set, automatic document classification enables users to organize content quickly and efficiently.

While manual document classification may be highly detailed and accurate, it is time-consuming and subjective. Automatic document classification is faster, scalable and more objective. It provides organizations with a more systematic and consistent classification and can be useful in more complex, nuanced contexts, such as business-specific content. Machine learning and artificial intelligence can boost the speed and efficiency of automatic document classification.

The automated classification of texts into predefined categories has gained attention in the past 10 to 15 years due to the increased availability of documents in digital form and the need to get them organized. Today, text classification is applied in many contexts, including document filtering, email spam filtering, automated document metadata generation, word sense disambiguation and hierarchical catalogs of web resources.

Because automatic document classification software defines the requirements for organizing content at the outset, there needs to be a clear, objective configuration of the categories and classification rules before testing, customization  and  refinement can be performed. Key elements of text classification include the ability to analyze the intent, emotion and sentiment of textual data.

Text classification helps companies understand customer behavior by categorizing conversations on social networks, comment sections  and other web sources. Having an effective and consistent automatic content classification system can provide better customer relationship management (CRM), enhance findability for key audiences and improve and organization's ability to monetize customer-generated information.

This was last updated in June 2018

Continue Reading About automatic content classification

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

File Extensions and File Formats

SearchCompliance

  • smart contract

    A smart contract, also known as a cryptocontract, is a computer program that directly controls the transfer of digital currencies...

  • risk map (risk heat map)

    A risk map, also known as a risk heat map, is a data visualization tool for communicating specific risks an organization faces. A...

  • internal audit (IA)

    An internal audit (IA) is an organizational initiative to monitor and analyze its own business operations in order to determine ...

SearchSecurity

SearchHealthIT

  • Health IT (health information technology)

    Health IT (health information technology) is the area of IT involving the design, development, creation, use and maintenance of ...

  • fee-for-service (FFS)

    Fee-for-service (FFS) is a payment model in which doctors, hospitals, and medical practices charge separately for each service ...

  • biomedical informatics

    Biomedical informatics is the branch of health informatics that uses data to help clinicians, researchers and scientists improve ...

SearchDisasterRecovery

  • risk mitigation

    Risk mitigation is a strategy to prepare for and lessen the effects of threats faced by a data center.

  • ransomware recovery

    Ransomware recovery is the process of resuming options following a cyberattack that demands payment in exchange for unlocking ...

  • natural disaster recovery

    Natural disaster recovery is the process of recovering data and resuming business operations following a natural disaster.

SearchStorage

  • RAID 5

    RAID 5 is a redundant array of independent disks configuration that uses disk striping with parity.

  • non-volatile storage (NVS)

    Non-volatile storage (NVS) is a broad collection of technologies and devices that do not require a continuous power supply to ...

  • petabyte

    A petabyte is a measure of memory or data storage capacity that is equal to 2 to the 50th power of bytes.

Close