Browse Definitions :
Definition

semi-structured data

Contributor(s): Ivy Wigmore

Semi-structured data is data that has not been organized into a specialized repository, such as a database, but that nevertheless has associated information, such as metadata, that makes it more amenable to processing than raw data.

The difference between structured data, unstructured data and semi-structured data:
Unstructured data has not been organized into a format that makes it easier to access and process. In reality, very little data is completely unstructured. Even things that are often considered unstructured data, such as documents and images, are structured to some extent. Structured data is basically the opposite of unstructured: It has been reformatted and its elements organized into a data structure so that elements can be addressed, organized and accessed in various combinations to make better use of the information. Semi-structured data lies somewhere between the two. It is not organized in a complex manner that makes sophisticated access and analysis possible; however, it may have information associated with it, such as metadata tagging, that allows elements contained to be addressed.

Here's an example: A Word document is generally considered to be unstructured data. However, you can add metadata tags in the form of keywords and other metadata that represent the document content and make it easier for that document to be found when people search for those terms -- the data is now semi-structured. Nevertheless, the document still lacks the complex organization of the database, so falls short of being fully structured data.

In reality, there is considerable overlap between the boundaries of the three categories, which are sometimes described collectively as the data continuum.

This was last updated in November 2014

Continue Reading About semi-structured data

Join the conversation

1 comment

Send me notifications when other members comment.

Please create a username to comment.

can you provide me the examples of semistructured data
Cancel

-ADS BY GOOGLE

File Extensions and File Formats

Powered by:

SearchCompliance

  • compliance audit

    A compliance audit is a comprehensive review of an organization's adherence to regulatory guidelines.

  • regulatory compliance

    Regulatory compliance is an organization's adherence to laws, regulations, guidelines and specifications relevant to its business...

  • Whistleblower Protection Act

    The Whistleblower Protection Act of 1989 is a law that protects federal government employees in the United States from ...

SearchSecurity

  • brute force attack

    Brute force (also known as brute force cracking) is a trial and error method used by application programs to decode encrypted ...

  • spyware

    Spyware is software that is installed on a computing device without the user's knowledge. Spyware can be difficult to detect; ...

  • ATM black box attack

    An ATM black box attack, also referred to as jackpotting, is a type of banking-system crime in which the perpetrators bore holes ...

SearchHealthIT

SearchDisasterRecovery

  • business continuity and disaster recovery (BCDR)

    Business continuity and disaster recovery (BCDR) are closely related practices that describe an organization's preparation for ...

  • warm site

    A warm site is a type of facility an organization uses to recover its technology infrastructure when its primary data center goes...

  • disaster recovery (DR) test

    A disaster recovery test (DR test) is the examination of each step in a disaster recovery plan as outlined in an organization's ...

SearchStorage

  • enterprise storage

    Enterprise storage is a centralized repository for business information that provides common data management, protection and data...

  • disk array

    A disk array, also called a storage array, is a data storage system used for block-based storage, file-based storage or object ...

  • optical storage

    Optical storage is any storage type in which data is written and read with a laser. Typically, data is written to optical media, ...

Close