Browse Definitions :
Definition

semi-structured data

Contributor(s): Ivy Wigmore

Semi-structured data is data that has not been organized into a specialized repository, such as a database, but that nevertheless has associated information, such as metadata, that makes it more amenable to processing than raw data.

The difference between structured data, unstructured data and semi-structured data:
Unstructured data has not been organized into a format that makes it easier to access and process. In reality, very little data is completely unstructured. Even things that are often considered unstructured data, such as documents and images, are structured to some extent. Structured data is basically the opposite of unstructured: It has been reformatted and its elements organized into a data structure so that elements can be addressed, organized and accessed in various combinations to make better use of the information. Semi-structured data lies somewhere between the two. It is not organized in a complex manner that makes sophisticated access and analysis possible; however, it may have information associated with it, such as metadata tagging, that allows elements contained to be addressed.

Here's an example: A Word document is generally considered to be unstructured data. However, you can add metadata tags in the form of keywords and other metadata that represent the document content and make it easier for that document to be found when people search for those terms -- the data is now semi-structured. Nevertheless, the document still lacks the complex organization of the database, so falls short of being fully structured data.

In reality, there is considerable overlap between the boundaries of the three categories, which are sometimes described collectively as the data continuum.

This was last updated in November 2014

Continue Reading About semi-structured data

Join the conversation

1 comment

Send me notifications when other members comment.

Please create a username to comment.

can you provide me the examples of semistructured data
Cancel

-ADS BY GOOGLE

File Extensions and File Formats

Powered by:

SearchCompliance

  • risk management

    Risk management is the process of identifying, assessing and controlling threats to an organization's capital and earnings.

  • compliance as a service (CaaS)

    Compliance as a Service (CaaS) is a cloud service service level agreement (SLA) that specified how a managed service provider (...

  • data protection impact assessment (DPIA)

    A data protection impact assessment (DPIA) is a process designed to help organizations determine how data processing systems, ...

SearchSecurity

  • quantum key distribution (QKD)

    Quantum key distribution (QKD) is a secure communication method for exchanging encryption keys only known between shared parties.

  • identity theft

    Identity theft, also known as identity fraud, is a crime in which an imposter obtains key pieces of personally identifiable ...

  • cybercrime

    Cybercrime is any criminal activity that involves a computer, networked device or a network.

SearchHealthIT

SearchDisasterRecovery

  • disaster recovery plan (DRP)

    A disaster recovery plan (DRP) is a documented, structured approach that describes how an organization can quickly resume work ...

  • business continuity plan (BCP)

    A business continuity plan (BCP) is a document that consists of the critical information an organization needs to continue ...

  • disaster recovery team

    A disaster recovery team is a group of individuals focused on planning, implementing, maintaining, auditing and testing an ...

SearchStorage

  • logical unit number (LUN)

    A logical unit number (LUN) is a unique identifier for designating an individual or collection of physical or virtual storage ...

  • NVMe over Fabrics (NVMe-oF)

    NVMe over Fabrics, also known as NVMe-oF and non-volatile memory express over fabrics, is a protocol specification designed to ...

  • CIFS (Common Internet File System)

    CIFS (Common Internet File System) is a protocol that gained popularity around the year 2000, as vendors worked to establish an ...

Close