Browse Definitions :
Definition

semi-structured data

Semi-structured data is data that has not been organized into a specialized repository, such as a database, but that nevertheless has associated information, such as metadata, that makes it more amenable to processing than raw data.

The difference between structured data, unstructured data and semi-structured data:
Unstructured data has not been organized into a format that makes it easier to access and process. In reality, very little data is completely unstructured. Even things that are often considered unstructured data, such as documents and images, are structured to some extent. Structured data is basically the opposite of unstructured: It has been reformatted and its elements organized into a data structure so that elements can be addressed, organized and accessed in various combinations to make better use of the information. Semi-structured data lies somewhere between the two. It is not organized in a complex manner that makes sophisticated access and analysis possible; however, it may have information associated with it, such as metadata tagging, that allows elements contained to be addressed.

Here's an example: A Word document is generally considered to be unstructured data. However, you can add metadata tags in the form of keywords and other metadata that represent the document content and make it easier for that document to be found when people search for those terms -- the data is now semi-structured. Nevertheless, the document still lacks the complex organization of the database, so falls short of being fully structured data.

In reality, there is considerable overlap between the boundaries of the three categories, which are sometimes described collectively as the data continuum.

This was last updated in November 2014

Continue Reading About semi-structured data

SearchCompliance

  • information governance

    Information governance is a holistic approach to managing corporate information by implementing processes, roles, controls and ...

  • enterprise document management (EDM)

    Enterprise document management (EDM) is a strategy for overseeing an organization's paper and electronic documents so they can be...

  • risk assessment

    Risk assessment is the identification of hazards that could negatively impact an organization's ability to conduct business.

SearchSecurity

  • unified threat management (UTM)

    Unified threat management (UTM) describes an information security (infosec) system that provides a single point of protection ...

  • physical security

    Physical security is the protection of personnel, hardware, software, networks and data from physical actions and events that ...

  • attack vector

    An attack vector is a path or means by which an attacker or hacker can gain access to a computer or network server in order to ...

SearchHealthIT

SearchDisasterRecovery

  • risk mitigation

    Risk mitigation is a strategy to prepare for and lessen the effects of threats faced by a business.

  • call tree

    A call tree is a layered hierarchical communication model that is used to notify specific individuals of an event and coordinate ...

  • Disaster Recovery as a Service (DRaaS)

    Disaster recovery as a service (DRaaS) is the replication and hosting of physical or virtual servers by a third party to provide ...

SearchStorage

  • cloud storage

    Cloud storage is a service model in which data is transmitted and stored on remote storage systems, where it is maintained, ...

  • cloud testing

    Cloud testing is the process of using the cloud computing resources of a third-party service provider to test software ...

  • storage virtualization

    Storage virtualization is the pooling of physical storage from multiple storage devices into what appears to be a single storage ...

Close