Browse Definitions :
Definition

Avro (Apache Avro)

Contributor(s): Matthew Haughn

Apache Avro is a row-oriented object container storage format for Hadoop as well as a remote procedure call and data serialization framework. Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. Avro is optimized for write operations and includes a wire format for communication between nodes.

Avro makes translation between different nodes by way of the data definition and serialized permanent data. Avro uses JavaScript object notation to define the data types and protocols. The data is streamed in an efficient and compact binary format. An Avro container file consists of a header and one or multiple file storage blocks.

The header is made up of:

  • 4 bytes of ASCI “OBJ1”
  • File metadata including the schema definition
  • A sync marker: 16 bytes of randomly generated code

Avro also includes its own interface descriptor language (IDL) also named Avro, aside from JSON to define data types and protocols. IDL eases adoption by users who are used to more common traditional IDLs, which have a syntax more like C/C++.

Avro is a top-level project sponsored by the Apache Software Foundation (ASF).

This was last updated in January 2018

Continue Reading About Avro (Apache Avro)

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

File Extensions and File Formats

Powered by:

SearchCompliance

  • compliance audit

    A compliance audit is a comprehensive review of an organization's adherence to regulatory guidelines.

  • regulatory compliance

    Regulatory compliance is an organization's adherence to laws, regulations, guidelines and specifications relevant to its business...

  • Whistleblower Protection Act

    The Whistleblower Protection Act of 1989 is a law that protects federal government employees in the United States from ...

SearchSecurity

  • Transport Layer Security (TLS)

    Transport Layer Security (TLS) is a protocol that provides authentication, privacy, and data integrity between two communicating ...

  • van Eck phreaking

    Van Eck phreaking is a form of electronic eavesdropping that reverse engineers the electromagnetic fields (EM fields) produced by...

  • zero-trust model (zero trust network)

    The zero trust model is a security model used by IT professionals that requires strict identity and device verification ...

SearchHealthIT

SearchDisasterRecovery

  • cloud insurance

    Cloud insurance is any type of financial or data protection obtained by a cloud service provider. 

  • business continuity software

    Business continuity software is an application or suite designed to make business continuity planning/business continuity ...

  • business continuity policy

    Business continuity policy is the set of standards and guidelines an organization enforces to ensure resilience and proper risk ...

SearchStorage

  • solid-state storage

    Solid-state storage (SSS) is a type of computer storage media made from silicon microchips. SSS stores data electronically ...

  • persistent storage

    Persistent storage is any data storage device that retains data after power to that device is shut off. It is also sometimes ...

  • computational storage

    Computational storage is an information technology (IT) architecture in which data is processed at the storage device level to ...

Close