Browse Definitions:


Contributor(s): Matthew Haughn

Presto is a free and open source distributed SQL query engine designed for the demands of big data.

Presto can run analytic queries on data ranging from gigabytes to petabytes, which enables it to search huge data warehouses. Presto offers speeds close to those of commercial solutions without excessive hardware requirements.

Presto was purpose-designed and coded to run interactive analytical searches swiftly and process results as quickly as a commercial data warehouse. Presto can scale up to the largest requirements, dealing with the 300PB size of Facebook’s massive data warehouse while also querying multiple data sources. Presto queries data where it is resident and supports Hive, Cassandra, relational databases and proprietary data stores.

Data analysts use Presto for its fast response times, from a less than a second to minutes. Facebook uses Presto themselves: Over 1000 Facebook employees use Presto daily, to run more than 30,000 queries. On average, the queries of Facebook employees scan through over a petabyte of data every day.

This was last updated in December 2017

Continue Reading About Presto

Start the conversation

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.


File Extensions and File Formats


  • risk map (risk heat map)

    A risk map, also known as a risk heat map, is a data visualization tool for communicating specific risks an organization faces. A...

  • internal audit (IA)

    An internal audit (IA) is an organizational initiative to monitor and analyze its own business operations in order to determine ...

  • pure risk (absolute risk)

    Pure risk, also called absolute risk, is a category of threat that is beyond human control and has only one possible outcome if ...


  • cloud ecosystem

    A cloud ecosystem is a complex system of interdependent components that all work together to enable cloud services.

  • cloud services

    Cloud services is an umbrella term that may refer to a variety of resources provided over the internet, or to professional ...

  • uncloud (de-cloud)

    The term uncloud describes the action or process of removing applications and data from a cloud computing platform.


  • federated identity management (FIM)

    Federated identity management (FIM) is an arrangement that can be made among multiple enterprises to let subscribers use the same...

  • cross-site scripting (XSS)

    Cross-site scripting (XSS) is a type of injection security attack in which an attacker injects data, such as a malicious script, ...

  • firewall

    In computing, a firewall is software or firmware that enforces a set of rules about what data packets will be allowed to enter or...




  • bad block

    A bad block is an area of storage media that is no longer reliable for storing and retrieving data because it has been physically...

  • all-flash array (AFA)

    An all-flash array (AFA), also known as a solid-state storage disk system, is an external storage array that uses only flash ...

  • volume manager

    A volume manager is software within an operating system (OS) that controls capacity allocation for storage arrays.


  • hybrid hard disk drive (HDD)

    A hybrid hard disk drive is an electromechanical spinning hard disk that contains some amount of NAND Flash memory.