Presto is a SQL-on-Hadoop engine used by Facebook to handle the queries made on its database, which is majorly an unstructured one. It is a distributed system that runs on cluster of machines. Queries submitted to the coordinator are parsed, analysed and then distributed to the worker nodes by it.
Basic requirements of Presto include:
Linux or MAC OS X
Java 7 64 Bit
Presto does not use MapReduce and thuis requires only HDFS.
Facebook had previously developed a query engine called Hive, for the same purpose and made it an open-source product. Hive is now being successfully used by companies and organizations which deploy Hadoop.
Facebook has recently declared to open-source Presto, which works 10 times faster than Hive for most of the queries, according to Facebook software engineer Martin Traverso. Presto was up and working in the early 2013 and is now actively used by more than 1,000 employees, running more than 30,000 queries each day against a database of petabyte scale. Presto also provides support for a large subset of ANSI SQL queries, sub-queries and joins.