Stockholm Hadoop User Group (SHUG) is a group of enthusiasts of Apache Hadoop technology and its related projects. We are interested in distributed computing and processing large amounts of data using open-source technologies.
The goal of SHUG to popularize these technologies by organizing regular meetings where we can discuss and share our experience, thoughts and observations related to both practical and non-practical (but still fun!) use of Apache Hadoop Ecosystem.
Title: Presto: SQL-on-Anything
Abstract: Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook. One key feature in Presto is the ability to query data where it lives via an uniform ANSI SQL interface. Presto’s connector architecture creates an abstraction layer for anything that can be represented in a columnar or row-like format, such as HDFS, Amazon S3, Azure Storage, NoSQL stores, relational databases, Kafka streams and even proprietary data stores. Furthermore, a single Presto query can combine data from multiple sources, allowing for analytics across an entire organization.
This talk will be co-presented by Wojciech Biela and Piotr Findeisen from Starburst, the enterprise Presto company, largest contributor to Presto outside of Facebook. The talk will be a gentle introduction to Presto and its ability to query virtually any data source via it’s connector interface. Wojciech and Piotr will present some of the use cases of Presto querying various data sources, discuss the existing connectors in Presto, and describe the backing architectural concepts.
Wojciech Biela is a co-founder of Starburst and is responsible for product development. He has a background of over 13 years of building products and running engineering teams. Previously Wojciech was the Engineering Manager at the Teradata Center for Hadoop, running the Presto engineering operations in Warsaw, Poland. Prior to that, back in 2011, he built and ran the Polish engineering team, a subsidiary of Hadapt Inc., a pioneer in the SQL-on-Hadoop space. Hadapt was acquired by Teradata in 2014. Earlier, Wojciech built and lead teams on multi-year projects, from custom big e-commerce & SCM platforms to PoS systems. Wojciech holds a M.S. in Computer Science from the Wroclaw University of Technology.
Piotr Findeisen is a Software Engineer and a founding member of the Starburst team. He contributes to the Presto code base and is also active in the community. Piotr has been involved in the design and development of significant features like the cost-based optimizer (still in development), spill to disk, correlated subqueries and a plethora of smaller enhancements. Before Starburst, Piotr worked at Teradata and was the top external Presto committer of the year. Prior to that, he was a Team Leader at Syncron (provider of cloud services for supply chain management), responsible for their product's technical foundation and performance. Piotr holds a M.S. in Computer Science (and a B.Sc. in Mathematics) from University of Warsaw.
18:45 break with food and drink
19:15 Q and A
19:30 more drinks
More information about the second talk to come.