“Big data” is a buzzword in today’s world, and many businesses are looking into how to handle their own big data. A common solution for many is cloud-based data services. Both products of Amazon, Redshift and Athena are tools that have helped build cloud-based data warehouse technologies into more interactive, current, and analytical solutions to big data problems. While both are great means of analyzing data, each has its own advantages and disadvantages, and I wanted to discuss Redshift vs. Athena.
Redshift is a fully managed data warehouse that exists in the cloud. It’s based on PostgreSQL 8.0.2 and is designed to deliver fast query and I/O performance for any size dataset. Athena is an interactive query service that allows you to conveniently analyze data stored in Amazon Simple Storage Service (S3) by using basic SQL. The biggest, most obvious difference (in my opinion) between the two is that Redshift requires the user to set up collections of servers called clusters. A further breakdown of the differences between Redshift and Athena can be found here.