Has anyone done all parts of this setup?

  • Spark Thrift Server (aka SparkSQL) on master node with worker nodes registered to it

  • Parquet on S3 and hive tables pointing to the s3 locations

  • No HDFS

  • Use SparkSQL to query the hive tables from beeline/jdbc to Spark Thrift Server?

If I don’t register the workers to the spark master then the queries work fine. If I register the workers to the master I get errors on the workers about query plan not found.

Source link

No tags for this post.


Please enter your comment!
Please enter your name here