Norway


In my case, is aggregated using Spark and written to HDFS. Afterwards, it needs to be copied over to a Postgres database. What’s the way you’ve found works the best?

Right now I’m using a Python script to transfer the data from HDFS to a temp directory then using COPY to write to Postgres. Seems highly inefficient, so I’m looking for alternatives. Sqoop is an option, but some people have mentioned JDBC connection issues.

Edit: title typo…RDBMS*



Source link

No tags for this post.

LEAVE A REPLY

Please enter your comment!
Please enter your name here