Spark Performance Tuning – Part 4 | Data Exposed

This post has been republished via RSS; it originally appeared at: Channel 9.

This week's Data Exposed show welcomes back Maxim Lukiyanov to talk more about Spark performance tuning with Spark 2.x. Maxim is a Senior PM on the big data HDInsight team and is in the studio today to present the final part of his 4-part series.

Topics in today's video:

[00:45] - Intro

[02:15] - Advanced Partitioning and Bucketing

[10:30] - Advanced Joins: Joining Large Tables

[19:00] - Debugging and Recap

Spark 2.2 rc4 on Azure HDInsight: Script action

REMEMBER: these articles are REPUBLISHED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.