assignment1 genai

For Assignment 1, I used generative AI (ChatGPT) to help me:

Understand the difference between RDD and DataFrame approaches – I asked for explanations of how Spark reads data line by line with RDDs versus column-wise with DataFrames, and why this leads to different word counts.
Interpret the Spark code – I got step-by-step explanations for code snippets like the RDD word count pipeline (flatMap, map, reduceByKey, sortBy) and the DataFrame equivalent using split, explode, and groupBy.
Explain performance metrics – I asked for a breakdown of the wall time, RSS, and peak memory metrics displayed by the %%timemem cell magic.

Quartz 4