Generative AI Usage Disclosure

In accordance with academic and professional integrity guidelines, this document outlines the extent to which Generative AI tools (such as ChatGPT, Claude, or Gemini) were utilized during the completion of the DE2 Final Project (Track D — Aviation).

Generative AI was used strictly as a supportive assistant and conceptual guide. The core engineering, data modeling, and architectural decisions remain the original work of the project authors.

Specifically, AI tools were utilized for the following purposes:

  1. Code Assistance and Debugging: * AI was used to help formulate and optimize specific PySpark queries (e.g., syntax for window functions, inverted index transformations, and PageRank loop structuring). It also assisted in troubleshooting execution errors and suggesting physical optimizations like repartition and xxhash64.

  2. Spark Metrics and Monitoring Guidance: * AI served as a technical guide to help identify which specific Spark metrics were necessary to validate our Service Level Objectives (SLOs).

    • It provided instructions on how to navigate the Spark UI to extract the correct physical plans (e.g., identifying Exchange nodes before and after partitioning) and advised on which specific screenshots and JSON logs (like lastProgress for streaming) to capture as evidence for the report.
  3. Report Drafting and Formatting: * After the analytical work was completed, AI was used to help structure the final DE2_Project_Report.md. It assisted in translating raw notes and Spark execution times into clear, formatted Markdown tables, ensuring the document was professional, concise, and easy to read.

Note: All data generated, pipeline outputs, and recorded metrics are the direct result of our own execution of the code on our local machines. AI was not used to fabricate any results, metrics, or evaluation outcomes.