GENAI.md — Generative AI Usage Declaration
Assignment: DE2 - Assignment 3: Graphs or Clustering
Course: Data Engineering II- ESIEE Paris 2025-2026
Track: D - Aviation (OpenSky Network) | Path: A - Graph Processing
1. AI Tool Used
Tool: Claude / Gemini
2. Scope of AI assistance
- Code Generation: We asked the AI some help to complete the PySpark notebook templates following the provided assignment instructions.
- Troubleshooting & Spark UI: We asked the AI to help us interpret our execution metrics (e.g., convergence in 4 iterations, performance jump from 24s to 0.5s) and to point out which specific Spark UI tabs to capture to prove our optimization.
- Formatting: We ask the AI to format a clean Markdown engineering report.
3. Manuel verifications
We manually verified the core logic to ensure we fully understood the underlying platform mechanics:
- Graph Logic: Validated the self-join condition (
l.icao24 < r.icao24) to ensure an undirected graph without duplicate edges or self-loops. - Spark Mechanics: Ensured we understood why
.cache().count()is mandatory at each iteration to break the DAG lineage and prevent memory crashes. - Domain Context: Confirmed that the 2°×2° grid and 5-minute window were realistic for European airspace.
- Execution & Evidence: Manually ran the cluster, extracted the execution plans (
.txt), and captured all the required Spark UI screenshots.
4. Academic Integrity Statement
We confirm that every line of code in the submitted notebooks is understood. Generative AI was used to accelerate the implementation and format our findings, but the execution, evidence gathering, and platform understanding remain our own work.
Declared by Justine Guirauden and Volcy Desmazures - ESIEE Paris, April 2026