Associate-Developer-Apache-Spark-3.5 Exam Question 1
What is the benefit of using Pandas on Spark for data transformations?
Options:
Options:
Associate-Developer-Apache-Spark-3.5 Exam Question 2
A Spark application suffers from too many small tasks due to excessive partitioning. How can this be fixed without a full shuffle?
Options:
Options:
Associate-Developer-Apache-Spark-3.5 Exam Question 3
A data engineer is asked to build an ingestion pipeline for a set of Parquet files delivered by an upstream team on a nightly basis. The data is stored in a directory structure with a base path of "/path/events/data". The upstream team drops daily data into the underlying subdirectories following the convention year/month/day.
A few examples of the directory structure are:

Which of the following code snippets will read all the data within the directory structure?
A few examples of the directory structure are:

Which of the following code snippets will read all the data within the directory structure?
Associate-Developer-Apache-Spark-3.5 Exam Question 4
Which UDF implementation calculates the length of strings in a Spark DataFrame?
Associate-Developer-Apache-Spark-3.5 Exam Question 5
Given:
python
CopyEdit
spark.sparkContext.setLogLevel("<LOG_LEVEL>")
Which set contains the suitable configuration settings for Spark driver LOG_LEVELs?
python
CopyEdit
spark.sparkContext.setLogLevel("<LOG_LEVEL>")
Which set contains the suitable configuration settings for Spark driver LOG_LEVELs?
