Associate-Developer-Apache-Spark-3.5 Exam Question 31
4 of 55.
A developer is working on a Spark application that processes a large dataset using SQL queries. Despite having a large cluster, the developer notices that the job is underutilizing the available resources. Executors remain idle for most of the time, and logs reveal that the number of tasks per stage is very low. The developer suspects that this is causing suboptimal cluster performance.
Which action should the developer take to improve cluster utilization?
A developer is working on a Spark application that processes a large dataset using SQL queries. Despite having a large cluster, the developer notices that the job is underutilizing the available resources. Executors remain idle for most of the time, and logs reveal that the number of tasks per stage is very low. The developer suspects that this is causing suboptimal cluster performance.
Which action should the developer take to improve cluster utilization?
Associate-Developer-Apache-Spark-3.5 Exam Question 32
15 of 55.
A data engineer is working on a Streaming DataFrame (streaming_df) with the following streaming data:
id
name
count
timestamp
1
Delhi
20
2024-09-19T10:11
1
Delhi
50
2024-09-19T10:12
2
London
50
2024-09-19T10:15
3
Paris
30
2024-09-19T10:18
3
Paris
20
2024-09-19T10:20
4
Washington
10
2024-09-19T10:22
Which operation is supported with streaming_df?
A data engineer is working on a Streaming DataFrame (streaming_df) with the following streaming data:
id
name
count
timestamp
1
Delhi
20
2024-09-19T10:11
1
Delhi
50
2024-09-19T10:12
2
London
50
2024-09-19T10:15
3
Paris
30
2024-09-19T10:18
3
Paris
20
2024-09-19T10:20
4
Washington
10
2024-09-19T10:22
Which operation is supported with streaming_df?
Associate-Developer-Apache-Spark-3.5 Exam Question 33
A developer is running Spark SQL queries and notices underutilization of resources. Executors are idle, and the number of tasks per stage is low.
What should the developer do to improve cluster utilization?
What should the developer do to improve cluster utilization?
Associate-Developer-Apache-Spark-3.5 Exam Question 34
A developer wants to test Spark Connect with an existing Spark application.
What are the two alternative ways the developer can start a local Spark Connect server without changing their existing application code? (Choose 2 answers)
What are the two alternative ways the developer can start a local Spark Connect server without changing their existing application code? (Choose 2 answers)
Associate-Developer-Apache-Spark-3.5 Exam Question 35
7 of 55.
A developer has been asked to debug an issue with a Spark application. The developer identified that the data being loaded from a CSV file is being read incorrectly into a DataFrame.
The CSV file has been read using the following Spark SQL statement:
CREATE TABLE locations
USING csv
OPTIONS (path '/data/locations.csv')
The first lines of the command SELECT * FROM locations look like this:
| city | lat | long |
| ALTI Sydney | -33... | ... |
Which parameter can the developer add to the OPTIONS clause in the CREATE TABLE statement to read the CSV data correctly again?
A developer has been asked to debug an issue with a Spark application. The developer identified that the data being loaded from a CSV file is being read incorrectly into a DataFrame.
The CSV file has been read using the following Spark SQL statement:
CREATE TABLE locations
USING csv
OPTIONS (path '/data/locations.csv')
The first lines of the command SELECT * FROM locations look like this:
| city | lat | long |
| ALTI Sydney | -33... | ... |
Which parameter can the developer add to the OPTIONS clause in the CREATE TABLE statement to read the CSV data correctly again?
