Online Access Free Associate-Developer-Apache-Spark-3.5 Exam Questions

Exam Code:Associate-Developer-Apache-Spark-3.5
Exam Name:Databricks Certified Associate Developer for Apache Spark 3.5 - Python
Certification Provider:Databricks
Free Question Number:135
Posted:Jun 14, 2026
Rating
100%

Question 1

28 of 55.
A data analyst builds a Spark application to analyze finance data and performs the following operations:
filter, select, groupBy, and coalesce.
Which operation results in a shuffle?

Question 2

42 of 55.
A developer needs to write the output of a complex chain of Spark transformations to a Parquet table called events.liveLatest.
Consumers of this table query it frequently with filters on both year and month of the event_ts column (a timestamp).
The current code:
from pyspark.sql import functions as F
final = df.withColumn("event_year", F.year("event_ts")) \
.withColumn("event_month", F.month("event_ts")) \
.bucketBy(42, ["event_year", "event_month"]) \
.saveAsTable("events.liveLatest")
However, consumers report poor query performance.
Which change will enable efficient querying by year and month?

Question 3

49 of 55.
In the code block below, aggDF contains aggregations on a streaming DataFrame:
aggDF.writeStream \
.format("console") \
.outputMode("???") \
.start()
Which output mode at line 3 ensures that the entire result table is written to the console during each trigger execution?

Question 4

A data scientist at a financial services company is working with a Spark DataFrame containing transaction records. The DataFrame has millions of rows and includes columns for transaction_id, account_number, transaction_amount, and timestamp. Due to an issue with the source system, some transactions were accidentally recorded multiple times with identical information across all fields. The data scientist needs to remove rows with duplicates across all fields to ensure accurate financial reporting.
Which approach should the data scientist use to deduplicate the orders using PySpark?

Question 5

The following code fragment results in an error:
@F.udf(T.IntegerType())
def simple_udf(t: str) -> str:
return answer * 3.14159
Which code fragment should be used instead?

Add Comments

Your email address will not be published. Required fields are marked *

insert code
Type the characters from the picture.