Online Access Free Databricks.Associate-Developer-Apache-Spark.v2022-06-21.q62 Practice Test (Page 11)

Associate-Developer-Apache-Spark Exam Question 46

The code block displayed below contains an error. The code block should save DataFrame transactionsDf at path path as a parquet file, appending to any existing parquet file. Find the error.
Code block:

A.transactionsDf.format("parquet").option("mode", "append").save(path)

B.The code block is missing a reference to the DataFrameWriter.

C.save() is evaluated lazily and needs to be followed by an action.

D.The mode option should be omitted so that the command uses the default mode.

E.The code block is missing a bucketBy command that takes care of partitions.

F.Given that the DataFrame should be saved as parquet file, path is being passed to the wrong method.

Associate-Developer-Apache-Spark Exam Question 47

Which of the following code blocks reorders the values inside the arrays in column attributes of DataFrame itemsDf from last to first one in the alphabet?
1.+------+-----------------------------+-------------------+
2.|itemId|attributes |supplier |
3.+------+-----------------------------+-------------------+
4.|1 |[blue, winter, cozy] |Sports Company Inc.|
5.|2 |[red, summer, fresh, cooling]|YetiX |
6.|3 |[green, summer, travel] |Sports Company Inc.|
7.+------+-----------------------------+-------------------+

A.itemsDf.withColumn('attributes', sort_array(col('attributes').desc()))

B.itemsDf.withColumn('attributes', sort_array(desc('attributes')))

C.itemsDf.withColumn('attributes', sort(col('attributes'), asc=False))

D.itemsDf.withColumn("attributes", sort_array("attributes", asc=False))

E.itemsDf.select(sort_array("attributes"))

Associate-Developer-Apache-Spark Exam Question 48

Which of the following code blocks reads in the JSON file stored at filePath, enforcing the schema expressed in JSON format in variable json_schema, shown in the code block below?
Code block:
1.json_schema = """
2.{"type": "struct",
3. "fields": [
4. {
5. "name": "itemId",
6. "type": "integer",
7. "nullable": true,
8. "metadata": {}
9. },
10. {
11. "name": "supplier",
12. "type": "string",
13. "nullable": true,
14. "metadata": {}
15. }
16. ]
17.}
18."""

A.spark.read.json(filePath, schema=json_schema)

B.spark.read.schema(json_schema).json(filePath)
1.schema = StructType.fromJson(json.loads(json_schema))
2.spark.read.json(filePath, schema=schema)

C.spark.read.json(filePath, schema=schema_of_json(json_schema))

D.spark.read.json(filePath, schema=spark.read.json(json_schema))

Correct Answer: B

Explanation
Spark provides a way to digest JSON-formatted strings as schema. However, it is not trivial to use. Although slightly above exam difficulty, this question is beneficial to your exam preparation, since it helps you to familiarize yourself with the concept of enforcing schemas on data you are reading in - a topic within the scope of the exam.
The first answer that jumps out here is the one that uses spark.read.schema instead of spark.read.json. Looking at the documentation of spark.read.schema (linked below), we notice that the operator expects types pyspark.sql.types.StructType or str as its first argument. While variable json_schema is a string, the documentation states that the str should be "a DDL-formatted string (For example col0 INT, col1 DOUBLE)". Variable json_schema does not contain a string in this type of format, so this answer option must be wrong.
With four potentially correct answers to go, we now look at the schema parameter of spark.read.json() (documentation linked below). Here, too, the schema parameter expects an input of type pyspark.sql.types.StructType or "a DDL-formatted string (For example col0 INT, col1 DOUBLE)". We already know that json_schema does not follow this format, so we should focus on how we can transform json_schema into pyspark.sql.types.StructType. Hereby, we also eliminate the option where schema=json_schema.
The option that includes schema=spark.read.json(json_schema) is also a wrong pick, since spark.read.json returns a DataFrame, and not a pyspark.sql.types.StructType type.
Ruling out the option which includes schema_of_json(json_schema) is rather difficult. The operator's documentation (linked below) states that it "[p]arses a JSON string and infers its schema in DDL format". This use case is slightly different from the case at hand: json_schema already is a schema definition, it does not make sense to "infer" a schema from it. In the documentation you can see an example use case which helps you understand the difference better. Here, you pass string '{a: 1}' to schema_of_json() and the method infers a DDL-format schema STRUCT<a: BIGINT> from it.
In our case, we may end up with the output schema of schema_of_json() describing the schema of the JSON schema, instead of using the schema itself. This is not the right answer option.
Now you may consider looking at the StructType.fromJson() method. It returns a variable of type StructType - exactly the type which the schema parameter of spark.read.json expects.
Although we could have looked at the correct answer option earlier, this explanation is kept as exhaustive as necessary to teach you how to systematically eliminate wrong answer options.
More info:
- pyspark.sql.DataFrameReader.schema - PySpark 3.1.2 documentation
- pyspark.sql.DataFrameReader.json - PySpark 3.1.2 documentation
- pyspark.sql.functions.schema_of_json - PySpark 3.1.2 documentation
Static notebook | Dynamic notebook: See test 3

Associate-Developer-Apache-Spark Exam Question 49

The code block displayed below contains an error. The code block should return all rows of DataFrame transactionsDf, but including only columns storeId and predError. Find the error.
Code block:
spark.collect(transactionsDf.select("storeId", "predError"))

A.Instead of select, DataFrame transactionsDf needs to be filtered using the filter operator.

B.Columns storeId and predError need to be represented as a Python list, so they need to be wrapped in brackets ([]).

C.The take method should be used instead of the collect method.

D.Instead of collect, collectAsRows needs to be called.

E.The collect method is not a method of the SparkSession object.

Associate-Developer-Apache-Spark Exam Question 50

Which of the following describes Spark's standalone deployment mode?

A.Standalone mode uses a single JVM to run Spark driver and executor processes.

B.Standalone mode means that the cluster does not contain the driver.

C.Standalone mode is how Spark runs on YARN and Mesos clusters.

D.Standalone mode uses only a single executor per worker per application.

E.Standalone mode is a viable solution for clusters that run multiple frameworks, not only Spark.

Other Version: 2184Databricks.Associate-Developer-Apache-Spark.v2022-08-12.q63; 1355Databricks.Associate-Developer-Apache-Spark.v2022-05-26.q61; 98Databricks.Validbraindumps.Associate-Developer-Apache-Spark.v2022-04-02.by.doreen.61q.pdf

Latest Upload: 160CompTIA.220-1202.v2026-06-16.q110; 112TheInstitutes.CPCU-500.v2026-06-16.q25; 156ACAMS.CAMS7-CN.v2026-06-16.q170; 183CBIC.CIC.v2026-06-15.q123; 128Peoplecert.ITIL-4-Specialist-High-velocity-IT.v2026-06-15.q16; 220HashiCorp.Terraform-Associate-004.v2026-06-15.q126; 130Peoplecert.ITILFNDv5.v2026-06-15.q26; 129Workday.Workday-Pro-HCM-Reporting.v2026-06-15.q28; 129Fortinet.NSE5_SSE_AD-7.6.v2026-06-15.q17; 326PMI.PMI-ACP.v2026-06-15.q523

Associate-Developer-Apache-Spark Exam Question 46

Associate-Developer-Apache-Spark Exam Question 47

Associate-Developer-Apache-Spark Exam Question 48

Associate-Developer-Apache-Spark Exam Question 49

Associate-Developer-Apache-Spark Exam Question 50

Download PDF File