CDP-3002 Exam Question 56
Consider the following code snippet:# Sample DataFrame (assuming it exists) df = spark.createDataFrame(...)
# Attempt to explode a nested array column (fix the error)
df_exploded = df.withColumn("items", F.explode(df["items"]))
df_exploded.show()
What is the error in this code, and how can it be fixed?
# Attempt to explode a nested array column (fix the error)
df_exploded = df.withColumn("items", F.explode(df["items"]))
df_exploded.show()
What is the error in this code, and how can it be fixed?
CDP-3002 Exam Question 57
In the context of Spark SQL, what does the Catalyst optimizer use to optimize queries?
CDP-3002 Exam Question 58
You're working with a Hive table containing sensitive dat
a. How can you ensure data security while allowing authorized Spark applications to access the data?
A Store the data in plain text format and restrict access to the HDFS directory
a. How can you ensure data security while allowing authorized Spark applications to access the data?
A Store the data in plain text format and restrict access to the HDFS directory
CDP-3002 Exam Question 59
You're working with a large dataset that needs to be partitioned and processed in chunks to improve efficiency. How can you achieve this using Airflow operators?
CDP-3002 Exam Question 60
How can you use Apache Airflow to ensure a data quality check stops the workflow if it fails, without failing subsequent tasks that are not dependent on the data quality check?
