CDP-3002 Exam Question 106
You want to schedule your Airflow DAG to run every hour, starting at midnight (00:00). How can you achieve this scheduling configuration?
CDP-3002 Exam Question 107
You are working with a large, skewed dataset in Spark. How would you optimize processing to mitigate the impact of skew and improve performance?
CDP-3002 Exam Question 108
Your team is integrating PySpark with a MySQL database. You need to read data from a table named 'employees'. Which of the following PySpark code snippets correctly accomplishes this task?
CDP-3002 Exam Question 109
Which Spark component is responsible for managing the execution of tasks on worker nodes?
CDP-3002 Exam Question 110
You encounter an error during a data quality check within your Airflow DAG. How can you access detailed information about the error to aid in troubleshooting?
