Databricks-Certified-Professional-Data-Engineer Exam Question 81

A Spark job is taking longer than expected. Using the Spark UI, a data engineer notes that the Min, Median, and Max Durations for tasks in a particular stage show the minimum and median time to complete a task as roughly the same, but the max duration for a task to be roughly 100 times as long as the minimum.
Which situation is causing increased duration of the overall job?
  • Databricks-Certified-Professional-Data-Engineer Exam Question 82

    Which statement describes Delta Lake Auto Compaction?
  • Databricks-Certified-Professional-Data-Engineer Exam Question 83

    When defining external tables using formats CSV, JSON, TEXT, BINARY any query on the exter-nal tables caches the data and location for performance reasons, so within a given spark session any new files that may have arrived will not be available after the initial query. How can we address this limitation?
  • Databricks-Certified-Professional-Data-Engineer Exam Question 84

    A Delta Lake table representing metadata about content from user has the following schema:

    Based on the above schema, which column is a good candidate for partitioning the Delta Table?
  • Databricks-Certified-Professional-Data-Engineer Exam Question 85

    You are asked to setup two tasks in a databricks job, the first task runs a notebook to download the data from a remote system, and the second task is a DLT pipeline that can process this data, how do you plan to configure this in Jobs UI