DP-203 Exam Question 1

You need to implement versioned changes to the integration pipelines. The solution must meet the data integration requirements.
In which order should you perform the actions? To answer, move all actions from the list of actions to the answer area and arrange them in the correct order.

DP-203 Exam Question 2

You have an Azure Data Factory pipeline that performs an incremental load of source data to an Azure Data Lake Storage Gen2 account.
Data to be loaded is identified by a column named LastUpdatedDate in the source table.
You plan to execute the pipeline every four hours.
You need to ensure that the pipeline execution meets the following requirements:
* Automatically retries the execution when the pipeline run fails due to concurrency or throttling limits.
* Supports backfilling existing data in the table.
Which type of trigger should you use?
  • DP-203 Exam Question 3

    You are designing an application that will store petabytes of medical imaging data When the data is first created, the data will be accessed frequently during the first week. After one month, the data must be accessible within 30 seconds, but files will be accessed infrequently. After one year, the data will be accessed infrequently but must be accessible within five minutes.
    You need to select a storage strategy for the data. The solution must minimize costs.
    Which storage tier should you use for each time frame? To answer, select the appropriate options in the answer area.
    NOTE: Each correct selection is worth one point.

    DP-203 Exam Question 4

    Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
    After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
    You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:
    * A workload for data engineers who will use Python and SQL.
    * A workload for jobs that will run notebooks that use Python, Scala, and SOL.
    * A workload that data scientists will use to perform ad hoc analysis in Scala and R.
    The enterprise architecture team at your company identifies the following standards for Databricks environments:
    * The data engineers must share a cluster.
    * The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.
    * All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.
    You need to create the Databricks clusters for the workloads.
    Solution: You create a High Concurrency cluster for each data scientist, a High Concurrency cluster for the data engineers, and a Standard cluster for the jobs.
    Does this meet the goal?
  • DP-203 Exam Question 5

    You are batch loading a table in an Azure Synapse Analytics dedicated SQL pool.
    You need to load data from a staging table to the target table. The solution must ensure that if an error occurs while loading the data to the target table, all the inserts in that batch are undone.
    How should you complete the Transact-SQL code? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
    NOTE Each correct selection is worth one point.