DP-203 Exam Question 36

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:
A workload for data engineers who will use Python and SQL.
A workload for jobs that will run notebooks that use Python, Scala, and SOL.
A workload that data scientists will use to perform ad hoc analysis in Scala and R.
The enterprise architecture team at your company identifies the following standards for Databricks environments:
The data engineers must share a cluster.
The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.
All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.
You need to create the Databricks clusters for the workloads.
Solution: You create a Standard cluster for each data scientist, a Standard cluster for the data engineers, and a High Concurrency cluster for the jobs.
Does this meet the goal?
  • DP-203 Exam Question 37

    You plan to create a real-time monitoring app that alerts users when a device travels more than 200 meters away from a designated location.
    You need to design an Azure Stream Analytics job to process the data for the planned app. The solution must minimize the amount of code developed and the number of technologies used.
    What should you include in the Stream Analytics job? To answer, select the appropriate options in the answer area.
    NOTE: Each correct selection is worth one point.

    DP-203 Exam Question 38

    You are designing a financial transactions table in an Azure Synapse Analytics dedicated SQL pool. The table will have a clustered columnstore index and will include the following columns:
    TransactionType: 40 million rows per transaction type
    CustomerSegment: 4 million per customer segment
    TransactionMonth: 65 million rows per month
    AccountType: 500 million per account type
    You have the following query requirements:
    Analysts will most commonly analyze transactions for a given month.
    Transactions analysis will typically summarize transactions by transaction type, customer segment, and/or account type You need to recommend a partition strategy for the table to minimize query times.
    On which column should you recommend partitioning the table?
  • DP-203 Exam Question 39

    You have an Azure Databricks workspace and an Azure Data Lake Storage Gen2 account named storage!
    New files are uploaded daily to storage1.
    * Incrementally process new files as they are upkorage1 as a structured streaming source. The solution must meet the following requirements:
    * Minimize implementation and maintenance effort.
    * Minimize the cost of processing millions of files.
    * Support schema inference and schema drift.
    Which should you include in the recommendation?
  • DP-203 Exam Question 40

    You use Azure Data Lake Storage Gen2.
    You need to ensure that workloads can use filter predicates and column projections to filter data at the time the data is read from disk.
    Which two actions should you perform? Each correct answer presents part of the solution.
    NOTE: Each correct selection is worth one point.