DP-203 Exam Question 31

You have an Azure Synapse workspace named MyWorkspace that contains an Apache Spark database named mytestdb.
You run the following command in an Azure Synapse Analytics Spark pool in MyWorkspace.
CREATE TABLE mytestdb.myParquetTable(
EmployeeID int,
EmployeeName string,
EmployeeStartDate date)
USING Parquet
You then use Spark to insert a row into mytestdb.myParquetTable. The row contains the following data.

One minute later, you execute the following query from a serverless SQL pool in MyWorkspace.
SELECT EmployeeID
FROM mytestdb.dbo.myParquetTable
WHERE name = 'Alice';
What will be returned by the query?
  • DP-203 Exam Question 32

    You have the following Azure Data Factory pipelines
    * ingest Data from System 1
    * Ingest Data from System2
    * Populate Dimensions
    * Populate facts
    ingest Data from System1 and Ingest Data from System1 have no dependencies. Populate Dimensions must execute after Ingest Data from System1 and Ingest Data from System* Populate Facts must execute after the Populate Dimensions pipeline. All the pipelines must execute every eight hours.
    What should you do to schedule the pipelines for execution?
  • DP-203 Exam Question 33

    You are designing an inventory updates table in an Azure Synapse Analytics dedicated SQL pool. The table will have a clustered columnstore index and will include the following columns:

    You identify the following usage patterns:
    Analysts will most commonly analyze transactions for a warehouse.
    Queries will summarize by product category type, date, and/or inventory event type.
    You need to recommend a partition strategy for the table to minimize query times.
    On which column should you partition the table?
  • DP-203 Exam Question 34

    You are building an Azure Stream Analytics job to identify how much time a user spends interacting with a feature on a webpage.
    The job receives events based on user actions on the webpage. Each row of data represents an event. Each event has a type of either 'start' or 'end'.
    You need to calculate the duration between start and end events.
    How should you complete the query? To answer, select the appropriate options in the answer area.
    NOTE: Each correct selection is worth one point.

    DP-203 Exam Question 35

    You are developing a solution using a Lambda architecture on Microsoft Azure.
    The data at test layer must meet the following requirements:
    Data storage:
    * Serve as a repository (or high volumes of large files in various formats.
    * Implement optimized storage for big data analytics workloads.
    * Ensure that data can be organized using a hierarchical structure.
    Batch processing:
    * Use a managed solution for in-memory computation processing.
    * Natively support Scala, Python, and R programming languages.
    * Provide the ability to resize and terminate the cluster automatically.
    Analytical data store:
    * Support parallel processing.
    * Use columnar storage.
    * Support SQL-based languages.
    You need to identify the correct technologies to build the Lambda architecture.
    Which technologies should you use? To answer, select the appropriate options in the answer area NOTE: Each correct selection is worth one point.