To advance the offset of a stream to the current table version without consuming the change data in a DML operation, which of the following operations can be done by Data Engineer? [Select 2]
Correct Answer: A,B
Explanation When created, a stream logically takes an initial snapshot of every row in the source object (e.g. ta-ble, external table, or the underlying tables for a view) by initializing a point in time (called an off-set) as the current transactional version of the object. The change tracking system utilized by the stream then records information about the DML changes after this snapshot was taken. Change rec-ords provide thestate of a row before and after the change. Change information mirrors the column structure of the tracked source object and includes additional metadata columns that describe each change event. Note that a stream itself does not contain any table data. A stream only stores an offset for the source object and returns CDC records by leveraging the versioning history for the source object. A new table version is created whenever a transaction that includes one or more DML statements is committed to the table. In the transaction history for a table, a stream offset is located between two table versions. Query-ing a stream returns the changes caused by transactions committed after the offset and at or before the current time. Multiple queries can independently consume the same change data from a stream without changing the offset. A stream advances the offset only when it is used in a DML transaction. This behavior applies to both explicit and autocommit transactions. (By default, when a DML statement is execut-ed, an autocommit transaction is implicitly started and the transaction is committed at the comple-tion of the statement. This behavior is controlled with the AUTOCOMMIT parameter.) Querying a stream alone does not advance its offset, even within an explicit transaction; the stream contents must be consumed in a DML statement. To advance the offset of a stream to the current table version without consuming the change data in a DML operation, complete either of the following actions: Recreate the stream (using the CREATE OR REPLACE STREAM syntax). Insert the current change data into a temporary table. In the INSERT statement, query the stream but include a WHERE clause that filters out all of the change data (e.g. WHERE 0 = 1).
DEA-C01 Exam Question 37
Data Engineer is using existing pipe that automates data loads using event notifications, later he figured out the needs to modify pipe properties. For the same, He decided to recreate the pipe as best practice. He followed the below steps for the same. 1. Query the SYSTEM$PIPE_STATUS function and verify that the pipe execution state is RUN-NING. 2. Recreate the pipe (using CREATE OR REPLACE PIPE). 3. Query the SYSTEM$PIPE_STATUS function and verify that the pipe execution state is RUN-NING. Which are the Missing recommended steps while Recreating Pipes for Automated Data Loads?
Correct Answer: C
Explanation Recreating a pipe (using a CREATE OR REPLACE PIPE statement) is necessary to modify most pipe properties. Recreating Pipes for Automated Data Loads When recreating a pipe that automates data loads using event notifications, it's recommended that Data Engineer complete the following steps: 1. Pause the pipe (using ALTER PIPE ... SET PIPE_EXECUTION_PAUSED = true). 2. Query the SYSTEM$PIPE_STATUS function and verify that the pipe execution state is PAUSED. 3. Recreate the pipe (using CREATE OR REPLACE PIPE). 4. Pause the pipe again. 5. Review the configuration steps for your cloud messaging service to ensure the settings are still accurate. 6. Query the SYSTEM$PIPE_STATUS function again and verify that the pipe execution state is RUNNING.
DEA-C01 Exam Question 38
In efforts to recover the dropped child tables within schema named SCV_SCHEMA by Data Engi-neer, She found that DATA_RETENTION_TIME_IN_DAYS parameter set with value 45 days at Schema level &the data retention period for child tables explicitly set at 85 days. What will happen when she will try to run undrop table command on Child tables to recover them on the 50th day as-suming SCV_SCHEMA is already dropped on 45th day?
Correct Answer: B
Explanation Dropped Containers and Object Retention Inheritance Currently, when a database is dropped, the data retention period for child schemas or tables, if ex-plicitly set to be different from the retention of the database, is not honored. The child schemas or tables are retained for the same period of time as the database. Similarly, when a schema is dropped, the data retention period for child tables, if explicitly set to be different from the retention of the schema, is not honored. The child tables are retained for the same period of time as the schema. To honor the data retention period for these child objects (schemas or tables), drop them explicitly before you drop the database or schema.
DEA-C01 Exam Question 39
What are Common Query Problems a Data Engineer can identified using Query Profiler?
Correct Answer: A,B,C
Explanation "Exploding" Joins One of the common mistakes SQL users make is joining tables without providing a join condition (resulting in a "Cartesian product"), or providing a condition where records from one table match multiple records from another table. For such queries, the Join operator produces significantly (often by orders of magnitude) more tuples than it consumes. This can be observed by looking at the number of records produced by a Join operator in the profile interface, and typically is also reflected in Join operator consuming a lot of time. Queries Too Large to Fit in Memory For some operations (e.g. duplicate elimination for a huge data set), the amount of memory available for the compute resources used to execute the operation might not be sufficient to hold intermediate results. As a result, the query processing engine will start spilling the data to local disk. If the local disk space is not sufficient, the spilled data is then saved to remote disks. This spilling can have a profound effect on query performance (especially if remote disk is used for spilling). Spilling statistics can be checked in Query Profile Interface. Inefficient Pruning Snowflake collects rich statistics on data allowing it not to read unnecessary parts of a table based on the query filters. However, for this to have an effect, the data storage order needs to be correlat-ed with the query filter attributes. The efficiency of pruning can be observed by comparing Partitions scanned and Partitions total sta-tistics in the TableScan operators. If the former is a small fraction of the latter, pruning is efficient. If not, the pruning did not have an effect. Of course, pruning can only help for queries that actually filter out a significant amount of data. If the pruning statistics do not show data reduction, but there is a Filter operator above TableScan which filters out a number of records, this might signal that a different data organization might be beneficial for this query.
DEA-C01 Exam Question 40
For the most efficient and cost-effective Data load experience, Data Engineer needs to inconsider-ate which of the following considerations?
Correct Answer: A
Explanation Split larger files into a greater number of smaller files to distribute the load among the compute re-sources in an active warehouse. This would minimize the processing overhead rather than maximize it. Rest is recommended Data loading considerations.