Associate-Developer-Apache-Spark Exam Question 36
Which of the following code blocks uses a schema fileSchema to read a parquet file at location filePath into a DataFrame?
Associate-Developer-Apache-Spark Exam Question 37
In which order should the code blocks shown below be run in order to return the number of records that are not empty in column value in the DataFrame resulting from an inner join of DataFrame transactionsDf and itemsDf on columns productId and itemId, respectively?
1. .filter(~isnull(col('value')))
2. .count()
3. transactionsDf.join(itemsDf, col("transactionsDf.productId")==col("itemsDf.itemId"))
4. transactionsDf.join(itemsDf, transactionsDf.productId==itemsDf.itemId, how='inner')
5. .filter(col('value').isnotnull())
6. .sum(col('value'))
1. .filter(~isnull(col('value')))
2. .count()
3. transactionsDf.join(itemsDf, col("transactionsDf.productId")==col("itemsDf.itemId"))
4. transactionsDf.join(itemsDf, transactionsDf.productId==itemsDf.itemId, how='inner')
5. .filter(col('value').isnotnull())
6. .sum(col('value'))
Associate-Developer-Apache-Spark Exam Question 38
Which of the following code blocks returns a DataFrame showing the mean value of column "value" of DataFrame transactionsDf, grouped by its column storeId?
Associate-Developer-Apache-Spark Exam Question 39
Which of the following is one of the big performance advantages that Spark has over Hadoop?
Associate-Developer-Apache-Spark Exam Question 40
Which of the following code blocks returns a 2-column DataFrame that shows the distinct values in column productId and the number of rows with that productId in DataFrame transactionsDf?
