Online Access Free Google.Professional-Machine-Learning-Engineer.v2025-07-22.q137 Practice Test (Page 25)

Professional-Machine-Learning-Engineer Exam Question 116

You work on the data science team at a manufacturing company. You are reviewing the company's historical sales data, which has hundreds of millions of records. For your exploratory data analysis, you need to calculate descriptive statistics such as mean, median, and mode; conduct complex statistical tests for hypothesis testing; and plot variations of the features over time You want to use as much of the sales data as possible in your analyses while minimizing computational resources. What should you do?

A.Spin up a Vertex Al Workbench user-managed notebooks instance and import the dataset Use this data to create statistical and visual analyses

B.Visualize the time plots in Google Data Studio. Import the dataset into Vertex Al Workbench user-managed notebooks Use this data to calculate the descriptive statistics and run the statistical analyses

C.Use BigQuery to calculate the descriptive statistics. Use Vertex Al Workbench user-managed notebooks to visualize the time plots and run the statistical analyses.
D Use BigQuery to calculate the descriptive statistics, and use Google Data Studio to visualize the time plots. Use Vertex Al Workbench user-managed notebooks to run the statistical analyses.

Professional-Machine-Learning-Engineer Exam Question 117

You work for a retail company. You have a managed tabular dataset in Vertex Al that contains sales data from three different stores. The dataset includes several features such as store name and sale timestamp. You want to use the data to train a model that makes sales predictions for a new store that will open soon You need to split the data between the training, validation, and test sets What approach should you use to split the data?

A.Use Vertex Al manual split, using the store name feature to assign one store for each set.

B.Use Vertex Al default data split.

C.Use Vertex Al chronological split and specify the sales timestamp feature as the time vanable.

D.Use Vertex Al random split assigning 70% of the rows to the training set, 10% to the validation set, and
20% to the test set.

Correct Answer: B

The best option for splitting the data between the training, validation, and test sets, using a managed tabular dataset in Vertex AI that contains sales data from three different stores, is to use Vertex AI default data split.
This option allows you to leverage the power and simplicity of Vertex AI to automatically and randomly split your data into the three sets by percentage. Vertex AI is a unified platform for building and deploying machine learning solutions on Google Cloud. Vertex AI can support various types of models, such as linear regression, logistic regression, k-means clustering, matrix factorization, and deep neural networks. Vertex AI can also provide various tools and services for data analysis, model development, model deployment, model monitoring, and model governance. A default data split is a data split method that is provided by Vertex AI, and does not require any user input or configuration. A default data split can help you split your data into the training, validation, and test sets by using a random sampling method, and assign a fixed percentage of the data to each set. A default data split can help you simplify the data split process, and works well in most cases. A training set is a subset of the data that is used to train the model, and adjust the model parameters. A training set can help you learn the relationship between the input features and the target variable, and optimize the model performance. A validation set is a subset of the data that is used to validate the model, and tune the model hyperparameters. A validation set can help you evaluate the model performance on unseen data, and avoid overfitting or underfitting. A test set is a subset of the data that is used to test the model, and provide the final evaluation metrics. A test set can help you assess the model performance on new data, and measure the generalization ability of the model. By using Vertex AI default data split, you can split your data into the training, validation, and test sets by using a random sampling method, and assign the following percentages of the data to each set1:
The other options are not as good as option B, for the following reasons:
* Option A: Using Vertex AI manual split, using the store name feature to assign one store for each set would not allow you to split your data into representative and balanced sets, and could cause errors or poor performance. A manual split is a data split method that allows you to control how your data is split into sets, by using the ml_use label or the data filter expression. A manual split can help you customize the data split logic, and handle complex or non-standard data formats. A store name feature is a feature that indicates the name of the store where the sales data was collected. A store name feature can help you identify the source of the data, and group the data by store. However, using Vertex AI manual split, using the store name feature to assign one store for each set would not allow you to split your data into representative and balanced sets, and could cause errors or poor performance. You would need to write code, create and configure the ml_use label or the data filter expression, and assign one store for each set. Moreover, this option would not ensure that the data in each set has the same distribution and characteristics as the data in the whole dataset, which could prevent you from learning the general pattern of the data, and cause bias or variance in the model2.
* Option C: Using Vertex AI chronological split and specifying the sales timestamp feature as the time variable would not allow you to split your data into representative and balanced sets, and could cause errors or poor performance. A chronological split is a data split method that allows you to split your data into sets based on the order of the data. A chronological split can help you preserve the temporal dependency and sequence of the data, and avoid data leakage. A sales timestamp feature is a feature that indicates the date and time when the sales data was collected. A sales timestamp feature can help you track the changes and trends of the data over time, and capture the seasonality and cyclicality of the data. However, using Vertex AI chronological split and specifying the sales timestamp feature as the time variable would not allow you to split your data into representative and balanced sets, and could cause errors or poor performance. You would need to write code, create and configure the time variable, and split the data by the order of the time variable. Moreover, this option would not ensure that the data in each set has the same distribution and characteristics as the data in the whole dataset, which could prevent you from learning the general pattern of the data, and cause bias or variance in the model3.
* Option D: Using Vertex AI random split, assigning 70% of the rows to the training set, 10% to the validation set, and 20% to the test set would not allow you to use the default data split method that is provided by Vertex AI, and could increase the complexity and cost of the data split process. A random split is a data split method that allows you to split your data into sets by using a random sampling method, and assign a custom percentage of the data to each set. A random split can help you split your data into representative and balanced sets, and avoid data leakage. However, using Vertex AI random split, assigning 70% of the rows to the training set, 10% to the validation set, and 20% to the test set would not allow you to use the default data split method that is provided by Vertex AI, and could increase the complexity and cost of the data split process. You would need to write code, create and configure the random split method, and assign the custom percentages to each set. Moreover, this option would not use the default data split method that is provided by Vertex AI, which can simplify the data split process, and works well in most cases1.
References:
* About data splits for AutoML models | Vertex AI | Google Cloud
* Manual split for unstructured data
* Mathematical split

Professional-Machine-Learning-Engineer Exam Question 118

You are creating a model training pipeline to predict sentiment scores from text-based product reviews. You want to have control over how the model parameters are tuned, and you will deploy the model to an endpoint after it has been trained You will use Vertex Al Pipelines to run the pipeline You need to decide which Google Cloud pipeline components to use What components should you choose?

Professional-Machine-Learning-Engineer Exam Question 119

You work for a hotel and have a dataset that contains customers' written comments scanned from paper-based customer feedback forms which are stored as PDF files Every form has the same layout. You need to quickly predict an overall satisfaction score from the customer comments on each form. How should you accomplish this task'?

A.Use the Vision API to parse the text from each PDF file Use the Natural Language API analyzesentiment feature to infer overall satisfaction scores.

B.Use the Vision API to parse the text from each PDF file Use the Natural Language API analyzeEntitysentiment feature to infer overall satisfaction scores.

C.Uptrain a Document Al custom extractor to parse the text in the comments section of each PDF file.
Use the Natural Language API analyze sentiment feature to infer overall satisfaction scores.

D.Uptrain a Document Al custom extractor to parse the text in the comments section of each PDF file.Use the Natural Language API analyzeEntitySentiment feature to infer overall satisfaction scores.

Professional-Machine-Learning-Engineer Exam Question 120

You work for an online travel agency that also sells advertising placements on its website to other companies.
You have been asked to predict the most relevant web banner that a user should see next. Security is important to your company. The model latency requirements are 300ms@p99, the inventory is thousands of web banners, and your exploratory analysis has shown that navigation context is a good predictor. You want to Implement the simplest solution. How should you configure the prediction pipeline?

A.Embed the client on the website, and then deploy the model on AI Platform Prediction.

B.Embed the client on the website, deploy the gateway on App Engine, and then deploy the model on AI Platform Prediction.

C.Embed the client on the website, deploy the gateway on App Engine, deploy the database on Cloud Bigtable for writing and for reading the user's navigation context, and then deploy the model on AI Platform Prediction.

D.Embed the client on the website, deploy the gateway on App Engine, deploy the database on Memorystore for writing and for reading the user's navigation context, and then deploy the model on Google Kubernetes Engine.

Correct Answer: A

In this scenario, the goal is to predict the most relevant web banner that a user should see next on an online travel agency's website. The model needs to have low latency requirements of 300ms@p99, and there are thousands of web banners to choose from. The exploratory analysis has shown that the navigation context is a good predictor. Security is also important to the company. Given these requirements, the best configuration for the prediction pipeline would be to embed the client on the website and deploy the model on AI Platform Prediction. Option A is the correct answer.
Option A: Embed the client on the website, and then deploy the model on AI Platform Prediction. This option is the simplest solution that meets the requirements. The client can collect the user's navigation context and send it to the model deployed on AI Platform Prediction for prediction. AI Platform Prediction can handle large-scale prediction requests and has low latency requirements. This option does not require any additional infrastructure or services, making it the simplest solution.
Option B: Embed the client on the website, deploy the gateway on App Engine, and then deploy the model on AI Platform Prediction. This option adds an additional layer of infrastructure by deploying the gateway on App Engine. While App Engine can handle large-scale requests, it adds complexity to the pipeline and may not be necessary for this use case.
Option C: Embed the client on the website, deploy the gateway on App Engine, deploy the database on Cloud Bigtable for writing and for reading the user's navigation context, and then deploy the model on AI Platform Prediction. This option adds even more complexity to the pipeline by deploying the database on Cloud Bigtable. While Cloud Bigtable can provide fast and scalable access to the user's navigation context, it may not be needed for this use case. Moreover, Cloud Bigtable may introduce additional latency and cost to the pipeline.
Option D: Embed the client on the website, deploy the gateway on App Engine, deploy the database on Memorystore for writing and for reading the user's navigation context, and then deploy the model on Google Kubernetes Engine. This option is the most complex and costly solution that does not meet the requirements.
Deploying the model on Google Kubernetes Engine requires more management and configuration than AI Platform Prediction. Moreover, Google Kubernetes Engine may not be able to meet the low latency requirements of 300ms@p99. Deploying the database on Memorystore also adds unnecessary overhead and cost to the pipeline.
References:
* AI Platform Prediction documentation
* App Engine documentation
* Cloud Bigtable documentation
* [Memorystore documentation]
* [Google Kubernetes Engine documentation]

Other Version: 781Google.Professional-Machine-Learning-Engineer.v2025-08-30.q138; 1705Google.Professional-Machine-Learning-Engineer.v2025-07-21.q129; 1290Google.Professional-Machine-Learning-Engineer.v2023-04-15.q71; 1007Google.Professional-Machine-Learning-Engineer.v2023-01-24.q55; 2607Google.Professional-Machine-Learning-Engineer.v2022-08-16.q71; 120Google.Braindumpquiz.Professional-Machine-Learning-Engineer.v2022-06-03.by.dorothy.67q.pdf; 1636Google.Professional-Machine-Learning-Engineer.v2022-02-28.q67; 1841Google.Professional-Machine-Learning-Engineer.v2021-12-10.q53; 55Google.Exam4labs.Professional-Machine-Learning-Engineer.v2021-09-02.by.giles.53q.pdf

Latest Upload: 146Oracle.1D0-1057-25-D.v2026-06-03.q29; 275NAHQ.CPHQ.v2026-06-03.q396; 255CompTIA.220-1201.v2026-06-03.q196; 158GIAC.GCFE.v2026-06-03.q78; 154HIMSS.CPHIMS.v2026-06-03.q45; 235Google.Professional-Cloud-Architect.v2026-06-03.q165; 160HP.HPE7-A09.v2026-06-02.q48; 169ACDIS.CCDS-O.v2026-06-02.q56; 152Microsoft.AB-730.v2026-06-02.q31; 215ASQ.CSSBB.v2026-06-02.q130

Professional-Machine-Learning-Engineer Exam Question 116

Professional-Machine-Learning-Engineer Exam Question 117

Professional-Machine-Learning-Engineer Exam Question 118

Professional-Machine-Learning-Engineer Exam Question 119

Professional-Machine-Learning-Engineer Exam Question 120

Download PDF File