NCP-AII Exam Question 11

A user reports that their GPU-accelerated application is crashing with a CUDA error related to 'out of memory'. You have confirmed that the GPU has sufficient physical memory What are the likely causes and troubleshooting steps?
  • NCP-AII Exam Question 12

    An NVIDIA DGX server with 8 GPUs is experiencing performance issues during a distributed deep learning training run. You suspect a problem with the GPU interconnects. You have already confirmed that NVLink is active. What is the most thorough approach to diagnose potential bandwidth or latency bottlenecks in the GPU-to-GPlJ communication paths?
  • NCP-AII Exam Question 13

    You have a server equipped with multiple NVIDIA GPUs connected via NVLink. You want to monitor the NVLink bandwidth utilization in real-time. Which tool or method is the most appropriate and accurate for this?
  • NCP-AII Exam Question 14

    After configuring MIG on an NVIDIAAIOO GPU, you run 'nvidia-smu and observe that all MIG instances are in the 'Disabled' state.
    Which of the following are potential reasons for this issue? (Select all that apply)
  • NCP-AII Exam Question 15

    You are building a cloud-native application that uses both CPU and GPU resources. You want to optimize resource utilization and cost by scheduling CPU-intensive tasks on nodes without GPUs and GPU-intensive tasks on nodes with GPUs. How would you achieve this node selection and workload placement in Kubernetes?