Skip to main content

AWS Page

This guide walks through setting up dedicated GPU resources in your AWS account and how to use those GPUs for your workloads that need the GPU’s power.

In AWS, the specific instance types automatically have GPU resources as a part of it. EC2 P2, and P3 instances are good options for most use cases. In the below example, we use p2.xlarge - 4vCPUs.61GB.1GPU.

Pre-requisites

  • Argonaut account
  • AWS account and VCS connected
  • Apps on your Git that can take advantage of the GPU

Provisioning Cluster

  1. From the sidebar, go to Environments and click Environment + to provision a new environment in your chosen region.
  2. Once the environment is created, click on the Infra tab and click Resource +.
  3. Choose EKS (Kubernetes cluster)
  4. Set up separate Node groups with CPU-only resources and one with GPU. (If you only need GPU-enabled nodes, just create one node group with GPU enabled).
  5. Choose a compute instance with GPU and set the other parameters. Here, the GPUnodes have both p2.xlarge instances with both CPUs and GPUs. The CPUnodes have t3.medium with only CPU resources.
    Creating Nodegroups for GPU and CPU instances
  6. Click Create EKS, and your node groups will be deployed in your cluster.

Now that your cluster is ready, it’s time to deploy your application and enable the nodes that can make use of the GPU resources.

Deploying an app

While deploying your app, you have two steps: build and deploy. Your git provider handles the build step through GitLab CI or GitHub Actions. The deployment step is handled by ArgoCD.

  1. Go to the applications tab on the left.
  2. Click on Application +. You can now set up multiple pipelines for your application.
  3. Create Pipeline + to build a new pipeline.
  4. Enter the build step, choose your repo, and configure the build details
    Build step configs
  5. Select the Deploy step. Choose which environment and cluster it gets deployed to.
  6. Add any runtime variables, secret files, etc.
    Deploy step configs
  7. You can then update network services, autoscaling, and storage.
  8. As a last step, you set the CPU and Memory requests and limits, and the number of GPUs to attach.

    💡 Note: Attach GPUs number <= Number of available GPUs in your instance (refer to EKS resource in your environment to see how many GPUs you have provisioned)

  9. Click on Create Pipeline, and your app is deployed. Attaching GPU resources