RHEL8 and Fedora Development Environments for GPU Enabled x86 Instances

2024-5-13

Overview

This write up provides steps in setting up a RHEL8 or Fedora development environment for GPU enabled x86 instances in AWS. These steps were used when benchmarking machine learning models developed at University of Texas at Austin and Oregon State University for use in ATL24.

Steps

Logging in the first time

RHEL8

ssh -i .ssh/<mykey>.pem ec2-user@<ip address>
sudo subscription-manager register
sudo dnf upgrade --refresh

Fedora39

ssh -i .ssh/<mykey>.pem fedora@<ip address>
sudo dnf upgrade --refresh
sudo hostnamectl set-hostname --static <new hostname>

Install Large File System for Git

sudo dnf install wget
wget --content-disposition "https://packagecloud.io/github/git-lfs/packages/el/8/git-lfs-3.5.1-1.el8.x86_64.rpm/download.rpm?distro_version_id=205"
sudo rpm -i git-lfs-3.5.1-1.el8.x86_64.rpm
git lfs install

Configure the development repositories for the package manager:

sudo subscription-manager config --rhsm.manage_repos=1
sudo subscription-manager repos --enable=codeready-builder-for-rhel-8-x86_64-rpms
sudo dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
sudo dnf install -y epel-release

Install the build requirements:

sudo dnf groupinstall "Development Tools"
sudo dnf install \
    cmake \
    cppcheck \
    opencv-devel \
    python3.9  \
    parallel \
    gmp-devel \
    mlpack-devel \
    mlpack-bin \
    gdal-devel \
    armadillo-devel \
    gcc-toolset-12

Install the NVIDIA drivers:

dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo
dnf -y install cuda libcudnn8 libcudnn8-devel

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
sudo yum install -y nvidia-container-toolkit
sudo systemctl restart docker

Setup the Python environment for the build

python3.9 -m venv venv
source ./venv/bin/activate
python -m pip install --upgrade pip
pip install torch numpy pandas

Configure Environment for Build

source ~/venv/bin/activate
source /opt/rh/gcc-toolset-12/enable
CUDACXX=/usr/local/cuda/bin/nvcc
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}