Posts made by travisdh1
-
RE: Docker with Nvidia card access
Next up from the Docker site, how to actually enable the GPU in a Docker Compose file.
Example of a Compose file for running a service with access to 1 GPU device
services: test: image: nvidia/cuda:12.3.1-base-ubuntu20.04 command: nvidia-smi deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: [gpu]
-
Docker with Nvidia card access
I've been working on a bunch of stuff lately, and I'm going to be recording some of the annoyances and fixes in the near future.
For now, it's Docker and getting access to Nvidia cards in containers.
For starters, just use Ubuntu Desktop. I know it sucks running a desktop for servers, but in this case the Desktop installer allows you to install the Nvidia proprietary drivers. I had no luck getting the proprietary drivers working on Ubuntu Server.
Step 1: Install Ubuntu Desktop and make sure to select the proprietary Nvidia driver.
Step 2: Verifynvidia-smi
has the correct card(s) listed.
Step 3: Run the script below. Sourced from the Docker and Nvidia sites.Edit: added nvidia-docker2 to the installed Nvidia software
# Add Docker's official GPG key: sudo apt-get update sudo apt-get install ca-certificates curl sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc # Add the repository to Apt sources: echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt-get update sudo apt-get install -y nvidia-container-toolkit nvidia-docker2
-
RE: Dell r720 and Hitachi Drive
@gjacobse Now that I think about it, you might want to try a bios reset on the iDRAC. I don't know if that will make a difference, but it's corrected issues with them for me in the past.
-
RE: Dell r720 and Hitachi Drive
@gjacobse said in Dell r720 and Hitachi Drive:
iDRAC -
Ended up being short lived excitement when I came in last night to see that I could access the LifeCycle controller and was able to get the iDRAC connected.
Used the reboot option and that was it I suppose... Back to LifeCycle controller being disabled and the iDRAC being inaccessible.
Odd also is that it seems to only allow one of the four main NICs over the iDRAC NIC - Has it been so long since I worked on something at this level that I've misconstrued some information on how the iDRAC NIC is suppose to work/
iDRAC normally has a NIC separate from all the others. Some utilize the same NIC as the main system, but those are the jank version. My R620 iDRAC is a physically separate network port, and all the servers at work are as well.
-
RE: Dell r720 and Hitachi Drive
@gjacobse said in Dell r720 and Hitachi Drive:
Specs:
Dell r720 surplused
Hitachi HUC106060css600 512 drivesI admit - I've realized that the last server I had to physically touch was likely back in 2009. While I have worked with systems, I've not had to build or rebuild an array in a very long time. So I am a tad rusty.
System does see these drives - I have a full complement of sixteen to fully populate the two bays. However I have run into a slight issue with these drives I hadn't had to work with previously as technology has changed.
These drives are per the details above 512/512e and the r720 doesn't want to 'accept' them. While the PERC does see them, they are all marked as BLOCKED.
Now, I understand that sector size is set and cannot be changed - I rather expected to find the correct firmware needed to update the PERC to allow these drives to be used...
I do see that Dell has the driver for this drive - but reading the details about it seems to point at an OS side issue and not the PERC.
I fully expect that I am simply overlooking the solution because I am looking at or for the wrong information.
Are you looking at the PERC bios screen? If it's in standard RAID mode, that's where you need to setup the array.
Hotkeys are always different. If you have access to the iDRAC, you should be able to boot directly into the RAID bios from that. I have an even older R620 still in use as my home lab box, and that's normally how I access the PERC bios when needed.
You're reminding me that I really should see about upgrading my home lab box.
-
RingCentral/AT&T Office at Hand Outage
If any of you still have customers on AT&T Office At Hand/RingCentral, reboot your phones.
They let an SSL cert expire. Now that they've corrected that, phones are able to register again.
-
RE: What Are You Doing Right Now
Well, that was a fun afternoon.
Client asked for the /logs mount point on a mysql server to be doubled in size. What they didn't tell anyone is that they had already changed the settings in mysql for the logs.
-
RE: What Are You Doing Right Now
@EddieJennings said in What Are You Doing Right Now:
Practicing installation of Red Hat Identity Manager.
Hrm, I think it's about time I poked at FreeIPA again.
-
RE: Outage 7/19
As one of my co-workers quipped "They're not getting any malware now!"
-
RE: Random Thread - Anything Goes
@garak0410 said in Random Thread - Anything Goes:
How's everyone doing out there in Mango Land?
Sitting at work bored out of my mind. They dangled that 5% of billable hours logged when I was considering the job. Found out after I started that they really don't have the business to make any billable hours.
-
RE: Going entirely wireless instead of wired
I've had to deal with the same sort of buildings before. The key will be to have a good site survey done ahead of time. If you end up needing an AP in every room, it's probably not worth it.
Just an educated guess here, but 2.4GHz will probably get halfway acceptable coverage without acceptable speed for the end users while 5GHz will likely not cover enough area to make it worth moving to entirely.
-
RE: What Are You Doing Right Now
Just got done removing a cage section in the datacenter for an equipment move this morning.
Next up, new circuit turnups for customers.