Building a GPU Home Server for AI
Want to build a GPU home server for running quantized models? Here’s some tips and tricks for setting up the server.
Components Overview
GPUs
- RTX 3090: Two RTX 3090s with NVLink are a common choice for running large AI models. NVLink can provide improved communication between GPUs, though for many AI tasks, traditional PCIe bandwidth is sufficient.
- VRAM: With API models - memory is king. LLAMA3 70b fits into 160GB of RAM - it’s quantized varients are able to squeeze into 48 GB VRAM. Hence whey 2x3090 and 2x4090 GPUS are so popular for home systems.
CPU
- AMD vs. Intel: Modern Intel CPUs are generally better at power management and clocking back when idle. However, high-end AMD CPUs like the 7800X3D are also a good choice.
- Recommendation: Consider AMD Ryzen 7800X3D or Intel i5/i7 depending on your power management preference and budget. The AMD Rynzen 7800X3D and 7900X3D have very large l3 caches making them highly performant on un-optimised single treaded applications (looking at you Rimworld)
Motherboard
- PCIe Lanes: Ensure the motherboard supports 8x/8x PCIe bifurcation if running dual GPUs. Models like the Asus Creator, ASRock Taichi series (AMD), or any Z790 board (Intel) are good choices.
- Integrated NIC: For high-speed networking, consider boards with a 10-gig NIC.
Memory
- RAM: 32GB or 64GB DDR4/DDR5 depending on your workload. Dual-channel configurations are generally sufficient.
- Storage: A 2TB PCIe 5.0 NVMe SSD ensures fast read/write speeds.
Power Supply Unit (PSU)
- Capacity: A 1200W Platinum or Titanium PSU is recommended. These offer higher efficiency, especially at lower loads, which is critical for reducing idle power consumption.
- Connections: Ensure the PSU has enough PCIe connectors (6 total, 2 for CPU and 4 for GPUs).
Cooling
- Airflow: Ensure adequate spacing and airflow for cooling. Adding dedicated fans or using water cooling can help manage temperatures and improve efficiency.
Additional Components
- Networking: A high-speed NIC like Intel X710-DA4 can be beneficial for data transfer.
- UPS: Consider an Uninterruptible Power Supply (UPS) to protect against power outages.
Power Management
GPU Power Limiting
- Persistent Mode: Enable persistent mode to reduce power usage when GPUs are idle.
1
sudo nvidia-smi -pm 1
- Power Limit: Set power limits to balance performance and efficiency.
1
sudo nvidia-smi -pl 200 -i 0 # Set power limit to 200W for GPU 0
CPU and System Power Management
- BIOS Settings: Enable power-saving features in the BIOS. Disable unnecessary components.
- Operating System: Use Linux with power management tools to monitor and control power usage. For instance,
power_now
can provide power draw information.1
cat /sys/class/power_supply/BAT0/power_now
Example Builds
Build 1
- CPU: AMD 7900X3D
- Motherboard: Asus X670E Hero
- RAM: 64GB DDR5
- GPUs: 2x RTX 3090 with NVLink
- PSU: Corsair RM1200e
- Cooling: Custom water cooling for GPUs, air cooling for CPU
- Storage: 2TB PCIe 5.0 NVMe SSD
- Networking: Integrated 10-gig NIC
Build 2
- CPU: Intel i5
- Motherboard: Z790 board with dual PCIe 4.0 x16 slots
- RAM: 32GB DDR4
- GPUs: 2x RTX 3090 with NVLink
- PSU: Be Quiet 1000W Platinum
- Cooling: Air cooling with additional fans
- Storage: 1TB PCIe 4.0 NVMe SSD
- Networking: 1-gig NIC (optional 10-gig upgrade)
Power Consumption
- Idle Power: Aim for around 50-90W. Efficient components and power management settings are crucial.
- Load Power: Expect around 700-800W under full load with power-limited GPUs. Ensure your PSU can handle peak loads.
Miscellaneous Tips
- Energy Efficiency: Invest in energy-efficient components and consider renewable energy options like solar panels to offset electricity costs.
- Monitoring Tools: Use power metering tools to monitor and manage power usage effectively. For example: Electricity Usage Monitor