Local LLM and ML platform with RTX 5090 GPU

I built a local AI workstation around an RTX 5090 (32 GB) for an uninterrupted, offline coding workflow.

OS: Debian 12 with a pinned NVIDIA .run driver (frozen for kernel stability).
LLMs: each in its own Python venv to keep the global stack clean.
Tools in a default “example-venv”: PyTorch, SciPy, NumPy, pandas, Matplotlib, scikit-learn.

Short demo + full setup notes:
https://localprompt.ai/demo.mp4
https://localprompt.ai
System Specifications – LocalPrompt.ai

Current favorite: DeepSeek-Coder-V2-Lite-Instruct (GGUF, Q8_0) for offline code help; I run it locally and use the venv to execute/validate.

I’d love feedback on two points:

  1. With a 32 GB GPU, which models are you finding best in practice as a coding assistant?
  2. For longer tasks, do you prefer a slightly smaller model with bigger context, or a stronger model accepting the risk of some forgetting of chat history?

1

If you’re looking for a coding-specialized model that can be quantized to 32 GB or less (preferably 16 GB or less when considering memory for context), Qwen Coder series would be a safe bet. Devstral and NextCoder also seems promising.

Thanks a lot, I’ll try it somewhere next week and post my findings here.

How’s your experience been with driver support for native fan control with your Inno3D 5090?

I was looking at some similar RTX 5090 builds for local ai on llamabuilds.ai and it looked like most builds there prefer the reference nVidia RTX 5090 or MSI models?

How did you resolve the issue with sm_120 support? I tried and I couldn’t get it to work. Otherwise you’re not going to be able to infer anything with it.

sm_120 is only supported in the torch nightly builds. I’m running on dual RTX 5070’s.
Make sure you are running cuda 13.0.1 ( CUDA 13.0 Update 1 >=580 )
and install torch nightly build

download. pytorch. org /whl/nightly/cu130

pip show torch
Name: torch
Version: 2.10.0.dev20250910+cu130

nvidia-smi.exe
±-----------------------------------------------------------------------+
| NVIDIA-SMI 581.29 Driver Version: 581.29 CUDA Version: 13.0 |
|=============+=================+===============|
| 0 NVIDIA GeForce RTX 5070 WDDM | 00000000:01:00.0 On | N/A |
| 0% 37C P1 34W / 250W | 11198MiB / 12227MiB | 10% Default |
±-----------------------±-----------------------±---------------------+
| 1 NVIDIA GeForce RTX 5070 WDDM | 00000000:14:00.0 Off | N/A |
| 0% 46C P1 24W / 250W | 11065MiB / 12227MiB | 10% Default |