Local LLM and ML platform with RTX 5090 GPU

mateomayer · August 11, 2025, 10:02am

I built a local AI workstation around an RTX 5090 (32 GB) for an uninterrupted, offline coding workflow.

OS: Debian 12 with a pinned NVIDIA .run driver (frozen for kernel stability).
LLMs: each in its own Python venv to keep the global stack clean.
Tools in a default “example-venv”: PyTorch, SciPy, NumPy, pandas, Matplotlib, scikit-learn.

Short demo + full setup notes:
→ https://localprompt.ai/demo.mp4
→ https://localprompt.ai
→ System Specifications – LocalPrompt.ai

Current favorite: DeepSeek-Coder-V2-Lite-Instruct (GGUF, Q8_0) for offline code help; I run it locally and use the venv to execute/validate.

I’d love feedback on two points:

With a 32 GB GPU, which models are you finding best in practice as a coding assistant?
For longer tasks, do you prefer a slightly smaller model with bigger context, or a stronger model accepting the risk of some forgetting of chat history?

John6666 · August 11, 2025, 11:30am

1

If you’re looking for a coding-specialized model that can be quantized to 32 GB or less (preferably 16 GB or less when considering memory for context), Qwen Coder series would be a safe bet. Devstral and NextCoder also seems promising.

mateomayer · August 11, 2025, 11:47am

Thanks a lot, I’ll try it somewhere next week and post my findings here.

aiflux · September 10, 2025, 2:27pm

How’s your experience been with driver support for native fan control with your Inno3D 5090?

I was looking at some similar RTX 5090 builds for local ai on llamabuilds.ai and it looked like most builds there prefer the reference nVidia RTX 5090 or MSI models?

Pimpcat-AU · September 10, 2025, 10:14pm

How did you resolve the issue with sm_120 support? I tried and I couldn’t get it to work. Otherwise you’re not going to be able to infer anything with it.

Deliriousintent · September 19, 2025, 4:44am

sm_120 is only supported in the torch nightly builds. I’m running on dual RTX 5070’s.
Make sure you are running cuda 13.0.1 ( CUDA 13.0 Update 1 >=580 )
and install torch nightly build

download. pytorch. org /whl/nightly/cu130

pip show torch
Name: torch
Version: 2.10.0.dev20250910+cu130

Topic		Replies	Views
TOP local AI models (gguf) for complete web app development (no coding) for 2026? Models	2	560	March 17, 2026
Want to host a production level server for runnin llm for code generation Intermediate	0	141	January 7, 2025
Buying advice local llm Beginners	1	2055	March 28, 2026
Benchmark: 6 local Ollama models for code-gen delegation, with variance analysis Models	0	49	April 27, 2026
Best model to fine-tune for code explanation and debugging assistant (zero-cost deployment goal) Models	2	693	July 22, 2025

Local LLM and ML platform with RTX 5090 GPU

Related topics