Prompt Your Way to Linux: A 4-Part Series
Part 1: Picking Your Distro → Part 2: Storage and Encryption → Part 3: Manual Install → Part 4: Services and GPU
I've installed Linux more times than I can count. Every single time, the distro choice came down to gut feeling or whatever I used last. Here's the thing: if you're building something specific, gut feeling won't cut it. A security brain node that needs Docker, PostgreSQL, local AI models, and packet capture across three encrypted disks? That's an architecture problem, not a vibes problem.
This chapter shows you how to frame the conversation so AI gives you actual architecture, not a Wikipedia summary. You'll feed it your real workload, get back structured reasoning, map your hardware to a storage plan, and catch firmware gotchas before they cost you hours. By the end, you'll have a repeatable method that works for any build, any hardware, any distro.
The First Prompt Matters Most
Here's what most people type into ChatGPT when they're thinking about Linux:
"Which Linux distro should I install?"
And they get back exactly what you'd expect: a grocery list of distributions with surface-level descriptions. Ubuntu is beginner-friendly. Fedora has newer packages. Arch is for people who enjoy suffering. Thanks, incredibly unhelpful.
The problem isn't the AI. It's the prompt. You're asking a generic question and getting a generic answer. The moment you shift from "what should I install" to "here's what this machine needs to do," the conversation changes completely.
Here's the prompt pattern that actually works:
1I'm setting up a Linux machine for [specific role].
2My requirements are: [list workloads].
3The machine needs to support: [list services].
4What distribution should I consider and why?Simple structure. Massive difference in output quality. You're giving the AI enough context to reason, not just recite.
What Your Build Actually Needs
Before you fire off that prompt, get specific about the role. Don't say "a Linux server." Say what it does. For a security brain node (the build this series walks through), the workload list looks like this:
- Docker containers running multiple security tools and services
- PostgreSQL and Redis for structured data and caching
- Ollama for running local AI models (no cloud dependency)
- Future NVIDIA GPU passthrough for accelerated inference
- Attack artifact storage with proper chain-of-custody separation
- Encrypted multi-disk storage across NVMe, SSD, and HDD
- Long-term stability without constant breakage from bleeding-edge updates
That's not a "which distro" question anymore. That's an architecture question. And when you frame it that way, AI starts thinking about package ecosystems, kernel support timelines, driver availability, and community tooling. Your job is to provide the constraints. AI's job is to reason through them.
How the Reasoning Breaks Down
Feed AI a workload list like that and you won't just get an answer. You'll get the reasoning behind each option. Here's what to expect:
Kali Linux gets the strongest recommendation. It's Debian-based, which means rock-solid package management and wide compatibility. The security tooling comes pre-installed or is one apt install away. Rolling releases keep tools current without the instability of something like Arch. And the community is specifically focused on the kind of work this machine does.
Ubuntu Server lands as the runner-up. Excellent Docker support, huge community, LTS releases. But for a security-focused build, you'd spend your first two days installing tools that Kali ships by default. It's a general-purpose foundation when you need a purpose-built one.
Debian Stable is too conservative. The package versions lag behind what Ollama and newer NVIDIA drivers need. You'd be fighting backports constantly.
Arch Linux has the freshest packages, but rolling release instability on a machine running production databases and Docker services is asking for trouble. One bad pacman -Syu and your PostgreSQL instance is down.
Fedora Server is interesting but introduces RPM-based tooling that doesn't align with the broader Kali/Debian ecosystem most security tools target.
The Key Insight
You won't just get a distro name dropped in your lap. You'll get the reasoning for each option, where each distribution would struggle with your specific workloads, and enough context to make an informed call yourself. That's the difference between asking "which distro" and describing what you're building.
For this build, the choice is Kali. Not because it's trendy or because some YouTube video said so, but because the workload profile maps directly to what Kali is designed for, and the Debian foundation provides the stability Docker and PostgreSQL demand. Your build might point somewhere different. That's the whole point of letting AI reason through it instead of guessing.
The Hardware Discovery Conversation
With the distribution decided, don't reach for the ISO yet. First, find out exactly what hardware you're working with. This is where the AI conversation gets genuinely powerful.
Try this prompt:
1You are a security expert and are setting up a clean Kali Linux
2install. I'm sitting in front of the terminal. What commands do
3you want me to run to get you all the specs and hard drives
4available to plan this?Read that again. You're not asking "how do I check my CPU." You're telling the AI to design the entire discovery sequence based on what it needs to know for planning. That's a fundamentally different interaction. You're putting AI in the driver's seat for the investigation while you execute the commands and report back.
What AI Asks For (and Why Each Category Matters)
You'll get back a structured list of commands, grouped by category. Not random commands. A deliberate discovery protocol. Here's what to expect:
Firmware mode (ls /sys/firmware/efi or checking BIOS settings): This determines your entire boot strategy. UEFI means GPT partition tables and ESP partitions. Legacy BIOS means MBR and different bootloader configurations. Get this wrong and you're reinstalling.
CPU and RAM (lscpu, free -h): Core count and memory size determine how many Docker containers you can run simultaneously, whether local AI models will fit in memory, and how aggressive your PostgreSQL configuration can be.
Disk inventory (lsblk -o NAME,SIZE,TYPE,ROTA,TRAN,MODEL, fdisk -l): Every disk's size, interface type (NVMe, SATA), and rotational status. This is the foundation of the storage architecture.
Controller modes (BIOS settings, dmesg | grep -i ahci): AHCI vs RAID vs IDE mode affects disk performance and Linux compatibility. Some controllers in RAID mode hide individual disks from the installer.
GPU hardware (lspci | grep -i vga, lspci -nn | grep -i nvidia): Critical for two reasons. First, NVIDIA GPUs can cause installer crashes if not handled properly. Second, knowing the exact GPU model determines which driver version you'll need later.
Network devices (ip link, lspci | grep -i net): You need network access during installation for package downloads. Knowing whether you have Intel, Realtek, or Broadcom networking determines if you'll need firmware packages.
Existing partition tables (fdisk -l, blkid): Any existing data or partition schemes need to be understood before you wipe anything.
SMART health (smartctl -a /dev/sdX): Disk health data tells you which drives are trustworthy for long-term storage and which ones might fail under heavy workloads.
The Workflow Pattern
Here's how this plays out in practice:
- AI gives you a batch of commands
- You run them in the terminal
- You paste the output back into the chat
- AI interprets the results and asks follow-up questions
- Repeat until the full picture emerges
You don't need to understand what every line of lspci output means. AI reads it, flags what's relevant, and tells you what it implies for your build. Think of it as collaborative troubleshooting: you're the hands, AI is the analyst. That division of labor works because AI can process dense technical output faster than most humans can read it.
AI Interprets Your Hardware
Once you've run through the discovery commands, here's the kind of inventory you'll be working with (this is the exact hardware for this series' build):
- NVMe SSD, 1TB: Samsung 970 EVO Plus, ~3,500 MB/s sequential read
- SATA HDD, 1TB: Western Digital Blue, 5400 RPM, mechanical
- SATA SSD, 128GB: Older Kingston, decent but small, SMART showing elevated temperature events
- GPU: Intel integrated + NVIDIA GeForce (hybrid/Optimus configuration)
- RAM: 32GB DDR4
- CPU: Intel i7, 8 cores / 16 threads
Raw specs are just numbers. What matters is how AI maps them to your workload. Paste your hardware output back into the chat and watch it derive a storage architecture based on how each disk's characteristics match the requirements you described earlier.
Three-Tier Storage Architecture
For a multi-disk build, AI will propose a tiered system where each disk serves the workloads it's best suited for:
Tier 1: NVMe (the workhorse) for everything that needs speed. The operating system, Docker container storage, PostgreSQL databases, Redis data, and Ollama AI models. These workloads generate heavy random I/O, and NVMe handles that without breaking a sweat. This disk gets LUKS encryption with LVM for flexible partition management.
Tier 2: SATA SSD (the hot workspace) for active analysis. When you're working a case, you need fast access to extracted samples, temporary tooling output, and in-progress data. The 128GB SSD provides SSD-speed access without polluting the primary NVMe with transient files. Also LUKS-encrypted, mounted as a dedicated workspace.
Tier 3: SATA HDD (the cold archive) for long-term retention. Packet captures, forensic exports, evidence archives, and anything that needs to exist but doesn't need fast access. The mechanical drive is perfect here: big, cheap, and reliable for sequential writes. LUKS-encrypted with a separate key.
"Don't put your databases on spinning rust, and don't waste NVMe bandwidth on files you open twice a year. Match the storage tier to the access pattern."
Why the Small SSD Gets Rejected as Root Disk
Here's a trap you might walk into: using a smaller SSD as the root disk to keep the NVMe "free" for data. Sounds logical. AI will push back hard on this, and here's why.
Docker images and container volumes alone can consume 40-60GB on a security workstation. Add PostgreSQL data directories, Ollama model files (which can be 4-8GB each), and system packages, and you're looking at 80-100GB minimum for a comfortable root partition. On a 128GB disk, that leaves almost no headroom for growth. One large Docker pull and you're at 95% capacity.
SMART data adds another concern. If the SSD has logged thermal throttling events, it's not failing, but it's not the disk you want as your system root either.
The NVMe is the obvious primary. Faster, larger, healthier, and built for exactly the kind of mixed random/sequential workload a root partition with Docker and databases generates. Don't overthink it.
The Firmware Check That Saves Hours
This is where the discovery process pays for itself before a single byte hits disk.
One of the early discovery commands might reveal that your machine is running in Legacy BIOS mode. The machine works fine in legacy mode. But for an encrypted multi-disk build, it's wrong. This is exactly the kind of thing AI catches that you might not think to check. Here's why it matters:
Legacy BIOS + MBR limits you to four primary partitions per disk. For a three-disk encrypted build with LVM, that's a real constraint. You end up using extended partitions and logical volumes in ways that add unnecessary complexity.
UEFI + GPT removes the partition count limitation, supports larger disk sizes natively, and provides a cleaner boot process. For an encrypted multi-volume build, GPT is simply the right foundation.
Your next move if AI flags this: three firmware changes before installation.
- Switch to UEFI mode in BIOS settings
- Disable Secure Boot (Kali's installer can handle Secure Boot, but it adds friction during initial setup and driver installation, especially with NVIDIA)
- Verify AHCI mode for SATA controllers (might already be set, but confirm it)
This is the kind of problem you don't know to look for until it bites you. Catching it during discovery, before the USB drive is even flashed, saves real time and frustration. The firmware switch takes five minutes in BIOS settings. Discovering the issue mid-installation means starting over from scratch.
Your Repeatable Method
Here's what this entire process comes down to: a methodology you can use for any build, any hardware, any distribution. Pin this.
Step 1: Describe your goals with specificity. Not "I want Linux" but "I need a system that runs these services, handles these workloads, and stores this type of data." The more specific you are, the better AI can reason about your options.
Step 2: Let AI design the discovery. Don't Google individual commands. Tell AI what role it's playing and ask it to design the investigation. It'll ask for things you wouldn't have thought to check.
Step 3: Execute and report back. Run the commands, paste the output. You're the hands; AI is the analyst. This division of labor works because AI can process dense technical output faster than most humans can read it.
Step 4: Let AI interpret against your goals. Raw specs are meaningless without context. A 128GB SSD is fine for a media server root partition. It's dangerously small for a Docker-heavy security workstation. AI maps hardware to workload requirements and flags mismatches you'd miss.
Step 5: Iterate until the picture is complete. Discovery is rarely one round. AI will ask follow-up questions. "What does smartctl show for that older SSD?" or "Is the NVIDIA GPU the primary display adapter?" Each round refines the plan.
"You don't need to memorize Linux commands. You need to know what you're building. AI handles the translation between goals and implementation."
This methodology works whether you're setting up a home media server, a development workstation, a network appliance, or a penetration testing rig. The commands change. The pattern stays the same.
Now you've got a distro locked in, a full hardware inventory, a tiered storage plan, and clean firmware settings. Everything from here forward is execution. Part 2 takes all of this and turns it into an actual partition scheme: encrypted volumes, LVM layout, mount points, and the prep script that validates everything before the installer touches a disk. That's where the build gets real.
Next up: Part 2: Storage and Encryption covers encrypted volume design, LVM layout, and the safety-check script that validates your hardware before installation begins. Bring your terminal.