I am a DevOps engineer. I know Linux, Kubernetes and troubleshooting, but I do not have experience installing or using AI tools. Where should I start?
Start by treating AI as another engineering tool, not as magic. Your first goal is not to build a foundation model. Your first goal is to run a local AI assistant, understand how prompts work, feed it safe troubleshooting data, validate its output, and connect it to real DevOps scenarios like logs, Kubernetes events and alerts.
The project that drives the whole roadmap
Project: Local AI DevOps Troubleshooting Assistant
You will build a local AI setup that can help summarize Linux logs, Kubernetes events, Prometheus alerts and incident notes. It will not make production changes. It will explain, summarize and suggest validation steps.
Stage 1: Understand the core theory
Model
The model is the AI engine that generates answers. For a DevOps learner, the exact model is less important than learning how to give it clean context and validate the output.
Runtime
The runtime runs the model on your machine or server. Ollama is a simple way to run local models and expose them through a local API.
Prompt
The prompt is your instruction. A weak prompt asks “fix this.” A strong prompt asks for timeline, evidence, hypotheses, validation commands and unsafe actions to avoid.
Context
Context is the data you provide: logs, events, command outputs, alert labels and change history. Better context usually gives better answers.
Stage 2: Install local AI with Ollama
Why local AI first? Why not use any online AI tool?
Local AI gives you a safe learning environment. You can experiment without sending internal logs to an external service. In real companies, policy may still decide what is allowed, but learning locally helps you understand model behavior, prompts, privacy and limitations.
Basic Linux setup
What these commands mean
- install.sh installs Ollama on Linux.
- ollama pull downloads a local model.
- ollama run starts an interactive prompt with that model.
Stage 3: Add Open WebUI
A web UI makes the lab easier for learners because they can save conversations, test prompts and explain scenarios visually.
Stage 4: Learn the master DevOps troubleshooting prompt
This is the type of prompt that should appear throughout SkillUpWorks content because it teaches safe thinking.
Why do we ask AI to mention unsafe actions?
Because during incidents people are under pressure. A good AI workflow should slow down dangerous assumptions. It should remind the engineer not to delete resources, rotate secrets, restart critical systems or apply manifests unless there is evidence and approval.
Stage 5: Apply the roadmap to Linux troubleshooting
Scenario
Nginx is failing on a Linux server. You collect data first.
What AI should help with
- Summarize repeated errors.
- Identify the first visible failure.
- Suggest validation commands.
- Explain possible causes in simple language.
- Draft an incident note.
What AI should not do
- Blindly restart the service without understanding impact.
- Disable firewall rules.
- Delete logs or files.
- Change production configuration without review.
Stage 6: Apply the roadmap to Kubernetes troubleshooting
Scenario
A payment API Pod is CrashLoopBackOff. You collect evidence before asking AI.
AI prompt
Stage 7: Understand AIOps, MLOps and LLMOps in your roadmap
| Area | What it means for a DevOps engineer | What to learn |
|---|---|---|
| AIOps | AI-assisted operations: alert correlation, anomaly detection, log analysis and incident support. | Observability, incident workflows, alert quality, safe automation. |
| MLOps | Deploying and managing machine learning model lifecycle. | Pipelines, model registry, deployment, monitoring, data/version control basics. |
| LLMOps | Operating LLM applications and systems. | Prompts, RAG, vector databases, evaluation, model serving, safety. |
| AI in DevOps | Practical use of AI inside DevOps, SRE and platform work. | Local AI, troubleshooting prompts, Kubernetes AI workloads, safety and interview framing. |
90-day learning roadmap
Interview framing
How would you start learning AI in DevOps as an infrastructure engineer?
I would start with local AI setup using tools such as Ollama and a self-hosted UI like Open WebUI, so I can learn safely without exposing sensitive data. Then I would apply AI to practical DevOps tasks: summarizing Linux logs, explaining Kubernetes events, understanding Prometheus alerts and drafting incident notes. I would treat AI output as a hypothesis, not truth. Every recommendation must be validated with logs, metrics, events, runbooks and human approval before production action.
Continue the SkillUpWorks AI in DevOps path
Use this roadmap with the main AI in DevOps hub, AIOps questions and project-based learning.
Official references
- Ollama Linux documentation
- Open WebUI documentation
- Open WebUI quick start
- Kubernetes GPU scheduling documentation
References are included so learners can verify the local AI and Kubernetes concepts from official or primary project documentation.