What are DevOps practices?

DevOps practices are engineering habits that connect development, operations, security and reliability. They include version control, CI/CD, infrastructure as code, automated configuration, containerization, Kubernetes operations, observability, incident response, security controls and continuous improvement.

Which DevOps skill should a beginner learn first?

A beginner should start with Linux fundamentals, shell commands, networking basics, Git and basic scripting. These skills make Kubernetes, cloud, Terraform, CI/CD and troubleshooting easier to understand later.

How do I practice DevOps in a real way?

Practice by building small end-to-end projects: deploy an app, store code in Git, build a container image, run it locally, automate infrastructure with Terraform, configure it with Ansible, deploy to Kubernetes or OpenShift, monitor it, break it intentionally and troubleshoot it.

Is DevOps only CI/CD?

No. CI/CD is one part of DevOps. Real DevOps also includes Linux operations, cloud architecture, containers, Kubernetes, Infrastructure as Code, configuration management, observability, reliability, incident response, security, cost awareness and production troubleshooting.

DevOps Practices: Tutor-Style Complete Guide for Engineers

Before we start: what is DevOps really?

Student

I keep hearing DevOps everywhere. Some people say it is Jenkins. Some say Kubernetes. Some say cloud. What is DevOps in real work?

Teacher

DevOps is not one tool. DevOps is a way of building, releasing, operating and improving software with fewer handover gaps between development, operations, security and reliability teams. Tools help, but the real practice is about repeatability, automation, visibility, safety and fast recovery.

A good DevOps engineer should understand how an application moves from source code to production, how infrastructure is created, how deployments are automated, how systems are monitored, and how incidents are handled when something breaks.

Simple definition: DevOps practices help teams deliver changes safely, operate systems reliably, and recover quickly when failures happen.

1. Linux and networking foundation 2. Git and collaboration 3. Containers and image practices 4. CI/CD practices 5. Infrastructure as Code 6. Configuration management 7. Kubernetes and OpenShift 8. GitOps and Argo CD 9. Cloud practices 10. Observability and SRE 11. DevSecOps practices 12. AIOps and AI in DevOps

The end-to-end DevOps flow

If you want to understand DevOps clearly, imagine one small application moving from a developer laptop to production.

Code is written and stored in Git.
Branches, pull requests, reviews and commit history create collaboration and traceability.

CI pipeline builds and tests the application.
Unit tests, lint checks, security scans and image builds run automatically.

Infrastructure is created using code.
Terraform provisions cloud resources such as networks, compute, managed databases or Kubernetes clusters.

Configuration is automated.
Ansible or platform automation configures servers, packages, files, users and application settings.

Application is packaged into a container.
Docker or Podman creates a repeatable image with the application and its runtime dependencies.

Application is deployed to Kubernetes or OpenShift.
Deployments, Services, Routes/Ingress, Secrets, ConfigMaps, probes and autoscaling support runtime operations.

GitOps keeps desired state under control.
Argo CD can sync Kubernetes manifests from Git and detect drift.

Observability watches the system.
Metrics, logs, traces, dashboards and alerts help engineers understand behavior.

Incidents are handled with SRE thinking.
Engineers triage, mitigate, communicate, write postmortems and improve reliability.

1. Linux and networking practices

Student

Why do you always say Linux is the base? Can I directly learn Kubernetes and cloud?

Teacher

You can start Kubernetes directly, but production troubleshooting will become difficult. Containers run on Linux. Kubernetes nodes are Linux machines in most environments. Logs, processes, ports, DNS, filesystems, permissions and system services all come back to Linux fundamentals.

Process practice

Understand how to inspect running processes, CPU usage, memory usage and service status. In production, you often start with questions like: is the process running, is it stuck, is it consuming too many resources?

ps aux | head ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%cpu | head systemctl status nginx --no-pager journalctl -u nginx --since "30 minutes ago" --no-pager

Networking practice

DevOps troubleshooting often starts with connectivity. You should know how to test DNS, ports, routes, listening services and TLS behavior.

ss -tulpn curl -vk https://example.com dig example.com ip route traceroute example.com

Storage practice

Disk full issues are still common. Learn filesystem usage, inode usage, mount points and log growth.

df -h df -i du -sh /var/log/* | sort -h lsblk mount | column -t

Security practice

Linux permissions, users, groups, SSH keys and sudo access matter in every DevOps environment.

id ls -l /var/www sudo -l chmod 640 app.conf chown appuser:appgroup app.conf

Interview framing: If an application is down, do not say “I will restart it first.” A stronger answer is: “I will check service status, logs, port binding, recent changes, resource pressure, dependencies and then decide the safest mitigation.”

Practice Linux Questions Practice Linux Networking Start Free

2. Git and collaboration practices

Student

Git looks simple: add, commit, push. What are real DevOps Git practices?

Teacher

In DevOps, Git is not only source control. Git becomes the audit trail for code, infrastructure, Kubernetes manifests, pipeline definitions and sometimes runbooks. A good Git practice makes changes reviewable and recoverable.

Practice	Why it matters	Example
Small commits	Easier review and rollback	One change per commit instead of mixing app, infra and pipeline changes
Pull requests	Peer review and discussion	Terraform change reviewed before apply
Branch protection	Prevents accidental direct production changes	Main branch requires approvals and checks
Meaningful commit messages	Useful during incident review	“fix: increase readiness probe delay for model API”
Tagging releases	Trace deployment versions	v1.4.2 deployed to production

git checkout -b fix/readiness-probe git status git diff git add deployment.yaml git commit -m "fix: tune readiness probe for slow startup" git push origin fix/readiness-probe

Real practice: If infrastructure and application changes are in Git, an incident responder can check what changed recently instead of guessing blindly.

3. Container and image practices

Student

Dockerfile works on my laptop. Is that enough?

Teacher

For learning, maybe. For production, no. A good container image should be small, secure, repeatable, non-root where possible, and should separate build-time dependencies from runtime dependencies.

Good image practice

Use a clear base image.
Pin important versions where needed.
Do not store secrets in images.
Run as non-root where possible.
Use health checks or Kubernetes probes.
Keep image layers clean and small.

Bad image practice

Copying entire local directory blindly.
Running SSH inside every container.
Using latest tag everywhere without control.
Embedding passwords or tokens.
Installing unnecessary debugging tools in production images.

FROM python:3.12-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . RUN useradd -r appuser USER appuser EXPOSE 8080 CMD ["python", "app.py"]

Practice Docker Questions Practice Kubernetes Questions Get Full Access ₹399

4. CI/CD practices

Student

Is CI/CD just Jenkins pipeline?

Teacher

No. Jenkins is one tool. GitHub Actions, GitLab CI and other systems can also do CI/CD. The practice is to automate build, test, security checks, packaging and deployment in a controlled way.

Continuous Integration: every code change is built and tested early.

Artifact creation: create container image or package with version tags.

Quality gates: run tests, linting, dependency checks and image scans.

Deployment automation: push changes to environment with approval gates where required.

Rollback planning: every deployment should have a recovery path.

Example pipeline thinking

stages: - checkout - unit_test - build_image - scan_image - push_image - deploy_to_dev - approval - deploy_to_prod - smoke_test

Production trap: A pipeline that deploys fast but has no rollback, no smoke test, no approval and no observability is not mature automation. Speed without safety creates incidents.

Practice Jenkins Questions Practice GitOps/Argo CD Start Free

5. Infrastructure as Code with Terraform

Student

Why should infrastructure be written as code? Cloud console is faster sometimes.

Teacher

Manual cloud console work may be fast once, but it is hard to repeat, review, audit and recover. Terraform turns infrastructure into version-controlled code. That means review, plan, apply, rollback strategy and environment consistency become possible.

Terraform practices

Use remote backend for shared state.
Protect state files because they may contain sensitive values.
Review terraform plan before apply.
Use modules for repeatable infrastructure.
Separate environments carefully.
Avoid manual drift from console changes.

Typical resources

VPC/VNet and subnets
Security groups / NSGs
Compute instances
Kubernetes clusters
Load balancers
Managed databases
IAM roles and policies

terraform init terraform fmt terraform validate terraform plan -out=tfplan terraform apply tfplan terraform state list

Interview framing: A strong answer should mention state, remote backend, locking, drift, modules, plan review, secrets handling and CI/CD integration.

Practice Terraform Questions Get Full Access ₹399

6. Configuration management with Ansible

Student

If Terraform creates servers, why do we need Ansible?

Teacher

Terraform is mainly for provisioning infrastructure resources. Ansible is often used to configure operating systems, packages, services, files, users and application settings. In simple words: Terraform creates the machine; Ansible prepares the machine.

--- - name: Configure web server hosts: web become: yes tasks: - name: Install nginx package: name: nginx state: present - name: Ensure nginx is running service: name: nginx state: started enabled: yes

Good Ansible practice

Use idempotent tasks, roles, inventories, variables and clear handlers. Do not write playbooks that blindly run shell commands for everything.

Production caution

Test changes in a lower environment. Use check mode when possible. Be careful with service restarts and configuration templates.

Practice Ansible Questions Start Free

7. Kubernetes and OpenShift practices

Student

Kubernetes has too many objects. Which practices matter most?

Teacher

Start with the runtime path: Pod, Deployment, Service, Ingress or Route, ConfigMap, Secret, PVC, probes, resources, logs and events. Then learn scheduling, RBAC, network policies, autoscaling and troubleshooting.

Area	Kubernetes practice	OpenShift angle
Deployment	Use Deployments, StatefulSets, probes and controlled rollouts	Use Deployment/DeploymentConfig depending on environment standards
Networking	Use Services and Ingress carefully	Routes are commonly used for external access
Security	RBAC, Secrets, network policies, non-root containers	SCCs and OpenShift security defaults matter
Storage	PVCs, StorageClasses, backup planning	Cluster storage integration and permissions are important
Troubleshooting	Describe, logs, events, rollout history, endpoints	Also check Routes, SCC, BuildConfig/ImageStream if used

CrashLoopBackOff troubleshooting example

kubectl get pods -n app kubectl describe pod payment-api-xxxxx -n app kubectl logs payment-api-xxxxx -n app --previous kubectl get events -n app --sort-by=.lastTimestamp kubectl rollout history deployment/payment-api -n app

Production rule: Do not delete Pods repeatedly without reading previous logs and events. The restart itself may hide evidence.

Practice Kubernetes Questions Practice OpenShift Questions Get Full Access ₹399

8. GitOps and Argo CD practices

Student

What problem does GitOps solve if I already have a pipeline?

Teacher

Traditional pipelines often push changes into clusters. GitOps changes the model: Git becomes the desired state, and tools like Argo CD continuously compare the cluster state with Git state. This helps with drift detection, auditability and controlled sync.

GitOps benefits

Git is the source of truth.
Drift is visible.
Rollback can be Git-based.
Cluster changes are reviewable.
Environment differences can be tracked.

GitOps cautions

Do not store plain secrets in Git.
Be careful with auto-sync in production.
Separate app and platform responsibilities.
Understand prune and self-heal behavior.
Review changes before production sync.

Practice Argo CD/GitOps Start Free

9. Cloud DevOps practices: AWS, Azure and GCP

Student

Should I learn one cloud or all clouds?

Teacher

Start with one cloud deeply. But understand cloud patterns that repeat everywhere: IAM, networking, compute, storage, managed Kubernetes, load balancing, monitoring, automation, backups and cost control. Once fundamentals are clear, moving between AWS, Azure and GCP becomes easier.

AWS practice areas

IAM roles and least privilege
VPC, subnets, route tables
EC2, ALB, S3, RDS
EKS and CloudWatch
Terraform automation

Practice AWS Questions →

Azure practice areas

Resource groups
VNets and NSGs
VMs, Storage, Azure SQL
AKS and Monitor
Identity and RBAC

Practice Azure Questions →

GCP practice areas

Projects and IAM
VPC networks
Compute Engine and GCS
GKE and Cloud Logging
Service accounts

Practice GCP Questions →

Cloud practice: Always connect cloud learning to real systems: DNS, TLS, load balancers, firewall rules, autoscaling, logging, backups, IAM and cost.

10. Observability and SRE practices

Student

Monitoring means dashboards, right?

Teacher

Dashboards are useful, but observability is deeper. You should be able to understand the internal state of the system using metrics, logs, traces, events and user impact signals. SRE adds reliability thinking: SLIs, SLOs, error budgets, incident response and postmortems.

Signal	What it tells you	Example
Metrics	Numeric time-series behavior	CPU usage, request rate, error rate, latency
Logs	Detailed event records	Application errors, stack traces, auth failures
Traces	Request path across services	Which service made a request slow
Events	System lifecycle changes	Kubernetes scheduling, pod restarts, image pull errors
SLIs/SLOs	Reliability targets	99.9% successful requests under 300ms

Incident response practice

Confirm customer impact and severity.

Check dashboards, alerts, recent deployments and infrastructure changes.

Mitigate safely before deep root cause analysis if impact is high.

Communicate status clearly.

Write post-incident notes and prevent recurrence.

Practice Observability Practice SRE Questions Get Full Access ₹399

11. DevSecOps and production safety practices

Student

Security is a separate team, right? Why should DevOps learn it?

Teacher

Security teams define standards, but DevOps engineers implement many controls: IAM, secrets, network rules, image scanning, pipeline permissions, Kubernetes RBAC, TLS, audit logs and deployment approvals. If DevOps ignores security, automation can spread mistakes very fast.

Security practices to learn

Least privilege IAM and RBAC
Secrets management
Container image scanning
Dependency scanning
Network policies and firewall rules
TLS certificate handling
Audit logging

Unsafe practices to avoid

Hardcoding secrets in Git
Using admin credentials in pipelines
Running all containers as root
Opening broad network rules
Skipping approvals for production
Disabling security checks to deploy faster

Production warning: A fast pipeline with powerful credentials is dangerous. Treat pipeline identities like production users and restrict what they can do.

12. AIOps and AI in DevOps practices

Student

Where does AI fit in DevOps? Is it replacing engineers?

Teacher

No serious team should treat AI as a replacement for engineering judgment. AI is useful as an assistant for summarizing logs, explaining alerts, drafting incident timelines, searching runbooks and generating hypotheses. The engineer still validates everything using real signals.

Safe AI use cases

Summarize Linux logs
Explain Kubernetes events
Draft incident updates
Suggest validation commands
Compare alert context
Generate interview practice scenarios

Unsafe AI use cases without approval

Deleting resources
Changing firewall rules
Applying Terraform changes
Rotating secrets
Restarting production services blindly
Auto-remediating without guardrails

Example AI troubleshooting prompt

You are assisting with a production DevOps incident. Analyze the following logs, metrics and deployment history. Return: 1. Timeline 2. First visible failure 3. Repeated errors 4. Possible causes 5. Validation commands 6. Unsafe actions to avoid Do not assume root cause unless evidence is clear.

Open AI in DevOps Hub Practice AIOps Questions AI DevOps Interview Blog

How all SkillUpWorks topics connect together

DevOps is not learned as isolated tools. Each topic supports the others.

SkillUpWorks topic	Real production purpose	Practice link
Linux	Operating system, process, logs, files, services	Linux questions
Linux Networking	DNS, ports, routing, connectivity, TLS troubleshooting	Networking questions
Bash scripting	Automation, checks, small operational tools	Bash questions
Docker	Container images and local runtime behavior	Docker questions
Kubernetes	Container orchestration and production application runtime	Kubernetes questions
OpenShift	Enterprise Kubernetes platform with Routes, SCC, Operators	OpenShift questions
Terraform	Infrastructure as Code and cloud provisioning	Terraform questions
Ansible	Configuration management and automation	Ansible questions
Jenkins	CI/CD pipelines and release automation	Jenkins questions
GitOps/Argo CD	Git-based Kubernetes deployment and drift control	Argo CD questions
AWS/Azure/GCP	Cloud infrastructure, IAM, networking, compute, managed services	AWS / Azure / GCP
Observability	Metrics, logs, traces, dashboards and alerts	Observability questions
SRE	Reliability, incidents, SLOs, error budgets	SRE questions
AIOps	AI-assisted operations, alerting and troubleshooting	AIOps questions

A practical DevOps project every learner should build

Student

I understand the topics separately. How do I practice them together?

Teacher

Build one end-to-end project. Do not only read. Create a small app and move it through the full DevOps lifecycle.

Create a simple web application and push it to Git.

Write a Dockerfile and run the app locally.

Create CI pipeline to test and build the image.

Provision cloud infrastructure using Terraform.

Use Ansible for any VM or server configuration.

Deploy the app to Kubernetes or OpenShift.

Expose it using Service and Ingress/Route.

Add ConfigMaps, Secrets, resource limits and probes.

Set up logs, metrics, alerts and dashboards.

Break the app intentionally and troubleshoot it.

Write incident notes and improve the design.

Practice explaining the project in an interview.

Practice DevOps the SkillUpWorks way

SkillUpWorks is built for engineers who want practical interview preparation, deep technical answers, real troubleshooting thinking, AI-assisted learning and project-based DevOps practice.

Start Free Practice Get Full Access — ₹399 Explore Projects AI in DevOps Path

Free pages help you learn the concept. Full access helps you practice more questions, deeper answers, projects and interview scenarios.

Common DevOps mistakes beginners should avoid

Tool-first learning

Learning commands without understanding systems creates shallow knowledge. Learn why a tool exists, what problem it solves and how it fails.

Ignoring Linux basics

Kubernetes, containers and cloud still depend on operating system fundamentals. Do not skip logs, processes, DNS, ports and filesystems.

No troubleshooting practice

Only deploying happy-path labs is not enough. Break things and learn how to recover.

No security thinking

Secrets, permissions, IAM and network exposure are part of DevOps. Security cannot be an afterthought.

No observability

If you cannot see system behavior, you cannot operate it confidently.

Blind automation

Automation should have review, guardrails, rollback and logging. Bad automation can damage production faster than manual mistakes.

Interview framing: how to answer “What DevOps practices do you follow?”

Interviewer

What DevOps practices do you follow in a production environment?

Strong candidate answer

I follow practices that make delivery repeatable, visible and safe. Code and infrastructure changes should be version controlled in Git. CI pipelines should build, test, scan and package artifacts. Infrastructure should be managed using Terraform with remote state and plan review. Configuration should be automated with tools like Ansible where needed. Applications should run in containers and be deployed to Kubernetes or OpenShift with proper probes, resource limits, ConfigMaps, Secrets and rollout strategy. GitOps tools like Argo CD can maintain desired state and detect drift. Observability should include metrics, logs, traces and alerts connected to SLOs where possible. For incidents, I focus on impact, mitigation, communication, root cause analysis and post-incident improvement. I also consider security controls such as least privilege, secrets management, image scanning and approval gates. The goal is not only faster deployment, but safer and more reliable production operations.

Why this answer is strong: It connects tools to outcomes: repeatability, safety, visibility, reliability and recovery.

Suggested learning path on SkillUpWorks

Start with Linux and Linux networking.
Linux questions and networking questions.

Learn scripting and Git habits.
Bash scripting and Git/GitHub practice.

Move into Docker and Kubernetes.
Docker, Kubernetes and OpenShift.

Add automation and infrastructure.
Terraform, Ansible and cloud platforms.

Learn CI/CD and GitOps.
Jenkins and Argo CD/GitOps.

Build reliability thinking.
Observability, SRE and production troubleshooting.

Explore AI-assisted DevOps.
AI in DevOps and AIOps practice.

Ready to practice like an engineer?

Read this guide for free. Then use SkillUpWorks to practice real DevOps, Cloud, SRE and Linux interview questions with deeper answers, project flow and production troubleshooting thinking.

Start Free Buy Full Access — ₹399 Visit SkillUpWorks