运维

运维相关知识和内容

Docker+Kubernetes容器化运维实战:从零部署到生产级集群(2026版)

# Docker+Kubernetes容器化运维实战:从零部署到生产级集群(2026版)

## 摘要

本文提供一份面向初学者的Docker与Kubernetes容器化运维实战指南,涵盖Linux/Windows双平台安装、核心操作命令、多容器编排、K8s集群搭建、应用部署全流程,以及生产环境最佳实践与故障排查技巧。所有操作步骤均附带完整命令和实战建议。

## 正文

### 一、Docker实战安装与配置

#### 1.1 Linux(Ubuntu 26.04)标准6步安装

```bash
# Step 1: 卸载旧版本(如有)
sudo apt-get remove docker docker-engine docker.io containerd runc

# Step 2: 安装依赖
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg

# Step 3: 添加Docker官方GPG密钥
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Step 4: 添加Docker apt仓库
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

# Step 5: 安装Docker组件
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# Step 6: 启动并验证
sudo systemctl start docker
sudo systemctl enable docker
sudo docker run hello-world
```

**中国大陆用户必做:配置镜像加速**

```bash
sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://mirror.ccs.tencentyun.com"]
}
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
```

#### 1.2 Windows推荐方案:WSL2原生部署(避开Docker Desktop付费限制)

```powershell
# 1. 安装WSL2(管理员权限运行PowerShell)
wsl --install

# 重启后自动安装Ubuntu

# 2. 在WSL2 Ubuntu中安装Docker(同Linux安装步骤)

# 3. WSL2特定修复:修复iptables兼容性
sudo update-alternatives --config iptables
# 选择 iptables-legacy

# 4. 配置用户权限(避免每次sudo)
sudo usermod -aG docker $USER
# 重新打开终端生效

# 5. 设置Docker开机自启(Windows任务计划程序)
# 触发器:系统启动
# 程序/脚本:C:\Windows\System32\wsl.exe
# 参数:-d Ubuntu -u root -- service docker start
```

### 二、Docker核心操作命令

#### 2.1 镜像管理

```bash
docker pull nginx:latest # 拉取最新nginx镜像
docker pull ubuntu:22.04 # 拉取指定版本Ubuntu
docker images # 列出本地镜像
docker rmi nginx:latest # 删除指定镜像
docker system prune -a -f # 清理未使用镜像(谨慎使用)
```

#### 2.2 容器管理

```bash
# 运行nginx容器:后台运行,命名my-nginx,映射主机8080到容器80
docker run -d --name my-nginx -p 8080:80 nginx:latest

docker ps # 列出运行中的容器
docker ps -a # 列出所有容器(含已停止)
docker exec -it my-nginx /bin/bash # 交互式进入容器
docker logs my-nginx # 查看容器日志
docker stop my-nginx # 停止容器
docker start my-nginx # 启动已停止的容器
docker rm my-nginx # 删除容器(需先停止)
```

#### 2.3 自定义镜像构建(Node.js示例)

`Dockerfile`:
```dockerfile
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["npm", "start"]
```

构建命令:
```bash
docker build -t my-app:v1 .
```

#### 2.4 多容器管理(Docker Compose)

`docker-compose.yml`示例(Nginx + MySQL):
```yaml
version: '3'
services:
web:
image: nginx:latest
ports:
- "80:80"
db:
image: mysql:8.0
environment:
MYSQL_ROOT_PASSWORD: root123
volumes:
- db-data:/var/lib/mysql
volumes:
db-data:
```

命令:
```bash
docker-compose up -d # 后台启动所有服务
docker-compose down # 停止并删除所有服务/容器
```

### 三、Kubernetes生产级集群搭建

#### 3.1 环境准备(Ubuntu 26.04)

```bash
# 1. 禁用swap(K8s强制要求)
sudo swapoff -a
sudo sed -i '/swap/s/^/#/' /etc/fstab # 永久禁用

# 2. 加载内核模块
cat <overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter

# 3. 配置网络参数
cat <net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
```

#### 3.2 安装容器运行时(containerd)

```bash
sudo apt-get install containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml

# 启用SystemdCgroup(K8s兼容性要求)
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml

sudo systemctl restart containerd
sudo systemctl enable containerd
```

#### 3.3 安装K8s组件(kubeadm/kubelet/kubectl)

```bash
# 添加K8s v1.31 apt仓库
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.31/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.31/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl

# 锁定版本防止意外更新
sudo apt-mark hold kubelet kubeadm kubectl
```

#### 3.4 初始化单节点K8s集群

```bash
# 初始化控制平面,使用Flannel默认CIDR
sudo kubeadm init --pod-network-cidr=10.244.0.0/16

# 配置kubectl(普通用户)
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

# 允许控制平面节点运行普通Pod(单节点必备)
kubectl taint nodes --all node-role.kubernetes.io/control-plane-

# 安装Flannel网络插件
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

# 验证集群
kubectl get nodes # 应显示Ready状态
kubectl get pods -n kube-system # 所有核心组件应运行中
```

### 四、Kubernetes核心实战操作

#### 4.1 Pod管理

```bash
kubectl run nginx --image=nginx:latest # 创建单个nginx Pod
kubectl get pods # 列出Pods
kubectl describe pod nginx # 查看Pod详情/事件
kubectl logs nginx # 查看Pod日志
kubectl exec -it nginx -- /bin/bash # 交互式进入Pod
```

#### 4.2 Deployment管理(推荐方式)

`nginx-deployment.yaml`:
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
```

命令:
```bash
kubectl apply -f nginx-deploy.yaml # 创建Deployment
kubectl get deployments # 列出Deployments
kubectl scale deployment nginx-deployment --replicas=5 # 扩缩容到5副本
```

#### 4.3 Service暴露服务(NodePort示例)

`nginx-service.yaml`:
```yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
type: NodePort
```

命令:
```bash
kubectl apply -f nginx-service.yaml
kubectl get services # 查看分配的NodePort(如80:31234/TCP)
# 通过任意节点IP + NodePort访问
```

#### 4.4 水平自动扩缩容(HPA)

```bash
kubectl autoscale deployment springboot-demo --cpu-percent=50 --min=1 --max=10
```

### 五、完整容器化部署工作流(Spring Boot示例)

**步骤1:容器化Spring Boot应用**

`Dockerfile`:
```dockerfile
FROM openjdk:17-jdk-alpine
LABEL maintainer="yourname@example.com"
WORKDIR /app
COPY app.jar app.jar
EXPOSE 8080
CMD ["java", "-jar", "app.jar"]
```

**步骤2:构建并推送镜像**

```bash
docker build -t yourusername/springboot-demo:v1 .
docker login # 登录Docker Hub
docker push yourusername/springboot-demo:v1
```

**步骤3:编写K8s Deployment + Service YAML**

`springboot-deploy.yaml`:
```yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: springboot-demo
spec:
replicas: 3
selector:
matchLabels:
app: springboot-demo
template:
metadata:
labels:
app: springboot-demo
spec:
containers:
- name: springboot-app
image: yourusername/springboot-demo:v1
ports:
- containerPort: 8080
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
livenessProbe:
httpGet:
path: /actuator/health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /actuator/health
port: 8080
initialDelaySeconds: 30
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: springboot-service
spec:
selector:
app: springboot-demo
ports:
- protocol: TCP
port: 80
targetPort: 8080
type: LoadBalancer
```

**步骤4:部署到K8s**

```bash
kubectl apply -f springboot-deploy.yaml
kubectl get deployments
kubectl get pods -l app=springboot-demo
kubectl get services springboot-service
```

**步骤5:滚动更新**

```bash
docker build -t yourusername/springboot-demo:v2 .
docker push yourusername/springboot-demo:v2
kubectl set image deployment/springboot-demo springboot-app=yourusername/springboot-demo:v2
kubectl rollout status deployment/springboot-demo
```

**步骤6:回滚**

```bash
kubectl rollout history deployment/springboot-demo # 查看历史版本
kubectl rollout undo deployment/springboot-demo # 回滚到上一版本
kubectl rollout undo deployment/springboot-demo --to-revision=1 # 回滚到指定版本
```

### 六、常见故障排查与最佳实践

#### 6.1 Docker常见问题

| 问题 | 解决方案 |
|------|---------|
| 容器启动后立即退出 | 确保应用以前台进程运行,必要时加`tail -f /dev/null`保持容器运行 |
| 磁盘空间占满 | 定期清理:`docker system prune -a -f`(删除未使用镜像/容器),`docker volume prune -f`(删除未使用卷)|
| 镜像拉取慢 | 使用配置的镜像加速器,或搭建私有仓库 |

#### 6.2 Kubernetes常见问题

| 问题 | 解决方案 |
|------|---------|
| Pod卡在Pending状态 | 检查`kubectl describe pod `的事件,查看节点资源`kubectl describe node `,检查节点污点 |
| ImagePullBackOff | 验证YAML中的镜像地址,配置`imagePullSecret`用于私有仓库,检查节点磁盘空间 |
| Service无法访问 | 验证Service selector与Pod标签匹配`kubectl get pods --show-labels`,检查Service Endpoints`kubectl get endpoints ` |

#### 6.3 生产环境运维最佳实践

1. **资源限制**:为所有容器设置`resources.requests`和`resources.limits`,避免资源耗尽
2. **健康检查**:配置`livenessProbe`和`readinessProbe`,实现自动重启和流量路由
3. **日志管理**:使用集中式日志管理(EFK:Elasticsearch+Fluentd+Kibana,或Loki)
4. **监控告警**:部署Prometheus+Grafana进行集群监控和告警
5. **镜像优化**:使用alpine基础镜像、多阶段构建、合并`RUN`命令减少层数

## SEO信息

- **关键词**: Docker,Kubernetes,容器化运维,K8s集群,Deployment,Service,故障排查,容器编排
- **描述**: 面向初学者的Docker与Kubernetes容器化运维实战指南,涵盖双平台安装、核心命令、K8s集群搭建、应用部署全流程及生产最佳实践与故障排查技巧。

---
*本文由北科信息日采集系统自动生成*
*采集时间: 2026-05-01 11:00:00*
*唯一码: a1b2c3d4*