Nginx Ingress Controller has been successfully deployed and bound to a public-facing SLB.
Note: Kubernetes clusters created via the Alibaba Cloud Container Service console automatically deploy an Nginx Ingress Controller during initialization, which is default-mounted to a public SLB instance.
II. Configuration
1. Create an Internal SLB
In the Alibaba Cloud console, create an internal SLB and bind it to your VPC.
# my-nginx-ingress-slb-intranet.yaml# intranet nginx ingress slb serviceapiVersion:v1kind:Servicemetadata:# Name the service as nginx-ingress-lb-intranet.name:nginx-ingress-lb-intranetnamespace:kube-systemlabels:app:nginx-ingress-lb-intranetannotations:# Specify the SLB instance type as internal.service.beta.kubernetes.io/alicloud-loadbalancer-address-type:intranet# Replace with your internal SLB instance ID.service.beta.kubernetes.io/alicloud-loadbalancer-id:<YOUR_INTRANET_SLB_ID># Whether to automatically create SLB port listeners (overrides existing ones); can also be configured manually.#service.beta.kubernetes.io/alicloud-loadbalancer-force-override-listeners: 'false'spec:type:LoadBalancer# Route traffic to other nodesexternalTrafficPolicy:"Cluster"ports:- port:80name:httptargetPort:80- port:443name:httpstargetPort:443selector:# Select pods with app=ingress-nginxapp:ingress-nginx
git clone https://github.com/feiyu563/PrometheusAlert.git
cd PrometheusAlert/example/helm/prometheusalert
# Update config/app.conf to set login user info and database configurationhelm install -n monitoring .
Create a WeChat Work Group Robot
After creating a WeChat Work group, right-click the group → “Add Group Robot”. This will generate a webhook URL for the robot. Record this URL for later use.
For backups, every internet company’s technical team must handle this task, and we are no exception. Today, I’ll share my own strategies for backing up production Kubernetes clusters.
My primary goals for Kubernetes backups are to prevent:
Accidental deletion of a namespace within the cluster
Accidental deletion of partial resources in the cluster
Loss of etcd data
Backing Up etcd
Backing up etcd prevents catastrophic failures at the cluster level or loss of etcd data, which could render the entire cluster unusable. In such cases, only full cluster recovery can restore services.
Recently, our internal project has been supporting a big data initiative, requiring the simulation of customer scenarios using Greenplum (older version 4.2.2.4). Below is a record of the Greenplum cluster setup process—note that the procedure for higher versions of GP remains largely identical.
Before deployment, ensure that nvidia-driver and nvidia-docker are installed on your Kubernetes nodes, and Docker’s default runtime has been set to nvidia.
To facilitate later verification of private deployment, a quick Kubernetes cluster setup is required in the internal network environment. Previously, for larger clusters, I typically used Kubeasz or Kubespray. For this small-scale cluster, using kubeadm will be more efficient.
Below is the recorded process for deploying with kubeadm:
Kubernetes 1.8+ requires disabling swap. If not disabled, kubelet will fail to start by default.
Option 1: Use --fail-swap-on=false in kubelet startup args.
Option 2: Disable system swap.
Alibaba Cloud cluster can resolve internal domain names
Office network resolves internal domain names + internet access resolution
Solution:
For the first requirement, directly use Alibaba Cloud PrivateZone for resolution.
For the second requirement, configure internal domain zones in PrivateZone, then synchronize them to the office network’s bind9 server using Alibaba Cloud’s synchronization tool. Use Dnsmasq as the DNS entry point for the office network: forward public queries to public DNS servers, and forward internal domain queries to the bind9 server.
Some may wonder: Why not use bind9 alone to handle all internal resolutions?
The main reason is that in practice, bind9 exhibits performance issues when forwarding to multiple DNS servers simultaneously—occasional timeouts occur. In contrast, Dnsmasq handles this scenario significantly better.
Previously, we introduced how to install Argo Workflow and trigger tasks. In this article, we focus on a new tool:
What is ArgoEvents?
Argo Events is an event-driven Kubernetes workflow automation framework. It supports over 20 different event sources (e.g., webhooks, S3 drops, cronjobs, message queues such as Kafka, GCP PubSub, SNS, SQS, etc.).
EventSource (similar to a gateway; sends messages to the event bus)
EventBus (event message queue; implemented using high-performance distributed messaging middleware NATS — note that NATS has ceased maintenance after 2023, so architectural changes may be expected in the future)
EventSensor (subscribes to the message queue, parameterizes events, and filters them)
Grant operate-workflow-sa permission to create Argo Workflows within the argo-events namespace — required for EventSensor to automatically create workflows later.