With business growth, there are currently over a dozen independent private network environments (ZStack/TStack) overseas, for example:
Overseas Private Network Region A
Overseas Private Network Region B
Overseas Private Network Region C
Each network is an independent entity requiring different VPN clients to dial in for resource access, posing significant challenges for daily O&M management:
O&M personnel need to frequently switch between multiple VPN clients.
Lack of unified permission management and access control.
Complex network configuration and difficult troubleshooting.
Inability to achieve cross-network automated O&M.
Solution
Adopting the Headscale (Server) + Tailscale (Client) architecture to build an enterprise-grade Zero Trust VPN network:
Deploy the Headscale control server on the Alibaba Cloud Singapore node.
Install Tailscale clients on private network nodes and register them with Headscale.
Achieve one-time connection to access all private network resources.
Support ACL access control and SSO.
Tailscale has excellent multi-platform support: iOS, macOS, Android, Windows, Linux, and supports IPv6.
Headscale Architecture Overview
Core Components
Headscale is an open-source implementation of the Tailscale control server, fully compatible with the Tailscale protocol:
Control Server (Headscale): Manages node registration, key distribution, address allocation, and ACL policies.
DERP Server: Relay server for traffic forwarding when NAT traversal fails.
Client (Tailscale): Deployed on nodes, establishes encrypted tunnels based on WireGuard.
Technical Advantages
Compared to traditional VPN solutions:
Zero Trust Architecture: Deny all access by default, explicitly authorize via ACL.
NAT Traversal: Supports STUN protocol for automatic hole punching, no public IP required.
Mesh Network: Direct communication between nodes without passing through a central server.
Auto Reconnection: Automatically re-establishes connections upon network changes.
# Server URLserver_url:https://headscale.wnote.com:443# Listen addresslisten_addr:0.0.0.0:8080# Private key (auto-generated or specified)private_key_path:/var/lib/headscale/private.key# DERP relay server configuration derp:# Enable embedded DERP serverserver:enabled:falseregion_id:999# Use an unused ID to avoid conflictsregion_code:"self-derp"# Short identifierregion_name:"Self-hosted Co-located DERP"# Descriptive namestun_listen_addr:"0.0.0.0:3478"# STUN service listening, must open UDP 3478# [Important] Enter your server's actual public IPv4 addressipv4:"YOUR_SERVER_PUBLIC_IP"# If you have public IPv6, you can also fill it in# ipv6: "YOUR_SERVER_PUBLIC_IPV6"# Automatically add this region to DERP mapautomatically_add_embedded_derp_region:true# [Security] Strongly recommended to enable, only allow your Tailnet clients to use this relayverify_clients:true# You can choose to keep the official DERP as a backup, or disable it completely (clear urls)urls:- https://controlplane.tailscale.com/derpmap/default # Keep backup (recommended)# If you have additional self-hosted DERP nodes (e.g., Singapore node), keep pathspaths:- /etc/headscale/derp.yamlauto_update_enabled:trueupdate_frequency:24h# Database (Default SQLite, PostgreSQL recommended for production)database:type:sqlitesqlite:path:/var/lib/headscale/db.sqlite# ACL policy fileacl_policy_path:/etc/headscale/acl.yaml# DNS configurationdns:nameservers:- 1.1.1.1- 8.8.8.8magic_dns:true# Enable MagicDNS (node.namespace.vpn)# OAuth configuration (optional)# oidc:# issuer: "https://your-oidc-provider.com"# client_id: "your-client-id"# client_secret: "your-client-secret"# Log configurationlog:format:textlevel:info
Configuring the DERP Relay Server
First, let’s understand the DERP concept:
The DERP server is primarily a reliable, low-latency backup relay solution to ensure devices can always connect when network conditions are poor (e.g., symmetric NAT, strict firewalls) preventing direct connections.
Direct Connection (P2P): Headscale’s ultimate goal is to let clients establish peer-to-peer direct connections via the STUN protocol “hole punching”, offering the fastest speed.
Relay (DERP): When hole punching fails (e.g., one party is behind strict symmetric NAT), traffic is forwarded via the DERP server. Although latency is higher than direct connection, it ensures uninterrupted connection.
If you want to self-host DERP and finely control multiple DERP nodes, you need to create a configuration file to define your private DERP network regions and node information.
The benefit of self-hosting DERP is that you can avoid the latency issues of official DERP servers overseas and keep relay traffic on your own servers, improving security and stability.
I don’t need this for now, just an example. Create /etc/headscale/derp.yaml:
# /etc/headscale/derp.yamlregions:# First self-hosted region, e.g., East China901:regionid:901regioncode:"cn-east"# Region code, shortregionname:"East China Self-hosted Node"# Region namenodes:- name:901aregionid:901# Enter your independent DERP server domain name herehostname:derp-shanghai.yourdomain.com# Your DERP server port, usually 443 or customderpport:12345# STUN port, usually 3478stunport:3478stunonly:false# Second self-hosted region, e.g., North China902:regionid:902regioncode:"cn-north"regionname:"North China Self-hosted Node"nodes:- name:902aregionid:902hostname:derp-beijing.yourdomain.comderpport:443# Assume this node uses standard port 443stunport:3478stunonly:false
After creating the configuration, you can view the DERP map and connectivity on any Tailscale client:
tailscale debug derp-map
tailscale debug derp headscale # Test connectivity with a specific DERP server
To verify if the client is actually using the self-hosted relay, you can check the connection status via tailscale status. If you see derp=self-derp (or your custom region code), it means traffic is going through your self-hosted relay.
# User/Group definitionsgroups:group:admin:- admin@example.comgroup:devops:- devops1@example.com- devops2@example.comgroup:developer:- dev1@example.com- dev2@example.com# Host tag definitionshosts:sg-cloud:"10.100.0.1/32"network-a:"10.100.1.0/24"network-b:"10.100.2.0/24"network-c:"10.100.3.0/24"# Access control rulesacls:# Admins can access all resources- action:acceptsrc:- group:admindst:- "*:*"# Ops team can access all servers via SSH- action:acceptsrc:- group:devopsdst:- "sg-cloud:22"- "network-a:22"- "network-b:22"- "network-c:22"# Dev team can access application ports- action:acceptsrc:- group:developerdst:- "sg-cloud:80,443,8080"- "network-a:3306,6379"- "network-b:9200,9300"# Deny all other access by default- action:rejectsrc:- "*"dst:- "*:*"# Tag routing rules (optional)tagOwners:tag:prod:- group:admin- group:devopstag:database:- group:admin
Since we previously applied for a wildcard domain certificate via acme, we can directly specify the SSL certificate path. Configure according to actual scenarios in the enterprise:
Ensure the server firewall allows the following traffic:
Port Protocol Purpose Description
443 TCP HTTPS (Control Plane + DERP Relay) Opened via reverse proxy
3478 UDP STUN Service Must open UDP 3478 for NAT traversal detection
If using cloud service providers (like AWS, Alibaba Cloud), you also need to add inbound rules in the security group.
# Enable IP forwardingecho"net.ipv4.ip_forward=1" | sudo tee -a /etc/sysctl.conf
echo"net.ipv6.conf.all.forwarding=1" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
# Advertise subnet routes (execute on gateway node)sudo tailscale up --advertise-routes=10.100.1.0/24,10.100.2.0/24
# Approve routes on serversudo headscale routes list
sudo headscale routes enable -r <RouteID>
# Verify routingsudo tailscale status
ping 10.100.1.10 # Test cross-network access
Exit Node Configuration (Optional)
Configure an exit node to access the internet via a specific node (designated internet gateway):
# Enable on exit nodesudo tailscale up --advertise-exit-node
# Approve on serversudo headscale nodes list
sudo headscale nodes edit --exit-node <NodeID> --enable
# Client uses exit nodesudo tailscale up --exit-node=<ExitNodeIP> --exit-node-allow-lan-access
Specify to forward traffic via and allow simultaneous access to the local LAN. If --exit-node-allow-lan-access is not added, local LAN access (like printers, router admin pages) will also go through the exit node, potentially causing inaccessibility.
# Use PostgreSQL as database backenddatabase:type:postgrespostgres:host:pg-cluster.example.comport:5432user:headscalepassword:"your-password"database:headscalesslmode:require
# Client configurationsudo tailscale up --ping=20s # Heartbeat interval
Single Point of Failure Risk
If official DERP is completely disabled (urls: []) and there is only one self-hosted DERP server, all clients requiring relay will be unable to communicate if that server goes down or has network issues. For production, it is recommended to keep at least the official DERP as a backup or deploy multiple self-hosted DERP nodes.
Certificate Requirements
Embedded DERP relies on Headscale’s HTTPS service, so your domain headscale.wnote.com must have a valid SSL certificate.
# Check if IP forwarding is enabledsysctl net.ipv4.ip_forward
# Check if routes are approvedsudo headscale routes list
# Packet capture analysissudo tcpdump -i tailscale0 -n host 10.100.x.x