Back to Blog

Setting up private GKE automation using Terraform

Introduction

One of your most important decisions when creating a GKE cluster is deciding whether it will be public or private. Public Clusters are assigned with public and private IP addresses both and they can be accessed anywhere on the internet. However, on the other hand, Private clusters are assigned with a private IP address. Having said that, clusters are isolated from inbound and outbound traffic. Nodes in the private cluster cannot be accessed by the client. The nodes can only connect to the internet with the cloud NAT. So, this provides better security.

Secure GKE Private Cluster

Today, we are going to learn and deploy GKE private clusters. In GKE, private clusters are the clusters whose nodes are isolated from inbound and outbound traffic by assigning them internal IP addresses only. Private clusters in GKE have the option of exposing the control plane endpoint as a publicly accessible address or as a private address.

Please keep in mind, Nodes in private clusters are assigned with the Private IP. this means they are isolated from inbound and outbound communication until you configured cloud NAT.NAT service will allow nodes in the private network to access the internet, enabling them to download the required images from the Docker hub or the public registry else go for the private registry if you restrict both incoming and outgoing traffic. One more point that we should remember is that Private clusters can also have private endpoints as well as public endpoints.

Kubernetes (K8s) is one of the best container orchestration platforms today, but it is also complex. Because Kubernetes aims to be a general solution for specific problems, it has many moving parts. Each of these parts must be understood and configured to meet your unique requirements.

In this demo, we will create the following resources

  • A network named vpc-h-cluster-1.
  • A Subnetwork named subnet1.
  • A private cluster named gke-h-cluster-1 has private nodes and has client access to the public endpoint.
  • Managed node pool with n sets of nodes. This will be based on environment.
  • A Cloud Nat gateway named nat-h-cluster-1

To provide outbound internet access for your private nodes, such as to pull images from an external registry, use Cloud NAT to create and configure a Cloud Router. Cloud NAT lets private clusters establish outbound connections over the internet to send and receive packets.

Pre-requisite

1. A GCP Account with one Project.

2. Service Account. Make sure the SA must have appropriate permission to provision the resources.

3. Gcloud CLI with the configured service account for terraform deployment.

4. Terraform

Now we’re ready to get started:

Here is the Terraform configuration code that will be used to deploy our entire setup.  

Following is the directory hierarchy:  

Devops-toolchain  

├── GKE_cluster_commission.sh

├── README.md

├── gke.tf

├── keys

│   └── env.key

├── nat.tf

├── outputs.tf

├── route.tf

├── terraform.tfstate

├── terraform.tfvars

├── variables.tf

└── vpc.tf

     

Step1.  We have the terraform source files placed in the gitlab repository.  Please checkout to get the more understanding.

https://gitlab.com/tweeny-dev/devops-toolchain/-/tree/GKE-Cluster/AWS_to_GCP_migration_config

Step2. Following is the code for provisioning Virtual Private Cloud (VPC) with custom subnet creation.

vpc.tf

1tterraform { 
2  required_providers { 
3    google = { 
4      source  = "hashicorp/google" 
5      version = "4.27.0" 
6    } 
7  } 
8 
9  required_version = ">= 0.14" 
10} 
11
12provider "google" { 
13  project = var.project_id 
14  region  = var.region 
15 # credentials = file(var.auth_file) 
16} 
17
18# VPC 
19resource "google_compute_network" "vpc" { 
20  name                    = "vpc-${var.cluster_name}" 
21#  creating custom subnet for VPC 
22   auto_create_subnetworks = "false" 
23} 
24
25resource "google_compute_subnetwork" "subnet" { 
26  name          = "subnet1"  
27# We are creating custom subnet for us-central1 region only 
28region        = "us-central1" 
29network       = google_compute_network.vpc.name 
30ip_cidr_range = "10.0.0.0/24" 
31} 

Step3. Following code will create the cluster and the node pool.

gke.tf

1resource "google_container_cluster" "primary" { 
2  name     = var.cluster_name 
3  location = var.region 
4
5# We can't create a cluster with no node pool defined, but we want to only use 
6# separately managed node pools. So we create the smallest possible default 
7# node pool and immediately delete it. 
8remove_default_node_pool = true 
9initial_node_count       = 1 
10
11network    = google_compute_network.vpc.name 
12subnetwork = google_compute_subnetwork.subnet.name 
13private_cluster_config { 
14# Disabled private endpoint and public endpoint is enabled  
15enable_private_endpoint = "false" 
16# Making the nodes as private by which they won’t have public ip allocated 
17enable_private_nodes    = "true" 
18master_ipv4_cidr_block  = var.master_ipv4_cidr_block 
19
20} 
21
22ip_allocation_policy { 
23} 
24dynamic "master_authorized_networks_config" { 
25
26    for_each = var.master_authorized_networks_config 
27
28    content { 
29
30      dynamic "cidr_blocks" { 
31
32        for_each = lookup(master_authorized_networks_config.value, "cidr_blocks", []) 
33
34        content { 
35
36          cidr_block   = cidr_blocks.value.cidr_block 
37
38          display_name = lookup(cidr_blocks.value, "display_name", null) 
39
40        } 
41
42      } 
43
44    } 
45
46  } 
47
48} 
49
50 
51
52#Separately Managed Node Pool 
53
54resource "google_container_node_pool" "primary_nodes" { 
55
56  name       = google_container_cluster.primary.name 
57
58  location   = var.region 
59
60  node_locations = var.node_locations 
61
62  cluster    = google_container_cluster.primary.name 
63
64  node_count = var.gke_num_nodes 
65
66# we have the autoscaling enabled to scale the application when needed 
67
68  autoscaling { 
69
70    min_node_count = var.min_node 
71
72    max_node_count = var.max_node 
73
74  } 
75
76 
77
78  management { 
79
80    auto_repair  = "true" 
81
82    auto_upgrade = "true" 
83
84  } 
85
86  node_config { 
87
88 
89
90    labels = { 
91
92      env = var.project_id 
93
94    } 
95
96 
97
98    # preemptible  = true 
99
100    image_type   = var.image_type 
101
102    machine_type = var.machine_type 
103
104    tags         = ["gke-node", "${var.project_id}-gke"] 
105
106    metadata = { 
107
108      disable-legacy-endpoints = "true" 
109
110    } 
111
112        service_account = var.service_account 
113
114    oauth_scopes = [ 
115
116      "https://www.googleapis.com/auth/cloud-platform" 
117
118    ] 
119
120  } 
121
122} 

Step4.  Now, its time to create the NAT-gateway for our private cluster. Here we have used the module which is capable to create the nat-gateway.

nat.tf

1module "cloud-nat" { 
2
3  source                             = "terraform-google-modules/cloud-nat/google" 
4
5  version                            = "~> 2.0" 
6
7  project_id                         = var.project_id 
8
9  region                             = var.region 
10
11  router                             = google_compute_router.router.name 
12
13  name                               = "nat-${var.cluster_name}" 
14
15  source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES" 
16
17} 

Step5: nat-gateway needs the router configuration to have the communication with the outer network. Let’s define the router configuration.

route.tf

1resource "google_compute_router" "router" { 
2
3  project = var.project_id 
4
5  name    = "router-${var.cluster_name}" 
6
7  network = google_compute_network.vpc.name 
8
9  region  = var.region 
10
11} 

Step6: We have made use of the variables in terraform script so now we need to define those variables.

variables.tf

1variable "project_id" { 
2
3  description = "project id" 
4
5} 
6
7 
8
9variable "region" { 
10
11  description = "region" 
12
13} 
14
15 
16
17 
18
19variable "gke_num_nodes" { 
20
21  description = "number of gke nodes" 
22
23} 
24
25 
26
27variable "master_ipv4_cidr_block" { 
28
29  description = "The IP range in CIDR notation (size must be /28) to use for the hosted master network" 
30
31  type        = string 
32
33  default     = "10.13.0.0/28" 
34
35 
36
37} 
38
39variable "master_authorized_networks_config" { 
40
41  description = <<EOF 
42
43  The desired configuration options for master authorized networks. Omit the nested cidr_blocks attribute to disallow external access (except the cluster node IPs, which GKE automatically whitelists) 
44
45  ### example format ### 
46
47  master_authorized_networks_config = [{ 
48
49    cidr_blocks = [{ 
50
51      # We are not restricting the access to control pane. If needed then we need define CIDR range to allow the access. 
52
53      cidr_block   = "0.0.0.0/0" 
54
55      display_name = "example_network" 
56
57    }], 
58
59  }] 
60
61EOF 
62
63  type        = list(any) 
64
65  default     = [] 
66
67} 
68
69variable "cluster_name" { 
70
71  description = "cluster name" 
72
73  type        = string 
74
75 
76
77} 
78
79 
80
81variable "image_type" { 
82
83  description = "container image type" 
84
85  type        = string 
86
87  default     = "cos_containerd" 
88
89 
90
91} 
92
93 
94
95variable "machine_type" { 
96
97  description = "node image type" 
98
99  type        = string 
100
101  default     = "e2-standard-2" 
102
103} 
104
105 
106
107variable "node_locations" { 
108
109  description = "Zone names on which nodes will be provisioned" 
110
111  type = list(string) 
112
113} 
114
115 
116
117variable "min_node" { 
118
119  description = "minimum no of nodes for autoscaling" 
120
121  type        = number 
122
123} 
124
125 
126
127variable "max_node" { 
128
129  description = "maximum no of nodes for autoscaling" 
130
131  type        = number 
132
133} 
134
135 
136
137variable "service_account" { 
138
139  description = "service account name" 
140
141  type = string 
142
143} 

Step7:  We have only declared some of the variables and not defined their value. Terraform provides a way to define our secret variables in terraform.tfvars file. We have the following variables defined.

project_id = "<project-id>" 

region     = "us-central1" 

service_account = "<service-account-name>" 

Step8:  Finally, we want to print the basic cluster information as output post the successful provisioning. Here is the information.

output.tf

1output "region" { 
2
3  value       = var.region 
4
5  description = "GCloud Region" 
6
7} 
8
9 
10
11output "project_id" { 
12
13  value       = var.project_id 
14
15  description = "GCloud Project ID" 
16
17} 
18
19 
20
21output "kubernetes_cluster_name" { 
22
23  value       = google_container_cluster.primary.name 
24
25  description = "GKE Cluster Name" 
26
27} 
28
29 
30
31output "kubernetes_cluster_host" { 
32
33  value       = google_container_cluster.primary.endpoint 
34
35  description = "GKE Cluster Host" 
36
37} 

Step9: We have prepared the parameterized shell script to run the terraform code to provision the private GKE cluster.

GKE_cluster_commission.sh

1#!/bin/bash 
2
3 
4
5# This script takes 2 parameters CLUSTER_NAME and ENVIRONMENT (dev or prod) 
6
7cd $BASE_PROJECT_PATH 
8
9ENV=$2 
10
11ROOT_DIR="$(pwd)" 
12
13rm -rf .terraform 
14
15if [[ $2 == "dev" ]] 
16
17then 
18
19        # DEV cluster deployment with 2 AZ and 2 nodes 
20
21        terraform init 
22
23        terraform validate 
24
25        terraform plan -var "cluster_name=$1" -var-file "$ROOT_DIR/vars/$ENV.tfvars" -out=".$ENV.plan" 
26
27        terraform apply ".$ENV.plan" 
28
29elif [[ $2 == "prod" ]] 
30
31then 
32
33        # PROD cluster deployment with 3 AZ and 3 nodes 
34
35        terraform init 
36
37        terraform validate 
38
39        terraform plan -var "cluster_name=$1" -var-file "$ROOT_DIR/vars/$ENV.tfvars" -out=".$ENV.plan" 
40
41        terraform apply ".$ENV.plan" 
42
43else 
44
45        echo "Please enter DEV or PROD as input to 2nd paramater" 
46
47fi 

Execute the following command to run the script. We have the source code directory set in different file so we need to reference the source path while running the script. (/keys/env.key).

source ./keys/env.key && bash GKE_cluster_commission.sh <cluster-name> <environment>

For this demo, we are using the local terraform state file which will be limited to the system but we can keep the terraform state file in any cloud location (AWS S3 or GCS etc) by defining the backend configuration where we can keep the versions of state file and also can be accessible to multiple people.  

Step10:  Terraform will take up to 20 minutes to provision the cluster and if everything goes well, we will see the following resources has been created.      

Command: terraform state list

Google Cloud Console Screenshots:

Private cluster status is green and ready to use

Cluster is regional and has the public end point enabled

Private cluster with custom VPC/subnet

Custom subnet with primary and secondary ip address

Cloud NAT  

Now the question is How to Access/Connect your private cluster?

Since in this private cluster public endpoint is enabled, we can connect to the cluster anywhere from the outside of GCP network. We can restrict the access to the cluster by defining the CIDR range to the authorized network. In this demo, we have allowed all the network to connect to cluster in authorized network (0.0.0.0).

It is possible to restrict the access only to specific instances which needs access to cluster by listing there ip address in the authorized network.

Ex:   authorized_network = [10.20.30.40/32,36.78.134.80/32]

As our cluster has the public endpoint enabled and opened the connectivity from any network, we can connect to cluster from any server with following command.

gcloud container clusters get-credentials cluster-1 --region asia-south1 --project oceanic-granite-383005

Deploy sample application to Kubernetes cluster

As the cluster has been created and accessed the successfully, now its time to deploy the sample application. We are deploying nginx application to test the cluster.

Run the following command to create the deployment

kubectl create deployment nginx –image=nginx

Following command will expose nginx through LoadBalancer

kubectl expose deployment nginx –type=LoadBalancer –port=80

Nginx deployment has been created and exposed to the external load balancer. Following screenshot shows the same.

Now Go ahead and copy the public IP of the load balancer and access it into the browser. You will see the Nginx Welcome page.

Newsletter - Code Webflow Template

Subscribe to our newsletter

Stay updated with industry trends, expert tips, case studies, and exclusive Tweeny updates to help you build scalable and innovative solutions.

Thanks for joining our newsletter.
Oops! Something went wrong.