Use terraform to config GCP proxy setup

With terraform (https://terraform.io) we can convert most of the infrastructure setup into code. This simplifies setting up a new proxy as well as providing reproducibility in the setup, eliminating human errors as much as possible.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=190634711
This commit is contained in:
jianglai 2018-03-27 10:29:44 -07:00
parent 2bbde9d9a9
commit 6dec95b980
15 changed files with 641 additions and 30 deletions

View file

@ -28,6 +28,230 @@ service like [Spinnaker](https://www.spinnaker.io/) for release management.
## Detailed Instruction
We use [`gcloud`](https://cloud.google.com/sdk/gcloud/) and
[`terraform`](https://terraform.io) to configure the proxy project on GCP. We
use [`kubectl`](https://kubernetes.io/docs/tasks/tools/install-kubectl/) to
deploy the proxy to the project. Additionally,
[`gsutil`](https://cloud.google.com/storage/docs/gsutil) is used to create GCS
bucket for storing the terraform state file. These instructions assume that all
four tools are installed.
### Setup GCP project
There are three projects involved:
- Nomulus project: the project that hosts Nomulus.
- Proxy project: the project that hosts this proxy.
- GCR ([Google Container
Registry](https://cloud.google.com/container-registry/)) project: the
project from which the proxy pulls its Docker image.
We recommend using the same project for Nomulus and the proxy, so that logs for
both are collected in the same place and easily accessible. If there are
multiple Nomulus projects (environments), such as production, sandbox, alpha,
etc, it is recommended to use just one as the GCR project. This way the same
proxy images are deployed to each environment, and what is running in production
is the same image tested in sandbox before.
The following document outlines the procedure to setup the proxy for one
environment.
In the proxy project, create a GCS bucket to store the terraform state file:
```bash
$ gsutil config
$ gsutil mb -p <proxy-project> gs://<bucket-name>/
```
### Obtain a domain and SSL certificate
The proxy exposes two endpoints, whois.\<yourdomain.tld\> and
epp.\<yourdomain.tld\>. The base domain \<yourdomain.tld\> needs to be obtained
from a registrar ([Google Domains](https://domains.google) for example). Nomulus
operators can also self-allocate a domain in the TLDs under management.
[EPP protocol over TCP](https://tools.ietf.org/html/rfc5734) requires a
client-authenticated SSL connection. The operator of the proxy needs to obtain
an SSL certificate for domain epp.\<yourdomain.tld\>. [Let's
Encrypt](https://letsencrypt.org) offers SSL certificate free of charge, but any
other CA can fill the role.
Concatenate the certificate and its private key into one file:
```bash
$ cat <certificate.pem> <private.key> > <combined_secret.pem>
```
The order between the certificate and the private key inside the combined file
does not matter. However, if the certificate file is chained, i. e. it contains
not only the certificate for your domain, but also certificates from
intermediate CAs, these certificates must appear in order. The previous
certificate's issuer must be the next certificate's subject.
### Setup proxy project
First setup the [Application Default
Credential](https://cloud.google.com/docs/authentication/production) locally:
```bash
$ gcloud auth application-default login
```
Login with the account that has "Project Owner" role of all three projects
mentioned above.
Navigate to `java/google/registry/proxy/terraform`, create a folder called
`envs`, and inside it, create a folder for the environment that proxy is
deployed to ("alpha" for example). Copy `example_config.tf` to the environment
folder.
```bash
$ cd java/google/registry/proxy/terraform
$ mkdir -p envs/alpha
$ cp example_config.tf envs/alpha/config.tf
```
Now go to the environment folder, edit the `config.tf` file and replace
placeholders with actual project and domain names.
Run terraform:
```bash
$ cd envs/alpha
(edit config.tf)
$ terraform init -upgrade
$ terraform apply
```
Go over the proposed changes, and answer "yes". Terraform will start configuring
the projects, including setting up clusters, keyrings, load balancer, etc. This
takes a couple of minutes.
### Setup Nomulus
After terraform completes, it outputs some information, among which is the
client id of the service account created for the proxy. This needs to be added
to the Nomulus configuration file so that Nomulus accepts traffic from the
proxy. Edit the following section in
`java/google/registry/config/files/nomulus-config-<env>.yaml` and redeploy
Nomulus:
```yaml
oAuth:
allowedOauthClientIds:
- <client_id>
```
### Setup nameservers
The terraform output (run `terraform output` in the environment folder to show
it again) also shows the nameservers of the proxy domain (\<yourdomain.tld\>).
Delegate this domain to these nameservers (through your registrar). If the
domain is self-allocated by Nomulus, run:
```bash
$ nomulus -e production update_domain <yourdomain.tld> \
-c <registrar_client_name> -n <nameserver1>,<nameserver2>,...
```
### Setup named ports
Unfortunately, terraform currently cannot add named ports on the instance groups
of the GKE clusters it manages. [Named
ports](https://cloud.google.com/compute/docs/load-balancing/http/backend-service#named_ports)
are needed for the load balancer it sets up to route traffic to the proxy. To
set named ports, in the environment folder, do:
```bash
$ bash ../../update_named_ports.sh
```
### Encrypt the certificate to Cloud KMS
With the newly set up Cloud KMS key, encrypt the certificate/key combo file
created earlier:
```bash
$ gcloud kms encrypt --plaintext-file <combined_secret.pem> \
--ciphertext-file <combined_secret.pem.enc> \
--key <key-name> --keyring <keyring-name> --location global
```
Place the encrypted file <combined_secret.pem.enc> to
`java/google/registry/proxy/resources`.
### Edit proxy config file
Proxy configuration files are at `java/google/registry/proxy/config/`. There is
a default config that provides most values needed to run the proxy, and several
environment-specific configs for proxy instances that communicate to different
Nomulus environments. The values specified in the environment-specific file
override those in the default file.
The values that need to be changed include the project name, the Nomulus
endpoint, encrypted certificate/key combo filename, Cloud KMS keyring and key
names, etc. Refer to the default file for detailed descriptions on each field.
### Upload proxy docker image to GCR
Edit the `proxy_push` rule in `java/google/registry/proxy/BUILD` to add the GCR
project name and the image name to save to. Note that as currently set up, all
images pushed to GCR will be tagged `bazel` and the GKE deployment object loads
the image tagged as `bazel`. This is fine for testing, but for production one
should give images unique tags (also configured in the `proxy_push` rule).
To push to GCR, run:
```bash
$ bazel run java/google/registry/proxy:proxy_push
```
### Deploy proxy
Terraform by default creates three clusters, in the Americas, EMEA, and APAC,
respectively. We will have to deploy to each cluster separately. The cluster
information is shown by `terraform output` as well.
Deployment is defined in two files, `proxy-deployment-<env>.yaml` and
`proxy-service.yaml`. Edit `proxy-deployment-<env>.yaml` for your environment,
fill in the GCR project name and image name. You can also change the arguments
in the file to turn on logging, for example. To deploy to a cluster:
```bash
# Get credentials to deploy to a cluster.
$ gcloud container clusters get-credentials --project <proxy-project> \
--zone <cluster-zone> <cluster-name>
# Deploys environment specific kubernetes objects.
$ kubectl create -f \
java/google/registry/proxy/kubernetes/proxy-deployment-<env>.yaml
# Deploys shared kubernetes objects.
$ kubectl create -f \
java/google/registry/proxy/kubernetes/proxy-service.yaml
```
Repeat this for all three clusters.
### Afterwork
Remember to turn on [Stackdriver
Monitoring](https://cloud.google.com/monitoring/docs/) for the proxy project as
we use it to collect metrics from the proxy.
You are done! The proxy should be running now. You should store the private key
safely, or delete it as you now have the encrypted file shipped with the proxy.
See "Additional Steps" in the appendix for other things to check.
## Appendix
Here we give detailed instructions on how to configure a GCP project to host the
proxy manually. We strongly recommend against doing so because it is tedious and
error-prone. Using Terraform is much easier. The following instructions are for
educational purpose for readers to understand why we set up the infrastructure
this way. The Terraform config is essentially a translation of the following
procedure.
### Set default project
The proxy can run on its own GCP project, or use the existing project that also
@ -93,6 +317,15 @@ $ gcloud projects add-iam-policy-binding <project-id> \
--member serviceAccount:<service-account-email> --role roles/viewer
```
Also bind the "Logs Writer" and role to the proxy service account so that it can
write logs to [Stackdriver Logging](https://cloud.google.com/logging/).
```bash
$ gcloud projects add-iam-policy-binding <project-id> \
--member serviceAccount:<service-accounte-email> \
--role roles/logging.logWriter
```
### Obtain a domain and SSL certificate
A domain is needed (if you do not want to rely on IP addresses) for clients to
@ -202,9 +435,17 @@ immediately after.
```bash
$ gcloud container clusters create proxy-americas-cluster --enable-autorepair \
--enable-autoupgrade --enable-autoscaling --max-nodes=3 --min-nodes=1 \
--zone=us-east1-c --cluster-version=1.9.4-gke.1 --tags=proxy-cluster
--zone=us-east1-c --cluster-version=1.9.4-gke.1 --tags=proxy-cluster \
--service-account=<service-account-email>
```
We give the GCE instances inside the cluster the same credential as the proxy
service account, which makes it easier to limit permissions granted to service
accounts. If we use the default GCE service account, we'd have to grant the
default GCE service account permission to read from GCR in order to download
images of the proxy to create pods, which gives *any* GCE instance with the
default service account that permission.
Note the `--tags` flag: it will apply the tag to all GCE instances running in
the cluster, making it easier to set up firewall rules later on. Use the same
tag for all clusters.
@ -239,29 +480,13 @@ To push to GCR, run:
$ bazel run java/google/registry/proxy:proxy_push
```
If the GCP project to host pull images (image project) is different from the
project that the proxy runs in (proxy project), the default compute engine
service account from the proxy project needs to be granted the ["Storage Object
Viewer"](https://cloud.google.com/container-registry/docs/access-control) role
in the image project. Kubernetes clusters in the proxy project use GCE VMs as
nodes and the nodes by default use the default compute engine service account
credential to pull images. This account is different from the proxy service
account created earlier, which represents the credentials that the proxy itself
has.
To find the default compute engine service account:
```bash
$ gcloud iam service-accounts list \
| grep "Compute Engine default service account"
```
To add the account with "Storage Object Viewer" role to the project hosting the
images:
If the GCP project to host images (gcr project) is different from the project
that the proxy runs in (proxy project), give the service account "Storage Object
Viewer" role of the gcr project.
```bash
$ gcloud projects add-iam-policy-binding <image-project> \
--member serviceAccount:<gce-default-service-account> \
--member serviceAccount:<service-account-email> \
--role roles/storage.objectViewer
```

View file

@ -14,10 +14,6 @@ spec:
labels:
app: proxy
spec:
volumes:
- name: service-account-key
secret:
secretName: service-account
containers:
- name: proxy
image: gcr.io/GCP_PROJECT/IMAGE_NAME:bazel
@ -38,14 +34,9 @@ spec:
port: health-check
initialDelaySeconds: 15
periodSeconds: 20
volumeMounts:
- name: service-account-key
mountPath: /var/secrets/google
imagePullPolicy: Always
args: ["--env", "alpha", "--log"]
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/secrets/google/service-account-key.json
- name: POD_ID
valueFrom:
fieldRef:

View file

@ -0,0 +1,35 @@
terraform {
backend "gcs" {
# The name of the GCS bucket that stores the terraform.tfstate file.
bucket = "YOUR_GCS_BUCKET"
prefix = "terraform/state"
}
}
module "proxy" {
source = "../../modules"
proxy_project_name = "YOUR_PROXY_PROJECT"
nomulus_project_name = "YOUR_NOMULUS_GPROJECT"
gcr_project_name = "YOUR_GCR_PROJECT"
proxy_domain_name = "YOUR_PROXY_DOMAIN"
}
output "proxy_service_account_client_id" {
value = "${module.proxy.proxy_service_account_client_id}"
}
output "proxy_name_servers" {
value = "${module.proxy.proxy_name_servers}"
}
output "proxy_instance_groups" {
value = "${module.proxy.proxy_instance_groups}"
}
output "proxy_ip_addresses" {
value = {
ipv4 = "${module.proxy.proxy_ipv4_address}",
ipv6 = "${module.proxy.proxy_ipv6_address}"
}
}

View file

@ -0,0 +1,3 @@
provider "google" {
project = "${var.proxy_project_name}"
}

View file

@ -0,0 +1,36 @@
resource "google_dns_managed_zone" "proxy_domain" {
name = "proxy-domain"
dns_name = "${var.proxy_domain_name}."
}
resource "google_dns_record_set" "proxy_epp_a_record" {
name = "epp.${google_dns_managed_zone.proxy_domain.dns_name}"
type = "A"
ttl = 300
managed_zone = "${google_dns_managed_zone.proxy_domain.name}"
rrdatas = ["${google_compute_global_address.proxy_ipv4_address.address}"]
}
resource "google_dns_record_set" "proxy_epp_aaaa_record" {
name = "epp.${google_dns_managed_zone.proxy_domain.dns_name}"
type = "AAAA"
ttl = 300
managed_zone = "${google_dns_managed_zone.proxy_domain.name}"
rrdatas = ["${google_compute_global_address.proxy_ipv6_address.address}"]
}
resource "google_dns_record_set" "proxy_whois_a_record" {
name = "whois.${google_dns_managed_zone.proxy_domain.dns_name}"
type = "A"
ttl = 300
managed_zone = "${google_dns_managed_zone.proxy_domain.name}"
rrdatas = ["${google_compute_global_address.proxy_ipv4_address.address}"]
}
resource "google_dns_record_set" "proxy_whois_aaaa_record" {
name = "whois.${google_dns_managed_zone.proxy_domain.dns_name}"
type = "AAAA"
ttl = 300
managed_zone = "${google_dns_managed_zone.proxy_domain.name}"
rrdatas = ["${google_compute_global_address.proxy_ipv6_address.address}"]
}

View file

@ -0,0 +1,32 @@
module "proxy_gke_americas" {
source = "./gke"
proxy_cluster_region = "americas"
proxy_service_account_email = "${google_service_account.proxy_service_account.email}"
proxy_ports = "${var.proxy_ports}"
}
module "proxy_gke_emea" {
source = "./gke"
proxy_cluster_region = "emea"
proxy_service_account_email = "${google_service_account.proxy_service_account.email}"
proxy_ports = "${var.proxy_ports}"
}
module "proxy_gke_apac" {
source = "./gke"
proxy_cluster_region = "apac"
proxy_service_account_email = "${google_service_account.proxy_service_account.email}"
proxy_ports = "${var.proxy_ports}"
}
locals {
"proxy_instance_groups" = {
americas = "${module.proxy_gke_americas.proxy_instance_group}",
emea = "${module.proxy_gke_emea.proxy_instance_group}",
apac = "${module.proxy_gke_apac.proxy_instance_group}",
}
}
output "proxy_instance_groups" {
value = "${local.proxy_instance_groups}"
}

View file

@ -0,0 +1,37 @@
locals {
proxy_cluster_zone = "${lookup(var.proxy_cluster_zones, var.proxy_cluster_region)}"
}
data "google_container_engine_versions" "gke_version" {
zone = "${local.proxy_cluster_zone}"
}
resource "google_container_cluster" "proxy_cluster" {
name = "proxy-cluster-${var.proxy_cluster_region}"
zone = "${local.proxy_cluster_zone}"
node_version = "${data.google_container_engine_versions.gke_version.latest_node_version}"
min_master_version = "${data.google_container_engine_versions.gke_version.latest_master_version}"
node_pool {
name = "proxy-node-pool"
initial_node_count = 1
node_config {
tags = [
"proxy-cluster"]
service_account = "${var.proxy_service_account_email}"
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email"
]
}
autoscaling {
max_node_count = 5
min_node_count = 1
}
management {
auto_repair = true
auto_upgrade = true
}
}
}

View file

@ -0,0 +1,16 @@
variable "proxy_service_account_email" {}
variable "proxy_cluster_region" {}
variable "proxy_cluster_zones" {
type = "map"
default = {
americas = "us-east4-a"
emea = "europe-west4-b"
apac = "asia-northeast1-c"
}
}
variable "proxy_ports" {
type = "map"
}

View file

@ -0,0 +1,3 @@
output "proxy_instance_group" {
value = "${google_container_cluster.proxy_cluster.instance_group_urls[0]}"
}

View file

@ -0,0 +1,26 @@
resource "google_service_account" "proxy_service_account" {
account_id = "proxy-service-account"
display_name = "Nomulus proxy service account"
}
resource "google_project_iam_member" "nomulus_project_viewer" {
project = "${var.nomulus_project_name}"
role = "roles/viewer"
member = "serviceAccount:${google_service_account.proxy_service_account.email}"
}
resource "google_project_iam_member" "gcr_storage_viewer" {
project = "${var.gcr_project_name}"
role = "roles/storage.objectViewer"
member = "serviceAccount:${google_service_account.proxy_service_account.email}"
}
resource "google_project_iam_member" "metric_writer" {
role = "roles/monitoring.metricWriter"
member = "serviceAccount:${google_service_account.proxy_service_account.email}"
}
resource "google_project_iam_member" "log_writer" {
role = "roles/logging.logWriter"
member = "serviceAccount:${google_service_account.proxy_service_account.email}"
}

View file

@ -0,0 +1,31 @@
# GCP project in which the proxy runs.
variable "proxy_project_name" {}
# GCP project in which Nomulus runs.
variable "nomulus_project_name" {}
# GCP project from which the proxy image is pulled.
variable "gcr_project_name" {}
# The base domain name of the proxy, without the whois. or epp. part.
variable "proxy_domain_name" {}
# Cloud KMS keyring name
variable "proxy_key_ring" {
default = "proxy-key-ring"
}
# Cloud KMS key name
variable "proxy_key" {
default = "proxy-key"
}
# Node ports exposed by the proxy.
variable "proxy_ports" {
type = "map"
default = {
health_check = 30000
whois = 30001
epp = 30002
}
}

View file

@ -0,0 +1,16 @@
resource "google_kms_key_ring" "proxy_key_ring" {
name = "${var.proxy_key_ring}"
location = "global"
}
resource "google_kms_crypto_key" "proxy_key" {
name = "${var.proxy_key}"
key_ring = "${google_kms_key_ring.proxy_key_ring.id}"
}
resource "google_kms_crypto_key_iam_member" "ssl_key_decrypter" {
crypto_key_id = "${google_kms_crypto_key.proxy_key.id}"
role = "roles/cloudkms.cryptoKeyDecrypter"
member = "serviceAccount:${google_service_account.proxy_service_account.email}"
}

View file

@ -0,0 +1,116 @@
resource "google_compute_global_address" "proxy_ipv4_address" {
name = "proxy-ipv4-address"
ip_version = "IPV4"
}
resource "google_compute_global_address" "proxy_ipv6_address" {
name = "proxy-ipv6-address"
ip_version = "IPV6"
}
resource "google_compute_firewall" "proxy_firewall" {
name = "proxy-firewall"
network = "default"
allow {
protocol = "tcp"
ports = [
"${var.proxy_ports["epp"]}",
"${var.proxy_ports["whois"]}",
"${var.proxy_ports["health_check"]}"]
}
source_ranges = [
"130.211.0.0/22",
"35.191.0.0/16"]
target_tags = [
"proxy-cluster"
]
}
resource "google_compute_health_check" "proxy_health_check" {
name = "proxy-health-check"
tcp_health_check {
port = "${var.proxy_ports["health_check"]}"
request = "HEALTH_CHECK_REQUEST"
response = "HEALTH_CHECK_RESPONSE"
}
}
resource "google_compute_backend_service" "epp_backend_service" {
name = "epp-backend-service"
protocol = "TCP"
timeout_sec = 3600
port_name = "epp"
backend {
group = "${local.proxy_instance_groups["americas"]}"
}
backend {
group = "${local.proxy_instance_groups["emea"]}"
}
backend {
group = "${local.proxy_instance_groups["apac"]}"
}
health_checks = [
"${google_compute_health_check.proxy_health_check.self_link}"]
}
resource "google_compute_backend_service" "whois_backend_service" {
name = "whois-backend-service"
protocol = "TCP"
timeout_sec = 60
port_name = "whois"
backend {
group = "${local.proxy_instance_groups["americas"]}"
}
backend {
group = "${local.proxy_instance_groups["emea"]}"
}
backend {
group = "${local.proxy_instance_groups["apac"]}"
}
health_checks = [
"${google_compute_health_check.proxy_health_check.self_link}"]
}
resource "google_compute_target_tcp_proxy" "epp_tcp_proxy" {
name = "epp-tcp-proxy"
proxy_header = "PROXY_V1"
backend_service = "${google_compute_backend_service.epp_backend_service.self_link}"
}
resource "google_compute_target_tcp_proxy" "whois_tcp_proxy" {
name = "whois-tcp-proxy"
proxy_header = "PROXY_V1"
backend_service = "${google_compute_backend_service.whois_backend_service.self_link}"
}
resource "google_compute_global_forwarding_rule" "epp_ipv4_forwarding_rule" {
name = "epp-ipv4-forwarding-rule"
ip_address = "${google_compute_global_address.proxy_ipv4_address.address}"
target = "${google_compute_target_tcp_proxy.epp_tcp_proxy.self_link}"
port_range = "700"
}
resource "google_compute_global_forwarding_rule" "epp_ipv6_forwarding_rule" {
name = "epp-ipv6-forwarding-rule"
ip_address = "${google_compute_global_address.proxy_ipv6_address.address}"
target = "${google_compute_target_tcp_proxy.epp_tcp_proxy.self_link}"
port_range = "700"
}
resource "google_compute_global_forwarding_rule" "whois_ipv4_forwarding_rule" {
name = "whois-ipv4-forwarding-rule"
ip_address = "${google_compute_global_address.proxy_ipv4_address.address}"
target = "${google_compute_target_tcp_proxy.whois_tcp_proxy.self_link}"
port_range = "43"
}
resource "google_compute_global_forwarding_rule" "whois_ipv6_forwarding_rule" {
name = "whois-ipv6-forwarding-rule"
ip_address = "${google_compute_global_address.proxy_ipv6_address.address}"
target = "${google_compute_target_tcp_proxy.whois_tcp_proxy.self_link}"
port_range = "43"
}

View file

@ -0,0 +1,15 @@
output "proxy_name_servers" {
value = "${google_dns_managed_zone.proxy_domain.name_servers}"
}
output "proxy_service_account_client_id" {
value = "${google_service_account.proxy_service_account.unique_id}"
}
output "proxy_ipv4_address" {
value = "${google_compute_global_address.proxy_ipv4_address.address}"
}
output "proxy_ipv6_address" {
value = "${google_compute_global_address.proxy_ipv6_address.address}"
}

View file

@ -0,0 +1,29 @@
#!/bin/bash
# Copyright 2018 The Nomulus Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Terraform currently cannot set named ports on the instance groups underlying
# the gke instances it creates. Here we output the instance group URL, extract
# the project, zone and instance group names, and then call gcloud to add the
# named ports.
terraform output proxy_instance_groups | awk '{print $3}' | \
awk -F '/' '{print "--project", $7, "--zone", $9, $11}' |
{
while read line
do
gcloud compute instance-groups set-named-ports \
--named-ports whois:30001,epp:30002 $line
done
}