Commit graph

40 commits

Author SHA1 Message Date
jianglai
db60f0fd12 Create canary records in proxy zones
This allows for the creation of records like epp-canary.registr.google.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=199850436
2018-06-18 17:50:15 -04:00
jianglai
61f6e666b1 Enforce no logging in production environment
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=199156367
2018-06-06 15:10:15 -04:00
jianglai
3960207502 Log source IP when logging is enabled
We will only enable logging for non-production environment, so there shouldn't be any privacy concerns by enabling this.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=198744739
2018-06-06 15:02:31 -04:00
jianglai
af8b050446 Tweak log message a bit
SERVER and CLIENT is a bit hard to understand.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=198721870
2018-06-06 15:01:00 -04:00
jianglai
65ac28fae5 Increate GKE cluster upgrade timeout time to 30m
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=198322158
2018-05-30 12:18:54 -04:00
jianglai
a5abb05761 Migrating to fluent logging (red)
This is a 'red' Flogger migration CL. Red CLs contain changes which are
likely not to work without manual intervention.

Note that it may not even be possible to directly migrate the logger
usage in this CL to the Flogger API and some additional refactoring may
be required. If this is the case, please note that it should be safe to
submit any outstanding 'green' and 'yellow' CLs prior to tackling this.

If you feel that your use case is not covered by the existing Flogger API
please raise a feature request at []and
revert this CL.

For more information, see []
Base CL: 197331037

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=197503952
2018-05-30 12:18:54 -04:00
jianglai
05f166918f Migrating to fluent logging (green)
This is a 'green' Flogger migration CL. Green CLs are intended to be as
safe as possible and should be easy to review and submit.

No changes should be necessary to the code itself prior to submission,
but small changes to BUILD files may be required.

Changes within files are completely independent of each other, so this CL
can be safely split up for review using tools such as Rosie.

For more information, see []
Base CL: 197331037

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=197466715
2018-05-30 12:18:54 -04:00
jianglai
0cb303ed7f Fix proxy metrics instrumentation bug
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=197209531
2018-05-30 12:18:54 -04:00
jianglai
68b24f0a54 Migrate to internal FormattingLogger in GCP proxy in preparation of migration to Flogger
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=197199265
2018-05-30 12:18:54 -04:00
jianglai
053c52e0bd Add Flogger to GCP proxy
This adds a dummy flogger logging statement in the GCP proxy to ensure that it
works.

TESTED=Deployed to alpha and verified that flogger works. Also passed FOSS
tests.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=196899036
2018-05-30 12:18:54 -04:00
jianglai
1248a7722b Enable logging in sandbox GCP proxies
This makes it easier to debug issues. There are no privacy concerns in sandbox.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=197045576
2018-05-17 21:52:35 -04:00
jianglai
0fb845e81a Remove no quota leased warning from quota handler inactive callback
When EPP SSL handshake is unsuccessful, #channelInactive is called but there are no quotas to return, because quotas are only leased upon the first #channelRead. There is no need to log a warning and throw an exception in this case because the handshake exception would have been thrown already. Throwing a second exception just crowds the log.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=197016756
2018-05-17 21:52:35 -04:00
jianglai
e5f4b5a17b Add Flogger to GCP proxy
This adds a dummy flogger logging statement in the GCP proxy to ensure that it
works.

TESTED=Deployed to alpha and verified that flogger works. Also passed FOSS
tests.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=196899036
2018-05-17 21:52:35 -04:00
guyben
7bf8c02264 Replace uses of X.to(Upper|Lower)Case() with Ascii.to(Upper|Lower)Case(X)
Locales are weird. Even if all our character individually are just 0-9a-z_,
different locales might still convert them differently to upper/lower cases...

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=193512312
2018-04-23 15:02:31 -04:00
jianglai
f289259101 Change UserPolicy to PUBLIC on WHOIS and EPP endpoints
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=193407195
2018-04-23 14:59:24 -04:00
jianglai
77bfa5f4b8 Move autoscale object to service yaml file
The autoscaling manifest doesn't really change much from environment to environment. It makes sense to move it to the service yaml file, which is not environment dependent.

Also enhanced bashrc function to update the deployment manifest when deploy the proxy to alpha

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=193407184
2018-04-23 14:57:52 -04:00
jianglai
c6a4264606 Setup sandbox for GCP proxy
1) Clean up alpha config to only allow alpha proxy, removing test proxy client id.
2) Add sandbox service account client id to sandbox config.
3) Add sandbox config to nomulus and proxy, remove TEST environment, which is not being used anymore. (Test now uses LOCAL.)
4) Add sandbox kubenetes config

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=193400909
2018-04-23 14:51:35 -04:00
jianglai
744727a58f Update domain registry proxy terraform annealing config
1) Change annealing target to watch for sandbox terraform config instead of test.
2) Delete terraform config for test project, as this project will be turned down.
3) Do not ask annealing to watch for alpha project terraform config, as we intend to change alpha regularly and manually.
4) Make terraform output display both service account email and client id.
5) Change canary node ports to 3100X, as 4000X is out of range for kubernetes.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=193383457
2018-04-23 14:48:29 -04:00
jianglai
eab6fcc8e6 Add networking settings for canary proxies
Canary proxies are not receiving real traffic but can be useful when testing Nomulus deployment (probers will probe canary proxy and compare metrics with production proxy). This CL added a separate load balancer for a canary proxy, running on the same clusters as production proxy.

The canary proxies have their own IP addresses, but are not assigned domain names. Probers will directly connect to these endpoints by IP.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=193234937
2018-04-23 14:46:56 -04:00
jianglai
23c9cf926c Set namespace as default
This gets around a bug in Spinnaker where the namespace, if missing in the manifest, is set to "spinnaker".

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=192825895
2018-04-23 14:39:09 -04:00
jianglai
983bd27ee0 Read GCP proxy EPP SSL secret from GCS
This allows us to not ship the proxy with certificates/private keys. The secret is still encrypted by KMS. Reading the secret only happens once when the first EPP request comes in, which should not incur any tangible performance penalty.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=191771680
2018-04-10 16:38:31 -04:00
jianglai
18a145eef1 Use self signed certificate when running the proxy locally
This allows us to not obtain a certificate and encrypt it with KMS when running the proxy locally during development.

Also updated FOSS build dagger version.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=191746309
2018-04-10 16:36:56 -04:00
jianglai
4c06b36118 Format terraform files
For some reason the auto-formatting didn't happen when these files are first checked in.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=191589487
2018-04-10 16:27:23 -04:00
jianglai
e7f033201b Use process substitution in terraform config script
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=191584425
2018-04-10 16:25:36 -04:00
jianglai
6dec95b980 Use terraform to config GCP proxy setup
With terraform (https://terraform.io) we can convert most of the infrastructure setup into code. This simplifies setting up a new proxy as well as providing reproducibility in the setup, eliminating human errors as much as possible.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=190634711
2018-04-02 16:46:01 -04:00
jianglai
c72e01f75e Clean up some code quality issues in GCP proxy
All changes are suggested by IntelliJ code inspection.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=189586104
2018-03-19 18:44:12 -04:00
jianglai
22b575b17d Update proxy k8s configs
Some changes are made to the configs so that they agree with the setup guide in []

Combined deployment and autoscale manifests together because they work together.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=189403435
2018-03-19 18:40:42 -04:00
jianglai
33ec789a44 Use GKE-specific metrics in the proxy
Associate the custom metrics with the correct monitored resource type. The labels of the monitored resource are either obtained from environment variables for the container, configured in the GKE deployment file, or queried from GCE metadate server. Using the correct monitored resource can help performance and reduced out-of-order metric writes.

Also changed the metrics display name to be more descriptive.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=189184411
2018-03-19 18:29:39 -04:00
jianglai
00bf8a999f Handle malformed proxy protocol header
If the proxy protocol header contains a malformatted string, such as "PROXY UNKNOWN", instead of throwing and killing the connection, use the TCP source IP as the remote IP.

Also changed how the header is read from the buffer, to avoid a potential Netty resource leak. Originally the header is read into another ByteBuf, which needs be be explicit released in order for Netty to reclaim its memory (http://netty.io/wiki/reference-counted-objects.html). Now we just read it into a byte array and let JVM GC it.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=188047084
2018-03-06 19:26:31 -05:00
jianglai
84eab90000 Make GCP proxy log in a Stackdriver logging compliant format
When not running locally, the logging formatter is set to convert the log record to a single-line JSON string that Stackdriver logging agent running in GKE will pick up and parse correctly.

Also removed redundant logging handler in the proxy frontend connection. They have two problems: 1) it is possible to leak PII when all frontend traffic is logged, such as client IPs. Even though this is less of a concern because the GCP TCP proxy load balancer masquerade source IPs. 2) We are only logging the HTTP request/response that the frontend connection is sending to/receiving from the backend connection, but the backend already has its own logging handler to log the same message that it gets from/sends to the GAE app, so the logging in the frontend connection does not really give extra information.
Logging of some potential PII information such as the source IP of a proxied connection are also removed.

Thirdly, added a k8s autoscaling object that scales the containers based on CPU load. The default target load is 80%. This, in connection with GKE cluster VM autoscaling, means that when traffic is low, we'll only have one VM running one container of the proxy.

Fixes a bug where the MetricsComponent generates a separate ProxyConfig that does not call parse method on the command line args passed, resulting default Environment always being used in constructing the metric reporter.

Lastly a little bit of cleaning of the MOE config script, no newlines are necessary as the BUILD are formatted after string substitution.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=188029019
2018-03-06 19:23:23 -05:00
jianglai
753a269357 Use bazel rules to build docker image and push to GCR
Using bazel to build and push image result in reproducible builds.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=187252645
2018-03-06 19:08:24 -05:00
jianglai
6a994f320f Add GKE deployment config files for GCP proxy
This CL setups up kubernetes configuration files necessary to deploy the proxy service to k8s (GKE to be specific). Because kubernetes service can only expose node ports higher than 30000, the default ports that the containers expose are also changed to >30000 so that they are consistent. This is *not* necessary, but makes it easier to remember which ports are for what purpose.

Note that we are not setting up a load balancing service. The way it is set up now, the services are only visible within the clusters, on each node at the specified node ports. The load balancer k8s sets up uses GCP L4 load balancer that does not support IPv6 (because it does not do TCP termination at the LB, rather just routes packages to cluster nodes, and GCE VMs does not support IPv6 yet). The L4 load balancer also only provides regional IPs on the frontend, which means proxies running in different regions (Americas, EMEA, APAC) would all have different IPs, which in turn offload regional routing determination to the DNS system, adding complexity.

A user of the proxy instead should set up TCP proxy load balancing in GCP separately and point traffic to the VM group(s) backing the k8s cluster. This allows for a single global anycast IP (IPv4 and IPv6) to be allocated at the load balancer frontend.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=187046521
2018-03-06 18:57:43 -05:00
jianglai
f96a0b7da9 Reduce OAuth token cache time to 30min
It seems that even though the token is supposed to be valid for 60min, in
practice it expires before that. Reducing caching time to 30min solves the
problem (at least as far as I can tell). This should not increase too much load
as we are only calling the API twice an hour instead of once.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=186830395
2018-03-06 18:54:20 -05:00
jianglai
edc50bbe59 Containerize GCP proxy
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=186002010
2018-02-20 15:56:13 -05:00
jianglai
ce5baafc4a Register quota metrics in GCP proxy
When a quota request is rejected, increment the metric counter by one.

Also makes both frontend and backend metrics singleton because all the fields they have a static.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185146804
2018-02-20 15:39:15 -05:00
jianglai
6ca523386a Add QuotaHandler to GCP proxy
The quota handler terminates connections when quota is exceeded.

The next CL will add instrumentation for quota related metrics.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185042675
2018-02-20 15:36:23 -05:00
jianglai
b6d2790a13 Add TokenStore and QuotaManager to manage proxy quota requests
The TokenStore is configured by a QuotaConfig for a protocol (EPP/WHOIS). It accepts concurrent take, put and refresh request to grant/accept token to the caller.

The QuotaManager contains a TokenStore and provides abstractions that are appropriate for a quota leasing entity to use. Quota return calls are executed asynchronously by the QuotaManager, and quota refresh tasks are scheduled by the QuotaManager to run periodically.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182109341
2018-01-19 14:46:44 -05:00
jianglai
07622725bf Move metrics dependencies to artifacts under Maven groupId com.google.monitoring-client
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=180580386
2018-01-04 17:12:35 -05:00
jianglai
c5515ab4e6 Add ability to configure proxy quotas
The quotas can be configured in the yaml configuration file. Default quota will be applied to any userId that is not matched in the custom quota list.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=178804649
2017-12-13 12:43:45 -05:00
jianglai
7e42ee48a4 Open source GCP proxy
Dagger updated to 2.13, along with all its dependencies.

Also allows us to have multiple config files for different environment (prod, sandbox, alpha, local, etc) and specify which one to use on the command line with a --env flag. Therefore the same binary can be used in all environments.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=176551289
2017-11-21 19:19:03 -05:00