[1] All logs should contain a reference to the channel so that it is easy to search for logs about a specific channel.
[2] EPP ssl handshake failure should be logged at warning. It is mostly the client that failed to complete the handshake, for example by sending bad cert, or not sending cert, or not using the correct SSL version. We should not lot it at error and spam the log.
[3] When the EPP response is not 200, we should not log at error because it means that the GAE app responded successfully. For example when datastore contention occurs, app engine responds with a non-200 status and logs at warning. The proxy should not at a higher level than app engine itself.
[4] Timeout is a non-fatal error that should be logged at warning.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=207562299
The connection to GAE is not persistent and can drop. Reconnect when that happens, as long as the connection from the client is still active.
We need to consider the fact that while a reconnection is happening, the client may be sending requests that was relayed to the old connection, which is not going through. In that case these requests are queued and will be retried when the new connection is available.
Since we are no longer tying the lifecycles of the two connections, we cannot automatically terminate one when another is terminated. Also we need to explicitly control how WHOIS connection is terminated, not depending on the HTTP connection header.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=207335498
There's not much we can do when the user sends incorrect HTTP requests or cannot finish SSL handshake (the problematic requests are likely from bots anyway). Reducing the log level to warning in order to reduce spamming.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=207159118
We are seeing some web WHOIS HTTP(S) requests made to our endpoints without the Host header specified. This is an error according to the HTTP/1.1 spec. However we do not want to spam our logs with errors that are outside of our control. Do not throw and return a 400 response instead.
Also re-worked the logic a bit to only return HSTS headers if we send a redirect response, not any other error responses. The tests are re-arrange to correspond with the logical flow in the code.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=207143230
The server certificates and corresponding keys are encrypted by KMS and stored on GCS. This allows us to easily replace expiring certs without having to roll out a new proxy release. However currently the certificate is obtained as a singleton and used in all connections served by a proxy instance. This means that if we were to upload a new cert, all existing instances will not use it.
This CL makes it so that we only cache the certificate for 30 min, after which a new cert is fetched and decrypted. Local certificates used for testing are still singletons.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=206976318
Opened two ports (30010 and 30011 by default) that handles HTTP(S) GET requests. the HTTP request is redirected to the corresponding HTTPS site, whereas the HTTPS request is redirected to a site that supports web WHOIS.
The GCLB currently exposes port 80, but not port 443 on its TCP proxy load balancer (see https://cloud.google.com/load-balancing/docs/choosing-load-balancer). As a result, the HTTP traffic has to be routed by the HTTP load balancer, which requires a separate HTTP health check (as opposed to the TCP health check that the TCP proxy LB uses). This CL also added support for HTTP health check.
There is not a strong case for adding an end-to-end test for WebWhoisProtocolsModule (like those for EppProtocolModule, etc) as it just assembles standard HTTP codecs used for an HTTP server, plus the WebWhoisRedirectHandler, which is tested. The end-to-end test would just be testing if the Netty provided HTTP handlers correctly parse raw HTTP messages.
Sever other small improvement is also included:
[1] Use setInt other than set when setting content length in HTTP headers. I don't think it is necessary, but it is nevertheless a better practice to use a more specialized setter.
[2] Do not write metrics when running locally.
[3] Rename the qualifier @EppCertificates to @ServerSertificate as it now provides the certificate used in HTTPS traffic as well.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=206944843
This is a 'green' Flogger migration CL. Green CLs are intended to be as
safe as possible and should be easy to review and submit.
No changes should be necessary to the code itself prior to submission,
but small changes to BUILD files may be required.
Changes within files are completely independent of each other, so this CL
can be safely split up for review using tools such as Rosie.
For more information, see []
Base CL: 197331037
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=197466715
When EPP SSL handshake is unsuccessful, #channelInactive is called but there are no quotas to return, because quotas are only leased upon the first #channelRead. There is no need to log a warning and throw an exception in this case because the handshake exception would have been thrown already. Throwing a second exception just crowds the log.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=197016756
This allows us to not obtain a certificate and encrypt it with KMS when running the proxy locally during development.
Also updated FOSS build dagger version.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=191746309
If the proxy protocol header contains a malformatted string, such as "PROXY UNKNOWN", instead of throwing and killing the connection, use the TCP source IP as the remote IP.
Also changed how the header is read from the buffer, to avoid a potential Netty resource leak. Originally the header is read into another ByteBuf, which needs be be explicit released in order for Netty to reclaim its memory (http://netty.io/wiki/reference-counted-objects.html). Now we just read it into a byte array and let JVM GC it.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=188047084
When not running locally, the logging formatter is set to convert the log record to a single-line JSON string that Stackdriver logging agent running in GKE will pick up and parse correctly.
Also removed redundant logging handler in the proxy frontend connection. They have two problems: 1) it is possible to leak PII when all frontend traffic is logged, such as client IPs. Even though this is less of a concern because the GCP TCP proxy load balancer masquerade source IPs. 2) We are only logging the HTTP request/response that the frontend connection is sending to/receiving from the backend connection, but the backend already has its own logging handler to log the same message that it gets from/sends to the GAE app, so the logging in the frontend connection does not really give extra information.
Logging of some potential PII information such as the source IP of a proxied connection are also removed.
Thirdly, added a k8s autoscaling object that scales the containers based on CPU load. The default target load is 80%. This, in connection with GKE cluster VM autoscaling, means that when traffic is low, we'll only have one VM running one container of the proxy.
Fixes a bug where the MetricsComponent generates a separate ProxyConfig that does not call parse method on the command line args passed, resulting default Environment always being used in constructing the metric reporter.
Lastly a little bit of cleaning of the MOE config script, no newlines are necessary as the BUILD are formatted after string substitution.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=188029019
When a quota request is rejected, increment the metric counter by one.
Also makes both frontend and backend metrics singleton because all the fields they have a static.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185146804
The quota handler terminates connections when quota is exceeded.
The next CL will add instrumentation for quota related metrics.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185042675
Dagger updated to 2.13, along with all its dependencies.
Also allows us to have multiple config files for different environment (prod, sandbox, alpha, local, etc) and specify which one to use on the command line with a --env flag. Therefore the same binary can be used in all environments.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=176551289