Commit graph

104 commits

Author SHA1 Message Date
Ben McIlwain
c61f36502e Add a new check API that does not wrap the domain check EPP flow
Copied class and test from CheckApiAction. All unit tests passing.

Remaining work: add metrics

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=198916177
2018-06-06 15:05:30 -04:00
guyben
9d2b1e7572 Consolidate all Set parameter parsing
Currently, we have two different ways to parse a "set" parameter:
key=value1&key=value2&key=value3...
and
keys=value1,value2,value3

This is error prone for several reasons:
- different parts of the code must be "synchronized" to use the same style (the
  place that creates the request, and the place that parses the request)
- for the key=value1&key=value2, we often use the same key name for the single
  value and the set value. This can result in subtle bugs where part of the
  code will successfully read the key assuming there's only one key (and will
  get the first key=value1, ignoring the rest)

Here we transition everything to the keys=value1,value2,value3 method. This one
was chosen because:
- it's shorter
- it's more intuitive for users
- the key name is plural, differentiating it from the singular key=value that
  other requests might need

-----------------------------------

To make sure there are not "transition issues", we will continue to support
(with warnings) the key=value1&key=value2 parameter parsing until we're sure we
haven't forgotten to update any part of the code.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=198810681
2018-06-06 15:04:02 -04:00
jianglai
c0a7bde95e Remove deprecated PublishDetailReportAction
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=198624767
2018-06-06 14:59:30 -04:00
mcilwain
ac500652ac Add "pubapi" App Engine service for check API, WHOIS, and RDAP
The migration plan is as follows:
1. This CL, which adds the new "pubapi" service that serves the check API, WHOIS, and RDAP.
2a. Update our public facing sites to switch over to use the new service.
2b. (either order) Rewrite the check API to remove dependencies on flows.
3. ... eventually, once the frontend service is no longer being hit by this traffic, remove its handling of these public endpoints.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=197716346
2018-05-30 12:18:54 -04:00
mcilwain
807ab2b27b Reduce prod/sandbox frontend manual instance count from 100 to 30
100 is way overkill with manual scaling.  30 is most likely still overkill too,
but we want to tune incrementally rather than all at once.  Note that at 30
instances we're expecting around 3 QPS per instance, which is still an order
of magnitude less than each instance can actually handle.

This also fixes the instance type on sandbox to be the same as on prod.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=196875876
2018-05-17 21:52:35 -04:00
mcilwain
9c0d3b6db3 Add limit to list_domains command
This allows list_domains to continue working for large TLDs.

TESTED=Deploys to alpha and it works to list the most recently created domains even
on a TLD with a huge number of domains on it (much more than .app has currently).

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=196717389
2018-05-17 21:52:35 -04:00
larryruili
c007458e1a Switch default service to manual scaling at 100 instances
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=196129129
2018-05-17 21:52:35 -04:00
mcilwain
ec782367c0 Increase tools instance timeout duration to 60 minutes
This should decrease the average wait time when running nomulus tool.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=195465469
2018-05-05 23:50:52 -04:00
mmuller
d3bb808c5f Increase the number of instances on alpha
Increase the instances on alpha to achieve parity with sandbox.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=194980588
2018-05-05 23:40:15 -04:00
mcilwain
613b19799a Increase commit log bucket count in production to match other envs
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=192614234
2018-04-23 14:29:59 -04:00
mcilwain
168a23206d Increase export-snapshot queue rate from 5/m to 1/s
Five per minute just isn't working well enough on environments with lots of
entities (e.g. alpha and sandbox right now), and there doesn't seem to be a
real need to enforce such a low throttle.  The mapreduce queue, for instance,
has 500/s (effectively no throttle).

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=192474962
2018-04-23 14:26:55 -04:00
guyben
8a9453f476 Replace registrar-premium-price-ack with registrar-settings
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=192355664
2018-04-23 14:22:18 -04:00
mcilwain
e0c32337fd Add mapreduce to delete load test data
This hard-deletes all contacts and hosts owned by a specific set of registrar
client IDs, currently just "proxy".

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=192325211
2018-04-10 17:07:15 -04:00
mcilwain
8f1848e32e Disable verify entity integrity mapreduce on sandbox
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=192289233
2018-04-10 16:59:28 -04:00
mcilwain
a8b6195ce2 Make RDE run less frequently on sandbox/alpha
This also removes RDE tasks that shouldn't/can't run on non-production environments, like upload/reporting.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=192177779
2018-04-10 16:56:22 -04:00
mcilwain
e816913c61 Increase # of commit log buckets ~4X for all non-prod environments
This also reduces the interval of the commitLogCheckpoint cron job to once
every three minutes, as this job needs to load all commit log bucket entities.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=191613858
2018-04-10 16:33:47 -04:00
mcilwain
377fe5f573 Allow number of commit log buckets to be increased
Also increases the number of commit log buckets on alpha to 397 and correspondingly
reduces the frequency of commit log diff exporting to once every 3 minutes.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=191440586
2018-04-10 16:16:08 -04:00
mmuller
785225fc28 Implement "premium price ack required" checkbox
Implement a checkbox in the "Resources" tab to allow registrars to toggle
their "premium price ack required" flag.

Tested:
  Verfied the console functionality by hand.  I've started work on an
  automated test, but we can't actually test those from blaze and the
  kokoro tests are way too time-consuming to be practical for development, so
  we're going to have to either find a way to run those locally outside of
  the normal process or make do without a test.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=190212177
2018-04-02 16:33:51 -04:00
guyben
63785e5149 Remove empty TLD parameter when fanning out without TLDs
TldFanoutAction fans out a given endpoint to all TLDs (either TEST, REAL, or
both).

However, it is also used to delegate a single endpoint request that we want set
in a specific queue (so we can control retries). We do that by setting the TLD
list to "runInEmpty" rather than "forEachRealTld" or "forEachTestTld".

Currently, using "runInEmpty" would still specify a TLD - but that TLD would be
the empty string. This is a bug: it sets the TLD parameter to a bad value. It
worked only because none of the endpoints called with "runInEmpty" were using
the TLD parameter.

However, this will (and does) break if either (a) the endpoint accepts an
optional TLD parameter (like deleteProberData does), or (b) the given endpoint
already has a TLD parameter in it (we want to run the endpoint with a single
TLD, but still use the "fanout" to set the right queue).

This CL fixes several things:

- if runInEmpty is given, no TLD parameter is added
- 'runInEmpty' is now mutually exclusive with 'forEach*Tld' and 'excludes'
- we do some sanity checks and added logging
- removed the buggy and unused "':tld' in path is replaced by TLD"
- in the cron.xml, removed documentation for :tld and the broken :registrar

Note that none of the endpoints that were used with runInEmpty fanout had the TLD parameter prior to deleteProberData

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=189954585
2018-04-02 16:24:27 -04:00
guyben
f5cb227f0e Reorder items in cron.xml for crash to make it similar to alpha
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=189642125
2018-03-19 18:49:03 -04:00
guyben
27894df45f Turn off deleteProberData on alpha for duration of loadtesting
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=187484858
2018-03-06 19:09:52 -05:00
larryruili
a365b82d42 Update publish queue with practical retry params
The unlimited exponential backoff makes cascading failure a serious problem,
when encountering burst DNS load. Originally, it was exponential backoff, with min 1 sec max 1 hour.

This changes it to be linearly scaling from
30 seconds to 10 minutes. Min 30 seconds is used to avoid over-retrying due to lock contention. Max 10 minutes allows for more retries within our 1 hour SLA. Finally, we're
switching to linear scaling to increase the number of 'quick' retries for low
backoff time, before ultimately settling on the upper bound of 10 minutes (if a
task ever gets to that point, it's probably misconfigured.)

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=186041553
2018-02-20 16:00:33 -05:00
mcilwain
001ce9cd52 Increase number of frontend/backend instances on prod/sandbox to 100
The higher the number the better for serious launches. These used to be 100
but had been detuned because instances weren't dying correctly when no longer
needed, thus contributing to higher costs than necessary. That problem was
fixed when we migrated to the Java 8 runtime, however, so there's no reason
not to use the higher number.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=184742738
2018-02-20 15:18:54 -05:00
guyben
8beb10c2a3 Update sandbox / alpha cron.xml to be in line with production
There are 2 types of changed done here:
- reorder the existing cron jobs to be in the same order as production (for
  easier diffing)
- add missing cron-jobs to either alpha or sandbox

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=183232936
2018-02-01 21:57:39 -05:00
larryruili
fbdb148540 Add billing pipeline to cron
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=181793243
2018-01-19 14:38:38 -05:00
larryruili
ab5e16ab67 Add publish functionality to billing pipeline
This closes the end-to-end billing pipeline, allowing us to share generated detail reports with registrars via Drive and e-mail the invoicing team a link to the generated invoice.

This also factors out the email configs from ICANN reporting into the common 'misc' config, since we'll likely need alert e-mails for future periodic tasks.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=180805972
2018-01-04 17:17:59 -05:00
larryruili
552ab12314 Prepare billing pipeline for production
This makes a few cosmetic changes that prepares the pipeline for production.

Namely:
- Converts file names to include the input yearMonth, mostly mirroring the original invoicing pipeline.
- Factors out the yearMonth logic from the reporting module to the more common backend module. We will likely use the default yearMonth logic in other backend tasks (such as spec11 reporting).
- Adds the "withTemplateCompatability" flag to the Bigquery read, which allows multiple uses of the same template.
- Adds the 'billing' task queue, which retries up to 5 times every 3 minutes, which is about the rate we desire for checking if the pipeline is complete.
- Adds a shell 'invoicing upload' class, which tests the retry semantics we want for post-generation work (e-mailing the invoice to crr-tech, and publishing detail reports)

While this cl may look big, it's mostly just a refactor and setting up boilerplate needed to frame the upload logic.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=179849586
2017-12-27 11:39:21 -05:00
mcilwain
46aa638b74 Rationalize prod/sandbox instance numbers to 50/5/50
That's 50 each for frontend and backend and 5 for tools. Since the
MetricExporter bug has been fixed for awhile now, we aren't gaining anything by
artificially keeping the instance number low, whereas we might benefit from
higher instance counts, e.g. for load-testing.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=179432038
2017-12-27 11:13:42 -05:00
guyben
0e3d050dae Temporarily disable deleteProberData cron job in sandbox for load-testing
Loadtesting data is identified as "prober data" by this job (it removes
anything under ".test", not only prober data)

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=177309096
2017-12-01 22:14:06 -05:00
mmuller
0ffd3553c3 Increase max number of sandbox frontend instances to 8
This mirrors production in hopes of triggering b/67508570 to test the fix.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=175295742
2017-11-21 18:24:32 -05:00
larryruili
8dcc2d6833 Chain ICANN report upload after staging
This converts the upload task from a cron job to a task chained after staging.
This ensures the upload job only occurs when its dependencies are met, and
provides a faster turnaround time to verify both the staging and upload jobs
are complete.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=175045489
2017-11-21 18:16:08 -05:00
larryruili
eff2266e35 Add apache beam to registry and open source
This is the initial commit of the new billing system, rewritten as an Apache
Beam pipeline. This contains a basic end-to-end pipeline as proof of concept,
and boilerplate for GenerateInvoicesAction, which will eventually be our
automated invoice generation endpoint.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=174184171
2017-11-07 17:36:07 -05:00
larryruili
486c348a00 Add reporting cron jobs to production
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=173569808
2017-11-07 17:25:46 -05:00
mountford
74873f90c8 Order RDAP domain searches by TLD in domain name order
I am not happy that another index is required, but the Pantheon console shows that domain indexes are much smaller than the other indexes (because there are fewer domains), so it's not adding an appreciable amount of storage space.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=173561771
2017-11-07 17:21:26 -05:00
mcilwain
c24d5b8a88 Increase the frontend service idle timeout from 10 to 30 minutes
This should help reduce the occurrence of requests taking a long time
to process because a new instance is being spun up. We might consider
increasing this further to 60 minutes in the future if necessary.

This also increases the number of frontend instances on production to 8
from 6, since it appears like the issue we were attempting to mitigate
with that change is now fixed.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=173440059
2017-11-07 17:07:10 -05:00
mcilwain
62dcf2f1a7 Temporarily tune down production frontend instances to 6
We'll revert this once the stuck instance issue in Java 8 is fixed.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=173183426
2017-10-24 16:53:47 -04:00
mountford
03087ddc85 Add RDAP support for deleted domains and filtering by registrar
This CL adds the functionality for domain searches. Entities and nameservers have already been handled by previous CLs.

Deleted items can only be seen by admins, and by registrars viewing their own deleted items.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=172097922
2017-10-24 16:53:47 -04:00
mountford
048ae4b4ba Add term to contact index
RDAP searches for contacts with a specific desired registrar need an additional
index term. The tests were not extensive enough to catch this particular case.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=172013843
2017-10-24 16:53:47 -04:00
bbilbo
24d58bf505 Add cron/commitLogCheckpoint to cron.xml for the crash environment
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=171297448
2017-10-10 12:09:41 -04:00
mcilwain
c69b409b30 Add java8 runtime option to production appengine-web.xml files
It occurs to me that we can't have this setting different between sandbox
and production, otherwise we can end up with a situation where we push code
that works on sandbox but then fails only when it is pushed to production.
Sandbox and production need to always be set up similarly for this reason.

We'll just have to pay a greater amount of attention to the release process
next week than normal, and continue playing around in alpha for the mean
time with a fully Java 8 build.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=170197703
2017-10-04 16:16:45 -04:00
mcilwain
8908686f23 Add java8 runtime option to all non-production appengine-web*.xml files
Java 8 is go!

https://cloudplatform.googleblog.com/2017/09/Java-8-on-App-Engine-Standard-environment-is-now-generally-available.html

We will add this option to the production files next week.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=170101056
2017-10-04 16:16:45 -04:00
mcilwain
7dc1940cdb Move ResaveAllEppResources mapreduce from tools service to backend
It makes sense for all mapreduces to run in backend, especially onces
that are scheduled regularly to run in cron like this one now. We don't
have many instances configured for the tools service anymore on some
of our environments, so backend is the friendliest place for a mapreduce
to run.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=168882122
2017-09-20 10:27:17 -04:00
guyben
d7214b58fc Re-enable DeleteOldCommitLogs cron job
Also adds a "resave all epp" cron job that's needed for the delete to work correctly.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=168879965
2017-09-20 10:27:17 -04:00
larryruili
efd7010f9d Add resave command for all HistoryEntries
This pattern will mainly be used for data migrations, i.e. updating all
HistoryEntries' DomainTransactionRecords to the new schema.

TESTED=Ran in alpha, the underlying data dropped non-Objectify fields as
expected.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=168684356
2017-09-20 10:27:17 -04:00
dxy
d8c1501213 Add PollMapreduceAction
This is the first in a series of CLs containing code from an old CL of Dai's that had never been completed, which compares zone data between Datastore and DNS. I had written a script to do this by calling two nomulus commands, but maybe it can be done directly in Java, which would be convenient.

This CL is just the plumbing to check on the status of a Mapreduce. We will need this to know that we can proceed with the next step of comparing the output to the DNS data.

Cloned from CL 134295050 by 'g4 patch'.
Original change by dxy@dxy:zoneman-reader:1939:citc on 2016/09/26 10:34:22.

Add a command for comparing zone data between DNS and datastore

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=167188979
2017-09-12 15:51:44 -04:00
larryruili
477617eec9 Add activity report generation code
This adds Bigquery API client code to generate the activity reports from our
now standardSQL queries. The naming mirrors that of RDE (Staging generates the
reports and uploads them to GCS).

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=164656344
2017-08-29 15:53:33 -04:00
larryruili
4130a8a75e Create ICANN report upload action
This is the first step in moving the current []cron-Python reporting scripts
into App Engine, as an official part of the Nomulus package. This copies the
structure of RDE uploads, with a few changes specific to monthly reporting.

I've left some TODOs related to actually testing it on the ICANN endpoint, as we're still not sure how files to be uploaded will be staged, and whether we can actually ping their endpoint on valid ports (80 or 443).

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=160408703
2017-07-10 11:27:58 -04:00
mcilwain
30d5d05fdf Refactor/rename refresh all DNS action
I'm moving it out of the scrap folder too because there's nothing else
in there and we do want to retain this indefinitely because it's a useful
tool for performing DNS writer migrations.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=160168902
2017-07-10 11:18:41 -04:00
mountford
02a5e3d20f Remove /_dr security constraint from web.xml files
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=159610409
2017-06-21 10:08:40 -04:00
mmuller
e6af34301d Move restore from backend to tools
Move the "restoreCommitLogs" command from the backend module to the tools
module so it's easier to access with nomulus.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=156768389
2017-05-23 17:22:49 -04:00