zydronium/google-nomulus

mirror of https://github.com/google/nomulus.git synced 2025-07-19 01:06:00 +02:00

Author	SHA1	Message	Date
mcilwain	218c4517eb	Stop exporting EPP flow metrics to BigQuery These are simply too costly in their current form now that we are handling double-digit QPS, so at a minimum we'd want to refactor these for batched exports using a background thread (like how Stackdriver metrics work). However, upon further review, that work isn't worth doing if this BigQuery table isn't actually being used for anything, and it seems that we aren't using it anymore given that ICANN transaction reporting no longer requires it. So the simplest thing to do is simply to get rid of this entirely, and just use a combination of Stackdriver metrics and App Engine logs. The eppMetrics BigQuery table is ~1.2 billion rows and takes up 223 GB, so that's not an insignificant GCP billings saving if we can delete it. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=215905466	2018-10-08 16:59:29 -04:00
mcilwain	1d134cdd3f	Delete the verify entity integrity mapreduce We never really used it and it'll be obsolete come Registry 3.0 anyway. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=213274520	2018-09-20 11:19:36 -04:00
larryruili	f36c72cca0	Deploy spec11 reporting to production This turns on spec11 reporting in production by adding it to the cron.xml, generating the report and sending an e-mail with a list of all problematic registrations to the associated registrar on the 2nd of each month at 15:00Z (11am EST) This also tweaks the e-mail template a bit according to suggestions from Bruno. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=213031440	2018-09-14 21:31:34 -04:00
mcilwain	a9944b8ce0	Remove deprecated DNS subsystem ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=212987308	2018-09-14 12:01:08 -04:00
weiminyu	e72f5c09a2	Enable Premium terms export in production Defines cron job in crash, sandbox and production environments. Job already exists in alpha. Job is not added to qa environment. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=212878436	2018-09-14 11:56:42 -04:00
mcilwain	8de36732cb	Delete mapreduce entity cleanup util This is obsoleted by the upcoming Registry 3.0 migration, after which we will be using neither the App Engine Mapreduce library nor Cloud Datastore. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=212864845	2018-09-14 11:55:12 -04:00
weiminyu	80b0e6297b	Export Premium names to Drive ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=212509587	2018-09-14 11:47:38 -04:00
larryruili	c5e6eae555	Add Spec11 registrar emailing mechanism This adds the terminal step of the Spec11 pipeline- processing the output of the Beam pipeline to send an e-mail to each registrar informing them of identified 'bad urls.' This also factors out methods common between invoicing (which uses similar beam pipeline tools) and spec11 to the common superpackage ReportingModule + ReportingUtils classes. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=210932496	2018-09-08 00:06:53 -04:00
mmuller	e4bb1c281c	Uncomment crontab entry for deleteProberData This was only supposed to stay commented out until load-testing was complete. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=210087917	2018-09-08 00:04:26 -04:00
larryruili	33ee7de457	Add GenerateSpec11Action and SafeBrowsing evaluation This adds actual subdomain verification via the SafeBrowsing API to the Spec11 pipeline, as well as on-the-fly KMS decryption via the GenerateSpec11Action to securely store our API key in source code. Testing the interaction becomes difficult due to serialization requirements, and will be significantly expanded in the next cl. For now, it verifies basic end-to-end pipeline behavior. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=208092942	2018-08-10 13:46:48 -04:00
mcilwain	e665a34810	Automated g4 rollback of changelist 204783809. * Reason for rollback * It's still having the same issues from b/79463634 in sandbox, so we don't want to deploy it to prod. * Original change description * Switch pubapi/default service to basic scaling in prod/sandbox Also goes back up to 100 max instances. Hopefully this'll work better this time. *** ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=206975159	2018-08-10 13:46:48 -04:00
mcilwain	ccbdfd0e41	Switch pubapi/default service to basic scaling in prod/sandbox Also goes back up to 100 max instances. Hopefully this'll work better this time. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=204783809	2018-07-17 22:01:08 -04:00
mcilwain	32b3563126	Delete all Braintree code We never launched this, don't planning on launching it now anyway, and it's rotted over the past two years anyway. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=202993577	2018-07-14 01:37:03 -04:00
mcilwain	fb9d876bff	Remove most HTML/CSS/JS assets from the backend service WAR It only needs the error page HTML files; everything else isn't used by endpoints served by the backend service and only serves to increase build times (especially compiling all that JS). ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=202229514	2018-06-27 15:28:53 -04:00
guyben	11c5d11a29	Re-enable the RdeUpload cronjob RdeUpload works on alpha and sandbox by sending the data to a google internal server. It is still running (and succeeding) after each RdeStaging, but the lack of the cron job means it is currently lagging behind. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=202181148	2018-06-27 15:28:53 -04:00
Ben McIlwain	0422205d84	Start using non-EPP-flow-wrapping implementation in CheckAPI ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=201620090	2018-06-27 15:28:52 -04:00
mcilwain	6f2e663b72	Add asynchronous scheduled actions to re-save entities This is used in the domain transfer and delete flows, both of which are asynchronous flows that have implicit default actions that will be taken at some point in the future. This CL adds scheduled re-saves to take place soon after those default actions would become effective, so that they can be re-saved quickly if so. Unfortunately the redemption grace period on our TLDs is 35 days, which exceeds the 30 day maximum task ETA in App Engine, so these won't actually fire. That's fine though; the deletion is actually effective as of 5 days, and this is just removing the grace period. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=201345274	2018-06-27 15:28:52 -04:00
mcilwain	87d1a1c2a3	Further increase the rde-upload queue processing rate We're still limiting to a maximum of 5 concurrent uploads, but when we get backed up (i.e. because we broke RDE like we did recently), it makes sense to burn through the backlog faster once tasks are succeeding again. As I'm going through the backlog now, 5/m isn't fast enough; 10/m seems right. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=201284990	2018-06-27 15:28:52 -04:00
mcilwain	5689234fd2	Allow more RDE upload tasks to run simultaneously We're currently facing a large backlog of RDE upload tasks, most of which won't have anything to do when they execute (because the RDE deposit in question has been successfully uploaded). And we're also facing the occasional >30 minute timeout even though most uploads are succeeding in around a minute. So this CL just lets more run simultaneously so that the backlog can be cleared out faster. Note that we still enforce locking on a per-TLD basis, so it won't be possible for uploads to stomp over each other. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=201257679	2018-06-27 15:28:52 -04:00
mcilwain	9097a32cc8	Remove web & protocol WHOIS, check API, and RDAP from frontend These are now handled by the pubapi service and all publicly facing sites that were using these APIs have already been migrated over. For documentation on the newly added dispatch.xml file, see: https://cloud.google.com/appengine/docs/standard/java/config/dispatchref Note that the --auto_update_dispatch parameter needs to be passed to the `appcfg update` command in order to apply this new XML file. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=200441580	2018-06-18 18:07:53 -04:00
mcilwain	5fdd7a15ca	Delete unused queue delete-commits ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=200062584	2018-06-18 17:57:41 -04:00
Ben McIlwain	c61f36502e	Add a new check API that does not wrap the domain check EPP flow Copied class and test from CheckApiAction. All unit tests passing. Remaining work: add metrics ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=198916177	2018-06-06 15:05:30 -04:00
guyben	9d2b1e7572	Consolidate all Set parameter parsing Currently, we have two different ways to parse a "set" parameter: key=value1&key=value2&key=value3... and keys=value1,value2,value3 This is error prone for several reasons: - different parts of the code must be "synchronized" to use the same style (the place that creates the request, and the place that parses the request) - for the key=value1&key=value2, we often use the same key name for the single value and the set value. This can result in subtle bugs where part of the code will successfully read the key assuming there's only one key (and will get the first key=value1, ignoring the rest) Here we transition everything to the keys=value1,value2,value3 method. This one was chosen because: - it's shorter - it's more intuitive for users - the key name is plural, differentiating it from the singular key=value that other requests might need ----------------------------------- To make sure there are not "transition issues", we will continue to support (with warnings) the key=value1&key=value2 parameter parsing until we're sure we haven't forgotten to update any part of the code. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=198810681	2018-06-06 15:04:02 -04:00
jianglai	c0a7bde95e	Remove deprecated PublishDetailReportAction ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=198624767	2018-06-06 14:59:30 -04:00
mcilwain	ac500652ac	Add "pubapi" App Engine service for check API, WHOIS, and RDAP The migration plan is as follows: 1. This CL, which adds the new "pubapi" service that serves the check API, WHOIS, and RDAP. 2a. Update our public facing sites to switch over to use the new service. 2b. (either order) Rewrite the check API to remove dependencies on flows. 3. ... eventually, once the frontend service is no longer being hit by this traffic, remove its handling of these public endpoints. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=197716346	2018-05-30 12:18:54 -04:00
mcilwain	807ab2b27b	Reduce prod/sandbox frontend manual instance count from 100 to 30 100 is way overkill with manual scaling. 30 is most likely still overkill too, but we want to tune incrementally rather than all at once. Note that at 30 instances we're expecting around 3 QPS per instance, which is still an order of magnitude less than each instance can actually handle. This also fixes the instance type on sandbox to be the same as on prod. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=196875876	2018-05-17 21:52:35 -04:00
mcilwain	9c0d3b6db3	Add limit to list_domains command This allows list_domains to continue working for large TLDs. TESTED=Deploys to alpha and it works to list the most recently created domains even on a TLD with a huge number of domains on it (much more than .app has currently). ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=196717389	2018-05-17 21:52:35 -04:00
larryruili	c007458e1a	Switch default service to manual scaling at 100 instances ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=196129129	2018-05-17 21:52:35 -04:00
mcilwain	ec782367c0	Increase tools instance timeout duration to 60 minutes This should decrease the average wait time when running nomulus tool. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=195465469	2018-05-05 23:50:52 -04:00
mmuller	d3bb808c5f	Increase the number of instances on alpha Increase the instances on alpha to achieve parity with sandbox. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=194980588	2018-05-05 23:40:15 -04:00
mcilwain	613b19799a	Increase commit log bucket count in production to match other envs ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=192614234	2018-04-23 14:29:59 -04:00
mcilwain	168a23206d	Increase export-snapshot queue rate from 5/m to 1/s Five per minute just isn't working well enough on environments with lots of entities (e.g. alpha and sandbox right now), and there doesn't seem to be a real need to enforce such a low throttle. The mapreduce queue, for instance, has 500/s (effectively no throttle). ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=192474962	2018-04-23 14:26:55 -04:00
guyben	8a9453f476	Replace registrar-premium-price-ack with registrar-settings ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=192355664	2018-04-23 14:22:18 -04:00
mcilwain	e0c32337fd	Add mapreduce to delete load test data This hard-deletes all contacts and hosts owned by a specific set of registrar client IDs, currently just "proxy". ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=192325211	2018-04-10 17:07:15 -04:00
mcilwain	8f1848e32e	Disable verify entity integrity mapreduce on sandbox ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=192289233	2018-04-10 16:59:28 -04:00
mcilwain	a8b6195ce2	Make RDE run less frequently on sandbox/alpha This also removes RDE tasks that shouldn't/can't run on non-production environments, like upload/reporting. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=192177779	2018-04-10 16:56:22 -04:00
mcilwain	e816913c61	Increase # of commit log buckets ~4X for all non-prod environments This also reduces the interval of the commitLogCheckpoint cron job to once every three minutes, as this job needs to load all commit log bucket entities. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=191613858	2018-04-10 16:33:47 -04:00
mcilwain	377fe5f573	Allow number of commit log buckets to be increased Also increases the number of commit log buckets on alpha to 397 and correspondingly reduces the frequency of commit log diff exporting to once every 3 minutes. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=191440586	2018-04-10 16:16:08 -04:00
mmuller	785225fc28	Implement "premium price ack required" checkbox Implement a checkbox in the "Resources" tab to allow registrars to toggle their "premium price ack required" flag. Tested: Verfied the console functionality by hand. I've started work on an automated test, but we can't actually test those from blaze and the kokoro tests are way too time-consuming to be practical for development, so we're going to have to either find a way to run those locally outside of the normal process or make do without a test. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=190212177	2018-04-02 16:33:51 -04:00
guyben	63785e5149	Remove empty TLD parameter when fanning out without TLDs TldFanoutAction fans out a given endpoint to all TLDs (either TEST, REAL, or both). However, it is also used to delegate a single endpoint request that we want set in a specific queue (so we can control retries). We do that by setting the TLD list to "runInEmpty" rather than "forEachRealTld" or "forEachTestTld". Currently, using "runInEmpty" would still specify a TLD - but that TLD would be the empty string. This is a bug: it sets the TLD parameter to a bad value. It worked only because none of the endpoints called with "runInEmpty" were using the TLD parameter. However, this will (and does) break if either (a) the endpoint accepts an optional TLD parameter (like deleteProberData does), or (b) the given endpoint already has a TLD parameter in it (we want to run the endpoint with a single TLD, but still use the "fanout" to set the right queue). This CL fixes several things: - if runInEmpty is given, no TLD parameter is added - 'runInEmpty' is now mutually exclusive with 'forEach*Tld' and 'excludes' - we do some sanity checks and added logging - removed the buggy and unused "':tld' in path is replaced by TLD" - in the cron.xml, removed documentation for :tld and the broken :registrar Note that none of the endpoints that were used with runInEmpty fanout had the TLD parameter prior to deleteProberData ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=189954585	2018-04-02 16:24:27 -04:00
guyben	f5cb227f0e	Reorder items in cron.xml for crash to make it similar to alpha ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=189642125	2018-03-19 18:49:03 -04:00
guyben	27894df45f	Turn off deleteProberData on alpha for duration of loadtesting ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=187484858	2018-03-06 19:09:52 -05:00
larryruili	a365b82d42	Update publish queue with practical retry params The unlimited exponential backoff makes cascading failure a serious problem, when encountering burst DNS load. Originally, it was exponential backoff, with min 1 sec max 1 hour. This changes it to be linearly scaling from 30 seconds to 10 minutes. Min 30 seconds is used to avoid over-retrying due to lock contention. Max 10 minutes allows for more retries within our 1 hour SLA. Finally, we're switching to linear scaling to increase the number of 'quick' retries for low backoff time, before ultimately settling on the upper bound of 10 minutes (if a task ever gets to that point, it's probably misconfigured.) ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=186041553	2018-02-20 16:00:33 -05:00
mcilwain	001ce9cd52	Increase number of frontend/backend instances on prod/sandbox to 100 The higher the number the better for serious launches. These used to be 100 but had been detuned because instances weren't dying correctly when no longer needed, thus contributing to higher costs than necessary. That problem was fixed when we migrated to the Java 8 runtime, however, so there's no reason not to use the higher number. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=184742738	2018-02-20 15:18:54 -05:00
guyben	8beb10c2a3	Update sandbox / alpha cron.xml to be in line with production There are 2 types of changed done here: - reorder the existing cron jobs to be in the same order as production (for easier diffing) - add missing cron-jobs to either alpha or sandbox ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=183232936	2018-02-01 21:57:39 -05:00
larryruili	fbdb148540	Add billing pipeline to cron ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=181793243	2018-01-19 14:38:38 -05:00
larryruili	ab5e16ab67	Add publish functionality to billing pipeline This closes the end-to-end billing pipeline, allowing us to share generated detail reports with registrars via Drive and e-mail the invoicing team a link to the generated invoice. This also factors out the email configs from ICANN reporting into the common 'misc' config, since we'll likely need alert e-mails for future periodic tasks. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=180805972	2018-01-04 17:17:59 -05:00
larryruili	552ab12314	Prepare billing pipeline for production This makes a few cosmetic changes that prepares the pipeline for production. Namely: - Converts file names to include the input yearMonth, mostly mirroring the original invoicing pipeline. - Factors out the yearMonth logic from the reporting module to the more common backend module. We will likely use the default yearMonth logic in other backend tasks (such as spec11 reporting). - Adds the "withTemplateCompatability" flag to the Bigquery read, which allows multiple uses of the same template. - Adds the 'billing' task queue, which retries up to 5 times every 3 minutes, which is about the rate we desire for checking if the pipeline is complete. - Adds a shell 'invoicing upload' class, which tests the retry semantics we want for post-generation work (e-mailing the invoice to crr-tech, and publishing detail reports) While this cl may look big, it's mostly just a refactor and setting up boilerplate needed to frame the upload logic. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=179849586	2017-12-27 11:39:21 -05:00
mcilwain	46aa638b74	Rationalize prod/sandbox instance numbers to 50/5/50 That's 50 each for frontend and backend and 5 for tools. Since the MetricExporter bug has been fixed for awhile now, we aren't gaining anything by artificially keeping the instance number low, whereas we might benefit from higher instance counts, e.g. for load-testing. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=179432038	2017-12-27 11:13:42 -05:00
guyben	0e3d050dae	Temporarily disable deleteProberData cron job in sandbox for load-testing Loadtesting data is identified as "prober data" by this job (it removes anything under ".test", not only prober data) ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=177309096	2017-12-01 22:14:06 -05:00

1 2 3

125 commits