google-nomulus/java/google/registry/env/sandbox/default/WEB-INF/cron.xml
guyben 63785e5149 Remove empty TLD parameter when fanning out without TLDs
TldFanoutAction fans out a given endpoint to all TLDs (either TEST, REAL, or
both).

However, it is also used to delegate a single endpoint request that we want set
in a specific queue (so we can control retries). We do that by setting the TLD
list to "runInEmpty" rather than "forEachRealTld" or "forEachTestTld".

Currently, using "runInEmpty" would still specify a TLD - but that TLD would be
the empty string. This is a bug: it sets the TLD parameter to a bad value. It
worked only because none of the endpoints called with "runInEmpty" were using
the TLD parameter.

However, this will (and does) break if either (a) the endpoint accepts an
optional TLD parameter (like deleteProberData does), or (b) the given endpoint
already has a TLD parameter in it (we want to run the endpoint with a single
TLD, but still use the "fanout" to set the right queue).

This CL fixes several things:

- if runInEmpty is given, no TLD parameter is added
- 'runInEmpty' is now mutually exclusive with 'forEach*Tld' and 'excludes'
- we do some sanity checks and added logging
- removed the buggy and unused "':tld' in path is replaced by TLD"
- in the cron.xml, removed documentation for :tld and the broken :registrar

Note that none of the endpoints that were used with runInEmpty fanout had the TLD parameter prior to deleteProberData

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=189954585
2018-04-02 16:24:27 -04:00

241 lines
9 KiB
XML

<?xml version="1.0" encoding="UTF-8"?>
<cronentries>
<!--
/cron/fanout params:
queue=<QUEUE_NAME>
endpoint=<ENDPOINT_NAME> // URL Path of servlet, which may contain placeholders:
runInEmpty // Run once, with no tld parameter
forEachRealTld // Run for tlds with getTldType() == TldType.REAL
forEachTestTld // Run for tlds with getTldType() == TldType.TEST
exclude=TLD1[&exclude=TLD2] // exclude something otherwise included
-->
<cron>
<url>/_dr/task/rdeStaging</url>
<description>
This job generates a full RDE escrow deposit as a single gigantic XML document
and streams it to cloud storage. When this job has finished successfully, it'll
launch a separate task that uploads the deposit file to Iron Mountain via SFTP.
</description>
<!--
This only needs to run once per day, but we launch additional jobs in case the
cursor is lagging behind, so it'll catch up to the current date as quickly as
possible. The only job that'll run under normal circumstances is the one that's
close to midnight, since if the cursor is up-to-date, the task is a no-op.
We want it to be close to midnight because that reduces the chance that the
point-in-time code won't have to go to the extra trouble of fetching old
versions of objects from Datastore. However, we don't want it to run too
close to midnight, because there's always a chance that a change which was
timestamped before midnight hasn't fully been committed to Datastore. So
we add a 4+ minute grace period to ensure the transactions cool down, since
our queries are not transactional.
-->
<schedule>every 4 hours from 00:07 to 20:00</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/cron/fanout?queue=rde-upload&endpoint=/_dr/task/rdeUpload&forEachRealTld]]></url>
<description>
This job is a no-op unless RdeUploadCursor falls behind for some reason.
</description>
<schedule>every 4 hours synchronized</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/cron/fanout?queue=rde-report&endpoint=/_dr/task/rdeReport&forEachRealTld]]></url>
<description>
This job is a no-op unless RdeReportCursor falls behind for some reason.
</description>
<schedule>every 4 hours synchronized</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/cron/fanout?queue=marksdb&endpoint=/_dr/task/tmchDnl&runInEmpty]]></url>
<description>
This job downloads the latest DNL from MarksDB and inserts it into the database.
(See: TmchDnlServlet, ClaimsList)
</description>
<schedule>every 12 hours synchronized</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/cron/fanout?queue=marksdb&endpoint=/_dr/task/tmchSmdrl&runInEmpty]]></url>
<description>
This job downloads the latest SMDRL from MarksDB and inserts it into the database.
(See: TmchSmdrlServlet, SignedMarkRevocationList)
</description>
<schedule>every 12 hours from 00:15 to 12:15</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/cron/fanout?queue=marksdb&endpoint=/_dr/task/tmchCrl&runInEmpty]]></url>
<description>
This job downloads the latest CRL from MarksDB and inserts it into the database.
(See: TmchCrlServlet)
</description>
<schedule>every 12 hours synchronized</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/cron/fanout?queue=retryable-cron-tasks&endpoint=/_dr/task/syncGroupMembers&runInEmpty]]></url>
<description>
Syncs RegistrarContact changes in the past hour to Google Groups.
</description>
<schedule>every 1 hours synchronized</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/cron/fanout?queue=sheet&endpoint=/_dr/task/syncRegistrarsSheet&runInEmpty]]></url>
<description>
Synchronize Registrar entities to Google Spreadsheets.
</description>
<schedule>every 1 hours synchronized</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/task/resaveAllEppResources]]></url>
<description>
This job resaves all our resources, projected in time to "now".
It is needed for "deleteOldCommitLogs" to work correctly.
</description>
<schedule>1st monday of month 09:00</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/task/deleteOldCommitLogs]]></url>
<description>
This job deletes unreferenced commit logs from Datastore that are older than thirty days.
Since references are only updated on save, if we want to delete "unneeded" commit logs, we
also need "resaveAllEppResources" to run periodically.
</description>
<schedule>3rd monday of month 09:00</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/cron/commitLogCheckpoint]]></url>
<description>
This job checkpoints the commit log buckets and exports the diff since last checkpoint to GCS.
</description>
<schedule>every 1 minutes synchronized</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/cron/fanout?queue=retryable-cron-tasks&endpoint=/_dr/task/exportDomainLists&runInEmpty]]></url>
<description>
This job exports lists of all active domain names to Google Cloud Storage.
</description>
<schedule>every 12 hours synchronized</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/task/deleteContactsAndHosts]]></url>
<description>
This job runs a mapreduce that processes batch asynchronous deletions of
contact and host resources by mapping over all EppResources and checking
for any references to the contacts/hosts in pending deletion.
</description>
<schedule>every 5 minutes synchronized</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/task/refreshDnsOnHostRename]]></url>
<description>
This job runs a mapreduce that asynchronously handles DNS refreshes for
host renames by mapping over all domains and creating DNS refresh tasks
for any domains that reference a renamed host.
</description>
<schedule>every 5 minutes synchronized</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/task/expandRecurringBillingEvents]]></url>
<description>
This job runs a mapreduce that creates synthetic OneTime billing events from Recurring billing
events. Events are created for all instances of Recurring billing events that should exist
between the RECURRING_BILLING cursor's time and the execution time of the mapreduce.
</description>
<schedule>every day 03:00</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/cron/fanout?queue=export-snapshot&endpoint=/_dr/task/exportSnapshot&runInEmpty]]></url>
<description>
This job fires off a Datastore backup-as-a-service job that generates snapshot files in GCS.
It also enqueues a new task to wait on the completion of that job and then load the resulting
snapshot into bigquery.
</description>
<!-- Keep the task-age-limit for this job's task queue less than this cron interval. -->
<schedule>every day 06:00</schedule>
<target>backend</target>
</cron>
<!--
Removed for the duration of load testing
TODO(b/71607184): Restore after loadtesting is done
<cron>
<url><![CDATA[/_dr/cron/fanout?queue=retryable-cron-tasks&endpoint=/_dr/task/deleteProberData&runInEmpty]]></url>
<description>
This job clears out data from probers and runs once a week.
</description>
<schedule>every monday 14:00</schedule>
<timezone>UTC</timezone>
<target>backend</target>
</cron>
-->
<cron>
<url><![CDATA[/_dr/cron/fanout?queue=retryable-cron-tasks&endpoint=/_dr/task/exportReservedTerms&forEachRealTld]]></url>
<description>
Reserved terms export to Google Drive job for creating once-daily exports.
</description>
<schedule>every day 05:30</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/cron/readDnsQueue?jitterSeconds=45]]></url>
<description>
Lease all tasks from the dns-pull queue, group by TLD, and invoke PublishDnsUpdates for each
group.
</description>
<schedule>every 1 minutes synchronized</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_ah/sessioncleanup?clear]]></url>
<description>
Delete up to 100 expired _ah_SESSION entities from Datastore.
</description>
<schedule>every 15 minutes</schedule>
<target>backend</target>
</cron>
<cron>
<url><![CDATA[/_dr/cron/fanout?queue=retryable-cron-tasks&endpoint=/_dr/task/verifyEntityIntegrity&runInEmpty]]></url>
<description>
This job verifies entity integrity and runs once daily.
</description>
<schedule>every day 06:30</schedule>
<timezone>UTC</timezone>
<target>backend</target>
</cron>
</cronentries>