Commit graph

1434 commits

Author SHA1 Message Date
Ben McIlwain
1dc810dd93 Don't enforce billing account map check on TEST TLDs (#1622)
* Don't enforce billing account map check on TEST TLDs

This was affecting monitoring (i.e. prober TLDs). Note that test TLDs are
already excluded from the billing account map check in the Registrar builder()
method (see PR #1601), but we forgot to make that same test TLD exclusion in the
EPP flows check (see PR #1605).
2022-05-04 16:59:25 -04:00
Michael Muller
7dfdffbbad Add missing transaction for whois lookups (#1614)
* Add missing transaction for whois lookups

Nameserver whois lookups are failing under SQL for hosts with superordinate
domains because the query in this case is not done in a transaction.  We
missed this during testing because a) we didn't have a test for lookups of
hosts with superordinate domains and b) we missed converting
NameserverWhoisResponseTest to a DualDatabaseTest.

This PR fixes the problem and adds the requisite testing.

* Use a single transaction to get host registrars

* Replace streaming with Maps.toMap()
2022-05-04 07:29:45 -04:00
Weimin Yu
9c564f9d85 Downgrade dependencies that no longer support Java8 (#1617)
* Downgrade dependencies that no longer support Java8

Downgrade two dependencies whose latest versions no longer support
java8.

A follow up PR will add java8 compatibility to presubmit tests.
2022-05-04 02:03:34 -04:00
Weimin Yu
f5981a1bf9 Use Gradle dependency dynamic versioning (#1612)
* Use Gradle dependency dynamic versioning

Use dynamic versioning for Gradle dependencies when possible.
Please refer to go/dr-dependency-upgrade for more information about the
automation plan.

This PR calls out all dependencies that must be pinned to specific
versions for various reasons. The remaining ones are converted to
open-ended version ranges ("[version_str,)").
2022-05-02 14:10:52 -04:00
Ben McIlwain
a321f3f36b Re-enable prober data deletion cron jobs in prod & sandbox (#1611)
This reverts commit 52c759d1db.
2022-05-02 13:46:02 -04:00
sarahcaseybot
cea132f536 Check PAK is present on billing domain flows (#1605)
* Check PAK on domain create

* Add unit test

* update docs

* Remove unneccesary setup

* Fix blank line

* Add check and test to all relevant flows

* Change error message
2022-04-29 14:30:24 -04:00
Rachel Guan
e1d36bd728 Change from assertThat(assertThrow()) to thrown = assertThrows() then assertThat(thrown) (#1606) 2022-04-28 16:09:36 -04:00
Michael Muller
d61a4671c8 Ignore version UIDs during txn deserialization (#1607)
* Ignore version UIDs during txn deserialization

When deserializing transactions for replay to datastore, ignore class version
UIDs that don't match those of the local classes and just use the local class
descriptors instead.  This is a simple solution for the problem of persisted
VKeys containing references to classes where the class has been updated and
the serial version UID has changed.

Also add a "replay_txns" command that replays the transactions from a given
start point so we can verify all transactions are deserializable.

TESTED:
    Ran replay_txns against all transactions on sandbox beginning with
    transaction id 1828385, which includes Recurring billing events containing
    both the old and the new serial version UIDs.
2022-04-27 15:40:27 -04:00
Ben McIlwain
103f43465c Convert more Guava caches to Caffeine (#1603)
* Convert more Guava caches to Caffeine
2022-04-26 11:26:51 -04:00
Weimin Yu
ed1f4a046e Remove stray file that slipped in the repo (#1604)
* Remove stray file that slipped in the repo
2022-04-25 17:08:58 -04:00
Ben McIlwain
9f145d4634 Validate that a registrar has billing accounts for all its allowed TLDs (#1601)
This will require edits to a substantial number of registrars on sandbox (nearly
all of them) because almost all of them have access to at least one TLD, but
almost none of them have any billing accounts set. Until this is set, any updates
to the existing registrars that aren't adding the billing accounts will cause
failures.

Unfortunately, there wasn't any less invasive foolproof way to implement this
change, and we already had one attempt to implement it on create registrar
command that wasn't working (because allowed TLDs tend not to be added on
initial registrar creation, but rather, afterwards as an update).
2022-04-22 16:33:18 -04:00
sarahcaseybot
0371ef54e6 Don't fail invoicing on missing PAK (#1595)
* Don't fail invoicing on missing PAK

* Skip line if missing PAK

* Add log check in test
2022-04-22 13:00:50 -04:00
Ben McIlwain
a4fb66b94f Downgrade Caffeine to 2.9.3 (#1600)
* Downgrade Caffeine to 2.9.3

Apparently Caffeine >=3.* requires Java 11, and we're still stuck on Java 8
because of App Engine Standard.  Fortunately this doesn't affect the exposed
interface we're using, so we can simply go back to the newest Caffeine version
once Registry 3.0 Phase 3 (GKE migration) is completed.
2022-04-20 14:05:37 -04:00
Lai Jiang
6fe63c65cb Remove the BEAM RDE pipeline side job (#1599)
Now that SQL is the default, we do not need this side job to run
alongside the main one. Its purpose was to validate the BEAM pipeline
while Datastore was primary.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1599)
<!-- Reviewable:end -->
2022-04-20 12:12:23 -04:00
Rachel Guan
5e62595de8 Fix build warning for unused variable (#1594) 2022-04-19 14:11:23 -04:00
Lai Jiang
ee5b247883 Fix build (#1597) 2022-04-19 08:27:06 -04:00
Rachel Guan
2a6d4298c6 Replace .get()).isEqualTo() with hasValue() (#1589)
* Replace .get()).isEqualTo() with hasValue()

* Use containsExactly for list comparison

* Fix spacing
2022-04-18 18:35:46 -04:00
gbrodman
9939833c25 Create a Dataflow pipeline to resave EPP resources (#1553)
* Create a Dataflow pipeline to resave EPP resources

This has two modes.

If `fast` is false, then we will just load all EPP resources, project them to the current time, and save them.

If `fast` is true, we will attempt to intelligently load and save only resources that we expect to have changes applied when we project them to the current time. This means resources with pending transfers that have expired, domains with expired grace periods, and non-deleted domains that have expired (we expect that they autorenewed).
2022-04-15 15:46:35 -04:00
Rachel Guan
94017b694e Make code change for AllocationToken schema change (#1581)
* Make code change for AllocationToken schema change
2022-04-15 15:28:39 -04:00
Lai Jiang
98461c1875 Fix a in issue with RDE (#1593)
For some inexplicable reason, the RDE beam pipeline in both sandbox and
production has been broken for the past week or so. Our investigations
revealed that during the CoGropuByKey stage, some repo ID -> revision ID
pairs were duplicated. This may be a problem with the Dataflow runtime
which somehow introduced the duplicate during reshuffling.

This PR attempts to fix the symptom only by deduping the revision IDs. We
will do some more investigation and possibly follow up with the Dataflow
team if we determine it is an upstream issue.

TESTED=deployed the pipeline and successfully run sandbox RDE with it.
2022-04-15 10:32:44 -04:00
Ben McIlwain
a3b8ad4cfc Begin migration from Guava Cache to Caffeine (#1590)
* Begin migration from Guava Cache to Caffeine

Caffeine is apparently strictly superior to the older Guava Cache (and is even
recommended in lieu of Guava Cache on Guava Cache's own documentation).

This adds the relevant dependencies and switch over just a single call site to
use the new Caffeine cache. It also implements a new pattern, asynchronously
refreshing the cache value starting from half of our configuration time. For
frequently accessed entities this will allow us to NEVER block on a load, as it
will be asynchronously refreshed in the background long before it ever expires
synchronously during a read operation.
2022-04-14 13:38:53 -04:00
Rachel Guan
9ec303bd71 Change to hasSize() for assertions (#1588)
* Change to hasSize() for assertions
2022-04-13 18:20:34 -04:00
Rachel Guan
f79e4740f6 Add new columns to BillingEvent (#1573)
* Add new columns to BillingEvent.java

* Improve PR and modifyJodaMoneyType to handle null currency in override

* Add test cases for edge cases of nullSafeGet in JodaMoneyType

* Improve assertions
2022-04-11 20:09:26 -04:00
Weimin Yu
430e136920 Remove dos.xml from the configs (#1587)
* Remove dos.xml from the configs

We don't have dos config right now, and applying dos from "gcloud app
deploy" is deprecated and has started causing problems.

If we add dos configs, it should be using "gcloud app firewall-rules".
2022-04-11 15:22:42 -04:00
Weimin Yu
d5f2529b07 Remove Optional.isEmpty() in code (#1585)
* Remove Optional.isEmpty() in code
2022-04-08 21:30:22 +00:00
Ben McIlwain
99b1e93cc4 Canonicalize domain/host names in nomulus tool commands (#1583)
* Canonicalize domain/host names in nomulus tool commands

This helps prevent some common user errors.
2022-04-06 18:35:38 -04:00
Michael Muller
6314eb4ee7 Ignore read-only when saving commit logs (#1584)
* Ignore read-only when saving commit logs

Ignore read-only when saving commit logs and commit log mutations so that we
can safely replicate in read-only mode.  This should be safe, as we only ever
to the situation of saving commit logs and mutations when something has
already actually been modified in a transaction, meaning that we should have hit
the "read only" sentinel already.

This also introduces the ability to set the Clock in the
TransactionManagerFactory so that we can test this functionality.

* Changes per review

* Fix issues affecting tests

- Restore clobbered async phase in testNoInMigrationState_doesNothing
- Restore system clock to TransactionManagerFactory to avoid affecting other
  tests.
2022-04-06 13:06:08 -04:00
sarahcaseybot
82a215c807 Change use of BillingIdentifier to BillingAccountMap in invoicing pipeline (#1577)
* Change billingIdentifier to BillingAccountMap in invoicing pipeline

* Add a default for billing account map

* Throw error on missing PAK

* Add unit test
2022-04-04 16:16:43 -04:00
Michael Muller
e74113247f Check for error suggesting another nomulus running (#1582)
Check for a PSQLException referencing a failed connection to "google:5433",
which likely indicates that there is another nomulus tool instance running.

It's worth giving this hint because in cases like this it's not at all obvious
that the other instance of nomulus is problematic.
2022-04-04 11:14:43 -04:00
Ben McIlwain
396e1fb0cf Add a no-async actions DB migration phase (#1579)
* Add a no-async actions DB migration phase

This needs to be set several hours prior to entering the READONLY stage. This is
not a read-only stage; all synchronous actions under Datastore (such as domain
creates) will continue to succeed. The only thing that will fail is host
deletes, host renames, and contact deletes, as these three actions require a
mapreduce to run before they are complete, and we don't want mapreduces hanging
around and executing during what is supposed to be a short duration READONLY
period.
2022-04-01 16:55:51 -04:00
gbrodman
86226d6ffa Use UrlFetch for RDE and default TLS (1.2) for other URL connections (#1578)
* Use UrlFetch for RDE and default TLS (1.2) for other URL connections

This removes the TLS 1.3-settings in the module providers and,
essentially, reverts the changes in #1535 only to the RdeReporter and
RdeReportActionTest
2022-03-31 14:08:28 -04:00
Michael Muller
0e355e71ed Fix a few references to "Datastore" in comments (#1576)
* Fix a few references to "Datastore" in comments

Fix references to Datastore in the comments of classes that are now SQL-only.
2022-03-30 15:17:38 -04:00
Lai Jiang
cc244fb4df Make a best effort guess on the RDE folder name (prefix) when not provided. (#1574)
We have a cron job that runs the RDE upload action every 4 hours for all
TLD. Normally this should be a no-op beacuse a RDE upload is scheduled
after RDE staging is completed, and when it fails with non-2XX status it
will retry. However if for some reason it failed due to 20X status (like
waiting for the SFTP cursor), it will not retry but rely on the cron job to
catch up.

With the BEAM RDE pipeline every staging job saves all its deposits in a
uniquely named folder to avoid the need to use a lock, which is not
practical in BEAM. However the cron job has no way of knowing what the
prefixes are for each TLD so it will fail in SQL mode.

In this PR we implemented a logic to guess what the prefix should be and
use it, if we are in SQL mode and a prefix is not provided.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1574)
<!-- Reviewable:end -->
2022-03-30 11:36:24 -04:00
Lai Jiang
fc0e461c44 Set the initial worker count for the RDE beam pipeline at 24 (#1572)
* Set the initial worker count for the RDE beam pipeline at 24

This likely will speed up the pipeline by skipping the initially slow
process of spinning up instances.
2022-03-27 22:51:58 -04:00
Weimin Yu
7135219151 Fix sporadic SQL Snapshot failure (#1571)
* Fix sporadic SQL Snapshot failure

The Postgresql set-snapshot statement (called in
JpaTransactionManager.setDatabaseSnapshot() method) must be the first
statement in the SQL transaction.

Currenty the JpaTransaction.transact() method may insert a query for
DatabaseMigrationStateSchedule before the user query when the cache is
empty or the cached value expires.

This PR proactively preloads the cache in RegistryJpaIO to prevent cache
loading  inside the transaction.

This PR also changes some DatabaseSnapshotTest tests to be retrying, in
case they run just after the cache expires. (This has happened before in
CI).
2022-03-25 15:55:00 -04:00
Michael Muller
914b795232 Add a "list_txns" to dump Transaction table (#1569)
* Add a "list_txns" to dump Transaction table

Add the list_txns command which can dump the entire contents of the
Transaction table, either in csv format or as human readable transactions.

The CSV format is useful for storing the transaction table at a specific point
in time for later reference without requiring us to repeatedly hit the
replica.

Creating this without tests because this command has a very short shelf-life
and is really only intended to be run by developers.  Tested all features
locally.

* Reformatted
2022-03-25 12:41:10 -04:00
gbrodman
a55bb7edaf Use TLS v1.3 explicitly in RDE reporting (#1564)
* Use TLS v1.3 explicitly in RDE reporting

The default Java 1.8 TLS version is 1.2 which isn't supported by the
ICANN upload site.
2022-03-25 12:09:46 -04:00
Weimin Yu
187432890a Ignore trivial differences when comparing DB (#1567)
* Ignore trivial differences when comparing DB

Some data difference are due to entity model differences and also
harmless. We should igore them when comparing Datastore and SQL.

  This PR ignores the following diffs:
  - null vs empty collection
  - the empty string in Address.stree field, which is a list
2022-03-23 13:52:31 -04:00
sarahcaseybot
eefd197ad5 Address some tiny TODOs (#1566)
* Address some tiny TODOs

* Format fix
2022-03-23 12:23:29 -04:00
gbrodman
df0758a494 Bump flogger and beam dependency versions (#1562)
* Bump flogger and beam dependency versions

Beam 2.34.0 -> 2.37.0
Flogger 0.7.3 -> 0.7.4

Intellij keeps getting confused about which version of Flogger we're
bringing in. Even though we had previously locked Flogger to 0.7.3, for
some reason it was still bringing in the Beam transitive dependency of
0.6.0 which was causing the a bunch of class initialization errors.

Bumping Beam to 2.34.0 bumps the transitive dependency to 0.7.4 so we
can always use that.
2022-03-22 16:08:32 -04:00
Michael Muller
d9c7a9e9f5 Fix build warning (#1565) 2022-03-22 10:34:36 -04:00
Lai Jiang
b5480185f1 Some code health fixes (#1563)
1. testRun_withPrefix() in RdeUploadActionTest does calls a mock lock
   handler and does not actually try to read from the fake GCS
   implementation. Therefore there's no point settig it up.

2. Remove an unused field in UploadDatastoreBackupActionTest.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1563)
<!-- Reviewable:end -->
2022-03-22 09:49:14 -04:00
Rachel Guan
4941798b4e Remove static methods in back up actions (#1481)
* Remove static methods in back up actions

* Remove BigqueryPollJob helper class

* Add schedule time in task comparison

* Change payload type from byte[] to ByteString
2022-03-17 17:54:20 -04:00
Lai Jiang
917e8d5a1e Fix a subtle issue in BRDA copy caused by Cloud Tasks (#1556)
* Fix a subtle issue in BRDA copy caused by Cloud Tasks

After the Cloud Tasks migration and #1508, the BRDA copy job now
routinely fail on the first try because the revision update is not
commited by the time the Cloud Tasks job enqueued in the same
transaction runs for  the first time. This is because the enqueueing is
a side effect and not part of the transaction. The job eventually
succeeds because of retries.

This PR attempts to mitigate the initial failure by adding a delay to
the enqueued job, and checking the cursor in the job itself to prevent
it from running before the transaction is commited.
2022-03-17 16:32:07 -04:00
Michael Muller
2a2652c7f5 Fix issues with saving and deleting gap records (#1561)
* Fix issues with saving and deleting gap records

Datastore limits us to mutating up to 25 records per transaction.  We
sometimes exceed that when deleting expired gap records.  In addition, it is
theoretically possible for us to accumulate enough continuous gap records to
exceed this count while replaying the original transaction.

Deal with deletion by breaking up the gap records to be deleted into a batch
size that is small enough to be deleted transactionally (in practice, we don't
much care about the transactionality but it doesn't seem like we can delete
batches without it).

Deal with the possibility of too many additions by always breaking out gap
record storage and last transaction number updates into their own
transaction(s) (separate from the replay of the original SQL transaction).
2022-03-17 15:09:59 -04:00
Ben McIlwain
784d193e58 Add domain repo ID and token type indexes to AllocationToken table (#1560)
These are useful for the purposes of filtering by one-time/multi-use tokens, and
for determining which one-time tokens have been used (and if so, for which
domain).
2022-03-17 13:58:45 -04:00
Michael Muller
b248071eb3 Track and replay Transaction table gaps (#1557)
* Track and replay Transaction table gaps

Id gaps in the Transaction table can be the result of a transactions committed
out of order.  To deal with this, keep track of gaps for up to five minutes
and check to see if they've been back-filled prior to applying the next batch
of transactions during reply.

* Changes for review

* Calculate gap expiration time before gap queries

* Reformat.
2022-03-16 11:08:45 -04:00
Ben McIlwain
a3ca052115 Add 3 more SQL indexes to the Host table (#1559)
* Add 3 more SQL indexes to the Host table

These indexes on creationTime, deletionTime, and currentSponsorRegistrarId are
present on the other two EPP resource tables (Domain and Contact), and are
useful for a wide variety of operations/analytics queries.
2022-03-15 22:16:36 -04:00
Weimin Yu
df7c60aec5 Improve cache loading in Registries.java (#1558)
* Improve cache loading in Registries.java

The loader for the TLD cache in Registries.java unnecessarily reads from
another cache when running with SQL, potentially triggering additional
database access. This code runs in the whois query path, and contributes
to the high latency in sandbox.
2022-03-15 22:02:05 -04:00
Ben McIlwain
2427884293 Make DigestType.fromWireValue() more performant (#1555)
* Make DigestType.fromWireValue() more performant
2022-03-15 13:00:54 -04:00