Commit graph

3940 commits

Author SHA1 Message Date
Rachel Guan
b5c016dda2 Replace email content with placeholders (#1530)
* Replace email content with placeholders

* Improve sample email wording
2022-03-14 11:30:58 -04:00
Weimin Yu
b6acbc77f1 Revise host.inet_addresses query to use gin index (#1550)
* Revise host.inet_addresses query to use gin index
2022-03-09 23:32:17 -05:00
Weimin Yu
9a022ff088 Reorganize new schema changes (#1551)
* Reorganize new schema changes

Reorganized new schema changes and make each flyway script update a
single table.

Each flyway script is executed in a single database transaction so that
the script can be rolled back in one shot. It acquires a shared lock on
all tables touched by the script. This is deadlock-prone because in a
busy database, there may be user queries that attempt to lock the same
set of tables, but in different order. By limiting each script to one
table, we avoid the problem.

We should have some a presubmit check to enforce this rule.

All changes have been deployed to Sandbox out-of-band. When doing so,
we changed all CREATE INDEX statements to CREATE INDEX IF NOT EXISTS.

Future deployments should be able to proceed normally.
2022-03-09 20:47:24 -05:00
Ben McIlwain
13e2b15a9d Add anti-deadlock instructions to DB update README (#1552) 2022-03-09 18:50:15 -05:00
Weimin Yu
c1ba725d3d Revise Host index on inet_addresses (#1549)
* Revise Host index on inet_addresses

The index on the 'inet_addresses array column should be of gin or gist
type, which index individual array elements. We use gin for now since
host updates are not often, and gin has better accuracy.

Since flyway script V108__... has not been deployed, we  edit the file
in place instead of adding a new script.

This will be followed up with a modified query that can take advantage
of the gin index. Until then we don't expect to see performance
improvement.

The suspected bottlenect query in the whois path is:

select * from "Host" where 'non-ip-string' = any(inet_address) and
deletion_time < now();

It needs to be revised into:

select * from "Host" where array['non-ip-string'] <@ inet_address and
deletion_time < now();

The combined change reduces the query time from 90ms to 30ms in Sandbox,
and from 150ms to 40ms in production.

It is unclear if this solves all problem with whois latency.
2022-03-08 14:22:01 -05:00
Ben McIlwain
30c3a51b8d Add deletionTime/inetAddresses indexes to Host table to support WHOIS (#1548)
* Add deletionTime/inetAddresses indexes to Host table to support WHOIS

Weimin identified these as missing, and being the cause of slowdowns in
NameserverLookupByIpCommand that we're seeing in sandbox.

This is the first of two PRs, adding just the Flyway/schema changes. The
second PR adding the Java object model changes is #1547.
2022-03-07 16:07:11 -05:00
Michael Muller
8395d17ba2 Improve Transaction gap processing (#1546)
Skip multiple gaps in one pass and write the correct transaction id to
datastore.
2022-03-07 13:26:18 -05:00
Ben McIlwain
f382f63d00 Add domainRepoId indexes to billing events (#1545)
* Add domainRepoId indexes to billing events #1544

The query analyzer identified this is a missing index on the BillingEvent table,
and I added it for recurrences and cancellations as well as it's likely to be a
problem for them too. "Give me all the billing events associated with a given
domain by its repo ID" seems like a pretty common use case for the DB (and does
appear to be used by our invoicing pipeline).

This is the first of two PRs that makes just the DB changes. The second PR
(#1544) will add the Java code changes, and will be committed after this one is
deployed.
2022-03-07 12:21:40 -05:00
Lai Jiang
2702eeb56d Update the the latest LTS Node version (#1543)
<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1543)
<!-- Reviewable:end -->
2022-03-07 11:49:18 -05:00
Ben McIlwain
7b11b29ea3 Add 9 more indexes to SQL schema (#1541)
* Add 9 more indexes to SQL schema

This indexes were identified as missing by PostgreSQL's query analyzer in our
sandbox environment (where we get enough realistic EPP traffic to identify these
deficiencies).

This is the first of two PRs -- the second PR (#1540) will be merged only after
this one is live in production. Note that this is the PR that actually modifies
the database though, so once this one is deployed we will already have the
benefit of the new indexes.
2022-03-05 17:57:36 -05:00
sarahcaseybot
e41cbd53b9 Check signature length in DS records (#1538)
* Check signature length in DS records

* Small fixes

* Add unit tests

* Formatting fix
2022-03-04 15:18:14 -05:00
gbrodman
0d62ac0410 Use built-in Java URL connections instead of UrlFetchService (#1535)
- Use the standard HttpsURLConnection to write/read data
- Rewrite RdeReporter, Nordn*Action, and Marksdb classes and related
  tests to conform to the new format
- Remove FakeURLFetchService and ForwardingUrlFetchService as they weren't used
- Refactor UrlFetchException to UrlConnectionException
- Refactor UrlFetchUtils to UrlConnectionUtils

I will need to test this on Alpha. Fortunately the connections that
don't require auth (e.g. TMDB downloading) should be testable.
2022-03-04 14:16:22 -05:00
Michael Muller
cc4f2c0c52 Allow replicateToDatastore to skip gaps (#1539)
* Allow replicateToDatastore to skip gaps

As it turns out, gaps in the transaction id sequence number are expected
because rollbacks do not rollback sequence numbers.

To deal with this, stop checking these.

This change is not adequate in and of itself, as it is possible for a gap to
be introduced if two transactions are committed out of order of their sequence
number.  We are currently discussing several strategies to mitigate this.

* Remove println, add a record verification
2022-03-04 09:04:13 -05:00
Weimin Yu
1ae676d3ad Pass stack trace to validate_datastore user (#1537)
* Pass stack trace to validate_datastore user
2022-03-03 16:10:31 -05:00
Weimin Yu
fba4eb4d67 Fix hanging test (#1536)
* Fix hanging test

Tests using the TestServerExtension may hang forever if an underlying
component (e.g., testcontainer for psql) fails. This may be the cause
of the some kokoro runs that timeed out after three hours.
2022-03-03 14:43:32 -05:00
Rachel Guan
eaf8d5e6d5 Inject CloudTasksUtil to AsyncTaskEnqueuer (#1522)
* Inject CloudTasksUtil to AsyncTasksEnqueuer

* Rebase

* Remove QUEUE_ASYNC_DELETE from AsyncTasksEnqueuer

* Refactor create() 

* Remove AppEngineServiceUtil depdendency from AsyncTaskEnqueuer
2022-03-02 11:31:45 -05:00
Weimin Yu
d882847fd7 Save db discrepancies to GCS (#1527)
* Save db discrepancies to GCS
2022-02-28 16:52:09 -05:00
Weimin Yu
e47be4fa2c Add a tools command to launch SQL validation job (#1526)
* Add a tools command to launch SQL validation job

Stopping using Pipeline.run().waitUntilFinish in
ValidateDatastorePipeline. Flex-templalate does not support blocking
wait in the main thread.

This PR adds a new ValidateSqlCommand that launches the pipeline and
maintains the SQL snapshot while the pipeline is running.

This PR also added more parameters to both ValidateSqlCommand and
ValidateDatastoreCommand:
- The -c option to supply an optional incremental comparison start time
- The -r option to supply an optional release tag that is not 'live',
  e.g., nomulus-DDDDYYMM-RC00

If the manual launch option (-m) is enabled, the commands will print the
gcloud command that can launch the pipeline.

Tested with sandbox, qa and the dev project.
2022-02-28 13:14:57 -05:00
Michael Muller
55c368cf13 Nullify contact fields in setContactFields (#1533)
When setting contact fields from a set of DesignatedContact's, nullify the
existing fields so they don't stick around if they're not in the new set.
2022-02-28 10:57:35 -05:00
Michael Muller
fbdee623de Don't reset the update time for TLD updates (#1532)
* Don't reset the update time for TLD updates

It turns out that the reason that the Registrar update timestamp isn't updated
for some of the tests is because the record is updated unchanged.  We can
avoid this problem by not trying to update the registrar to the same value.
So in this case, if the registrar alreay contains the TLD we're adding, don't
try to add it.
2022-02-25 13:09:36 -05:00
Rachel Guan
f7b105e96b Inject CloudTasksUtils to DomainLockUtils (#1519)
* Move enqueueDomainRelock to DomainLockUtils

* Rebase and improve PR

* Inject CloudTaskUtils to DomainLockUtils
2022-02-25 11:38:38 -05:00
gbrodman
5b7514096f Fix DTR creation in one location and clean up replay comparison (#1529)
* Fix DTR creation in one location and clean up replay comparison
2022-02-23 11:07:10 -05:00
Lai Jiang
71b9b7fc7d Disable sending cert expiration emails on sandbox (#1528) 2022-02-22 14:46:27 -05:00
Michael Muller
bb7d0b6b89 Do full database comparison during replay tests (#1524)
* Fix entity delete replication, compare db @ replay

Replay tests currently only verify that the contents of a transaction are
can be successfully replicated to the other database.  They do not verify that
the contents of both databases are equivalent.  As a result, we miss any
changes omitted from the transaction (as was the case with entity deletions).

This change adds a final database comparison to ReplayExtension so we can
safely say that the databases are in the same state.

This comparison is introduced in part as a unit test for the one-line fix for
replication of an "entity delete" operation (where we delete using an entity
object instead of the object's key) which so far has only affected PollMessage
deletion.  The fix is also included in this commit within
JpaTransactionManagerImpl.

* Exclude tests and entities with failing comparisons

* Get all tests to pass and fix more timestamp

Fix most of the unit tests that were broken by this change.

- Fix timestamp updates after grace period changes in DomainContent and for
  TLD changes in Registrar.
- Reenable full database comparison for most DomainCreateFlowTest's.
- Make some test entities NonReplicated so they don't break when used with
  jpaTm().delete()
- Diable checking of a few more entity types that are failing comparisons.
- Add some formatting fixes.

* Remove unnecessary "NoDatabaseCompare"

I turns out that after other fixes/elisions we no longer need these for
any tests in DomainCreateFlowTest.

* Changes for review

* Remove old "compare" flag.

* Reformatted.
2022-02-22 10:49:57 -05:00
Lai Jiang
d355da362f Make a few quality-of-life improvements in CloudTasksUtils (#1521)
* Make a few quality-of-life improvements in CloudTasksUtils

1. Update the method names. There are too many overloaded methods and it
   is hard to figure out which one does which without checking the
   javadoc.

2. Added a method in the task matcher to specify the delay time in
   DateTime, so the caller does not need to convert it to Timestamp.

3. Remove the expilict dependency on a clock when enqueueing a task with
   delay, the clock is now injected directly into the util instance
   itself.
2022-02-18 20:21:56 -05:00
Ben McIlwain
88f274a601 Disable prober data deletion cron job in prod & sandbox (#1525)
* Disable prober data deletion cron job in prod & sandbox

This is going to unnecessarily make the database migration more complex, and we
don't need them that badly. We'll re-enable these cron jobs once we've written
the new version of this action that handles Cloud SQL correctly (the current
version only does Datastore anyway).
2022-02-17 08:46:40 -08:00
Weimin Yu
065ced89e9 Ignore prober data when comparing databases (#1523)
* Ignore prober data when comparing databases

Completely ignore prober data when comparing Datastore and SQL.

Prober data deletions are not propagated from Datastore to SQL. It is
difficult to distinguish soft-deletes from normal updates, therefore
difficult to avoid false positives when looking for differences.
2022-02-15 12:01:20 -05:00
Ben McIlwain
94c33eba7e Make NordnUploadAction resilient to duplicate task queue tasks (#1516)
This is necessary because the Cloud Tasks API is not transactionally enrolled,
so it's possible that multiple tasks might end up being enqueued. We need to be
able to handle them.
2022-02-14 14:59:46 -05:00
Michael Muller
1af163d9ba Fix update timestamps for DomainContent types (#1517)
* Fix update timestamps for DomainContent types

We expect update timestamps to be updated whenever a containing entity is
modified and persisted, but unfortunately Hibernate doesn't seem to do this --
instead it appears to regard such an entity as unchanged.

To work around this, we explicitly reset the update timestamp whenever a
nested collection is modified in the Builder.

Note that this change only solves the problem for DomainContent.  All other
entitities containing UpdateAutoTimestamp will need to be audited and
instrumented with a similar change.

* Fix a handful of tests broken by this change

* Reformatted.
2022-02-14 11:31:03 -05:00
Rachel Guan
27d09af124 Use CloudTasksUtils to enqueue in RegistrarSettingsAction (#1467)
* Use CloudTaskUtils to enqueue

* Add CloudTasksUtilsModule to FrontendComponent

* Fix Uri query issue

* Remove header and check service in matcher

* Use a ThreadLocal boolean in TestServer to determine enqueueing

* Extract enqueuing and email sending from tm().transact()
2022-02-10 11:16:28 -05:00
Weimin Yu
4ca390da98 Compare datastore to sql action (#1507)
* Add action to DB comparison pipeline

Add a backend Action in Nomulus server that lanuches the pipeline for
comparing datastore (secondary) with Cloud SQL (primary).

* Save progress

* Revert test changes

* Add pipeline launching
2022-02-10 10:43:36 -05:00
Rachel Guan
7b86cba44f Fix protobuf-java-util dependency (#1518) 2022-02-09 14:11:09 -05:00
Rachel Guan
c1065daeac Use CloudTasksUtil to enqueue task in IcannReportingStagingAction (#1489)
* Use CloudTasksUtil to enqueue task

* Use schedule time helper and add schedule time comparison
2022-02-09 12:33:56 -05:00
Michael Muller
f66a60673c Fix create/update timestamp replay problems (#1515)
* Fix create/update timestamp replay problems

When CreateAutoTimestamp and UpdateAutoTimestamp are inserted into a
Transaction, their values are not populated in the same way as when they are
stored in the course of an SQL commit.  This results in different timestamp
values between SQL and datastore during the SQL -> DS replay.

Fix this by providing these values from the JPA transaction time when we're
doing transaction serialization.

This change also removes the initialization of the Ofy clock in
ExpandRecurringBillingEventsActionTest.  It's not necessary as the
ReplayExtension already takes care of this and doing it after the
ReplayExtension as we were breaks a test now that the update timestamps are
correct.
2022-02-09 08:48:51 -05:00
Rachel Guan
fb140808a6 Remove ReportingUtils and use CloudTasksUtil to enqueue tasks in GenerateInvoicesAction and GenerateSpec11ReportAction (#1491)
* Remove ReportingUtils and use CloudTaskUtil to enqueue 

* Use schedule time helper to enqueue and update schedule time comparison

* Fix comment, indentation in gradle file and improve time comparison
2022-02-08 17:48:47 -05:00
Rachel Guan
3af6ade080 Change from TaskQueueUtils to CloudTasksUtils in LoadTestAction (#1468)
* Change from TaskQueueUtils to CloudTasksUtils in LoadTestAction

* Put X_CSRF_TOKEN in task headers

* Fix schedule time and gradle issue

* Remove TaskQueue constant dependency

* Double run seconds

* Add comment for X_CSRF_TOKEN
2022-02-08 17:44:24 -05:00
Lai Jiang
004e90913b Make EscrowDepositEncryptor work with BRDA deposits (#1512)
Also make it possible to specify a revision number.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1512)
<!-- Reviewable:end -->
2022-02-07 12:40:00 -05:00
Weimin Yu
201d2ef95b Fix flaky RdeStagingActionDatastoreTest (#1514)
* Fix flaky RdeStagingActionDatastoreTest

Fixed the most common cause that makes one method flaky (Clock and
timestamp problem). Added a TODO to rethink test case.

Also added notes on tasks potentially enqueued multiple times.
2022-02-04 10:40:52 -05:00
Rachel Guan
98ba687005 Add support for delay of duration when scheduling a task (#1493)
* Add support for delay by duration when scheduling task

* Fix comments

* Add test for negative duration

* Change delay parameter type to duration
2022-02-03 22:25:39 -05:00
Lai Jiang
1688d27b4e Correctly delete all stopped versions except for the most recent 3 (#1511)
The gcloud command does some weird stuff with sorting when custom format
is used. Here we instead rely on linux sort and head command to sort the
versions list.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1511)
<!-- Reviewable:end -->
2022-02-03 16:04:58 -05:00
Weimin Yu
cdcdde2c96 Add an index on Host.host_name column (#1510)
* Add an index on Host.host_name column

This field is queried during host creation and needs an index to speed
up the query.

Since Hibernate does not explicitly refer to indexes, we can change the
code and schema in one PR.
2022-02-03 15:57:15 -05:00
gbrodman
4d08fadc11 Use the built-in replicaJpaTm() in RDAP (#1506)
* Use the built-in replicaJpaTm() in RDAP

This includes a test for the replica-simulating transaction manager and
removal of any replica-specific code in RDAP tests, because it's
unnecessary due to the existing tests.
2022-02-03 11:14:26 -05:00
Weimin Yu
77600ba404 Count duplicates when comparing Databases (#1509)
* Count duplicates when comparing Databases

Cursors may have duplicates in Datastore if imported across projects.
Count them instead of throwing.
2022-02-03 10:59:03 -05:00
Lai Jiang
86d227c748 Copy the latest revision of BRDA during upload (#1508)
The revision was hardcoded to 0, which caused problem when we need to
re-run BRDA.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1508)
<!-- Reviewable:end -->
2022-02-02 21:54:42 -05:00
Rachel Guan
de258d4dfc Change from TaskQueueUtils to CloudTasksUtils in RdeStaging (#1411)
* Change from TaskQueueUtils to CloudTaskUtils in RdeStaging
2022-02-01 20:41:56 -05:00
sarahcaseybot
68b0dd2595 Add DS validation to match Cloud DNS (#1487)
* Add DS validation to match Cloud DNS

* Add checks to flows

* Add some flow tests

* Add tests for DomainCreateFlow

* Add tests for UpdateDomainCommand

* Fix docs test

* Small fixes

* Remove builder from tests
2022-02-01 15:25:00 -05:00
Lai Jiang
5b56a38e01 Allow the beam parameter in RDE standard mode (#1505)
Standard mode will determine the watermarks based on the cursors and
kick off subsequent uploading steps. In order to run both the Beam and
the Mapreduce pipeline in parallel, we need to allow setting the beam
parameter when in standard mode. This changes should have been part of
https://github.com/google/nomulus/pull/1500.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1505)
<!-- Reviewable:end -->
2022-01-31 14:20:23 -05:00
gbrodman
1654f5d8e4 Use the replica jpaTm in FKI and EppResource cache methods (#1503)
The cached methods are only used in situations where we don't really
care about being 100% synchronously up to date (e.g. whois), and they're
not used frequently anyway, so it's safe to use the replica in these
locations.
2022-01-28 18:05:18 -05:00
Weimin Yu
ac2c08b6e1 Release ValidateSqlPipeline as container image (#1504)
* Release ValidateSqlPipeline as container image
2022-01-28 14:57:31 -05:00
Weimin Yu
62b2c18791 Release ValidateDatastorePipeline (#1501)
* Release ValidateDatastorePipeline
2022-01-26 13:38:19 -05:00