* Update the Datastore to SQL migration pipeline
The pipeline now includes all entity types to be migrated by it, and has
completed successfully using the Sandbox data set. The running time in Sandbox
is about 3 hours, extrapolating by entity count to a 12-hour run with
production data. However, actual running time is likely to be longer since
throughput is lower with domains, which accounts for a higher percentage
of the total in production. More optimization will be needed.
The migrated data has not been validated.
* Use better null-handling around registrar certificates
Now with Optional it's always very clear whether they do or do not have values.
isNullOrEmpty() shouldn't be necessary anymore (indeed it wasn't necessary prior
to this either, as the relevant setters in the Registrar builder already coerced
empty strings to null). And also the cert hash is a required HTTP header, so it
will error out in the Dagger component if null or empty long before getting to
any other code.
* Merge branch 'master' into optional-get-certs
* Allow BEAM pipeline to choose JDBC isolation levels
Some BEAM pipelines may only perform READ-ONLY (e.g., reporting) or
blind-write (datastore to sql data migration) operations, which do not
need the default TRANSACTION_SERIALIZABLE isolation level. In such
cases, a less strict level allows better performance.
* Refactor naming and behavior of bulk load methods in TransactionManager
The contract of loadByKeys(Iterable<VKey>) specifies that the method will
throw a NoSuchElementException if any of the specified keys don't exist.
We don't do that before this PR, but now we do.
Existing calls (when necessary) were converted to the new load*
methods, which have the same behavior as the previous methods.
Existing methods were also renamed to be more clear -- see b/176239831
for more details and discussion.
* Add unique constraints on domain_hosts
Add unique constraints on DomainHost (child of DomainBase) and
DomainHistoryHost (child of DomainHistory). DomainHost is non-entity
embedded object and Hibernate does not define indexes automatically.
This should improve read and write performance of the parent entities.
* Add a fast mode to the ResaveAllEppResourcesAction mapreduce
This new mode avoids writing no-op mutations for entities that don't actually
have any changes to write. The cronjobs use fast mode by default, but manual
invocations do not, as manual invocations are often used to trigger @OnLoad
migrations, and fast mode won't pick up on those changes.
* Convert AllocationToken-related classes to tm()
For the most part this is a fairly simple converstion -- changing Key
references to VKey references, using JPA transactions when necessary,
and using the TransactionManager interface. There's a bit of cleanup too
in related code
* Use PollMessageVKey to replace VKey<PollMessage> in DomainBase
* Revert changes to DomainContent
* Use BillingVKey in GracePeriodBase to restore symmetric vkey
* Rebase on HEAD
* Validate SQL credentials in Secret Manager
Load SQL credentials from the SecretManager and compare them with the
ones currently in use in Nomulus server, beam pipeline, and the registry
tool. Normal operations are not affected by failures related to the
SecretManager, be it IOException, insufficient permission , or wrong or
missing credential.
The appengine and compute engine default service accounts must be
granted the permission to access the secret data. In the short term, we
will grant the secretmanager.secretAccessor role to these accounts. In
the long term, with the proposed privilege service, access will be granted
on per-secret basis.
* Modify SignedMarkRevocationList to not swallow CloudSQL failures in unittests
* restore package-lock.json
* Added suppressExceptionUnlessInTest()
* Add a DatabaseMigrationUtils class
* small changes
This parses through all pre-existing Spec11 files in GCS (starting at
2019-01-01 which is basically when the new format started) and maps them
to the new Spec11ThreatMatch objects.
Because the old format stored domain names only and the new format stores
names + repo IDs, we need to retrieve the DomainBase objects from the
point in time of the scan (failing if they don't exist). Because the
same domains appear multiple times (we estimate a total of 100k+ entries
but only 1-2k unique domains) we cache the DomainBase objects that we
retrieve from Datastore.
* Allow disabling UpdateAutoTimestamp updates
Allow us to disable timestamp updates within a try-with-resources block for a
given thread. This functionality will be needed for transaction replays both
to and from datastore.
As part of this, also upgrade the UpdateAutoTimestampTest to a
DualDatabaseTest so we can verify that the functionality works both on
Datastore and Cloud SQL.
* Use 10 workers instead of the default 100 for re-save all EPP resources
The intended/desired effect is to have a larger number of GCS commit log diffs
spread out over a longer period of time, with each diff itself being
significantly smaller. This should retain roughly the same amount of total work
for the async Cloud SQL replication action to have to deal with, but spread
across 10X as much time.
* Add a credential store backed by Secret Manager
Added a SqlCredentialStore that stores user credentials with one level
of indirection: for each credential, an addtional secret is used to
identify the 'live' version of the credential. This is a work in
progress and the overall design is explained in
go/dr-sql-security.
Also added two nomulus commands for credential management. They are
stop-gap measures that will be deprecated by the planned privilege
management system.
* Parameterize the serialization of objects being written to SQL
We shouldn't require that objects written to SQL during a Beam pipeline
be VersionedEntity objects -- they may be non-Objectify entities. As a
result, we should allow the user to specify what the objects are that
should be written to SQL.
Note: we will need to clean up the Spec11PipelineTest more but that can
be out of the scope of this PR.
* Overload the method and add a bit of javadoc
* Actually use the overloaded function
* Resurrect symmetric vkeys when possible.
AbstractVKeyConverter had never been updated to generate symmetric VKeys for
simple (i.e. non-composite) VKey types. Update it to default to generating a
VKey with a simple Ofy key.
With this change, composite VKeys must specify the "compositeKey = true" field
in the With*VKey annotations. Doing so creates an asymmetric (SQL only) VKey
that can triggers higher-level logic to generate a corrected VKey containing
the composite Key.
* Add an action to replay commit logs to SQL
- We import from the commit-log GCS files so that we are sure that we
have a consistent snapshot.
- We use the object weighting (moved to ObjectWeights) to verify that we
are replaying objects in the correct order.
- We add a config setting (default false) of whether or not to replay
the logs.
- The action is triggered after the export if the aforementioned config
setting is on.
* Responses to CR
- Remove triggering of replay from the export action and remove the test
changes
- Add a method to load commit log diffs by transaction
- Replay one Datastore transaction at a time, per SQL transaction
- Minor logging / comment changes
- Change ObjectWeights to EntityWritePriorities and flesh out javadoc
* More CR responses
- Use one transaction per GCS diff file
- Fix up comments minutiae
* Add a class-level javadoc
* Add a log message and some periods
* bit of formatting
* Merge remote-tracking branch 'origin/master' into replayAction
* Handle toSqlEntity rather than toSqlEntities
* Convert DatastoreEntity/SqlEntity to return Optional, not List
We don't have any entities that convert to more than one entity, so we
can use an Optional instead for clarity and simplicity.
* Minor fixes:
- Initialize "requestedByRegistrar" to false (it's non-nullable).
- Store test entities (registrar, hosts and contacts) in JPA.
* Flyway changes
* Add ReplayExtension to DomainDeleteFlowTest
* Check in latest ER diagrams
* Ensure that all relevant Keys can be converted to VKeys
When replaying commit logs to SQL (specifically deletes) we will need to
convert Datastore Keys to SQL VKeys in order to know what (if anything)
to delete.
The test added to EntityTest (and the associated code changes) mean that
for every relevant object we'll be able to call
VKeyTranslatorFactory.createVKey(Key<?>) for all possible keys that we
care about. Note that we do not care about entities that only exist in
Datastore or entities that are non-replicated -- by their nature,
deletes for those types of objects in Datastore are not relevant in SQL.
* Responses to code review
- changing comments / method names
- using ModelUtils
* Reproduce DomainHistory double write failure
* Add fix for cascade sets and clean up hacks
* Fix DatastoreHelper to work with name change.
* Remove Ignored entities from ofy schema
* Make sure post load work happens in GracePeriod
The GracePeriod method with ofy @OnLoad annotation is not called.
Apparently Ofy only checks for @OnLoad on first-class entities,
not embedded ones.
Added a call to this method from DomainContent's OnLoad method.
Reproduced issue with a test and verified that the fix works.
* Rename DatastoreHelper -> DatabaseHelper
It already contains some functionality for dealing with Cloud SQL and will
increasingly contain more, so it should be renamed so that it does not falsely
imply it is specific only to Datastore.
* Allow addition of extra entity classes for VKey conversion
This allows us to create VKeys from Keys for test objects that may not
be part of the original codebase.
This isn't used anywhere directly yet but it will be useful in the
future when testing the replay of SQL transactions.
* Add an extension to verify transaction replay
Add ReplayExtension, which can be applied to test suites to verify that
transactions committed to datastore can be replayed to SQL.
This introduces a ReplayQueue class, which serves as a stand-in for the
current lack of replay-from-commit-logs. It also includes replay logic in
TransactionInfo which introduces the concept of "entity class weights."
Entity weighting allows us store and delete objects in an order that is
consistent with the direction of foreign key and deferred foreign key
relationships. As a general rule, lower weight classes must have no direct or
indirect non-deferred foreign key relationships on higher weight classes.
It is expected that much of this code will change when the final replay
mechanism is implemented.
* Minor fixes:
- Initialize "requestedByRegistrar" to false (it's non-nullable). [reverted
during rebase: non-nullable was removed in another PR]
- Store test entities (registrar, hosts and contacts) in JPA.
* Make testbed save replay
This changes the replay system to make datastore saves initiated from the
testbed (as opposed to just the tested code) replay when the ReplayExtension
is enabled. This requires modifications to DatastoreHelper and the
AppEngineExtension that the ReplayExtension can plug into.
This changes also has some necessary fixes to objects that are persisted by
the testbed (such as PremiumList).
* Make some columns nullable in History tables
xmlBytes is made nullable in all history tables since changes performed
by backend actions would not have it. In addition, epp requests are not saved to
ContactHistory since data may contain PII.
requestedByRegistrar in all history tables are made nullable. This
property is set from metadata in epp requests. Null means not provided.
* Add schema for GracePeriodHistory
Rebase on HEAD
Rebase on HEAD
Rebase on HEAD and rename column
Use OfyService to generate id
Refactor GracePeriodsSubject
Rebase on HEAD
Remove GracePeriodSubject and GracePeriodsSubject
Rebase on HEAD
Rebase on HEAD
Rebase on HEAD
Add gracePeriodHistoryRevisionId and remove some foreign key
* Rebase on HEAD
* Add SQL replay checkpoint object to Datastore
This will be part of the asynchronous commit-log replay to SQL. Whenever
we successfully export commits up to a particular time, we should
persist that time so we don't replay the same commits again (it is not
idempotent)
* Move SqlReplayCheckpoint from DS to SQL
* Responses to CR
* Small SQL persistence fixes to model classes
- Add a createVKey() method to Registry (Registry vkeys are composite)
- Add/fix toSqlEntities() methods in premium and reserved list classes.
* Remove fixes addressed by #866