* Add NotLoggedInException tests to flows and flow docs
This wasn't included in flows.md before because the test existed in
ResourceFlowTestCase. So even though the exception could be thrown and
even though this was tested, it wasn't picked up in the documentation
because the documentation is picked up from the corresponding concrete
test class.
* Validate SQL with Datastore being primary
Validates the data asynchronously replicated from Datastore to SQL.
This is a short term tool optimized for the current production database.
Tested in production.
We want to keep the read-only-mode-exception as an unchecked exception,
so we introduce a temporary check in the EppController that provides a
specific error message for this situation (rather than letting it fall
through to the generic "command failed" messaging
* Replace with stringify() and VKey.create(string)
* Convert implicit cases of VKey.fromWebsafeKey(string)
* Convert from Key to VKey to use stringify()
* Modify existing code to show correct string representation of a key
* Use VKey.create(websafeKey) to get ofy key in ResaveEntitiesCommand
* Add TODO note in CommitLogMutation and determine if key string should be modified
* Revert from stringify() to getOfyKey().getString()
* Add bug ids to TODOs
* Ignore read-only mode in SQL->DS replication process
We need to be able to save indices and save data about the replication
even when we're in read-only mode.
We can handle it the same way that we handle UpdateAutoTimestamp, where
we simply populate it in SQL if it doesn't exist. This has the following
benefits:
1. The converter is unnecessary code
2. We get non-null column definitions for free (overridden in
EppResource to allow null creation times so that legacy *History objects
can contain null in that field
3. More importantly, this allows us for proper SQL->DS replay. If the
field is filled out using a converter (as before this PR) then the field
is only actually filled out on transaction commit (rather than when the
write occurs within the transaction). This means that when we serialize
the Transaction object during the transaction (the data that gets
replayed to Datastore), we are crucially missing the creation time.
If the creation time is written on commit, we have to start a new
transaction to write the Transaction object, and it's an absolute
necessity that the record of the transaction be included in the
transaction itself so as to avoid situations where the transaction
succeeds but the record fails.
If the field is filled out in a @PrePersist method, crucially that
occurs on the object write itself (before transaction commit).
The original RDE pipeline was a direct translation of the App Engine
MapReduce logic. It turned out to be too slow (taking more than a day to
run) due to the way it finds the most recent history entry.
This PR overhauled the pipeline by using embedded EPP resource entities
inside history entries (only available in SQL) and finding the most
recent entries using the SQL engine. It cuts the time done to ~2h.
Note that there are quota limits on the CPU cores and external IP
addresses for a given GCP region inside a project, which will need to
accommodate the resource requirements for the pipeline. More details are
provided in comments.
Also merged the update cursor stage and enqueue next action stage in
RdeIO so that they can be done within a transaction, same as how
MapReduce handles them.
<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1427)
<!-- Reviewable:end -->
* Change TaskOptions to Task in CommitLogFanoutAction
* Add a createTask method that takes clock and jitterSeconds
* Change CreateTask parameter type and improve test cases
* Improve comments and test casse
* Improve test cases that handel jitterSeconds
* Grandfather in old data for one-time billing event requirement
We have data from 2018 and earlier where we didn't consistently set periodYears
for OneTime BillingEvents with certain reasons. This grandfathers in that old
data so that we can successfully move it over to Cloud SQL for now, then we can
later run a query that will backfill it, after which we can then tighten up the
requirement again. Note that the requirement is still being enforced for all
billing events from 2019 onwards.
This also improves the handling of validation, by adding a private field to the
Reason enum rather than creating a throwaway inline ImmmutableSet in the
Builder.
BSD sed requires a parameter to -i to indicate the backup suffix. By
adding a blank suffix the sed command works on both Linux and macOS.
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1421)
<!-- Reviewable:end -->
* Make TaskMatcher default to POST methods
TaskOptions.Builder.withUrl() defaults to POST methods. Therefore, it seems
reasonable to verify that task queue methods are using the POST method,
especially given that the method must now be identified explicitly when using
CloudTaskUtils. This check would have guarded against the bug fixed by #1413.
* Elaborate on comment
* Further improved the comment
* Remove the ineffective SQL injection check
Remove the ineffective SQL-injection attack check in go/r3pr/954. It is
quite restrictive, causing a long exempt list. It also doesn't protect
queries made through helpers such as QueryComposer etc.
We will start from scratch for a new solution.
* Add the Cloud SQL queries for transaction reports
* Add the remaining queries
* Some query fixes
* Fix comments
* Fix indentation in total_nameservers
* Fix indentation on other Case condition
* Fix InitSqlPipeline regarding synthesized history
There are a few bad domains in Datastore that we hardcoded to ignore
during SQL population. They didn't have history so we didn't try to
filter when writing history.
Recently we created synthesized history for domains, including the bad
domains. Now we need to filter History entries.
* Support shared database snapshot
Allow multiple workers to share a CONSISTENT database snapshot. The
motivating use case is SQL database snapshot loading, where it is too
slow to depend on one worker to load everything.
This currently is postgresql-specific, but will be improved to be
vendor-independent.
Also made sure AppEngineEnvironment.java clears the cached environment
in call cases when tearing down.
* Update terraform files and instructions
Update proxy terraform files based on current best practices and allow
exclusion of forwarding rules for HTTP endpoints. Specifically:
- Add a "public_web_whois" input to allow disabling the public HTTP
whois forwarding.
- Add "description" fields to all variables.
- Move outputs of the top-level module into "outputs.tf".
- Auto-reformat using hclfmt.
* Make entities serializable for DB validation
Make entities that are asynchronously replicated between Datastore and
Cloud SQL serializable so that they may be used in BEAM pipeline based
comparison tool.
Introduced an UnsafeSerializable interface (extending Serializable) and
added to relevant classes. Implementing classes are allowed some
shortcuts as explained in the interface's Javadoc. Post migration we
will decide whether to revert this change or properly implement
serialization.
Verified with production data.
This is used for the replay locks so that Beam pipelines (which will be
used for database comparison) can acquire / release locks as necessary
to avoid database contention. If we're comparing contents of Datastore
and SQL databases, we shouldn't have replay actively running during the
comparison, so the pipeline will grab the locks.
Beam doesn't always play nicely with loading from / saving to Datastore,
so we need to make sure that we store the replay locks in SQL at all
times, even when Datastore is the primary DB.
* Re-enable replay tests for most environments
This enables the replay tests except in environments where
the NOMULUS_DISABLE_REPLAY_TESTS environment variable is set to "true".
* Add a check for null
* Alt entity model for fast JPA bulk query
Defined an alternative JPA entity model that allows fast bulk loading of
multi-level entities, DomainBase and DomainHistory. The idea is to bulk
the base table as well as the child tables separately, and assemble them
into the target entity in memory in a pipeline.
For DomainBase:
- Defined a DomainBaseLite class that models the "Domain" table only.
- Defined a DomainHost class that models the "DomainHost" table
(nsHosts field).
- Exposed ID fields in GracePeriod so that they can be mapped to domains
after being loaded into memory.
For DomainHistory:
- Defined a DomainHistoryLite class that models the "DomainHistory"
table only.
- Defined a DomainHistoryHost class that models its namesake table.
- Exposed ID fields in GracePeriodHistory and DomainDsDataHistory
classes so that they can be mapped to DomainHistory after being
loaded into memory.
In PersistenceModule, provisioned a JpaTransactionManager that uses
the alternative entity model.
Also added a pipeline option that specifies which JpaTransactionManager
to use in a pipeline.
I observed an instance in which a couple queries from this action were,
for whatever reason, hanging around as idle for >30 minutes. Assuming
the behavior that we saw before where "an open idle serializable
transaction means all pg read-locks stick around forever" still holds,
that's the reason why the amount of read-locks in use spirals out of
control.
I'm not sure why those queries aren't timing out, but that's a separate
issue.
* Fix problems with the format tasks
The format check is using python2, and if "python" doesn't exist on the path
(or isn't python 2, or there is any other error in the python code or in the
shell script...) the format check just succeeds.
This change:
- Refactors out the gradle code that finds a python3 executable and use it
to get the python executable to be used for the format check.
- Upgrades google-java-format-diff.py to python3 and removes #! line.
- Fixes shell script to ensure that failures are propagated.
- Suppresses error output when checking for python commands.
Tested:
- verified that python errors cause the build to fail
- verified that introducing a bad format diff causes check to fail
- verified that javaIncrementalFormatDryRun shows the diffs that would be
introduced.
- verified that javaIncrementalFormatApply reformats a file.
- verified that well formatted code passes the format check.
- verified that an invalid or missing PYTHON env var causes
google-java-format-git-diff.sh to fail with the appropriate error.
* Fix presubmit issues
Omit the format presubmit when not in a git repo and remove unused "string"
import.
* Add a beam pipeline to create synthetic history entries in SQL
The logic is mostly lifted from CreateSyntheticHistoryEntriesAction. We
do not need to test for the existence of an embedded EPP resource in the
history entry before create a synthetic one because after
InitSqlPipeline runs it is guaranteed that no embedded resource exists.
* Set payload in success response after sending expiring certificate notification emails
* Modify log message and test cases for run() in sendExpiringCertificateNotificationEmailAction
* Resolve merge conflict
* Include reason and requestedByRegistrar in URS test file
* Modify test cases for new parameters in renew flow
* Add reason and registrar_request to renew domain command
* Update comments for new params in renew flow
* Make changes based on feedback