* Modify DomainCreateFlow to check for an applicable defaultPromoToken
* Add handling for deleted tokens
* Change cache to allocation token cache
* Abstract away cache methods
* Use AllocationToken.getAll in create flow
* Filter out empty tokens
This is an 'easy' upgrade that requires a minor change in
common/build.gradle and the removal of an unnecessary import in buildSrc.
Gradle 7.4 and above has breaking changes that break the latest nebula lint plugin. We may have to wait a while.
Due to the way the beam pipeline is designed, it will expand an
recurring billing event when its event time is in scope for expansion,
instead of billing time. This means that the one time will be generated
45 days earlier. This would negate the need to check if the expansion is
finished when generating monthly invoices.
We will need to backfill the past 45 days of onetimes before the new
code is deployed. As an illustration, with the old code, a cursor time
of 2023-01-17 means that all auto-renewals whose billing time is before
2023-01-17 were created, which corresponds to an effective cursor time
of 2022-12-03 (45 days before 2023-01-17) for event time. This cursor
will need to be brought to 2023-01-17 to ensure that there is no gap in
generated event times when switching to use the new code.
Currently we synthesize a instance id which requires the use of App
Engine Module API. Given that we only have one version of code running
at one time, and HTTP is stateless, there is no point tracking exactly
which GAE "instance" is. We do lose information on which service (default,
backend, etc) is writing the metric, but that does not seem very
important.
Using a constant fake instance ID allows us to get rid of another GAE
dependency.
The temporary queue.xml file is not deleted in the afterEach() method,
likely causing some flaky tests that we saw due to overwriting of the
file by concurrent tests.
Most of its usage can be replaced by JpaIntegrationTestExtension. In
places where specific GAE APIs are still needed, namely when pull queue
or the User service is used, two simplifed extensions are used, which
makes them much easier to identify when the APIs are no longer used.
We've been using it in production for three weeks now. Everything seems
to be working fine. Removing the code related to checking the migration
state and using the override.
This will replace the ExpandRecurringBillingEventsAction, which has a
couple of issues:
1) The action starts with too many Recurrings that are later filtered out
because their expanded OneTimes are not actually in scope. This is due
to the Recurrings not recording its latest expanded event time, and
therefore many Recurrings that are not yet due for renewal get included
in the initial query.
2) The action works in sequence, which exacerbated the issue in 1) and
makes it very slow to run if the window of operation is wider than
one day, which in turn makes it impossible to run any catch-up
expansions with any significant gap to fill.
3) The action only expands the recurrence when the billing times because
due, but most of its logic works on event time, which is 45 days
before billing time, making the code hard to reason about and
error-prone. This has led to b/258822640 where a premature
optimization intended to fix 1) caused some autorenwals to not be
expanded correctly when subsequent manual renews within the autorenew
grace period closed the original recurrece.
As a result, the new pipeline addresses the above issues in the
following way:
1) Update the recurrenceLastExpansion field on the Recurring when a new
expansion occurs, and narrow down the Recurrings in scope for
expansion by only looking for the ones that have not been expanded for
more than a year.
2) Make it a Beam pipeline so expansions can happen in parallel. The
Recurrings are grouped into batches in order to not overwhelm the
database with writes for each expansion.
3) Create new expansions when the event time, as opposed to billing
time, is within the operation window. This streamlines the logic and
makes it clearer and easier to reason about. This also aligns with
how other (cancelllable) operations for which there are accompanying
grace periods are handled, when the corresponding data is always
speculatively created at event time. Lastly, doing this negates the
need to check if the expansion has finished running before generating
the monthly invoices, because the billing events are now created not
just-in-time, but 45 days in advance.
Note that this PR only adds the pipeline. It does not switch the default
behavior to using the pipeline, which is still done by
ExpandRecurringBillingEventsAction. We will first use this pipeline to
generate missing billing events and domain histories caused by
b/258822640. This also allows us to test it in production, as it
backfills data that will not affect ongoing invoice generation. If
anything goes wrong, we can always delete the generated billing events
and domain histories, based on the unique "reason" in them.
This pipeline can only run after we switch to use SQL sequence based ID
allocation, introduced in #1831.
We can use the saved refresh token associated with the nomulus tool to
request an ID token with an audience of the IAP client in order to
satisfy IAP with with the Nomulus tool.
Note: this requires that the user of the Nomulus tool, e.g.
"gbrodman@google.com" has a User object stored in SQL.
Tested on QA
This is similar to where we store the SQL files for beam pipelines, and
frankly makes more sense. Also streamlined the use of the API to read
SQL files from a jar.
The AppEngine thread factory is only useful if we can't create our own
(this is no longer the case) or if we need access to AppEngine APIs
(this is no longer the case).
The Concurrent class is only used by the DNS writer and the
CreateGroupsAction.
This parameter is misleading and does not do what it purports to do.
Namely, it does not impact the level of parallelism. Given the input n for this
parameter, and m for the batch size, the elements are divided (keyed) into n
groups, each of which are then spread evenly across all threads, which
are eventually in turn batched into batches with size m:
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/GroupIntoBatches.java#L227
This is also evident in the implementation itself, where the ShardedKey
is determined by the unique number for a worker/thread combo and the
original key:
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/GroupIntoBatches.java#L268
Using a more concrete example, suppose we have 100 elements and 10
worker threads, with a target batch size of 5. If the "shard" number is set to
1, we first spread the 100 elements across 10 threads, resulting in 10
elements per thread, each thread then batches the elements into 2
batches of size 5.
If the "shard" number is set to 2, the 100 elements are first divided into 2
"shards" of 50 each. Each "shard" is then distributed within the 10
threads, resulting in 5 elements per "shard" per thread. They then get
turned into 1 batch per "shard" per thread. In the end, each thread
still processes 2 batches, even though they are from 2 different "shards".
Therefore this "shard" number does not perform horizontal partitioning
that one normally associates with sharding, and provides no
performance benefits but rather confuses the user.
It is also suggested that using withShardedKey() alone is already
sufficient to achieve auto-sharding within the keyed group. There is no
need to manually divide the input by keying them differently based on
the "shard" number specified:
https://youtu.be/jses0W4Zalc?t=967
Some retriers are no longer needed because transactions are
automatically retried by the JPA transaction manager when there's a
transient exception.
<!-- Reviewable:start -->
- - -
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1874)
<!-- Reviewable:end -->
We no longer use App Engine Remote API as of #1858.
The pipeline endpoint is only for GAE mapreduce, which we stopped doing
for a while.
TESTED=deployed to alpha and used nomulus tool built from master to
connect to the tools service.
<!-- Reviewable:start -->
- - -
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1873)
<!-- Reviewable:end -->