Add a beam pipeline to expand recurring billing event (#1881)

This will replace the ExpandRecurringBillingEventsAction, which has a
couple of issues:

1) The action starts with too many Recurrings that are later filtered out
   because their expanded OneTimes are not actually in scope. This is due
   to the Recurrings not recording its latest expanded event time, and
   therefore many Recurrings that are not yet due for renewal get included
   in the initial query.

2) The action works in sequence, which exacerbated the issue in 1) and
   makes it very slow to run if the window of operation is wider than
   one day, which in turn makes it impossible to run any catch-up
   expansions with any significant gap to fill.

3) The action only expands the recurrence when the billing times because
   due, but most of its logic works on event time, which is 45 days
   before billing time, making the code hard to reason about and
   error-prone.  This has led to b/258822640 where a premature
   optimization intended to fix 1) caused some autorenwals to not be
   expanded correctly when subsequent manual renews within the autorenew
   grace period closed the original recurrece.

As a result, the new pipeline addresses the above issues in the
following way:

1) Update the recurrenceLastExpansion field on the Recurring when a new
   expansion occurs, and narrow down the Recurrings in scope for
   expansion by only looking for the ones that have not been expanded for
   more than a year.

2) Make it a Beam pipeline so expansions can happen in parallel. The
   Recurrings are grouped into batches in order to not overwhelm the
   database with writes for each expansion.

3) Create new expansions when the event time, as opposed to billing
   time, is within the operation window. This streamlines the logic and
   makes it clearer and easier to reason about. This also aligns with
   how other (cancelllable) operations for which there are accompanying
   grace periods are handled, when the corresponding data is always
   speculatively created at event time. Lastly, doing this negates the
   need to check if the expansion has finished running before generating
   the monthly invoices, because the billing events are now created not
   just-in-time, but 45 days in advance.

Note that this PR only adds the pipeline. It does not switch the default
behavior to using the pipeline, which is still done by
ExpandRecurringBillingEventsAction. We will first use this pipeline to
generate missing billing events and domain histories caused by
b/258822640. This also allows us to test it in production, as it
backfills data that will not affect ongoing invoice generation. If
anything goes wrong, we can always delete the generated billing events
and domain histories, based on the unique "reason" in them.

This pipeline can only run after we switch to use SQL sequence based ID
allocation, introduced in #1831.
This commit is contained in:
Lai Jiang 2023-01-09 17:41:56 -05:00 committed by GitHub
parent 9789cf3b00
commit 2294c77306
29 changed files with 1194 additions and 27 deletions

View file

@ -76,6 +76,7 @@
reason text not null,
domain_name text not null,
recurrence_end_time timestamptz,
recurrence_last_expansion timestamptz not null,
recurrence_time_of_year text,
renewal_price_amount numeric(19, 2),
renewal_price_currency text,
@ -774,6 +775,7 @@ create index IDXd3gxhkh0jk694pjvh9pyn7wjc on "BillingRecurrence" (registrar_id);
create index IDX6syykou4nkc7hqa5p8r92cpch on "BillingRecurrence" (event_time);
create index IDXoqttafcywwdn41um6kwlt0n8b on "BillingRecurrence" (domain_repo_id);
create index IDXp3usbtvk0v1m14i5tdp4xnxgc on "BillingRecurrence" (recurrence_end_time);
create index IDXp0pxi708hlu4n40qhbtihge8x on "BillingRecurrence" (recurrence_last_expansion);
create index IDXjny8wuot75b5e6p38r47wdawu on "BillingRecurrence" (recurrence_time_of_year);
create index IDX3y752kr9uh4kh6uig54vemx0l on "Contact" (creation_time);
create index IDXtm415d6fe1rr35stm33s5mg18 on "Contact" (current_sponsor_registrar_id);