Commit graph

46 commits

Author SHA1 Message Date
Weimin Yu
cbffc454b5 Prober ssl cert update automation (#2019)
Defined CloudBuild script and docker image that automatically
updates probers' SSL certs
2023-05-03 15:57:50 -04:00
gbrodman
53549afd0d Fix typo in pipeline name (#2016) 2023-04-28 17:05:24 -04:00
gbrodman
668a7a47a2 Refactor / rename Billing object classes (#1993)
This includes renaming the billing classes to match the SQL table names,
as well as splitting them out into their own separate top-level classes.
The rest of the changes are mostly renaming variables and comments etc.

We now use `BillingBase` as the name of the common billing superclass,
because one-time events are called BillingEvents
2023-04-28 14:27:37 -04:00
Lai Jiang
2981c1c10c Refactor contact history PII wipeout logic into a Beam pipeline (#1994)
Because we need to check if a contact history is the most recent for its
underlying contact resource, the query-wipe out-repeat loop no longer works
ideally due to the added overhead with the query.

Instead, we refactor the logic into a Beam pipeline where the query only
needs to be performed once and history entries eligible for wipe out are
handled individually in their own transforms. Because history entries
are otherwise immutable, we can run the pipeline in relatively relaxed
repeatable read isolation level. We also do not worry about batching for
performance, as we do not anticipate this operation to put a lot of
strains on the particular table.
2023-04-19 13:04:45 -04:00
Lai Jiang
2294c77306 Add a beam pipeline to expand recurring billing event (#1881)
This will replace the ExpandRecurringBillingEventsAction, which has a
couple of issues:

1) The action starts with too many Recurrings that are later filtered out
   because their expanded OneTimes are not actually in scope. This is due
   to the Recurrings not recording its latest expanded event time, and
   therefore many Recurrings that are not yet due for renewal get included
   in the initial query.

2) The action works in sequence, which exacerbated the issue in 1) and
   makes it very slow to run if the window of operation is wider than
   one day, which in turn makes it impossible to run any catch-up
   expansions with any significant gap to fill.

3) The action only expands the recurrence when the billing times because
   due, but most of its logic works on event time, which is 45 days
   before billing time, making the code hard to reason about and
   error-prone.  This has led to b/258822640 where a premature
   optimization intended to fix 1) caused some autorenwals to not be
   expanded correctly when subsequent manual renews within the autorenew
   grace period closed the original recurrece.

As a result, the new pipeline addresses the above issues in the
following way:

1) Update the recurrenceLastExpansion field on the Recurring when a new
   expansion occurs, and narrow down the Recurrings in scope for
   expansion by only looking for the ones that have not been expanded for
   more than a year.

2) Make it a Beam pipeline so expansions can happen in parallel. The
   Recurrings are grouped into batches in order to not overwhelm the
   database with writes for each expansion.

3) Create new expansions when the event time, as opposed to billing
   time, is within the operation window. This streamlines the logic and
   makes it clearer and easier to reason about. This also aligns with
   how other (cancelllable) operations for which there are accompanying
   grace periods are handled, when the corresponding data is always
   speculatively created at event time. Lastly, doing this negates the
   need to check if the expansion has finished running before generating
   the monthly invoices, because the billing events are now created not
   just-in-time, but 45 days in advance.

Note that this PR only adds the pipeline. It does not switch the default
behavior to using the pipeline, which is still done by
ExpandRecurringBillingEventsAction. We will first use this pipeline to
generate missing billing events and domain histories caused by
b/258822640. This also allows us to test it in production, as it
backfills data that will not affect ongoing invoice generation. If
anything goes wrong, we can always delete the generated billing events
and domain histories, based on the unique "reason" in them.

This pipeline can only run after we switch to use SQL sequence based ID
allocation, introduced in #1831.
2023-01-09 17:41:56 -05:00
gbrodman
f62732547f Delete DatastoreTM and most other references to Datastore (#1681)
This includes:
- deletion of helper DB methods in tests
- deletion of various old Datastore-only classes and removal of any
  endpoints
- removal of the dual-database test concept
- removal of 'ofy' from the AppEngineExtension
2022-07-01 13:33:38 -04:00
gbrodman
471205ad77 Delete code relating to SQL init and scheduling (#1661)
One of the more significant changes introduced in this PR is that we use
SQL as the backing database in all tests unless otherwise specified,
e.g. by using the TmOverrideExtension. We change various ofy-related
tests to use this.

This includes various changes:
- Deletion of SqlEntity/DatastoreEntity and related classes. Includes
  any necessary changes because of that (e.g. getting a nice SQL key on
  error in RegistryJpaIO).
- Deletion of classes that used libraries from the init-sql code
  (RefreshDnsOnHostRenameAction)
- Removal of the JpaTransactionManager's backup implementation
- Modification of RegistryJpaWriteTest to not use init-sql code
- Removal of the Transaction class and related classes, however it does
  not remove the TransactionEntity class as that would require DB
  changes
- Removal of anything related to the actual usage of the database
  migration schedule or read-only phases
- Various test changes and fixes to account for the differences in SQL
  (like how foreign keys need to exist)

This deliberately doesn't do anything to alter the objects actually
stored in the DB yet, just how we use them
2022-06-13 15:10:35 -04:00
gbrodman
2879f3dac5 Remove functional SQL<->DS replay code (#1659)
This includes:
- removing the actions that do the replay
- removing the tests for the replay
- removing the ReplayExtension and adjusting the various tests that used
  it appropriately
- removing functionality relating to "things that happen during replay",
  e.g. beforeSqlSaveOnReplay

This does not include:
- removing the InitSqlPipeline or similar tasks
- removing e.g. SqlEntity (it's used in other places)
- removing Transforms/RegistryJpaIO and other SQL-pipeline-creation code
2022-06-09 07:44:01 -04:00
Weimin Yu
d03cd5bb76 Verify schema using Cloud Build (#1627)
* Add tool to compare  golden and actual schema
2022-05-16 16:10:09 -04:00
Weimin Yu
61c50d811a Tag nomulus-tool in schema deployment script (#1621)
* Tag nomulus-tool in schema deployment script
2022-05-05 12:46:32 -04:00
Lai Jiang
5352c06c7b Do not delete build cache when building release candidates (#1619)
We would like to re-use the build cache when building RCs for different
environments. There's not much practical use in doing a "clean" for
every build when Gradle should be able to figure out which artifacts
need to be rebuilt. It also does not make sense to build each
environment in a separate step, which also introduces redunency because
not all artifacts are cached across steps. The build cache is enabled by
default.

Lastly, the cache needs to be inside the /workspace folder, which is the
default persisted storage location.

TESTED=tried to build the RCs on alpha and saved about 10 min.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1619)
<!-- Reviewable:end -->
2022-05-04 12:08:10 -04:00
Lai Jiang
e1c0725fb1 Increase Nomulus build timeout (#1613)
We have recently started to routinely breach the 1h timeout. Increasing
this value to 2h. We should also look into reusing the artifacts when
building RCs for different environments.
2022-05-02 16:11:11 -04:00
gbrodman
9939833c25 Create a Dataflow pipeline to resave EPP resources (#1553)
* Create a Dataflow pipeline to resave EPP resources

This has two modes.

If `fast` is false, then we will just load all EPP resources, project them to the current time, and save them.

If `fast` is true, we will attempt to intelligently load and save only resources that we expect to have changes applied when we project them to the current time. This means resources with pending transfers that have expired, domains with expired grace periods, and non-deleted domains that have expired (we expect that they autorenewed).
2022-04-15 15:46:35 -04:00
Weimin Yu
e47be4fa2c Add a tools command to launch SQL validation job (#1526)
* Add a tools command to launch SQL validation job

Stopping using Pipeline.run().waitUntilFinish in
ValidateDatastorePipeline. Flex-templalate does not support blocking
wait in the main thread.

This PR adds a new ValidateSqlCommand that launches the pipeline and
maintains the SQL snapshot while the pipeline is running.

This PR also added more parameters to both ValidateSqlCommand and
ValidateDatastoreCommand:
- The -c option to supply an optional incremental comparison start time
- The -r option to supply an optional release tag that is not 'live',
  e.g., nomulus-DDDDYYMM-RC00

If the manual launch option (-m) is enabled, the commands will print the
gcloud command that can launch the pipeline.

Tested with sandbox, qa and the dev project.
2022-02-28 13:14:57 -05:00
Weimin Yu
ac2c08b6e1 Release ValidateSqlPipeline as container image (#1504)
* Release ValidateSqlPipeline as container image
2022-01-28 14:57:31 -05:00
Weimin Yu
62b2c18791 Release ValidateDatastorePipeline (#1501)
* Release ValidateDatastorePipeline
2022-01-26 13:38:19 -05:00
Lai Jiang
6a44565acd Fix beam deployment script again. (#1369)
uberjar task and uberjar name are now different (beamPipelineCommon and
beam_pipeline_common, respectively). This is more idiomatic with regard
to naming conventions but we need to take two different variables now.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1369)
<!-- Reviewable:end -->
2021-10-04 14:23:28 -04:00
Lai Jiang
4903465f26 Change Beam uber jar name in Nomulu release GCB config (#1367)
The uber jar name was changed in #1351.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1367)
<!-- Reviewable:end -->
2021-10-04 10:47:27 -04:00
Lai Jiang
b068e459c2 Add a Beam pipeline to generate RDE deposit (part 1) (#1219)
This is the first part of the RdeStagingAction SQL migration where the
mapper logic is implemented in Beam.

A few helper methods are added to convert the DomainContent, HostBase
and ContactBase to their respective terminal child classes. This is
necessary and possible because the child classes do not have extra
fields and the base classes exist only to be embedded to other entities
(such as the various HistoryEntry entities). The conversion is necessary
because most of our code expects the terminal classes, such as the
RdeMarshaller's various marshallXXX() methods. The alternative would be
to change all the call sites, which seems to be much more disruptive.

Unfortunately there is is no good way to do this conversion than just
creating a builder and setting every fields there is.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1219)
<!-- Reviewable:end -->
2021-06-30 13:54:24 -04:00
Lai Jiang
8034193b88 Upload the GCB delete job yaml file to GCS (#1135)
<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1135)
<!-- Reviewable:end -->
2021-05-05 21:43:51 -04:00
Lai Jiang
4c0f221e8c Re-enable tests in RC build (#1130)
There has been a case where the CI was broken on Friday and no one
noticied or fixed it and a RC build was built with broken tests.
The tests were disabled due to unknown test failures that have since
been fixed.

Also update the machine type used by GCB to be more powerful. This is
necessary for the tests to past because N1_HIGHCPU_8 is RAM constraint
and the tests crashes. I updated all jobs to use the new type which
hopefully will make the build faster as well.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1130)
<!-- Reviewable:end -->
2021-05-05 13:53:21 -04:00
Lai Jiang
514f2fbc5c Fix build (#1109) 2021-04-26 10:34:29 -04:00
Lai Jiang
feab3633a6 Migrate the billing pipeline to flex template (#1100)
This is similar to the migration of the spec11 pipeline in #1073. Also removed
a few Dagger providers that are no longer needed.

TESTED=tested the dataflow job on alpha.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1100)
<!-- Reviewable:end -->
2021-04-22 10:26:15 -04:00
Lai Jiang
8aea804841 Migrate Spec11 pipeline to flex template (#1073)
* Migrate Spec11 pipeline to flex template

Unfortunately this PR has turned out to be much bigger than I initially
conceived. However this is no good way to separate it out because the
changes are intertwined. This PR includes 3 main changes:

1. Change the spec11 pipline to use Dataflow Flex Template.
2. Retire the use of the old JPA layer that relies on credential saved
   in KMS.
3. Some extensive refactoring to streamline the logic and improve test
   isolation.

* Fix job name and remove projectId from options

* Add parameter logs

* Set RegistryEnvironment

* Remove logging and modify safe browsing API key regex

* Rename a test method and rebase

* Remove unused Junit extension

* Specify job region
2021-04-21 00:09:50 -04:00
Weimin Yu
7c3d0dd1a9 Use shared jar to stage BEAM pipeline if possible (#1008)
* Use shared jar to stage BEAM pipeline if possible

Allow multiple BEAM pipelines with the same classes and dependencies to
share one Uber jar.

Added metadata for BulkDeleteDatastorePipeline.

Updated shell and Cloud Build scripts to stage all pipelines in one
step.
2021-03-16 13:19:30 -04:00
Weimin Yu
39816bf7cd Stage the init_sql_pipeline in CloudBuild (#1004)
* Stage the init_sql_pipeline in CloudBuild

Defined metadata file and added Gradle uberJar task for the pipeline,
which are needed for staging.

Updated cloud build script to stage this pipeline during the build
processs.
2021-03-12 10:36:57 -05:00
Lai Jiang
18b0d074c8 Disable tests in RC builds (#752)
For reasons unclear at the moment the tests are not passing. Disabling
them for now so that release candidates can be built. We have CI runs
after each merge so we should be pretty confident if the build is broken
or not.
2020-08-07 17:51:34 -04:00
Weimin Yu
20ad27cfdd Use JSON API for Maven Repo on GCS (#483)
* Use JSON API for Maven Repo on GCS

The url pattern https://storage.googleapis.com/{Bucket}/{Path}
uses the legacy XML API, which seems to be less robust than
the JSON API. We have observed connection resets after a few
thousand-file download bursts over 30 minutes.

This PR changes all urls to registry's Maven repo on GCS to
gcs://{Bucket}/{Path}. Gradle uses the JSON API for such urls.

TESTED=In Cloud Build with local change
2020-02-12 14:03:50 -05:00
Weimin Yu
7acf136218 Use dependency cache in all Gradle tasks in GCB (#481)
* Use dependency cache in all Gradle tasks in GCB

Make the initial test and the final publishing steps use the shared
dependency cache.

Also make the initial test use the registry's own maven repo instead
of Maven Central.
2020-02-11 14:50:22 -05:00
Weimin Yu
36cfd31b80 Upload Cloud Build schema-deploy config to GCS (#435)
* Upload Cloud Build schema-deploy config to GCS

Forgot to upload cloudbuild-schema-deploy.yaml to GCS.
2020-01-09 15:04:10 -05:00
Weimin Yu
22004a4ee4 Run cross-release SQL integration tests (#403)
* Run cross-release SQL integration tests

Run SQL integration tests across arbitrary schema and server
releases.

Refer to integration/README.md in this change for more information.

TESTED=Cloud build changes tested with cloud-build-local
       Used the published jars to test sqlIntegration task locally.
2019-12-12 13:47:49 -05:00
Weimin Yu
3d2c68b350 Stop publish Cloud SQL schema jar to maven repo (#383)
* Stop publish Cloud SQL schema jar to maven repo

The original purpose of the maven publication is for
use in server/schema compatibility tests. A commandline
flag can direct a test run to use different versions of
the schema jar. However, this won't work due to dependency
locking.
2019-11-25 18:23:02 -05:00
Weimin Yu
bd53fb3bc0 Release SQL schema in Cloud Build (#341)
* Release SQL schema in Cloud Build

Tentatively release SQL schema at the same time as the server release.
Publish schema jar to gs://domain-registry-maven-repository/nomulus
and also upload it with server artifacts.

Also removed the Gradle 'version' variable which is not used.

Tested=On cloud-build with a simplified version of
cloudbuild-nomulus.yaml.
2019-11-04 10:22:05 -05:00
Weimin Yu
3638fb1cec Save current deployment tag for every environment (#332)
* Save release tag during deployment

* Save current tag for every environment

Store tag of the current deployment in each environment.
This is used by the server-sql compatibility test.

* Save current tag for every environment

Store tag of the current deployment in each environment.
This is used by the server-sql compatibility test.
2019-10-30 13:58:56 -04:00
Lai Jiang
f080259e5e Merge beam and GAE configs deployment to one GCB job (#182)
* Merge beam and GAE configs deployment to one GCB job

Deployment of GAE configs requires that the credential used by gcloud to
have GAE admin role of the project to be managed. We do not want to
grant the GCB service account that role, because it would all *any* GCB
job to deploy anything to GAE. Instead we use a dedicated credential
originally created to deploy beam pipelines. This credential is
encrypted by KMS and stored on GCS. Since the beam pipeline deployment
GCB job already does the decryption, it make sense to add the config
deployment step there as well. The beam deployment steps are tweaked to
use the nomulus tool docker image instead of the jar file.

Also moved the content of deploy_configs_to_env.sh to the GCB yaml file
itself because the shell script is not uploaded to GC Bat the same time
as the yaml file when the job is triggered by Spinnaker.

Lastly, due to b/137891685, using GCB to deploy cron jobs does not work
as we cannot use service account credential to deploy to projects under
google.com.
2019-07-19 16:54:56 -04:00
gbrodman
1abfd169f0 Add a Cloud Build task to update YAML configs (#177)
* Add a Cloud Build task to update YAML configs

* CR responses

* Move config deployment to a script

* Pin builder version

* Create different beam and deploy-config files per environment

* Update comments and make a for loop
2019-07-18 12:15:15 -04:00
Michael Muller
ba8d67ed30 Build docker image of nomulus tool (#142)
* Build docker image of nomulus tool

In the course of "gradle build", build a docker image of nomulus tool so that
users can run this to allow us to bundle the java version with the image.
2019-07-16 20:18:44 -04:00
Lai Jiang
f20fd64537 Update GCB beam deployment pipeline (#134)
* Update GCB beam deployment pipeline

Some of the texts are not really secerts because they are per-project.
Also changed the location of the credential file to `secerts` so that in
the future we may add more secerts in that folder.

The encrypted file is base64 encoded, consistent with how the proxy
certificates are encoded. Also made some changes to the other pipelines
to facilitate automation with Spinnaker
2019-06-24 14:36:56 -04:00
Lai Jiang
96f7217ed2 Always clone the internal repo to nomulus-internal
Also updated .gcloudignore to not pull in unnecessary files when running
`gcloud builds submit`.
2019-06-20 14:26:36 -04:00
Lai Jiang
ad20178f18 Fix builds after refactor (#99)
Fixed both GAE and proxy builds after #90 refactored the code structure.

Also removed now unnecessary chmod and chown from GCB scripts.
2019-06-13 18:01:30 -04:00
Gus Brodman
38cfc9f693 Refactor to be more in line with a standard Gradle project structure 2019-06-13 09:41:11 -04:00
jianglai
b664102048 Add GCB workflows to promote the nomulus tool command after deployment
With https://github.com/spinnaker/spinnaker/issues/4048 Spinnaker now natively supports GCB. We are able to start a GCB job from Spinnaker, and also there is better support to consume GCB pub/sub messages. Some changes are made to remove the workaround no longer needed.

Two new workflows are added, one to rsync a GCS folder to live/ after the deployment is done (so that the nomulus.jar file can then be fetched to x20 by a []cron job), and the other to tag the proxy image as live once it is deployed.

Lastly, the docs/ folders are needed when running tests. Remove it from .gcloudignore so that when a test run is kicked off by running "gcloud builds submit" the folder is sent to GCB. Ideally .gcloudignore should be identical to .gitignore but since they both are version controlled it is hard it make one a symlink of another.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=252625901
2019-06-12 13:08:11 -04:00
jianglai
538c659609 Upload the tool jar to GCS
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=250496565
2019-05-30 12:52:21 -04:00
jianglai
4dc21e076f Remove failing tests from Nomulus GCB pipeline
Also upgrade to Gradle 5.4.1.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=246829399
2019-05-06 16:58:15 -04:00
jianglai
94aa5cd6ec Update Nomulus release pipeline
Refactor out the build and package logic to a reusable script. Also removed the gradle task flag to skip lint check, as failing lint check is no longer a fatal error.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=245296563
2019-04-27 00:01:06 -04:00
jianglai
926e68e806 Update proxy deployment pipeline
The pipeline is broken into two. The first one is to be triggered when the public repo is tagged. It then tags the private repo, builds and upload the builder and base images, and push a new commit to the release (merged repo). This pipeline also does text manipulation on several files in the release repo to ensure that the images uploaded in this pipeline is always used to reproducibly build the release repo at the same commit.

The second pipeline is then triggered by commit into the release repo, which builds, signs and uploads the proxy image.

Also updated the dependency lock files to use the latest plugins dependencies, which are uploaded to the GCS repo.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=244666211
2019-04-22 13:02:39 -04:00
Renamed from cloudbuild-nomulus.yaml (Browse further)