Commit graph

78 commits

Author SHA1 Message Date
Lai Jiang
f4436b54cf
Do not delete build cache when building release candidates (#1619)
We would like to re-use the build cache when building RCs for different
environments. There's not much practical use in doing a "clean" for
every build when Gradle should be able to figure out which artifacts
need to be rebuilt. It also does not make sense to build each
environment in a separate step, which also introduces redunency because
not all artifacts are cached across steps. The build cache is enabled by
default.

Lastly, the cache needs to be inside the /workspace folder, which is the
default persisted storage location.

TESTED=tried to build the RCs on alpha and saved about 10 min.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1619)
<!-- Reviewable:end -->
2022-05-04 12:08:10 -04:00
Lai Jiang
4ec8b71f42
Increase Nomulus build timeout (#1613)
We have recently started to routinely breach the 1h timeout. Increasing
this value to 2h. We should also look into reusing the artifacts when
building RCs for different environments.
2022-05-02 16:11:11 -04:00
Michael Muller
7716eebfff
Change check for root directory during rollback (#1602)
* Change check for root directory during rollback

`rollback_tool` tries to infer the root of the nomulus tree by checking for a
directory named "nomulus".  This is potentially problematic (and, indeed, was
for me) since there is no guarantee what that directory will be named.

There are a number of features that characterize the root directory.  Check
for the presence of the `rollback_tool` wrapper script, as this is both at
root level and tightly coupled to the python code, so hopefully we won't
move it without testing that the script still works.
2022-04-25 12:39:16 -04:00
gbrodman
073d0a416a
Create a Dataflow pipeline to resave EPP resources (#1553)
* Create a Dataflow pipeline to resave EPP resources

This has two modes.

If `fast` is false, then we will just load all EPP resources, project them to the current time, and save them.

If `fast` is true, we will attempt to intelligently load and save only resources that we expect to have changes applied when we project them to the current time. This means resources with pending transfers that have expired, domains with expired grace periods, and non-deleted domains that have expired (we expect that they autorenewed).
2022-04-15 15:46:35 -04:00
Weimin Yu
65c2570b8f
Remove dos.xml from the configs (#1587)
* Remove dos.xml from the configs

We don't have dos config right now, and applying dos from "gcloud app
deploy" is deprecated and has started causing problems.

If we add dos configs, it should be using "gcloud app firewall-rules".
2022-04-11 15:22:42 -04:00
Weimin Yu
4f33de10f3
Add a tools command to launch SQL validation job (#1526)
* Add a tools command to launch SQL validation job

Stopping using Pipeline.run().waitUntilFinish in
ValidateDatastorePipeline. Flex-templalate does not support blocking
wait in the main thread.

This PR adds a new ValidateSqlCommand that launches the pipeline and
maintains the SQL snapshot while the pipeline is running.

This PR also added more parameters to both ValidateSqlCommand and
ValidateDatastoreCommand:
- The -c option to supply an optional incremental comparison start time
- The -r option to supply an optional release tag that is not 'live',
  e.g., nomulus-DDDDYYMM-RC00

If the manual launch option (-m) is enabled, the commands will print the
gcloud command that can launch the pipeline.

Tested with sandbox, qa and the dev project.
2022-02-28 13:14:57 -05:00
Lai Jiang
fa9b784c5c
Correctly delete all stopped versions except for the most recent 3 (#1511)
The gcloud command does some weird stuff with sorting when custom format
is used. Here we instead rely on linux sort and head command to sort the
versions list.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1511)
<!-- Reviewable:end -->
2022-02-03 16:04:58 -05:00
Weimin Yu
1253fa479a
Release ValidateSqlPipeline as container image (#1504)
* Release ValidateSqlPipeline as container image
2022-01-28 14:57:31 -05:00
Weimin Yu
5f0dd24906
Release ValidateDatastorePipeline (#1501)
* Release ValidateDatastorePipeline
2022-01-26 13:38:19 -05:00
Lai Jiang
ceade7f954
Use the service account credential to delete unused versions (#1484) 2022-01-07 11:06:19 -05:00
Lai Jiang
f6920454f6
Fix the beam staging script, take 3 (#1370)
The number of arguments changed in https://github.com/google/nomulus/pull/1369, so the check needs to change as well.

<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1370)
<!-- Reviewable:end -->
2021-10-04 16:44:32 -04:00
Lai Jiang
9103216a46
Fix beam deployment script again. (#1369)
uberjar task and uberjar name are now different (beamPipelineCommon and
beam_pipeline_common, respectively). This is more idiomatic with regard
to naming conventions but we need to take two different variables now.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1369)
<!-- Reviewable:end -->
2021-10-04 14:23:28 -04:00
Lai Jiang
737f65bd33
Change Beam uber jar name in Nomulu release GCB config (#1367)
The uber jar name was changed in #1351.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1367)
<!-- Reviewable:end -->
2021-10-04 10:47:27 -04:00
Michael Muller
420a0b8b9a
Use debian10 image for builder, not ubuntu1804 (#1345)
The debian10 image is generally a bit more recent and, in particular, includes
python 3.7.3, which we're currently using as a baseline for our builds.
2021-09-28 14:49:13 -04:00
Lai Jiang
047444831b
Add a Beam pipeline to generate RDE deposit (part 1) (#1219)
This is the first part of the RdeStagingAction SQL migration where the
mapper logic is implemented in Beam.

A few helper methods are added to convert the DomainContent, HostBase
and ContactBase to their respective terminal child classes. This is
necessary and possible because the child classes do not have extra
fields and the base classes exist only to be embedded to other entities
(such as the various HistoryEntry entities). The conversion is necessary
because most of our code expects the terminal classes, such as the
RdeMarshaller's various marshallXXX() methods. The alternative would be
to change all the call sites, which seems to be much more disruptive.

Unfortunately there is is no good way to do this conversion than just
creating a builder and setting every fields there is.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1219)
<!-- Reviewable:end -->
2021-06-30 13:54:24 -04:00
Lai Jiang
ce03556683
Fix a GCB job description (#1215)
<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1215)
<!-- Reviewable:end -->
2021-06-22 13:51:26 -04:00
Weimin Yu
66867e4397
Use SecretManager for nomulus-tool-cloudbuild cred (#1188)
* Use SecretManager for nomulus-tool-cloudbuild cred

Store cloudbuild's nomulus-tool credential in SecretManager and make the
deployment pipeline load it from the SecretManager.

The tool-credential.json.enc file in the
gs://domain-registry-dev-deploy/secrets folder is no longer needed.
2021-06-02 09:32:57 -04:00
Lai Jiang
5120397607
Upload the GCB delete job yaml file to GCS (#1135)
<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1135)
<!-- Reviewable:end -->
2021-05-05 21:43:51 -04:00
Lai Jiang
3f6ec8f1b0
Re-enable tests in RC build (#1130)
There has been a case where the CI was broken on Friday and no one
noticied or fixed it and a RC build was built with broken tests.
The tests were disabled due to unknown test failures that have since
been fixed.

Also update the machine type used by GCB to be more powerful. This is
necessary for the tests to past because N1_HIGHCPU_8 is RAM constraint
and the tests crashes. I updated all jobs to use the new type which
hopefully will make the build faster as well.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1130)
<!-- Reviewable:end -->
2021-05-05 13:53:21 -04:00
Lai Jiang
e63085fb6a
Add a GCB job to delete stopped GAE versions (#1128) 2021-05-05 11:27:46 -04:00
Lai Jiang
8884425a05
Fix build (#1109) 2021-04-26 10:34:29 -04:00
Lai Jiang
217b37b9d5
Migrate the billing pipeline to flex template (#1100)
This is similar to the migration of the spec11 pipeline in #1073. Also removed
a few Dagger providers that are no longer needed.

TESTED=tested the dataflow job on alpha.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1100)
<!-- Reviewable:end -->
2021-04-22 10:26:15 -04:00
Lai Jiang
464f9aed1f
Migrate Spec11 pipeline to flex template (#1073)
* Migrate Spec11 pipeline to flex template

Unfortunately this PR has turned out to be much bigger than I initially
conceived. However this is no good way to separate it out because the
changes are intertwined. This PR includes 3 main changes:

1. Change the spec11 pipline to use Dataflow Flex Template.
2. Retire the use of the old JPA layer that relies on credential saved
   in KMS.
3. Some extensive refactoring to streamline the logic and improve test
   isolation.

* Fix job name and remove projectId from options

* Add parameter logs

* Set RegistryEnvironment

* Remove logging and modify safe browsing API key regex

* Rename a test method and rebase

* Remove unused Junit extension

* Specify job region
2021-04-21 00:09:50 -04:00
Weimin Yu
4d04e4fd15
Add -r when rsync a release to the live folder (#1063)
* Add -r when rsync a release to the live folder

Release folders now are no longer flat. Each of them has a 'beam'
subfolder with pipeline metadata files.
2021-04-07 10:07:00 -04:00
Weimin Yu
9dd08c48bc
Use credential in secretmanager to deploy schema (#1055)
* Use credential in secretmanager to deploy schema

Fetch the schema_deployer credential from SecretManager when deploying
the schema to Cloud SQL.
2021-04-06 09:43:15 -04:00
Weimin Yu
ccfa145ab7
Allow nom_build to run in Cloudbuild (#1021)
* Allow nom_build to run in Cloudbuild

Our builder comes with python3.6 and cannot support nom_build out of
box. Nom_build requires dataclasses which is introduced in v3.7.

I haven't found an easy way to get python3.7+ without changing the base
linux image. This PR explicitly installs dataclasses.
2021-03-19 11:28:18 -04:00
Weimin Yu
eb2e1c60ca
Use shared jar to stage BEAM pipeline if possible (#1008)
* Use shared jar to stage BEAM pipeline if possible

Allow multiple BEAM pipelines with the same classes and dependencies to
share one Uber jar.

Added metadata for BulkDeleteDatastorePipeline.

Updated shell and Cloud Build scripts to stage all pipelines in one
step.
2021-03-16 13:19:30 -04:00
Weimin Yu
1bbc38c65e
Stage the init_sql_pipeline in CloudBuild (#1004)
* Stage the init_sql_pipeline in CloudBuild

Defined metadata file and added Gradle uberJar task for the pipeline,
which are needed for staging.

Updated cloud build script to stage this pipeline during the build
processs.
2021-03-12 10:36:57 -05:00
Weimin Yu
94ef81dca4
Script to rolling-start Nomulus (#888)
* Script to rolling-start Nomulus

Add a script to restart Nomulus non-disruptively. This can be used after
a configuration change to external resources (e.g.,  Cloud SQL
credential) to make Nomulus pick up the latest config.

Also added proper support to paging based List api methods, replacing the
current hack that forces the server to return everything in one response.
The List method for instances has a lower limit on page size than others
which is not sufficient for our project.
2020-12-01 10:14:05 -05:00
Weimin Yu
83918e92b5
Sync the live folder after Nomulus rollback (#854)
* Sync the live folder after Nomulus rollback

To update the nomulus tool on corp desktop, the artifacts from the
rollback target release should be copied to the 'live' folder.

* Fix a test
2020-10-29 16:21:56 -04:00
Weimin Yu
db2e896d42
An automated rollback tool for Nomulus (#847)
* An automated rollback tool for Nomulus

A tool that directs traffic between deployed versions. It handles the
conversion between Nomulus tags and AppEngine versions, executes schema
compatibility tests, ensures that steps are executed in the correct order,
and updates deployment records appropriately.
2020-10-29 10:37:20 -04:00
Shicong Huang
3705f37fab
Add a build task to upload ER diagrams to GCS (#844)
* Add a build task to upload ER diagrams to GCS

* Merge ER diagram task into cloudbuild-javadoc
2020-10-27 10:41:12 -04:00
Weimin Yu
13d30b0bfb
Maintain a release-to-Version map in deployment (#831)
* Maintain a release-to-Version map in deployment

Keep track of the mapping between Nomulus release tags and AppEngine
version ids with a mapping file. This is necessary because AppEngine
does not support custom versioning. With this mapping, rollbacks could
be automated. Automation of rollbacks is important since there are
test-supporting metadata to be updated, but are easily forgotten.

During the last stage of deployment, current per-service version ids
are fetched using gcloud and are appended to a file on GCS. Each line
is of the format "{RELEASE_TAG},{APPENGINE_SERVICE},{APPENGINE_VERSION}.

This change has been tested in crash. The rollback script is still a
work in progress.
2020-10-09 13:32:52 -04:00
Lai Jiang
98db79d3d1
Re-enable invoicing pipeline deployment (#764)
Now that beam deployment is compatible with Java 11. Re-enable this
step.
2020-08-11 17:26:17 -04:00
Lai Jiang
7b2f7c08e4
Comment out invoicing pipeline deployment temporarily (#759)
Currently it doesn't work with Java > 8. Fix inflight. Disable it to
unblock deployment.
2020-08-10 15:11:34 -04:00
Lai Jiang
ab4cecba22
Temporarily disable spec 11 pipeline deployment in GCB (#755)
The current setup causes the GCB job to fail validation and not run because it
uses backticks in the configuration yaml, which is not allowed -- there is no
shell to perform backtick substitution. See the error message here:

https://spinnaker.endpoints.domain-registry-dev.cloud.goog/gate/pipelines/01EF5GRMD625613H6Z033DBD3Z

In the future please make sure to test the GCB pipeline as instructed in
the comments at the beginning of each file before committing.

I tried to work around it by downloading the nomulus tool jar file
instead (running the nomulus-tool docker image inside a docker image is
not advisable). However the "nomulus deploy_spec11_pipeline" command
still fails. I'm not sure why. Has the command itself been tested
locally? The error message is shown below:

```
Step #2: Aug 09, 2020 3:11:46 AM org.apache.beam.runners.dataflow.DataflowRunner fromOptions
Step #2: WARNING: --region not set; will default to us-central1. Future releases of Beam will require the user to set the region explicitly. https://cloud.google.com/compute/docs/regions-zones/regions-zones
Step #2: Aug 09, 2020 3:11:46 AM org.apache.beam.sdk.extensions.gcp.options.GcpOptions$GcpTempLocationFactory tryCreateDefaultBucket
Step #2: INFO: No tempLocation specified, attempting to use default bucket: dataflow-staging-us-central1-937378958468
Step #2: Aug 09, 2020 3:11:47 AM org.apache.beam.sdk.extensions.gcp.util.RetryHttpRequestInitializer$LoggingHttpBackOffHandler handleResponse
Step #2: WARNING: Request failed with code 409, performed 0 retries due to IOExceptions, performed 0 retries due to unsuccessful status codes, HTTP framework says request can be retried, (caller responsible for retrying): https://www.googleapis.com/storage/v1/b?predefinedAcl=projectPrivate&predefinedDefaultObjectAcl=projectPrivate&project=domain-registry-alpha.
Step #2: Exception in thread "main"
Step #2: java.lang.RuntimeException: Failed to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions)
Step #2:        at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:224)
Step #2:        at org.apache.beam.sdk.util.InstanceBuilder.build(InstanceBuilder.java:155)
Step #2:
Step #2:        at org.apache.beam.sdk.PipelineRunner.fromOptions(PipelineRunner.java:55)
Step #2:        at org.apache.beam.sdk.Pipeline.create(Pipeline.java:147)
Step #2:
Step #2:        at google.registry.beam.spec11.Spec11Pipeline.deploy(Spec11Pipeline.java:157)
Step #2:        at google.registry.tools.DeploySpec11PipelineCommand.run(DeploySpec11PipelineCommand.java:80)
Step #2:        at google.registry.tools.RegistryCli.runCommand(RegistryCli.java:257)
Step #2:        at google.registry.tools.RegistryCli.run(RegistryCli.java:182)
Step #2:        at google.registry.tools.RegistryTool.main(RegistryTool.java:129)
Step #2: Caused by: java.lang.reflect.InvocationTargetException
Step #2:        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Step #2:
Step #2:        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Step #2:        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Step #2:        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
Step #2:        at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:214)
Step #2:        ... 8 more
Step #2: Caused by: java.lang.IllegalArgumentException: Unable to use ClassLoader to detect classpath elements. Current ClassLoader is jdk.internal.loader.ClassLoaders$AppClassLoader@5cb0d902, only URLClassLoaders are supported.
Step #2:        at org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage(PipelineResources.java:58)
Step #2:
Step #2:        at org.apache.beam.runners.dataflow.DataflowRunner.fromOptions(DataflowRunner.java:285)
Step #2:
Step #2:        ... 13 more
```

Lastly the "--project" flag refers to the KMS project. While I'm not
sure which project is that, I don't think we can use the PROJECT_ID
variable as this is a GCB-substituted variable which refers to the
project that the GCB job runs in, which in our cases means
domain-registry-dev. We shouldn't use that project for KMS. I've changed
it to the same project as the one we are deploying to, but please note
that we have a separate project ${project_id}-keys that is used for all
KMS purposes. This is specified in the config file so if that's what you
meant to use, there is no need to specify it in the command line. Actually
if you meant use the project to be deployed to for KMS, it also
shouldn't be necessary to specify it separately as this information is
already known when you specified "nomulus -e ENV".

https://team.git.corp.google.com/domain-registry-eng/nomulus-internal/+/refs/heads/master/core/src/main/java/google/registry/config/files/nomulus-config-production.yaml#168

Can you add more description on what the KMS project is supposed to be?
I don't think we specify a project for KMS purpose in any other
commands.

Given that there are several unresolved issues, I've commented out my
proposed solution so that deployment can proceed.
2020-08-09 22:41:31 -04:00
Lai Jiang
00941ddc3d
Disable tests in RC builds (#752)
For reasons unclear at the moment the tests are not passing. Disabling
them for now so that release candidates can be built. We have CI runs
after each merge so we should be pretty confident if the build is broken
or not.
2020-08-07 17:51:34 -04:00
Lai Jiang
95f4ae0e3a
Use nodesource to install node (#742)
The node installed by nvm gives errors when running "npm install".

Also installs Python as it is need. Presumbly the system provided npm
version has python as a dependency so it was installed when npm was
installed.
2020-08-05 14:56:40 -04:00
Lai Jiang
6d110c77ac
Use the latest version of node in the builder image (#741)
The default node version from the base image (Ubuntu 18.04) is too older
and karma is not happy about it.
2020-08-03 17:40:50 -04:00
Legina Chen
a79940e822
Persist ThreatMatches into Spec11ThreatMatch (#723)
* Replace jpaTm with a JpaSupplierFactory

* Style

* Style

* Pipeline takes in a SerializableSupplier instead

* Change the ordering of imports

* Test a good domain in addition to a bad one

* Rename and check good domain for Transact Answer

* Use standard Mockito verify

* Verify transact call and no more interactions

* Remove Answer comment

* Naming chsnges

* Deploy Spec 11 pipeline correctly

* Fix formatting of deploy file

* Use a file to persist state across Cloud Build steps

Co-authored-by: Gus Brodman <gbrodman@google.com>
2020-08-03 14:40:00 -07:00
Lai Jiang
5f2be914a1
Use Java 11 in GCB to build release candidates (#736) 2020-08-03 13:13:08 -04:00
Lai Jiang
d27fe8ead5
Fix a typo (#610) 2020-06-05 15:53:17 -04:00
Lai Jiang
da65a38782
Add a GCB job to build and publish javadoc (#609) 2020-06-05 13:00:15 -04:00
Weimin Yu
62433c2238
Use JSON API for Maven Repo on GCS (#483)
* Use JSON API for Maven Repo on GCS

The url pattern https://storage.googleapis.com/{Bucket}/{Path}
uses the legacy XML API, which seems to be less robust than
the JSON API. We have observed connection resets after a few
thousand-file download bursts over 30 minutes.

This PR changes all urls to registry's Maven repo on GCS to
gcs://{Bucket}/{Path}. Gradle uses the JSON API for such urls.

TESTED=In Cloud Build with local change
2020-02-12 14:03:50 -05:00
Weimin Yu
f134c4bf37
Use dependency cache in all Gradle tasks in GCB (#481)
* Use dependency cache in all Gradle tasks in GCB

Make the initial test and the final publishing steps use the shared
dependency cache.

Also make the initial test use the registry's own maven repo instead
of Maven Central.
2020-02-11 14:50:22 -05:00
Weimin Yu
ce80278ab7
Make Gradle dependency cache shareable in GCB (#479)
* Make Gradle dependency cache shareable in GCB

Make Gradle put its caches in the source tree so that
they can be preserved across steps. When left at their
default location, caches are lost after each step.
2020-02-10 11:20:11 -05:00
Weimin Yu
ce2f98f680
Work around Spinnaker issue wrt variables (#465)
* Work around Spinnaker issue wrt variables

Cloud Build variable reference need to stay from the  ${var} pattern
to prevent Spinnaker from trying to resolve it. In all files that
are used by Spinnaker, we change variable reference to the $var form.

We made the minimum amount of change possible, and will review this
issue after the permanent solution is available.
2020-01-30 13:28:36 -05:00
Lai Jiang
cdcbe1311a
Make the builder script error out when a command fails (#457) 2020-01-23 21:41:17 -05:00
Weimin Yu
d4b1048363
Upload Cloud Build schema-deploy config to GCS (#435)
* Upload Cloud Build schema-deploy config to GCS

Forgot to upload cloudbuild-schema-deploy.yaml to GCS.
2020-01-09 15:04:10 -05:00
Lai Jiang
76603812e3
Create kzip file for Kythe cross-referencing (#424)
This is set up per b/141716384.

TESTED=Tested on alpha and successfully uploaded the merged kzip file to
GCS.
2020-01-03 16:57:39 -05:00