mirror of
https://github.com/google/nomulus.git
synced 2025-04-30 12:07:51 +02:00
Add more explanation to architecture document
This also renames the document to clarify its scope as being all of Google Cloud Platform, not just App Engine. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=169543846
This commit is contained in:
parent
e0f432aafb
commit
c64e9fe788
4 changed files with 87 additions and 72 deletions
|
@ -1,12 +1,21 @@
|
||||||
# App Engine architecture
|
# Architecture
|
||||||
|
|
||||||
This document contains information on the overall architecture of Nomulus as
|
This document contains information on the overall architecture of Nomulus on
|
||||||
pertains to App Engine.
|
[Google Cloud Platform](https://cloud.google.com/). It covers the App Engine
|
||||||
|
architecture as well as other Cloud Platform services used by Nomulus.
|
||||||
|
|
||||||
## Services
|
## App Engine
|
||||||
|
|
||||||
Nomulus contains three
|
[Google App Engine](https://cloud.google.com/appengine/) is a cloud computing
|
||||||
[services](https://cloud.google.com/appengine/docs/python/an-overview-of-app-engine),
|
platform that runs web applications in the form of servlets. Nomulus consists of
|
||||||
|
Java servlets that process web requests. These servlets use other features
|
||||||
|
provided by App Engine, including task queues and cron jobs, as explained
|
||||||
|
below.
|
||||||
|
|
||||||
|
### Services
|
||||||
|
|
||||||
|
Nomulus contains three [App Engine
|
||||||
|
services](https://cloud.google.com/appengine/docs/python/an-overview-of-app-engine),
|
||||||
which were previously called modules in earlier versions of App Engine. The
|
which were previously called modules in earlier versions of App Engine. The
|
||||||
services are: default (also called front-end), backend, and tools. Each service
|
services are: default (also called front-end), backend, and tools. Each service
|
||||||
runs independently in a lot of ways, including that they can be upgraded
|
runs independently in a lot of ways, including that they can be upgraded
|
||||||
|
@ -25,7 +34,7 @@ The reason that the dot is escaped rather than forming subdomains is because the
|
||||||
SSL certificate for `appspot.com` is only valid for `*.appspot.com` (no double
|
SSL certificate for `appspot.com` is only valid for `*.appspot.com` (no double
|
||||||
wild-cards).
|
wild-cards).
|
||||||
|
|
||||||
### Default service
|
#### Default service
|
||||||
|
|
||||||
The default service is responsible for all registrar-facing
|
The default service is responsible for all registrar-facing
|
||||||
[EPP](https://en.wikipedia.org/wiki/Extensible_Provisioning_Protocol) command
|
[EPP](https://en.wikipedia.org/wiki/Extensible_Provisioning_Protocol) command
|
||||||
|
@ -36,7 +45,7 @@ begin to impact users immediately. Requests to the default service are handled
|
||||||
by the `FrontendServlet`, which provides all of the endpoints exposed in
|
by the `FrontendServlet`, which provides all of the endpoints exposed in
|
||||||
`FrontendRequestComponent`.
|
`FrontendRequestComponent`.
|
||||||
|
|
||||||
### Backend service
|
#### Backend service
|
||||||
|
|
||||||
The backend service is responsible for executing all regularly scheduled
|
The backend service is responsible for executing all regularly scheduled
|
||||||
background tasks (using cron) as well as all asynchronous tasks. Requests to the
|
background tasks (using cron) as well as all asynchronous tasks. Requests to the
|
||||||
|
@ -57,7 +66,7 @@ sized to support not just the normal ongoing DNS load but also the load incurred
|
||||||
by MapReduces, both scheduled (such as RDE) and on-demand (asynchronous
|
by MapReduces, both scheduled (such as RDE) and on-demand (asynchronous
|
||||||
contact/host deletion).
|
contact/host deletion).
|
||||||
|
|
||||||
### Tools service
|
#### Tools service
|
||||||
|
|
||||||
The tools service is responsible for servicing requests from the `nomulus`
|
The tools service is responsible for servicing requests from the `nomulus`
|
||||||
command line tool, which provides administrative-level functionality for
|
command line tool, which provides administrative-level functionality for
|
||||||
|
@ -74,18 +83,19 @@ tool subcommands like `generate_zone_files` and by manually hitting URLs under
|
||||||
https://tools-dot-project-id.appspot.com, like
|
https://tools-dot-project-id.appspot.com, like
|
||||||
`/_dr/task/refreshDnsForAllDomains`.
|
`/_dr/task/refreshDnsForAllDomains`.
|
||||||
|
|
||||||
## Task queues
|
### Task queues
|
||||||
|
|
||||||
[Task queues](https://cloud.google.com/appengine/docs/java/taskqueue/) in App
|
App Engine [task
|
||||||
Engine provide an asynchronous way to enqueue tasks and then execute them on
|
queues](https://cloud.google.com/appengine/docs/java/taskqueue/) provide an
|
||||||
some kind of schedule. There are two types of queues, push queues and pull
|
asynchronous way to enqueue tasks and then execute them on some kind of
|
||||||
queues. Tasks in push queues are always executing up to some throttlable limit.
|
schedule. There are two types of queues, push queues and pull queues. Tasks in
|
||||||
Tasks in pull queues remain there until the queue is polled by code that is
|
push queues are always executing up to some throttlable limit. Tasks in pull
|
||||||
running for some other reason. Essentially, push queues run their own tasks
|
queues remain there until the queue is polled by code that is running for some
|
||||||
while pull queues just enqueue data that is used by something else.
|
other reason. Essentially, push queues run their own tasks while pull queues
|
||||||
Many other parts of App Engine are implemented using task queues. For example,
|
just enqueue data that is used by something else. Many other parts of App Engine
|
||||||
[App Engine cron](https://cloud.google.com/appengine/docs/java/config/cron) adds
|
are implemented using task queues. For example, [App Engine
|
||||||
tasks to push queues at regularly scheduled intervals, and the [MapReduce
|
cron](https://cloud.google.com/appengine/docs/java/config/cron) adds tasks to
|
||||||
|
push queues at regularly scheduled intervals, and the [MapReduce
|
||||||
framework](https://cloud.google.com/appengine/docs/java/dataprocessing/) adds
|
framework](https://cloud.google.com/appengine/docs/java/dataprocessing/) adds
|
||||||
tasks for each phase of the MapReduce algorithm.
|
tasks for each phase of the MapReduce algorithm.
|
||||||
|
|
||||||
|
@ -183,12 +193,60 @@ explicitly marked as otherwise.
|
||||||
spreadsheet. Tasks are enqueued by `RegistrarServlet` when changes are made
|
spreadsheet. Tasks are enqueued by `RegistrarServlet` when changes are made
|
||||||
to registrar fields and are executed by `SyncRegistrarsSheetAction`.
|
to registrar fields and are executed by `SyncRegistrarsSheetAction`.
|
||||||
|
|
||||||
|
### Cron jobs
|
||||||
|
|
||||||
|
Nomulus uses App Engine [cron
|
||||||
|
jobs](https://cloud.google.com/appengine/docs/java/config/cron) to run periodic
|
||||||
|
scheduled actions. These actions run as frequently as once per minute (in the
|
||||||
|
case of syncing DNS updates) or as infrequently as once per month (in the case
|
||||||
|
of RDE exports). Cron tasks are specified in `cron.xml` files, with one per
|
||||||
|
environment. There are more tasks that run in Production than in other
|
||||||
|
environments because tasks like uploading RDE dumps are only done for the live
|
||||||
|
system. Cron tasks execute on the `backend` service.
|
||||||
|
|
||||||
|
Most cron tasks use the `TldFanoutAction` which is accessed via the
|
||||||
|
`/_dr/cron/fanout` URL path. This action, which is run by the BackendServlet on
|
||||||
|
the backend service, fans out a given cron task for each TLD that exists in the
|
||||||
|
registry system, using the queue that is specified in the `cron.xml` entry.
|
||||||
|
Because some tasks may be computationally intensive and could risk spiking
|
||||||
|
system latency if all start executing immediately at the same time, there is a
|
||||||
|
`jitterSeconds` parameter that spreads out tasks over the given number of
|
||||||
|
seconds. This is used with DNS updates and commit log deletion.
|
||||||
|
|
||||||
|
The reason the `TldFanoutAction` exists is that a lot of tasks need to be done
|
||||||
|
separately for each TLD, such as RDE exports and NORDN uploads. It's simpler to
|
||||||
|
have a single cron entry that will create tasks for all TLDs than to have to
|
||||||
|
specify a separate cron task for each action for each TLD (though that is still
|
||||||
|
an option). Task queues also provide retry semantics in the event of transient
|
||||||
|
failures that a raw cron task does not. This is why there are some tasks that do
|
||||||
|
not fan out across TLDs that still use `TldFanoutAction` -- it's so that the
|
||||||
|
tasks retry in the face of transient errors.
|
||||||
|
|
||||||
|
The full list of URL parameters to `TldFanoutAction` that can be specified in
|
||||||
|
cron.xml is:
|
||||||
|
|
||||||
|
* `endpoint` -- The path of the action that should be executed (see
|
||||||
|
`web.xml`).
|
||||||
|
* `queue` -- The cron queue to enqueue tasks in.
|
||||||
|
* `forEachRealTld` -- Specifies that the task should be run in each TLD of
|
||||||
|
type `REAL`. This can be combined with `forEachTestTld`.
|
||||||
|
* `forEachTestTld` -- Specifies that the task should be run in each TLD of
|
||||||
|
type `TEST`. This can be combined with `forEachRealTld`.
|
||||||
|
* `runInEmpty` -- Specifies that the task should be run globally, i.e. just
|
||||||
|
once, rather than individually per TLD. This is provided to allow tasks to
|
||||||
|
retry. It is called "`runInEmpty`" for historical reasons.
|
||||||
|
* `excludes` -- A list of TLDs to exclude from processing.
|
||||||
|
* `jitterSeconds` -- The execution of each per-TLD task is delayed by a
|
||||||
|
different random number of seconds between zero and this max value.
|
||||||
|
|
||||||
## Environments
|
## Environments
|
||||||
|
|
||||||
Nomulus comes pre-configured with support for a number of different
|
Nomulus comes pre-configured with support for a number of different
|
||||||
environments, all of which are used in Google's registry system. Other registry
|
environments, all of which are used in Google's registry system. Other registry
|
||||||
operators may choose to use more or fewer environments, depending on their
|
operators may choose to use more or fewer environments, depending on their
|
||||||
needs.
|
needs. Each environment consists of a separate Google Cloud Platform project,
|
||||||
|
which includes a separate database and separate bulk storage in Cloud Storage.
|
||||||
|
Each environment is thus completely independent.
|
||||||
|
|
||||||
The different environments are specified in `RegistryEnvironment`. Most
|
The different environments are specified in `RegistryEnvironment`. Most
|
||||||
correspond to a separate App Engine app except for `UNITTEST` and `LOCAL`, which
|
correspond to a separate App Engine app except for `UNITTEST` and `LOCAL`, which
|
||||||
|
@ -243,49 +301,6 @@ of experience running a production registry using this codebase.
|
||||||
errors, it can be pushed to Production.
|
errors, it can be pushed to Production.
|
||||||
5. Repeat once weekly, or potentially more often.
|
5. Repeat once weekly, or potentially more often.
|
||||||
|
|
||||||
## Cron tasks
|
|
||||||
|
|
||||||
All [cron tasks](https://cloud.google.com/appengine/docs/java/config/cron) are
|
|
||||||
specified in `cron.xml` files, with one per environment. There are more tasks
|
|
||||||
that execute in Production than in other environments, because tasks like
|
|
||||||
uploading RDE dumps are only done for the live system. Cron tasks execute on the
|
|
||||||
`backend` service.
|
|
||||||
|
|
||||||
Most cron tasks use the `TldFanoutAction` which is accessed via the
|
|
||||||
`/_dr/cron/fanout` URL path. This action, which is run by the BackendServlet on
|
|
||||||
the backend service, fans out a given cron task for each TLD that exists in the
|
|
||||||
registry system, using the queue that is specified in the `cron.xml` entry.
|
|
||||||
Because some tasks may be computationally intensive and could risk spiking
|
|
||||||
system latency if all start executing immediately at the same time, there is a
|
|
||||||
`jitterSeconds` parameter that spreads out tasks over the given number of
|
|
||||||
seconds. This is used with DNS updates and commit log deletion.
|
|
||||||
|
|
||||||
The reason the `TldFanoutAction` exists is that a lot of tasks need to be done
|
|
||||||
separately for each TLD, such as RDE exports and NORDN uploads. It's simpler to
|
|
||||||
have a single cron entry that will create tasks for all TLDs than to have to
|
|
||||||
specify a separate cron task for each action for each TLD (though that is still
|
|
||||||
an option). Task queues also provide retry semantics in the event of transient
|
|
||||||
failures that a raw cron task does not. This is why there are some tasks that do
|
|
||||||
not fan out across TLDs that still use `TldFanoutAction` -- it's so that the
|
|
||||||
tasks retry in the face of transient errors.
|
|
||||||
|
|
||||||
The full list of URL parameters to `TldFanoutAction` that can be specified in
|
|
||||||
cron.xml is:
|
|
||||||
|
|
||||||
* `endpoint` -- The path of the action that should be executed (see
|
|
||||||
`web.xml`).
|
|
||||||
* `queue` -- The cron queue to enqueue tasks in.
|
|
||||||
* `forEachRealTld` -- Specifies that the task should be run in each TLD of
|
|
||||||
type `REAL`. This can be combined with `forEachTestTld`.
|
|
||||||
* `forEachTestTld` -- Specifies that the task should be run in each TLD of
|
|
||||||
type `TEST`. This can be combined with `forEachRealTld`.
|
|
||||||
* `runInEmpty` -- Specifies that the task should be run globally, i.e. just
|
|
||||||
once, rather than individually per TLD. This is provided to allow tasks to
|
|
||||||
retry. It is called "`runInEmpty`" for historical reasons.
|
|
||||||
* `excludes` -- A list of TLDs to exclude from processing.
|
|
||||||
* `jitterSeconds` -- The execution of each per-TLD task is delayed by a
|
|
||||||
different random number of seconds between zero and this max value.
|
|
||||||
|
|
||||||
## Cloud Datastore
|
## Cloud Datastore
|
||||||
|
|
||||||
Nomulus uses [Cloud
|
Nomulus uses [Cloud
|
|
@ -12,8 +12,8 @@ updated by running `nomulus` commands without having to deploy a new version.
|
||||||
Here's a checklist of things that need to be configured upon initial
|
Here's a checklist of things that need to be configured upon initial
|
||||||
installation of the project:
|
installation of the project:
|
||||||
|
|
||||||
* Create Google Cloud Storage buckets (see the [App Engine architecture
|
* Create Google Cloud Storage buckets (see the [Architecture
|
||||||
guide](./app-engine-architecture.md)).
|
documentation](./architecture.md) for more information).
|
||||||
* Modify `ConfigModule.java` and set project-specific settings such as product
|
* Modify `ConfigModule.java` and set project-specific settings such as product
|
||||||
name (see below).
|
name (see below).
|
||||||
* Copy and edit `ProductionRegistryConfigExample.java` with your
|
* Copy and edit `ProductionRegistryConfigExample.java` with your
|
||||||
|
@ -28,8 +28,8 @@ different values for different environments. This is especially pronounced in
|
||||||
the `UNITTEST` and `LOCAL` environments, which don't run on App Engine at all.
|
the `UNITTEST` and `LOCAL` environments, which don't run on App Engine at all.
|
||||||
As an example, some timeouts may be long in production and short in unit tests.
|
As an example, some timeouts may be long in production and short in unit tests.
|
||||||
|
|
||||||
See the [App Engine architecture](./app-engine-architecture.md) documentation
|
See the [Architecture documentation](./architecture.md) for more details on
|
||||||
for more details on environments as used by Nomulus.
|
environments as used by Nomulus.
|
||||||
|
|
||||||
## App Engine configuration
|
## App Engine configuration
|
||||||
|
|
||||||
|
|
|
@ -106,8 +106,8 @@ Cloud Platform. Make sure to choose a good Project ID, as it will be used
|
||||||
repeatedly in a large number of places. If your company is named Acme, then a
|
repeatedly in a large number of places. If your company is named Acme, then a
|
||||||
good Project ID for your production environment would be "acme-registry". Keep
|
good Project ID for your production environment would be "acme-registry". Keep
|
||||||
in mind that project IDs for non-production environments should be suffixed with
|
in mind that project IDs for non-production environments should be suffixed with
|
||||||
the name of the environment (see the [App Engine architecture
|
the name of the environment (see the [Architecture
|
||||||
guide](./app-engine-architecture.md) for more details). For the purposes of this
|
documentation](./architecture.md) for more details). For the purposes of this
|
||||||
example we'll deploy to the "alpha" environment, which is used for developer
|
example we'll deploy to the "alpha" environment, which is used for developer
|
||||||
testing. The Project ID will thus be `acme-registry-alpha`.
|
testing. The Project ID will thus be `acme-registry-alpha`.
|
||||||
|
|
||||||
|
|
|
@ -25,8 +25,8 @@ the [first steps tutorial](./first-steps-tutorial.md).
|
||||||
## How to load an escrow file
|
## How to load an escrow file
|
||||||
|
|
||||||
First of all, ensure that all of the cloud storage buckets are set up for
|
First of all, ensure that all of the cloud storage buckets are set up for
|
||||||
nomulus. See the [architecture documentation](./app-engine-architecture.md) for
|
nomulus. See the [Architecture documentation](./architecture.md) for details.
|
||||||
details. The escrow file that will be imported should be uploaded to the
|
The escrow file that will be imported should be uploaded to the
|
||||||
`PROJECT-rde-import` cloud storage bucket. The escrow file should not be
|
`PROJECT-rde-import` cloud storage bucket. The escrow file should not be
|
||||||
compressed or encrypted. When launching each mapreduce job, reference the
|
compressed or encrypted. When launching each mapreduce job, reference the
|
||||||
absolute path to the file (just the path, not the bucket name) in the `path`
|
absolute path to the file (just the path, not the bucket name) in the `path`
|
||||||
|
|
Loading…
Add table
Reference in a new issue