mirror of
https://github.com/google/nomulus.git
synced 2025-04-30 12:07:51 +02:00
This also renames the document to clarify its scope as being all of Google Cloud Platform, not just App Engine. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=169543846
122 lines
5.7 KiB
Markdown
122 lines
5.7 KiB
Markdown
# Registry Data Import Usage Guide
|
|
|
|
In order to import an RDE escrow file into Nomulus, four different mapreduce
|
|
processes must be run in sequence. Future iterations of this feature may
|
|
automate the sequence, but for now each step of the process must be run
|
|
manually.
|
|
|
|
Note that only contacts, hosts and domains are imported by this process. Other
|
|
objects such as registrars and top level domains are not imported automatically,
|
|
and must be configured before running this import process (see below).
|
|
|
|
## Prerequisites
|
|
|
|
This document assumes that the import process is being executed against a fully
|
|
functioning instance of Nomulus that is set up with its own unique project
|
|
id. See the other docs (particularly the [install guide](./install.md)) for
|
|
details and helpful links.
|
|
|
|
Before launching any of the import jobs, the top level domain (TLD) and all
|
|
registrars that are included in the escrow file must be created in the registry;
|
|
these cannot be created automatically. Detailed instructions on how to create
|
|
TLDs, registrars and registrar contacts can be found in
|
|
the [first steps tutorial](./first-steps-tutorial.md).
|
|
|
|
## How to load an escrow file
|
|
|
|
First of all, ensure that all of the cloud storage buckets are set up for
|
|
nomulus. See the [Architecture documentation](./architecture.md) for details.
|
|
The escrow file that will be imported should be uploaded to the
|
|
`PROJECT-rde-import` cloud storage bucket. The escrow file should not be
|
|
compressed or encrypted. When launching each mapreduce job, reference the
|
|
absolute path to the file (just the path, not the bucket name) in the `path`
|
|
argument.
|
|
|
|
__TODO:__ Add `PROJECT-rde-import` bucket requirement to architecture doc.
|
|
|
|
## Overview of import steps
|
|
|
|
Due to the huge variety in size of domain registries, the number of objects
|
|
included in escrow files can also vary widely. In order to process these files
|
|
in a way that scales, the import of contacts, hosts and domains has been
|
|
implemented as a series of four mapreduce jobs. The four jobs are summarized as
|
|
follows:
|
|
|
|
* Contacts Import - Creates contacts in the registry.
|
|
* Hosts Import - Creates host objects in the registry. Note that only host
|
|
objects are supported at this time; escrow files with host attributes are not
|
|
supported and will be rejected by the import process.
|
|
* Domains Import - Creates domain objects in the registry. This step will also
|
|
publish NS records and A records to dns as necessary based on referenced host
|
|
objects.
|
|
* Hosts Link - Links hosts to their superordinate domain objects. This step is
|
|
necessary to establish data integrity for imported objects.
|
|
|
|
The import steps __must__ be run in this order: Contacts Import, Hosts Import,
|
|
Domains Import, Hosts Link.
|
|
|
|
## Executing the import process
|
|
|
|
The import process steps must be executed by a user that is logged in as an
|
|
administrator of the deployed Nomulus instance. Currently, the way to launch the
|
|
process is to manually enter the proper url into the user's web browser, which
|
|
will kick off the mapreduce job and load its status page. The status page will
|
|
serve as a way to monitor the progress of a job until completion. Once each job
|
|
is completed, the user can launch the next step in the same fashion.
|
|
|
|
Parameters:
|
|
|
|
* path - This is the path to the escrow file in cloud storage.
|
|
* mapShards - This is the number of shards that will be used to process the
|
|
file. The process has been tested with a mapShards setting of 100, which
|
|
seems to perform well in most cases.
|
|
|
|
The import process is deployed in the `backend` service, so "backend-dot-" will
|
|
be prepended to the hostname. Replace `PROJECT` below with the unique project
|
|
name under which the Nomulus instance is deployed, and `PATH` with the path to
|
|
the escrow file.
|
|
|
|
Launch Contacts Import:
|
|
`https://backend-dot-PROJECT.appspot.com/_dr/importRdeContacts?path=PATH&mapShards=100`
|
|
|
|
Launch Hosts Import:
|
|
`https://backend-dot-PROJECT.appspot.com/_dr/importRdeHosts?path=PATH&mapShards=100`
|
|
|
|
Launch Domains Import:
|
|
`https://backend-dot-PROJECT.appspot.com/_dr/importRdeDomains?path=PATH&mapShards=100`
|
|
|
|
Launch Hosts Link:
|
|
`https://backend-dot-PROJECT.appspot.com/_dr/linkRdeHosts?path=PATH&mapShards=100`
|
|
|
|
For each job, the mapreduce user interface will display the status of the
|
|
running job. The job is finished when all of the boxes on the left turn green
|
|
(at which point the next step of the process can be safely launched). If any of
|
|
the boxes turn red, it means that the job failed and the logs should be
|
|
consulted for errors (see below).
|
|
|
|
Note that each of these steps is idempotent, and can be run many times with no
|
|
harmful side effects or duplicated data.
|
|
|
|
## Monitoring and troubleshooting
|
|
|
|
On the status page, several counters will be shown as the job progresses,
|
|
indicating the number of operations attempted, how many succeeded, how many were
|
|
ignored, and if there were any errors. This is a good tool to understand at a
|
|
high level how the job is progressing. The counters from a completed job can
|
|
also be compared against the known count of resources from an escrow file to
|
|
determine if the import was successful or not.
|
|
|
|
For a more detailed view, open up the logging for the project in
|
|
the [Google Cloud Console](https://console.cloud.google.com). The application
|
|
logs from the import process will show up under requests with a url that starts
|
|
with `/_dr/mapreduce/workerCallback` - the logs provide a detailed view of which
|
|
resources were read from file, which were imported, which had already existed
|
|
before the import, and which failed to import.
|
|
|
|
## Known limitations
|
|
|
|
Currently, the ID of all registrars in the escrow file must match those that are
|
|
already configured in the registry. Future work is planned to map between
|
|
different internal and external registrar IDs by using the IANA IDs, which are
|
|
always consistent between registries.
|
|
|