mirror of
https://github.com/google/nomulus.git
synced 2025-04-30 03:57:51 +02:00
Document TLD import architecture
------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=149575471
This commit is contained in:
parent
815dae2749
commit
d4a428fc24
1 changed files with 74 additions and 0 deletions
74
docs/rde-import-architecture.md
Normal file
74
docs/rde-import-architecture.md
Normal file
|
@ -0,0 +1,74 @@
|
|||
# Registry Data Import Architecture
|
||||
|
||||
*See also the [RDE usage guide](./rde-import-usage.md).*
|
||||
|
||||
The Registry Data Import feature was designed to handle escrow files from other
|
||||
registries with millions of domains. In the spirit of divide and conquer, the
|
||||
mapreduce library is used to break up the work of the import into smaller chunks
|
||||
that can be processed in a reasonable period of time. This process is broken
|
||||
down into four separate mapreduce jobs that must be run in sequence due to how
|
||||
datastore transactions work and due to dependencies between registry objects.
|
||||
The steps are broken up as follows:
|
||||
|
||||
__Initial Setup__ - This is a set of manual steps that must be completed before
|
||||
the process can be run, and is out of the scope of this document. See
|
||||
[Usage](./rde-import-usage.md) for more details.
|
||||
|
||||
__Contacts Import__ - Reads contact entries from an escrow file and saves them
|
||||
as `ContactResource` entities. `HistoryEntry` entities are also created for the
|
||||
contact. This process depends on initial setup, but does not depend on any
|
||||
previous step being run.
|
||||
|
||||
__Hosts Import__ - Reads host entries from an escrow file and saves them as
|
||||
`HostResource` entities. `HistoryEntry` entities are also created for the hosts.
|
||||
This process depends on initial setup, but does not depend on any previous step
|
||||
being run.
|
||||
|
||||
__Domains Import__ - Reads domain entries from an escrow file and saves them as
|
||||
`DomainResource` entities. For each domain imported, a history entry, autorenew
|
||||
billing event and autorenew poll message will also be created. For domains that
|
||||
are in pending transfer state, the import process will also create future
|
||||
entities for automatic server approval in the same fashion as domain transfer
|
||||
request EPP messages. Domains cannot be imported until the contacts and hosts
|
||||
required by the domain are imported in previous steps.
|
||||
|
||||
__Hosts Link__ - Reads host entries from an escrow file and links in-zone hosts
|
||||
to their superordinate domains. This is the last step because both hosts and
|
||||
domains have to be imported before the link can be made in both directions.
|
||||
|
||||
## Components
|
||||
|
||||
Each mapreduce job (with the exception of Hosts Link) is made up of a similar
|
||||
set of components. Note that much of the work that is done by each job strongly
|
||||
resembles the inversion of the Registry Data Export feature, and reuses the Jaxb
|
||||
representations of the xml elements that compose escrow files.
|
||||
|
||||
__Parser__ - The parser is the lowest level of the import process. This
|
||||
component parses an escrow file (provided as an open stream from Google Cloud
|
||||
Storage) into discrete JAXB objects. The parser maintains an internal cursor in
|
||||
the xml file that represents the next element to be read, and can advance to and
|
||||
skip any number of elements.
|
||||
|
||||
__Reader__ - The reader is configured by each mapreduce job to load an escrow
|
||||
file from Cloud Storage and use a parser to read a selected subset of the file,
|
||||
forwarding the results to a mapper.
|
||||
|
||||
__Input__ - An input is responsible for determining how many reader instances to
|
||||
create and which section of the escrow file should be consumed by each reader.
|
||||
|
||||
__Converter__ - The converter accepts a Jaxb object and returns an equivalent
|
||||
resource that can be saved to the datastore.
|
||||
|
||||
__Import Utility Logic__ - Common import logic is consolidated into a single
|
||||
place, such as creation of index entities and escrow file validation.
|
||||
|
||||
__Mapper__ - The mapper accepts a stream of Jaxb objects from the reader and
|
||||
uses the converter to map them to resource objects. Then for each resource
|
||||
object produced, the mapper will attempt to save the resource and any related
|
||||
objects to the datastore in a transaction. This is an idempotent operation; if
|
||||
any resource has been previously imported by the process, it will be ignored.
|
||||
|
||||
__Action Endpoint__ - The action endpoint is responsible for accepting requests
|
||||
to launch each step of the import process, bootstrapping mapreduce jobs, and
|
||||
redirecting the client to the status page of the import job. This is the entry
|
||||
point of the import process.
|
Loading…
Add table
Reference in a new issue