diff --git a/README.md b/README.md index c1becf530..8939b29e1 100644 --- a/README.md +++ b/README.md @@ -15,10 +15,10 @@ the Markdown documents in the `docs` directory. When it comes to internet land, ownership flows down the following hierarchy: -1. [ICANN][icann] -2. [Registries][registry] (e.g. Google Registry) -3. [Registrars][registrar] (e.g. Google Domains) -4. Registrants (e.g. you) +1. [ICANN][icann] +2. [Registries][registry] (e.g. Google Registry) +3. [Registrars][registrar] (e.g. Google Domains) +4. Registrants (e.g. you) A registry is any organization that operates an entire top-level domain. For example, Verisign controls all the .COM domains and Affilias controls all the @@ -50,9 +50,9 @@ are limited to four minutes and ten megabytes in size. Furthermore, queries and indexes that span entity groups are always eventually consistent, which means they could take seconds, and very rarely, days to update. While most online services find eventual consistency useful, it is not appropriate for a service -conducting financial exchanges. Therefore Domain Registry has been engineered -to employ performance and complexity tradeoffs that allow strong consistency to -be applied throughout the codebase. +conducting financial exchanges. Therefore Domain Registry has been engineered to +employ performance and complexity tradeoffs that allow strong consistency to be +applied throughout the codebase. Domain Registry has a commit log system. Commit logs are retained in datastore for thirty days. They are also streamed to Cloud Storage for backup purposes. @@ -63,8 +63,8 @@ order to do restores. Each EPP resource entity also stores a map of its past mutations with 24-hour granularity. This makes it possible to have point-in-time projection queries with effectively no overhead. -The Registry Data Escrow (RDE) system is also built with reliability in mind. -It executes on top of App Engine task queues, which can be double-executed and +The Registry Data Escrow (RDE) system is also built with reliability in mind. It +executes on top of App Engine task queues, which can be double-executed and therefore require operations to be idempotent. RDE isn't idempotent. To work around this, RDE uses datastore transactions to achieve mutual exclusion and serialization. We call this the "Locking Rolling Cursor Pattern." One benefit of @@ -94,14 +94,15 @@ proxy listening on port 700. Poll message support is also included. To supplement EPP, Domain Registry also provides a public API for performing domain availability checks. This service listens on the `/check` path. -* [RFC 5730: EPP](http://tools.ietf.org/html/rfc5730) -* [RFC 5731: EPP Domain Mapping](http://tools.ietf.org/html/rfc5731) -* [RFC 5732: EPP Host Mapping](http://tools.ietf.org/html/rfc5732) -* [RFC 5733: EPP Contact Mapping](http://tools.ietf.org/html/rfc5733) -* [RFC 3915: EPP Grace Period Mapping](http://tools.ietf.org/html/rfc3915) -* [RFC 5734: EPP Transport over TCP](http://tools.ietf.org/html/rfc5734) -* [RFC 5910: EPP DNSSEC Mapping](http://tools.ietf.org/html/rfc5910) -* [Draft: EPP Launch Phase Mapping (Proposed)](http://tools.ietf.org/html/draft-tan-epp-launchphase-11) +* [RFC 5730: EPP](http://tools.ietf.org/html/rfc5730) +* [RFC 5731: EPP Domain Mapping](http://tools.ietf.org/html/rfc5731) +* [RFC 5732: EPP Host Mapping](http://tools.ietf.org/html/rfc5732) +* [RFC 5733: EPP Contact Mapping](http://tools.ietf.org/html/rfc5733) +* [RFC 3915: EPP Grace Period Mapping](http://tools.ietf.org/html/rfc3915) +* [RFC 5734: EPP Transport over TCP](http://tools.ietf.org/html/rfc5734) +* [RFC 5910: EPP DNSSEC Mapping](http://tools.ietf.org/html/rfc5910) +* [Draft: EPP Launch Phase Mapping (Proposed)] + (http://tools.ietf.org/html/draft-tan-epp-launchphase-11) ### Registry Data Escrow (RDE) @@ -114,17 +115,22 @@ This service exists for ICANN regulatory purposes. ICANN needs to know that, should a registry business ever implode, that they can quickly migrate their TLDs to a different company so that they'll continue to operate. -* [Draft: Registry Data Escrow Specification](http://tools.ietf.org/html/draft-arias-noguchi-registry-data-escrow-06) -* [Draft: Domain Name Registration Data (DNRD) Objects Mapping](http://tools.ietf.org/html/draft-arias-noguchi-dnrd-objects-mapping-05) -* [Draft: ICANN Registry Interfaces](http://tools.ietf.org/html/draft-lozano-icann-registry-interfaces-05) +* [Draft: Registry Data Escrow Specification] + (http://tools.ietf.org/html/draft-arias-noguchi-registry-data-escrow-06) +* [Draft: Domain Name Registration Data (DNRD) Objects Mapping] + (http://tools.ietf.org/html/draft-arias-noguchi-dnrd-objects-mapping-05) +* [Draft: ICANN Registry Interfaces] + (http://tools.ietf.org/html/draft-lozano-icann-registry-interfaces-05) ### Trademark Clearing House (TMCH) Domain Registry integrates with ICANN and IBM's MarksDB in order to protect trademark holders, when new TLDs are being launched. -* [Draft: TMCH Functional Spec](http://tools.ietf.org/html/draft-lozano-tmch-func-spec-08) -* [Draft: Mark and Signed Mark Objects Mapping](https://tools.ietf.org/html/draft-lozano-tmch-smd-02) +* [Draft: TMCH Functional Spec] + (http://tools.ietf.org/html/draft-lozano-tmch-func-spec-08) +* [Draft: Mark and Signed Mark Objects Mapping] + (https://tools.ietf.org/html/draft-lozano-tmch-smd-02) ### WHOIS @@ -134,8 +140,10 @@ internal HTTP endpoint running on `/_dr/whois`. A separate proxy running on port 43 forwards requests to that path. Domain Registry also implements a public HTTP endpoint that listens on the `/whois` path. -* [RFC 3912: WHOIS Protocol Specification](https://tools.ietf.org/html/rfc3912) -* [RFC 7485: Inventory and Analysis of Registration Objects](http://tools.ietf.org/html/rfc7485) +* [RFC 3912: WHOIS Protocol Specification] + (https://tools.ietf.org/html/rfc3912) +* [RFC 7485: Inventory and Analysis of Registration Objects] + (http://tools.ietf.org/html/rfc7485) ### Registration Data Access Protocol (RDAP) @@ -143,23 +151,24 @@ RDAP is the new standard for WHOIS. It provides much richer functionality, such as the ability to perform wildcard searches. Domain Registry makes this HTTP service available under the `/rdap/...` path. -* [RFC 7480: RDAP HTTP Usage](http://tools.ietf.org/html/rfc7480) -* [RFC 7481: RDAP Security Services](http://tools.ietf.org/html/rfc7481) -* [RFC 7482: RDAP Query Format](http://tools.ietf.org/html/rfc7482) -* [RFC 7483: RDAP JSON Responses](http://tools.ietf.org/html/rfc7483) -* [RFC 7484: RDAP Finding the Authoritative Registration Data](http://tools.ietf.org/html/rfc7484) +* [RFC 7480: RDAP HTTP Usage](http://tools.ietf.org/html/rfc7480) +* [RFC 7481: RDAP Security Services](http://tools.ietf.org/html/rfc7481) +* [RFC 7482: RDAP Query Format](http://tools.ietf.org/html/rfc7482) +* [RFC 7483: RDAP JSON Responses](http://tools.ietf.org/html/rfc7483) +* [RFC 7484: RDAP Finding the Authoritative Registration Data] + (http://tools.ietf.org/html/rfc7484) ### Backups The registry provides a system for generating and restoring from backups with -strong point-in-time consistency. Datastore backups are written out once daily +strong point-in-time consistency. Datastore backups are written out once daily to Cloud Storage using the built-in Datastore snapshot export functionality. Separately, entities called commit logs are continuously exported to track changes that occur in between the regularly scheduled backups. A restore involves wiping out all entities in Datastore, importing the most recent complete daily backup snapshot, then replaying all of the commit logs -since that snapshot. This yields a system state that is guaranteed +since that snapshot. This yields a system state that is guaranteed transactionally consistent. ### Billing @@ -173,24 +182,26 @@ monthly invoices per registrar. Because the registry runs on the Google Cloud Platform stack, it benefits from high availability, automatic fail-over, and horizontal auto-scaling of compute -and database resources. This makes it quite flexible for running TLDs of any +and database resources. This makes it quite flexible for running TLDs of any size. ### Automated tests The registry codebase includes ~400 test classes with ~4,000 total unit and -integration tests. This limits regressions, ensures correct system +integration tests. This limits regressions, ensures correct system functionality, and allows for easy continued future development and refactoring. ### DNS An interface for DNS operations is provided, along with a sample implementation -that uses the [Google Cloud DNS](https://cloud.google.com/dns/) API. A bulk +that uses the [Google Cloud DNS](https://cloud.google.com/dns/) API. A bulk export tool is also provided to export a zone file for an entire TLD in BIND format. -* [RFC 1034: Domain Names - Concepts and Facilities](https://www.ietf.org/rfc/rfc1034.txt) -* [RFC 1035: Domain Names - Implementation and Specification](https://www.ietf.org/rfc/rfc1034.txt) +* [RFC 1034: Domain Names - Concepts and Facilities] + (https://www.ietf.org/rfc/rfc1034.txt) +* [RFC 1035: Domain Names - Implementation and Specification] + (https://www.ietf.org/rfc/rfc1034.txt) ### Exports @@ -202,21 +213,20 @@ ICANN-mandated reports, database snapshots, and reserved terms. ### Metrics and reporting The registry records metrics and regularly exports them to BigQuery so that -analyses can be run on them using full SQL queries. Metrics include which EPP +analyses can be run on them using full SQL queries. Metrics include which EPP commands were run and when and by whom, information on failed commands, activity per registrar, and length of each request. [BigQuery][bigquery] reporting scripts are provided to generate the required -per-TLD monthly -[registry reports](https://www.icann.org/resources/pages/registry-reports) for -ICANN. +per-TLD monthly [registry reports] +(https://www.icann.org/resources/pages/registry-reports) for ICANN. ### Registrar console The registry includes a web-based registrar console that registrars can access -in a browser. It provides the ability for registrars to view their billing +in a browser. It provides the ability for registrars to view their billing invoices in Google Drive, contact the registry provider, and modify WHOIS, -security (including SSL certificates), and registrar contact settings. Main +security (including SSL certificates), and registrar contact settings. Main registry commands such as creating domains, hosts, and contacts must go through EPP and are not provided in the console. @@ -231,7 +241,7 @@ system, and creating new TLDs. ### Plug-and-play pricing engines The registry has the ability to configure per-TLD pricing engines to -programmatically determine the price of domain names on the fly. An +programmatically determine the price of domain names on the fly. An implementation is provided that uses the contents of a static list of prices (this being by far the most common type of premium pricing used for TLDs). @@ -240,23 +250,23 @@ implementation is provided that uses the contents of a static list of prices There are a few things that the registry cannot currently do, and a few things that are out of scope that it will never do. -* You will need a DNS system in order to run a fully-fledged registry. If you - are planning on using anything other than Google Cloud DNS you will need to - provide an implementation. -* You will need an invoicing system to convert the internal registry billing - events into registrar invoices using whatever accounts receivable setup you - already have. A partial implementation is provided that generates generic CSV - invoices (see `MakeBillingTablesCommand`), but you will need to integrate it - with your payments system. -* You will likely need monitoring to continuously monitor the status of the - system. Any of a large variety of tools can be used for this, or you can - write your own. -* You will need a proxy to forward traffic on EPP and WHOIS ports to the HTTPS - endpoint on App Engine, as App Engine only allows incoming traffic on - HTTP/HTTPS ports. Similarly, App Engine does not yet support IPv6, so your - proxy would have to support that as well if you need IPv6 support. Future - versions of [App Engine Flexible][flex] should provide these out of the box, - but they aren't ready yet. +* You will need a DNS system in order to run a fully-fledged registry. If you + are planning on using anything other than Google Cloud DNS you will need to + provide an implementation. +* You will need an invoicing system to convert the internal registry billing + events into registrar invoices using whatever accounts receivable setup you + already have. A partial implementation is provided that generates generic + CSV invoices (see `MakeBillingTablesCommand`), but you will need to + integrate it with your payments system. +* You will likely need monitoring to continuously monitor the status of the + system. Any of a large variety of tools can be used for this, or you can + write your own. +* You will need a proxy to forward traffic on EPP and WHOIS ports to the HTTPS + endpoint on App Engine, as App Engine only allows incoming traffic on + HTTP/HTTPS ports. Similarly, App Engine does not yet support IPv6, so your + proxy would have to support that as well if you need IPv6 support. Future + versions of [App Engine Flexible][flex] should provide these out of the box, + but they aren't ready yet. [bigquery]: https://cloud.google.com/bigquery/ [datastore]: https://cloud.google.com/datastore/docs/concepts/overview diff --git a/g3doc/app-engine-architecture.md b/g3doc/app-engine-architecture.md index 20229dbd3..6b37500ed 100644 --- a/g3doc/app-engine-architecture.md +++ b/g3doc/app-engine-architecture.md @@ -5,19 +5,19 @@ Registry project as it is implemented in App Engine. ## Services -The Domain Registry contains three -[services](https://cloud.google.com/appengine/docs/python/an-overview-of-app-engine), -which were previously called modules in earlier versions of App Engine. The -services are: default (also called front-end), backend, and tools. Each service +The Domain Registry contains three [services] +(https://cloud.google.com/appengine/docs/python/an-overview-of-app-engine), +which were previously called modules in earlier versions of App Engine. The +services are: default (also called front-end), backend, and tools. Each service runs independently in a lot of ways, including that they can be upgraded individually, their log outputs are separate, and their servers and configured scaling are separate as well. Once you have your app deployed and running, the default service can be accessed at `https://project-id.appspot.com`, substituting whatever your App Engine app -is named for "project-id". Note that that is the URL for the production -instance of your app; other environments will have the environment name appended -with a hyphen in the hostname, e.g. `https://project-id-sandbox.appspot.com`. +is named for "project-id". Note that that is the URL for the production instance +of your app; other environments will have the environment name appended with a +hyphen in the hostname, e.g. `https://project-id-sandbox.appspot.com`. The URL for the backend service is `https://backend-dot-project-id.appspot.com` and the URL for the tools service is `https://tools-dot-project-id.appspot.com`. @@ -27,32 +27,32 @@ wild-cards). ### Default service -The default service is responsible for all registrar-facing -[EPP](https://en.wikipedia.org/wiki/Extensible_Provisioning_Protocol) command +The default service is responsible for all registrar-facing [EPP] +(https://en.wikipedia.org/wiki/Extensible_Provisioning_Protocol) command traffic, all user-facing WHOIS and RDAP traffic, and the admin and registrar web -consoles, and is thus the most important service. If the service has any +consoles, and is thus the most important service. If the service has any problems and goes down or stops servicing requests in a timely manner, it will -begin to impact users immediately. Requests to the default service are handled +begin to impact users immediately. Requests to the default service are handled by the `FrontendServlet`, which provides all of the endpoints exposed in `FrontendRequestComponent`. ### Backend service The backend service is responsible for executing all regularly scheduled -background tasks (using cron) as well as all asynchronous tasks. Requests to -the backend service are handled by the `BackendServlet`, which provides all of -the endpoints exposed in `BackendRequestComponent`. These include tasks for +background tasks (using cron) as well as all asynchronous tasks. Requests to the +backend service are handled by the `BackendServlet`, which provides all of the +endpoints exposed in `BackendRequestComponent`. These include tasks for generating/exporting RDE, syncing the trademark list from TMDB, exporting backups, writing out DNS updates, handling asynchronous contact and host deletions, writing out commit logs, exporting metrics to BigQuery, and many -more. Issues in the backend service will not immediately be apparent to end +more. Issues in the backend service will not immediately be apparent to end users, but the longer it is down, the more obvious it will become that user-visible tasks such as DNS and deletion are not being handled in a timely manner. The backend service is also where all MapReduces run, which includes some of the aforementioned tasks such as RDE and asynchronous resource deletion, as well as -any one-off data migration MapReduces. Consequently, the backend service should +any one-off data migration MapReduces. Consequently, the backend service should be sized to support not just the normal ongoing DNS load but also the load incurred by MapReduces, both scheduled (such as RDE) and on-demand (asynchronous contact/host deletion). @@ -61,48 +61,48 @@ contact/host deletion). The tools service is responsible for servicing requests from the `registry_tool` command line tool, which provides administrative-level functionality for -developers and tech support employees of the registry. It is thus the least -critical of the three services. Requests to the tools service are handled by -the `ToolsServlet`, which provides all of the endpoints exposed in -`ToolsRequestComponent`. Some example functionality that this service provides +developers and tech support employees of the registry. It is thus the least +critical of the three services. Requests to the tools service are handled by the +`ToolsServlet`, which provides all of the endpoints exposed in +`ToolsRequestComponent`. Some example functionality that this service provides includes the server-side code to update premium lists, run EPP commands from the -tool, and manually modify contacts/hosts/domains/and other resources. Problems +tool, and manually modify contacts/hosts/domains/and other resources. Problems with the tools service are not visible to users. ## Task queues [Task queues](https://cloud.google.com/appengine/docs/java/taskqueue/) in App Engine provide an asynchronous way to enqueue tasks and then execute them on -some kind of schedule. There are two types of queues, push queues and pull -queues. Tasks in push queues are always executing up to some throttlable limit. +some kind of schedule. There are two types of queues, push queues and pull +queues. Tasks in push queues are always executing up to some throttlable limit. Tasks in pull queues remain there indefinitely until the queue is polled by code -that is running for some other reason. Essentially, push queues run their own -tasks while pull queues just enqueue data that is used by something else. Many -other parts of App Engine are implemented using task queues. For example, -[App Engine cron](https://cloud.google.com/appengine/docs/java/config/cron) adds -tasks to push queues at regularly scheduled intervals, and the -[MapReduce framework](https://cloud.google.com/appengine/docs/java/dataprocessing/) -adds tasks for each phase of the MapReduce algorithm. +that is running for some other reason. Essentially, push queues run their own +tasks while pull queues just enqueue data that is used by something else. Many +other parts of App Engine are implemented using task queues. For example, [App +Engine cron](https://cloud.google.com/appengine/docs/java/config/cron) adds +tasks to push queues at regularly scheduled intervals, and the [MapReduce +framework](https://cloud.google.com/appengine/docs/java/dataprocessing/) adds +tasks for each phase of the MapReduce algorithm. The Domain Registry project uses a particular pattern of paired push/pull queues -that is worth explaining in detail. Push queues are essential because App +that is worth explaining in detail. Push queues are essential because App Engine's architecture does not support long-running background processes, and so push queues are thus the fundamental building block that allows asynchronous and background execution of code that is not in response to incoming web requests. However, they also have limitations in that they do not allow batch processing -or grouping. That's where the pull queue comes in. Regularly scheduled tasks -in the push queue will, upon execution, poll the corresponding pull queue for a -specified number of tasks and execute them in a batch. This allows the code to +or grouping. That's where the pull queue comes in. Regularly scheduled tasks in +the push queue will, upon execution, poll the corresponding pull queue for a +specified number of tasks and execute them in a batch. This allows the code to execute in the background while taking advantage of batch processing. Particulars on the task queues in use by the Domain Registry project are -specified in the `queue.xml` file. Note that many push queues have a direct +specified in the `queue.xml` file. Note that many push queues have a direct one-to-one correspondence with entries in `cron.xml` because they need to be fanned-out on a per-TLD or other basis (see the Cron section below for more -explanation). The exact queue that a given cron task will use is passed as the +explanation). The exact queue that a given cron task will use is passed as the query string parameter "queue" in the url specification for the cron task. -Here are the task queues in use by the system. All are push queues unless +Here are the task queues in use by the system. All are push queues unless explicitly marked as otherwise. * `bigquery-streaming-metrics` -- Queue for metrics that are asynchronously @@ -181,245 +181,249 @@ explicitly marked as otherwise. ## Environments The domain registry codebase comes pre-configured with support for a number of -different environments, all of which are used in Google's registry system. -Other registry operators may choose to user more or fewer environments, -depending on their needs. +different environments, all of which are used in Google's registry system. Other +registry operators may choose to user more or fewer environments, depending on +their needs. -The different environments are specified in `RegistryEnvironment`. Most +The different environments are specified in `RegistryEnvironment`. Most correspond to a separate App Engine app except for `UNITTEST` and `LOCAL`, which -by their nature do not use real environments running in the cloud. The +by their nature do not use real environments running in the cloud. The recommended naming scheme for the App Engine apps that has the best possible compatibility with the codebase and thus requires the least configuration is to pick a name for the production app and then suffix it for the other -environments. E.g., if the production app is to be named 'registry-platform', +environments. E.g., if the production app is to be named 'registry-platform', then the sandbox app would be named 'registry-platform-sandbox'. The full list of environments supported out-of-the-box, in descending order from real to not, is: -* `PRODUCTION` -- The real production environment that is actually running live - TLDs. Since the Domain Registry is a shared registry platform, there need - only ever be one of these. -* `SANDBOX` -- A playground environment for external users to test commands in - without the possibility of affecting production data. This is the environment - new registrars go through - [OT&E](https://www.icann.org/resources/unthemed-pages/registry-agmt-appc-e-2001-04-26-en) - in. Sandbox is also useful as a final sanity check to push a new prospective - build to and allow it to "bake" before pushing it to production. -* `QA` -- An internal environment used by business users to play with and sign - off on new features to be released. This environment can be pushed to - frequently and is where manual testers should be spending the majority of - their time. -* `CRASH` -- Another environment similar to QA, except with no expectations of - data preservation. Crash is used for testing of backup/restore (which brings - the entire system down until it is completed) without affecting the QA - environment. -* `ALPHA` -- The developers' playground. Experimental builds are routinely - pushed here in order to test them on a real app running on App Engine. You - may end up wanting multiple environments like Alpha if you regularly - experience contention (i.e. developers being blocked from testing their code - on Alpha because others are already using it). -* `LOCAL` -- A fake environment that is used when running the app locally on a - simulated App Engine instance. -* `UNITTEST` -- A fake environment that is used in unit tests, where everything - in the App Engine stack is simulated or mocked. +* `PRODUCTION` -- The real production environment that is actually running + live TLDs. Since the Domain Registry is a shared registry platform, there + need only ever be one of these. +* `SANDBOX` -- A playground environment for external users to test commands in + without the possibility of affecting production data. This is the + environment new registrars go through [OT&E] + (https://www.icann.org/resources/unthemed-pages/registry-agmt-appc-e-2001-04-26-en) + in. Sandbox is also useful as a final sanity check to push a new prospective + build to and allow it to "bake" before pushing it to production. +* `QA` -- An internal environment used by business users to play with and sign + off on new features to be released. This environment can be pushed to + frequently and is where manual testers should be spending the majority of + their time. +* `CRASH` -- Another environment similar to QA, except with no expectations of + data preservation. Crash is used for testing of backup/restore (which brings + the entire system down until it is completed) without affecting the QA + environment. +* `ALPHA` -- The developers' playground. Experimental builds are routinely + pushed here in order to test them on a real app running on App Engine. You + may end up wanting multiple environments like Alpha if you regularly + experience contention (i.e. developers being blocked from testing their code + on Alpha because others are already using it). +* `LOCAL` -- A fake environment that is used when running the app locally on a + simulated App Engine instance. +* `UNITTEST` -- A fake environment that is used in unit tests, where + everything in the App Engine stack is simulated or mocked. ## Release process The following is a recommended release process based on Google's several years of experience running a production registry using this codebase. -1. Developers write code and associated unit tests verifying that the new code - works properly. -2. New features or potentially risky bug fixes are pushed to Alpha and tested by - the developers before being committed to the source code repository. -3. New builds are cut and first pushed to Sandbox. -4. Once a build has been running successfully in Sandbox for a day with no - errors, it can be pushed to Production. -5. Repeat once weekly, or potentially more often. +1. Developers write code and associated unit tests verifying that the new code + works properly. +2. New features or potentially risky bug fixes are pushed to Alpha and tested + by the developers before being committed to the source code repository. +3. New builds are cut and first pushed to Sandbox. +4. Once a build has been running successfully in Sandbox for a day with no + errors, it can be pushed to Production. +5. Repeat once weekly, or potentially more often. ## Cron tasks All [cron tasks](https://cloud.google.com/appengine/docs/java/config/cron) are -specified in `cron.xml` files, with one per environment. There are more tasks +specified in `cron.xml` files, with one per environment. There are more tasks that execute in Production than in other environments, because tasks like -uploading RDE dumps are only done for the live system. Cron tasks execute on -the `backend` service. +uploading RDE dumps are only done for the live system. Cron tasks execute on the +`backend` service. Most cron tasks use the `TldFanoutAction` which is accessed via the -`/_dr/cron/fanout` URL path. This action, which is run by the BackendServlet on +`/_dr/cron/fanout` URL path. This action, which is run by the BackendServlet on the backend service, fans out a given cron task for each TLD that exists in the registry system, using the queue that is specified in the `cron.xml` entry. Because some tasks may be computationally intensive and could risk spiking system latency if all start executing immediately at the same time, there is a `jitterSeconds` parameter that spreads out tasks over the given number of -seconds. This is used with DNS updates and commit log deletion. +seconds. This is used with DNS updates and commit log deletion. The reason the `TldFanoutAction` exists is that a lot of tasks need to be done -separately for each TLD, such as RDE exports and NORDN uploads. It's simpler to +separately for each TLD, such as RDE exports and NORDN uploads. It's simpler to have a single cron entry that will create tasks for all TLDs than to have to specify a separate cron task for each action for each TLD (though that is still -an option). Task queues also provide retry semantics in the event of transient -failures that a raw cron task does not. This is why there are some tasks that -do not fan out across TLDs that still use `TldFanoutAction` -- it's so that the +an option). Task queues also provide retry semantics in the event of transient +failures that a raw cron task does not. This is why there are some tasks that do +not fan out across TLDs that still use `TldFanoutAction` -- it's so that the tasks retry in the face of transient errors. The full list of URL parameters to `TldFanoutAction` that can be specified in cron.xml is: -* `endpoint` -- The path of the action that should be executed (see `web.xml`). -* `queue` -- The cron queue to enqueue tasks in. -* `forEachRealTld` -- Specifies that the task should be run in each TLD of type - `REAL`. This can be combined with `forEachTestTld`. -* `forEachTestTld` -- Specifies that the task should be run in each TLD of type - `TEST`. This can be combined with `forEachRealTld`. -* `runInEmpty` -- Specifies that the task should be run globally, i.e. just - once, rather than individually per TLD. This is provided to allow tasks to - retry. It is called "`runInEmpty`" for historical reasons. -* `excludes` -- A list of TLDs to exclude from processing. -* `jitterSeconds` -- The execution of each per-TLD task is delayed by a - different random number of seconds between zero and this max value. + +* `endpoint` -- The path of the action that should be executed (see + `web.xml`). +* `queue` -- The cron queue to enqueue tasks in. +* `forEachRealTld` -- Specifies that the task should be run in each TLD of + type `REAL`. This can be combined with `forEachTestTld`. +* `forEachTestTld` -- Specifies that the task should be run in each TLD of + type `TEST`. This can be combined with `forEachRealTld`. +* `runInEmpty` -- Specifies that the task should be run globally, i.e. just + once, rather than individually per TLD. This is provided to allow tasks to + retry. It is called "`runInEmpty`" for historical reasons. +* `excludes` -- A list of TLDs to exclude from processing. +* `jitterSeconds` -- The execution of each per-TLD task is delayed by a + different random number of seconds between zero and this max value. ## Cloud Datastore -The Domain Registry platform uses -[Cloud Datastore](https://cloud.google.com/appengine/docs/java/datastore/) as -its primary database. Cloud Datastore is a NoSQL document database that -provides automatic horizontal scaling, high performance, and high availability. -All information that is persisted to Cloud Datastore takes the form of Java -classes annotated with `@Entity` that are located in the `model` package. The -[Objectify library](https://cloud.google.com/appengine/docs/java/gettingstarted/using-datastore-objectify) +The Domain Registry platform uses [Cloud Datastore] +(https://cloud.google.com/appengine/docs/java/datastore/) as its primary +database. Cloud Datastore is a NoSQL document database that provides automatic +horizontal scaling, high performance, and high availability. All information +that is persisted to Cloud Datastore takes the form of Java classes annotated +with `@Entity` that are located in the `model` package. The [Objectify library] +(https://cloud.google.com/appengine/docs/java/gettingstarted/using-datastore-objectify) is used to persist instances of these classes in a format that Datastore understands. A brief overview of the different entity types found in the App Engine Datastore -Viewer may help administrators understand what they are seeing. Note that some +Viewer may help administrators understand what they are seeing. Note that some of these entities are part of App Engine tools that are outside of the domain registry codebase: -* `_AE_*` -- These entities are created by App Engine. -* `_ah_SESSION` -- These entities track App Engine client sessions. -* `_GAE_MR_*` -- These entities are generated by App Engine while running - MapReduces. -* `BackupStatus` -- There should only be one of these entities, used to maintain - the state of the backup process. -* `Cancellation` -- A cancellation is a special type of billing event which - represents the cancellation of another billing event such as a OneTime or - Recurring. -* `ClaimsList`, `ClaimsListShard`, and `ClaimsListSingleton` -- These entities - store the TMCH claims list, for use in trademark processing. -* `CommitLog*` -- These entities store the commit log information. -* `ContactResource` -- These hold the ICANN contact information (but not - registrar contacts, who have a separate entity type). -* `Cursor` -- We use Cursor entities to maintain state about daily processes, - remembering which dates have been processed. For instance, for the RDE export, - Cursor entities maintain the date up to which each TLD has been exported. -* `DomainApplicationIndex` -- These hold domain applications received during the - sunrise period. -* `DomainBase` -- These hold the ICANN domain information. -* `DomainRecord` -- These are used during the DNS update process. -* `EntityGroupRoot` -- There is only one EntityGroupRoot entity, which serves as - the Datastore parent of many other entities. -* `EppResourceIndex` -- These entities allow enumeration of EPP resources (such - as domains, hosts and contacts), which would otherwise be difficult to do in - Datastore. -* `ExceptionReportEntity` -- These entities are generated automatically by - ECatcher, a Google-internal logging and debugging tool. Non-Google users - should not encounter these entries. -* `ForeignKeyContactIndex`, `ForeignKeyDomainIndex`, and `ForeignKeyHostIndex` - -- These act as a unique index on contacts, domains and hosts, allowing - transactional lookup by foreign key. -* `HistoryEntry` -- A HistoryEntry is the record of a command which mutated an - EPP resource. It serves as the parent of BillingEvents and PollMessages. -* `HostRecord` -- These are used during the DNS update process. -* `HostResource` -- These hold the ICANN host information. -* `Lock` -- Lock entities are used to control access to a shared resource such - as an App Engine queue. Under ordinary circumstances, these locks will be - cleaned up automatically, and should not accumulate. -* `LogsExportCursor` -- This is a single entity which maintains the state of log - export. -* `MR-*` -- These entities are generated by the App Engine MapReduce library in - the course of running MapReduces. -* `Modification` -- A Modification is a special type of billing event which - represents the modification of a OneTime billing event. -* `OneTime` -- A OneTime is a billing event which represents a one-time charge - or credit to the client (as opposed to Recurring). -* `pipeline-*` -- These entities are also generated by the App Engine MapReduce - library. -* `PollMessage` -- PollMessages are generated by the system to notify registrars - of asynchronous responses and status changes. -* `PremiumList`, `PremiumListEntry`, and `PremiumListRevision` -- The standard - method for determining which domain names receive premium pricing is to - maintain a static list of premium names. Each PremiumList contains some number - of PremiumListRevisions, each of which in turn contains a PremiumListEntry for - each premium name. -* `RdeRevision` -- These entities are used by the RDE subsystem in the process - of generating files. -* `Recurring` -- A Recurring is a billing event which represents a recurring - charge to the client (as opposed to OneTime). -* `Registrar` -- These hold information about client registrars. -* `RegistrarContact` -- Registrars have contacts just as domains do. These are - stored in a special RegistrarContact entity. -* `RegistrarCredit` and `RegistrarCreditBalance` -- The system supports the - concept of a registrar credit balance, which is a pool of credit that the - registrar can use to offset amounts they owe. This might come from promotions, - for instance. These entities maintain registrars' balances. -* `Registry` -- These hold information about the TLDs supported by the Registry - system. -* `RegistryCursor` -- These entities are the predecessor to the Cursor - entities. We are no longer using them, and will be deleting them soon. -* `ReservedList` -- Each ReservedList entity represents an entire list of - reserved names which cannot be registered. Each TLD can have one or more - attached reserved lists. -* `ServerSecret` -- this is a single entity containing the secret numbers used - for generating tokens such as XSRF tokens. -* `SignedMarkRevocationList` -- The entities together contain the Signed Mark - Data Revocation List file downloaded from the TMCH MarksDB each day. Each - entity contains up to 10,000 rows of the file, so depending on the size of the - file, there will be some handful of entities. -* `TmchCrl` -- This is a single entity containing ICANN's TMCH CA Certificate - Revocation List. +* `_AE_*` -- These entities are created by App Engine. +* `_ah_SESSION` -- These entities track App Engine client sessions. +* `_GAE_MR_*` -- These entities are generated by App Engine while running + MapReduces. +* `BackupStatus` -- There should only be one of these entities, used to + maintain the state of the backup process. +* `Cancellation` -- A cancellation is a special type of billing event which + represents the cancellation of another billing event such as a OneTime or + Recurring. +* `ClaimsList`, `ClaimsListShard`, and `ClaimsListSingleton` -- These entities + store the TMCH claims list, for use in trademark processing. +* `CommitLog*` -- These entities store the commit log information. +* `ContactResource` -- These hold the ICANN contact information (but not + registrar contacts, who have a separate entity type). +* `Cursor` -- We use Cursor entities to maintain state about daily processes, + remembering which dates have been processed. For instance, for the RDE + export, Cursor entities maintain the date up to which each TLD has been + exported. +* `DomainApplicationIndex` -- These hold domain applications received during + the sunrise period. +* `DomainBase` -- These hold the ICANN domain information. +* `DomainRecord` -- These are used during the DNS update process. +* `EntityGroupRoot` -- There is only one EntityGroupRoot entity, which serves + as the Datastore parent of many other entities. +* `EppResourceIndex` -- These entities allow enumeration of EPP resources + (such as domains, hosts and contacts), which would otherwise be difficult to + do in Datastore. +* `ExceptionReportEntity` -- These entities are generated automatically by + ECatcher, a Google-internal logging and debugging tool. Non-Google users + should not encounter these entries. +* `ForeignKeyContactIndex`, `ForeignKeyDomainIndex`, and + `ForeignKeyHostIndex` -- These act as a unique index on contacts, domains + and hosts, allowing transactional lookup by foreign key. +* `HistoryEntry` -- A HistoryEntry is the record of a command which mutated an + EPP resource. It serves as the parent of BillingEvents and PollMessages. +* `HostRecord` -- These are used during the DNS update process. +* `HostResource` -- These hold the ICANN host information. +* `Lock` -- Lock entities are used to control access to a shared resource such + as an App Engine queue. Under ordinary circumstances, these locks will be + cleaned up automatically, and should not accumulate. +* `LogsExportCursor` -- This is a single entity which maintains the state of + log export. +* `MR-*` -- These entities are generated by the App Engine MapReduce library + in the course of running MapReduces. +* `Modification` -- A Modification is a special type of billing event which + represents the modification of a OneTime billing event. +* `OneTime` -- A OneTime is a billing event which represents a one-time charge + or credit to the client (as opposed to Recurring). +* `pipeline-*` -- These entities are also generated by the App Engine + MapReduce library. +* `PollMessage` -- PollMessages are generated by the system to notify + registrars of asynchronous responses and status changes. +* `PremiumList`, `PremiumListEntry`, and `PremiumListRevision` -- The standard + method for determining which domain names receive premium pricing is to + maintain a static list of premium names. Each PremiumList contains some + number of PremiumListRevisions, each of which in turn contains a + PremiumListEntry for each premium name. +* `RdeRevision` -- These entities are used by the RDE subsystem in the process + of generating files. +* `Recurring` -- A Recurring is a billing event which represents a recurring + charge to the client (as opposed to OneTime). +* `Registrar` -- These hold information about client registrars. +* `RegistrarContact` -- Registrars have contacts just as domains do. These are + stored in a special RegistrarContact entity. +* `RegistrarCredit` and `RegistrarCreditBalance` -- The system supports the + concept of a registrar credit balance, which is a pool of credit that the + registrar can use to offset amounts they owe. This might come from + promotions, for instance. These entities maintain registrars' balances. +* `Registry` -- These hold information about the TLDs supported by the + Registry system. +* `RegistryCursor` -- These entities are the predecessor to the Cursor + entities. We are no longer using them, and will be deleting them soon. +* `ReservedList` -- Each ReservedList entity represents an entire list of + reserved names which cannot be registered. Each TLD can have one or more + attached reserved lists. +* `ServerSecret` -- this is a single entity containing the secret numbers used + for generating tokens such as XSRF tokens. +* `SignedMarkRevocationList` -- The entities together contain the Signed Mark + Data Revocation List file downloaded from the TMCH MarksDB each day. Each + entity contains up to 10,000 rows of the file, so depending on the size of + the file, there will be some handful of entities. +* `TmchCrl` -- This is a single entity containing ICANN's TMCH CA Certificate + Revocation List. ## Cloud Storage buckets -The Domain Registry platform uses -[Cloud Storage](https://cloud.google.com/storage/) for bulk storage of large -flat files that aren't suitable for Datastore. These files include backups, RDE -exports, Datastore snapshots (for ingestion into BigQuery), and reports. Each -bucket name must be unique across all of Google Cloud Storage, so we use the -common recommended pattern of prefixing all buckets with the name of the App -Engine app (which is itself globally unique). Most of the bucket names are -configurable, but the defaults are as follows, with PROJECT standing in as a -placeholder for the App Engine app name: +The Domain Registry platform uses [Cloud Storage] +(https://cloud.google.com/storage/) for bulk storage of large flat files that +aren't suitable for Datastore. These files include backups, RDE exports, +Datastore snapshots (for ingestion into BigQuery), and reports. Each bucket name +must be unique across all of Google Cloud Storage, so we use the common +recommended pattern of prefixing all buckets with the name of the App Engine app +(which is itself globally unique). Most of the bucket names are configurable, +but the defaults are as follows, with PROJECT standing in as a placeholder for +the App Engine app name: -* `PROJECT-billing` -- Monthly invoice files for each registrar. -* `PROJECT-commits` -- Daily exports of commit logs that are needed for - potentially performing a restore. -* `PROJECT-domain-lists` -- Daily exports of all registered domain names per - TLD. -* `PROJECT-gcs-logs` -- This bucket is used at Google to store the GCS access - logs and storage data. This bucket is not required by the Registry system, - but can provide useful logging information. For instructions on setup, see - the - [Cloud Storage documentation](https://cloud.google.com/storage/docs/access-logs). -* `PROJECT-icann-brda` -- This bucket contains the weekly ICANN BRDA files. - There is no lifecycle expiration; we keep a history of all the files. This - bucket must exist for the BRDA process to function. -* `PROJECT-icann-zfa` -- This bucket contains the most recent ICANN ZFA - files. No lifecycle is needed, because the files are overwritten each time. -* `PROJECT-rde` -- This bucket contains RDE exports, which should then be - regularly uploaded to the escrow provider. Lifecycle is set to 90 days. The - bucket must exist. -* `PROJECT-reporting` -- Contains monthly ICANN reporting files. -* `PROJECT-snapshots` -- Contains daily exports of Datastore entities of types - defined in `ExportConstants.java`. These are imported into BigQuery daily to - allow for in-depth querying. -* `PROJECT.appspot.com` -- Temporary MapReduce files are stored here. By - default, the App Engine MapReduce library places its temporary files in a - bucket named {project}.appspot.com. This bucket must exist. To keep temporary - files from building up, a 90-day or 180-day lifecycle should be applied to the - bucket, depending on how long you want to be able to go back and debug - MapReduce problems. At 30 GB per day of generate temporary files, this bucket - may be the largest consumer of storage, so only save what you actually use. +* `PROJECT-billing` -- Monthly invoice files for each registrar. +* `PROJECT-commits` -- Daily exports of commit logs that are needed for + potentially performing a restore. +* `PROJECT-domain-lists` -- Daily exports of all registered domain names per + TLD. +* `PROJECT-gcs-logs` -- This bucket is used at Google to store the GCS access + logs and storage data. This bucket is not required by the Registry system, + but can provide useful logging information. For instructions on setup, see + the [Cloud Storage documentation] + (https://cloud.google.com/storage/docs/access-logs). +* `PROJECT-icann-brda` -- This bucket contains the weekly ICANN BRDA files. + There is no lifecycle expiration; we keep a history of all the files. This + bucket must exist for the BRDA process to function. +* `PROJECT-icann-zfa` -- This bucket contains the most recent ICANN ZFA files. + No lifecycle is needed, because the files are overwritten each time. +* `PROJECT-rde` -- This bucket contains RDE exports, which should then be + regularly uploaded to the escrow provider. Lifecycle is set to 90 days. The + bucket must exist. +* `PROJECT-reporting` -- Contains monthly ICANN reporting files. +* `PROJECT-snapshots` -- Contains daily exports of Datastore entities of types + defined in `ExportConstants.java`. These are imported into BigQuery daily to + allow for in-depth querying. +* `PROJECT.appspot.com` -- Temporary MapReduce files are stored here. By + default, the App Engine MapReduce library places its temporary files in a + bucket named {project}.appspot.com. This bucket must exist. To keep + temporary files from building up, a 90-day or 180-day lifecycle should be + applied to the bucket, depending on how long you want to be able to go back + and debug MapReduce problems. At 30 GB per day of generate temporary files, + this bucket may be the largest consumer of storage, so only save what you + actually use. ## Commit logs diff --git a/g3doc/configuration.md b/g3doc/configuration.md index ab7b288b2..6070e695b 100644 --- a/g3doc/configuration.md +++ b/g3doc/configuration.md @@ -1,19 +1,19 @@ # Configuration There are multiple different kinds of configuration that go into getting a -working registry system up and running. Broadly speaking, configuration works -in two ways -- globally, for the entire sytem, and per-TLD. Global -configuration is managed by editing code and deploying a new version, whereas -per-TLD configuration is data that lives in Datastore in `Registry` entities, -and is updated by running `registry_tool` commands without having to deploy a -new version. +working registry system up and running. Broadly speaking, configuration works in +two ways -- globally, for the entire sytem, and per-TLD. Global configuration is +managed by editing code and deploying a new version, whereas per-TLD +configuration is data that lives in Datastore in `Registry` entities, and is +updated by running `registry_tool` commands without having to deploy a new +version. ## Environments Before getting into the details of configuration, it's important to note that a -lot of configuration is environment-dependent. It is common to see `switch` +lot of configuration is environment-dependent. It is common to see `switch` statements that operate on the current `RegistryEnvironment`, and return -different values for different environments. This is especially pronounced in +different values for different environments. This is especially pronounced in the `UNITTEST` and `LOCAL` environments, which don't run on App Engine at all. As an example, some timeouts may be long in production and short in unit tests. @@ -27,34 +27,34 @@ thoroughly documented in the [App Engine configuration docs][app-engine-config]. The main files of note that come pre-configured along with the domain registry are: -* `cron.xml` -- Configuration of cronjobs -* `web.xml` -- Configuration of URL paths on the webserver -* `appengine-web.xml` -- Overall App Engine settings including number and type - of instances -* `datastore-indexes.xml` -- Configuration of entity indexes in Datastore -* `queue.xml` -- Configuration of App Engine task queues -* `application.xml` -- Configuration of the application name and its services +* `cron.xml` -- Configuration of cronjobs +* `web.xml` -- Configuration of URL paths on the webserver +* `appengine-web.xml` -- Overall App Engine settings including number and type + of instances +* `datastore-indexes.xml` -- Configuration of entity indexes in Datastore +* `queue.xml` -- Configuration of App Engine task queues +* `application.xml` -- Configuration of the application name and its services Cron, web, and queue are covered in more detail in the "App Engine architecture" doc, and the rest are covered in the general App Engine documentation. If you are not writing new code to implement custom features, is unlikely that you will need to make any modifications beyond simple changes to -`application.xml` and `appengine-web.xml`. If you are writing new features, -it's likely you'll need to add cronjobs, URL paths, Datastore indexes, and task +`application.xml` and `appengine-web.xml`. If you are writing new features, it's +likely you'll need to add cronjobs, URL paths, Datastore indexes, and task queues, and thus edit those associated XML files. ## Global configuration There are two different mechanisms by which global configuration is managed: -`RegistryConfig` (the old way) and `ConfigModule` (the new way). Ideally there +`RegistryConfig` (the old way) and `ConfigModule` (the new way). Ideally there would just be one, but the required code cleanup that hasn't been completed yet. If you are adding new options, prefer adding them to `ConfigModule`. **`RegistryConfig`** is an interface, of which you write an implementing class -containing the configuration values. `RegistryConfigLoader` is the class that +containing the configuration values. `RegistryConfigLoader` is the class that provides the instance of `RegistryConfig`, and defaults to returning -`ProductionRegistryConfigExample`. In order to create a configuration specific +`ProductionRegistryConfigExample`. In order to create a configuration specific to your registry, we recommend copying the `ProductionRegistryConfigExample` class to a new class that will not be shared publicly, setting the `com.google.domain.registry.config` system property in `appengine-web.xml` to @@ -64,16 +64,16 @@ configuration options. The `RegistryConfig` class has documentation on all of the methods that should be sufficient to explain what each option is, and -`ProductionRegistryConfigExample` provides an example value for each one. Some +`ProductionRegistryConfigExample` provides an example value for each one. Some example configuration options in this interface include the App Engine project ID, the number of days to retain commit logs, the names of various Cloud Storage bucket names, and URLs for some required services both external and internal. **`ConfigModule`** is a Dagger module that provides injectable configuration options (some of which come from `RegistryConfig` above, but most of which do -not). This is preferred over `RegistryConfig` for new configuration options +not). This is preferred over `RegistryConfig` for new configuration options because being able to inject configuration options is a nicer pattern that makes -for cleaner code. Some configuration options that can be changed in this class +for cleaner code. Some configuration options that can be changed in this class include timeout lengths and buffer sizes for various tasks, email addresses and URLs to use for various services, more Cloud Storage bucket names, and WHOIS disclaimer text. @@ -83,39 +83,39 @@ disclaimer text. Some configuration values, such as PGP private keys, are so sensitive that they should not be written in code as per the configuration methods above, as that would pose too high a risk of them accidentally being leaked, e.g. in a source -control mishap. We use a secret store to persist these values in a secure +control mishap. We use a secret store to persist these values in a secure manner, and abstract access to them using the `Keyring` interface. The `Keyring` interface contains methods for all sensitive configuration values, which are primarily credentials used to access various ICANN and ICANN- -affiliated services (such as RDE). These values are only needed for real -production registries and PDT environments. If you are just playing around with +affiliated services (such as RDE). These values are only needed for real +production registries and PDT environments. If you are just playing around with the platform at first, it is OK to put off defining these values until -necessary. To that end, a `DummyKeyringModule` is included that simply provides -an `InMemoryKeyring` populated with dummy values for all secret keys. This +necessary. To that end, a `DummyKeyringModule` is included that simply provides +an `InMemoryKeyring` populated with dummy values for all secret keys. This allows the codebase to compile and run, but of course any actions that attempt to connect to external services will fail because none of the keys are real. To configure a production registry system, you will need to write a replacement module for `DummyKeyringModule` that loads the credentials in a secure way, and provides them using either an instance of `InMemoryKeyring` or your own custom -implementation of `Keyring`. You then need to replace all usages of +implementation of `Keyring`. You then need to replace all usages of `DummyKeyringModule` with your own module in all of the per-service components -in which it is referenced. The functions in `PgpHelper` will likely prove -useful for loading keys stored in PGP format into the PGP key classes that -you'll need to provide from `Keyring`, and you can see examples of them in -action in `DummyKeyringModule`. +in which it is referenced. The functions in `PgpHelper` will likely prove useful +for loading keys stored in PGP format into the PGP key classes that you'll need +to provide from `Keyring`, and you can see examples of them in action in +`DummyKeyringModule`. ## Per-TLD configuration `Registry` entities, which are persisted to Datastore, are used for per-TLD -configuration. They contain any kind of configuration that is specific to a -TLD, such as the create/renew price of a domain name, the pricing engine +configuration. They contain any kind of configuration that is specific to a TLD, +such as the create/renew price of a domain name, the pricing engine implementation, the DNS writer implementation, whether escrow exports are -enabled, the default currency, the reserved label lists, and more. The -`update_tld` command in `registry_tool` is used to set all of these options. -See the "Registry tool" documentation for more information, as well as the -command-line help for the `update_tld` command. Unlike global configuration +enabled, the default currency, the reserved label lists, and more. The +`update_tld` command in `registry_tool` is used to set all of these options. See +the "Registry tool" documentation for more information, as well as the +command-line help for the `update_tld` command. Unlike global configuration above, per-TLD configuration options are stored as data in the running system, and thus do not require code pushes to update. diff --git a/g3doc/install.md b/g3doc/install.md index 4712b280c..eedb04b9c 100644 --- a/g3doc/install.md +++ b/g3doc/install.md @@ -5,25 +5,27 @@ working running instance. ## Prerequisites -* A recent version of the -[Java 7 JDK](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html) -(note that Java 8 support should be coming to App Engine soon). -* [Bazel](http://bazel.io/), which is the buld system that -the Domain Registry project uses. The minimum required version is 0.3.1. -* [Google App Engine SDK for Java](https://cloud.google.com/appengine/downloads#Google_App_Engine_SDK_for_Java), -especially `appcfg`, which is a command-line tool that runs locally that is used -to communicate with the App Engine cloud. -* [Create an application](https://cloud.google.com/appengine/docs/java/quickstart) - on App Engine to deploy to, and set up `appcfg` to connect to it. +* A recent version of the [Java 7 JDK] + (http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html) + (note that Java 8 support should be coming to App Engine soon). +* [Bazel](http://bazel.io/), which is the buld system that the Domain Registry + project uses. The minimum required version is 0.3.1. +* [Google App Engine SDK for Java] + (https://cloud.google.com/appengine/downloads#Google_App_Engine_SDK_for_Java), + especially `appcfg`, which is a command-line tool that runs locally that is + used to communicate with the App Engine cloud. +* [Create an application] + (https://cloud.google.com/appengine/docs/java/quickstart) on App Engine to + deploy to, and set up `appcfg` to connect to it. ## Downloading the code -Start off by grabbing the latest version from the -[Domain Registry project on GitHub](https://github.com/google/domain-registry). -This can be done either by cloning the Git repo (if you expect to make code -changes to contribute back), or simply by downloading the latest release as a -zip file. This guide will cover cloning from Git, but should work almost -identically for downloading the zip file. +Start off by grabbing the latest version from the [Domain Registry project on +GitHub](https://github.com/google/domain-registry). This can be done either by +cloning the Git repo (if you expect to make code changes to contribute back), or +simply by downloading the latest release as a zip file. This guide will cover +cloning from Git, but should work almost identically for downloading the zip +file. $ git clone git@github.com:google/domain-registry.git Cloning into 'domain-registry'... @@ -36,19 +38,19 @@ identically for downloading the zip file. The most important directories are: -* `docs` -- the documentation (including this install guide) -* `java/google/registry` -- all of the source code of the main project -* `javatests/google/registry` -- all of the tests for the project -* `python` -- Some Python reporting scripts -* `scripts` -- Scripts for configuring development environments +* `docs` -- the documentation (including this install guide) +* `java/google/registry` -- all of the source code of the main project +* `javatests/google/registry` -- all of the tests for the project +* `python` -- Some Python reporting scripts +* `scripts` -- Scripts for configuring development environments Everything else, especially `third_party`, contains dependencies that are used by the project. ## Building and verifying the code -The first step is to verify that the project successfully builds. This will -also download and install dependencies. +The first step is to verify that the project successfully builds. This will also +download and install dependencies. $ bazel --batch build //java{,tests}/google/registry/... INFO: Found 584 targets... @@ -56,7 +58,7 @@ also download and install dependencies. INFO: Elapsed time: 124.433s, Critical Path: 116.92s There may be some warnings thrown, but if there are no errors, then you are good -to go. Next, run the tests to verify that everything works properly. The tests +to go. Next, run the tests to verify that everything works properly. The tests can be pretty resource intensive, so experiment with different values of parameters to optimize between low running time and not slowing down your computer too badly. @@ -68,10 +70,10 @@ computer too badly. ## Running a development instance locally `RegistryTestServer` is a lightweight test server for the registry that is -suitable for running locally for development. It uses local versions of all -Google Cloud Platform dependencies, when available. Correspondingly, its +suitable for running locally for development. It uses local versions of all +Google Cloud Platform dependencies, when available. Correspondingly, its functionality is limited compared to a Domain Registry instance running on an -actual App Engine instance. To see its command-line parameters, run: +actual App Engine instance. To see its command-line parameters, run: $ bazel run //javatests/google/registry/server -- --help @@ -86,13 +88,13 @@ http://localhost:8080/registrar . ## Deploying the code You are going to need to configure a variety of things before a working -installation can be deployed (see the Configuration guide for that). It's +installation can be deployed (see the Configuration guide for that). It's recommended to at least confirm that the default version of the code can be pushed at all first before diving into that, with the expectation that things won't work properly until they are configured. -All of the [EAR](https://en.wikipedia.org/wiki/EAR_(file_format)) and -[WAR](https://en.wikipedia.org/wiki/WAR_(file_format)) files for the different +All of the [EAR](https://en.wikipedia.org/wiki/EAR_\(file_format\)) and [WAR] +(https://en.wikipedia.org/wiki/WAR_\(file_format\)) files for the different environments, which were built in the previous step, are outputted to the `bazel-genfiles` directory as follows: @@ -115,7 +117,8 @@ an environment in the file name), whereas there is one WAR file per service per environment, with there being three services in total: default, backend, and tools. -Then, use `appcfg` to [deploy the WAR files](https://cloud.google.com/appengine/docs/java/tools/uploadinganapp): +Then, use `appcfg` to [deploy the WAR files] +(https://cloud.google.com/appengine/docs/java/tools/uploadinganapp): $ cd /path/to/downloaded/appengine/app $ /path/to/appcfg.sh update /path/to/registry_default.war @@ -126,15 +129,15 @@ Then, use `appcfg` to [deploy the WAR files](https://cloud.google.com/appengine/ Once the code is deployed, the next step is to play around with creating some entities in the registry, including a TLD, a registrar, a domain, a contact, and -a host. Note: Do this on a non-production environment! All commands below use +a host. Note: Do this on a non-production environment! All commands below use `registry_tool` to interact with the running registry system; see the -documentation on `registry_tool` for additional information on it. We'll assume +documentation on `registry_tool` for additional information on it. We'll assume that all commands below are running in the `alpha` environment; if you named your environment differently, then use that everywhere that `alpha` appears. ### Create a TLD -Pick the name of a TLD to create. For the purposes of this example we'll use +Pick the name of a TLD to create. For the purposes of this example we'll use "example", which conveniently happens to be an ICANN reserved string, meaning it'll never be created for real on the Internet at large. @@ -144,25 +147,25 @@ it'll never be created for real on the Internet at large. Perform this command? (y/N): y Updated 1 entities. -The name of the TLD is the main parameter passed to the command. The initial -TLD state is set here to general availability, bypassing sunrise and landrush, -so that domain names can be created immediately in the following steps. The TLD +The name of the TLD is the main parameter passed to the command. The initial TLD +state is set here to general availability, bypassing sunrise and landrush, so +that domain names can be created immediately in the following steps. The TLD type is set to `TEST` (the other alternative being `REAL`) for obvious reasons. `roid_suffix` is the suffix that will be used for repository ids of domains on the TLD -- it must be all uppercase and a maximum of eight ASCII characters. -ICANN -[recommends](https://www.icann.org/resources/pages/correction-non-compliant-roids-2015-08-26-en) -a unique ROID suffix per TLD. The easiest way to come up with one is to simply +ICANN [recommends] +(https://www.icann.org/resources/pages/correction-non-compliant-roids-2015-08-26-en) +a unique ROID suffix per TLD. The easiest way to come up with one is to simply use the entire uppercased TLD string if it is eight characters or fewer, or -abbreviate it in some sensible way down to eight if it is longer. The full repo -id of a domain resource is a hex string followed by the suffix, -e.g. `12F7CDF3-EXAMPLE` for our example TLD. +abbreviate it in some sensible way down to eight if it is longer. The full repo +id of a domain resource is a hex string followed by the suffix, e.g. +`12F7CDF3-EXAMPLE` for our example TLD. ### Create a registrar Now we need to create a registrar and give it access to operate on the example -TLD. For the purposes of our example we'll name the registrar "Acme". +TLD. For the purposes of our example we'll name the registrar "Acme". $ registry_tool -e alpha create_registrar acme --name 'ACME Corp' \ --registrar_type TEST --password hunter2 \ @@ -175,27 +178,27 @@ TLD. For the purposes of our example we'll name the registrar "Acme". support it. In the command above, "acme" is the internal registrar id that is the primary -key used to refer to the registrar. The `name` is the display name that is used -less often, primarily in user interfaces. We again set the type of the resource -here to `TEST`. The `password` is the EPP password that the registrar uses to -log in with. The `icann_referral_email` is the email address associated with -the initial creation of the registrar -- note that the registrar cannot change -it later. The address fields are self-explanatory (note that other parameters -are available for international addresses). The `allowed_tlds` parameter is a +key used to refer to the registrar. The `name` is the display name that is used +less often, primarily in user interfaces. We again set the type of the resource +here to `TEST`. The `password` is the EPP password that the registrar uses to +log in with. The `icann_referral_email` is the email address associated with the +initial creation of the registrar -- note that the registrar cannot change it +later. The address fields are self-explanatory (note that other parameters are +available for international addresses). The `allowed_tlds` parameter is a comma-delimited list of TLDs that the registrar has access to, and here is set to the example TLD. ### Create a contact Now we want to create a contact, as a contact is required before a domain can be -created. Contacts can be used on any number of domains across any number of +created. Contacts can be used on any number of domains across any number of TLDs, and contain the information on who owns or provides technical support for -a TLD. These details will appear in WHOIS queries. Note the `-c` parameter, +a TLD. These details will appear in WHOIS queries. Note the `-c` parameter, which stands for client identifier: This is used on most `registry_tool` commands, and is used to specify the id of the registrar that the command will -be executed using. Contact, domain, and host creation all work by constructing +be executed using. Contact, domain, and host creation all work by constructing an EPP message that is sent to the registry, and EPP commands need to run under -the context of a registrar. The "acme" registrar that was created above is used +the context of a registrar. The "acme" registrar that was created above is used for this purpose. $ registry_tool -e alpha create_contact -c acme --id abcd1234 \ @@ -204,24 +207,24 @@ for this purpose. [ ... snip EPP response ... ] The `id` is the contact id, and is referenced elsewhere in the system (e.g. when -a domain is created and the admin contact is specified). The `name` is the +a domain is created and the admin contact is specified). The `name` is the display name of the contact, which is usually the name of a company or of a -person. Again, the address fields are required, along with an `email`. +person. Again, the address fields are required, along with an `email`. ### Create a host Hosts are used to specify the IP addresses (either v4 or v6) that are associated -with a given nameserver. Note that hosts may either be in-bailiwick (on a TLD -that this registry runs) or out-of-bailiwick. In-bailiwick hosts may +with a given nameserver. Note that hosts may either be in-bailiwick (on a TLD +that this registry runs) or out-of-bailiwick. In-bailiwick hosts may additionally be subordinate (a subdomain of a domain name that is on this -registry). Let's create an out-of-bailiwick nameserver, which is the simplest +registry). Let's create an out-of-bailiwick nameserver, which is the simplest type. $ my_registry_tool -e alpha create_host -c acme --host ns1.google.com [ ... snip EPP response ... ] Note that hosts are required to have IP addresses if they are subordinate, and -must not have IP addresses if they are not subordinate. Use the `--addresses` +must not have IP addresses if they are not subordinate. Use the `--addresses` parameter to set the IP addresses on a host, passing in a comma-delimited list of IP addresses in either IPv4 or IPv6 format. @@ -236,7 +239,7 @@ and host. [ ... snip EPP response ... ] Note how the same contact id (from above) is used for the administrative, -technical, and registrant contact. This is quite common on domain names. +technical, and registrant contact. This is quite common on domain names. To verify that everything worked, let's query the WHOIS information for fake.example: diff --git a/g3doc/registry-tool.md b/g3doc/registry-tool.md index 9a701b50e..52998f9e1 100644 --- a/g3doc/registry-tool.md +++ b/g3doc/registry-tool.md @@ -1,11 +1,11 @@ # Registry tool The registry tool is a command-line registry administration tool that is invoked -using the `registry_tool` command. It has the ability to view and change a -large number of things in a running domain registry environment, including -creating registrars, updating premium and reserved lists, running an EPP command -from a given XML file, and performing various backend tasks like re-running RDE -if the most recent export failed. Its code lives inside the tools package +using the `registry_tool` command. It has the ability to view and change a large +number of things in a running domain registry environment, including creating +registrars, updating premium and reserved lists, running an EPP command from a +given XML file, and performing various backend tasks like re-running RDE if the +most recent export failed. Its code lives inside the tools package (`java/google/registry/tools`), and is compiled by building the `registry_tool` target in the Bazel BUILD file in that package. @@ -15,11 +15,11 @@ To build the tool and display its command-line help, execute this command: For future invocations you should alias the compiled binary in the `bazel-genfiles/java/google/registry` directory or add it to your path so that -you can run it more easily. The rest of this guide assumes that it has been +you can run it more easily. The rest of this guide assumes that it has been aliased to `registry_tool`. The registry tool is always called with a specific environment to run in using -the -e parameter. This looks like: +the -e parameter. This looks like: $ registry_tool -e production {command name} {command parameters} @@ -37,7 +37,7 @@ There are actually two separate tools, `gtech_tool`, which is a collection of lower impact commands intended to be used by tech support personnel, and `registry_tool`, which is a superset of `gtech_tool` that contains additional commands that are potentially more destructive and can change more aspects of -the system. A full list of `gtech_tool` commands can be found in +the system. A full list of `gtech_tool` commands can be found in `GtechTool.java`, and the additional commands that only `registry_tool` has access to are in `RegistryTool.java`. @@ -47,7 +47,7 @@ There are two broad ways that commands are implemented: some that send requests to `ToolsServlet` to execute the action on the server (these commands implement `ServerSideCommand`), and others that execute the command locally using the [Remote API](https://cloud.google.com/appengine/docs/java/tools/remoteapi) -(these commands implement `RemoteApiCommand`). Server-side commands take more +(these commands implement `RemoteApiCommand`). Server-side commands take more work to implement because they require both a client and a server-side component, e.g. `CreatePremiumListCommand.java` and `CreatePremiumListAction.java` respectively for creating a premium list. @@ -56,35 +56,36 @@ Engine, including running a large MapReduce, because they execute on the tools service in the App Engine cloud. Local commands, by contrast, are easier to implement, because there is only a -local component to write, but they aren't as powerful. A general rule of thumb +local component to write, but they aren't as powerful. A general rule of thumb for making this determination is to use a local command if possible, or a server-side command otherwise. ## Common tool patterns All tools ultimately implement the `Command` interface located in the `tools` -package. If you use an IDE such as Eclipse to view the type hierarchy of that +package. If you use an IDE such as Eclipse to view the type hierarchy of that interface, you'll see all of the commands that exist, as well as how a lot of them are grouped using sub-interfaces or abstract classes that provide -additional functionality. The most common patterns that are used by a large +additional functionality. The most common patterns that are used by a large number of other tools are: -* **`BigqueryCommand`** -- Provides a connection to BigQuery for tools that need - it. -* **`ConfirmingCommand`** -- Provides the methods `prompt()` and `execute()` to - override. `prompt()` outputs a message (usually what the command is going to - do) and prompts the user to confirm execution of the command, and then - `execute()` actually does it. -* **`EppToolCommand`** -- Commands that work by executing EPP commands against - the server, usually by filling in a template with parameters that were passed - on the command-line. -* **`MutatingEppToolCommand`** -- A sub-class of `EppToolCommand` that provides - a `--dry_run` flag, that, if passed, will display the output from the server - of what the command would've done without actually committing those changes. -* **`GetEppResourceCommand`** -- Gets individual EPP resources from the server - and outputs them. -* **`ListObjectsCommand`** -- Lists all objects of a specific type from the - server and outputs them. -* **`MutatingCommand`** -- Provides a facility to create or update entities in - Datastore, and uses a diff algorithm to display the changes that will be made - before committing them. +* **`BigqueryCommand`** -- Provides a connection to BigQuery for tools that + need it. +* **`ConfirmingCommand`** -- Provides the methods `prompt()` and `execute()` + to override. `prompt()` outputs a message (usually what the command is going + to do) and prompts the user to confirm execution of the command, and then + `execute()` actually does it. +* **`EppToolCommand`** -- Commands that work by executing EPP commands against + the server, usually by filling in a template with parameters that were + passed on the command-line. +* **`MutatingEppToolCommand`** -- A sub-class of `EppToolCommand` that + provides a `--dry_run` flag, that, if passed, will display the output from + the server of what the command would've done without actually committing + those changes. +* **`GetEppResourceCommand`** -- Gets individual EPP resources from the server + and outputs them. +* **`ListObjectsCommand`** -- Lists all objects of a specific type from the + server and outputs them. +* **`MutatingCommand`** -- Provides a facility to create or update entities in + Datastore, and uses a diff algorithm to display the changes that will be + made before committing them.