It seems that even though the token is supposed to be valid for 60min, in
practice it expires before that. Reducing caching time to 30min solves the
problem (at least as far as I can tell). This should not increase too much load
as we are only calling the API twice an hour instead of once.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=186830395
The RDAP Pilot Program operational profile document indicates that domain
responses should list, in addition to their normal contacts, a special entity
for the registrar.
1.5.12. The domain object in the RDAP response MUST contain an entity with the registrar role (called registrar entity in this section). The handle of the entity MUST be equal to the IANA Registrar ID. A valid fn member MUST be present in the registrar entity. Other members MAY be present in the entity (as specified in RFC6350, the vCard Format Specification and its corresponding JSON mapping RFC7095). Contracted parties MUST include an entity with the abuse role (called Abuse Entity in this section) within the registrar entity. The Abuse Entity MUST include tel and email members, and MAY include other members.
1.5.13. The entity with the registrar role in the RDAP response MUST contain a publicIDs member [RFC7483] to identify the IANA Registrar ID from the IANA’s Registrar ID registry (https://www.iana.org/assignments/registrar-ids/registrar-ids.xhtml). The type value of the publicID object MUST be equal to IANA Registrar ID.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=186797360
I'm actually surprised that we had this in our code, as it seems like a huge
oversight, but we were individually loading each and every referenced contact
and host during domain/application create/update/allocate flows. Loading them
all as a single batch should reduce round trips to Datastore by a good deal,
thus improving performance.
We aren't giving up any transactional consistency in doing so and the only
potential downside I can think of is that we're always loading all contacts/
hosts instead of only some of them, in the rare case that one of the earlier
contacts/hosts is actually in pending delete (which allowed short-circuiting).
However, the gains from only making one round-trip should swamp the potential
losses in occasionally loading more data than is strictly necessary.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=186656118
Currently we validate the fee extension by summing up all fees present in the extension and comparing it against the total fee to be charged. While this works in most cases, we'd like the ability to individually validate each fee. This is especially useful during EAP when two fees are charged, a regular "create" fee that would also be amount we charge during renewal, and a one time "EAP" fee.
Because we can only distinguish fees by their descriptions, we try to match the description to the format string of the fee type enums. We also only require individual fee matches when we are charging more than one type of fees, which makes the change compatible with most existing use cases where only one fees is charged and the description field is ignored in the extension.
We expect the workflow to be that a registrar sends a domain check, and we reply with exactly what fees we are expecting, and then it will use the descriptions in the response to send us a domain create with the correct fees.
Note that we aggregate fees within the same FeeType together. Normally there will only be one fee per type, but in case of custom logic there could be more than one fee for the same type. There is no way to distinguish them as they both use the same description. So it is simpler to just aggregate them.
This CL also includes some reformatting that conforms to google-java-format output.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=186530316
The unlimited exponential backoff makes cascading failure a serious problem,
when encountering burst DNS load. Originally, it was exponential backoff, with min 1 sec max 1 hour.
This changes it to be linearly scaling from
30 seconds to 10 minutes. Min 30 seconds is used to avoid over-retrying due to lock contention. Max 10 minutes allows for more retries within our 1 hour SLA. Finally, we're
switching to linear scaling to increase the number of 'quick' retries for low
backoff time, before ultimately settling on the upper bound of 10 minutes (if a
task ever gets to that point, it's probably misconfigured.)
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=186041553
Added:
- dns/update_latency, which measures the time since a DNS update was added to the pull queue until that updates is committed to the DnsWriter
- - It doesn't check that after being committed, it was actually published in the DNS.
- dns/publish_queue_delay, which measures how long since the initial insertion to the push queue until a publishDnsUpdate action was handled. It measures both for successes (which is what we care about) and various failures (which are important because the success for that publishDnsUpdate will be > than any of the previous failures)
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185995678
The START_DATE_SUNRISE phase allows registration of domains only with a signed mark. In all other respects - it is identical to the GENERAL_AVAILABILITY phase.
Note that Anchor Tenants bypass all checks, and are hence able to register domains without a signed mark.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185534793
The task-queue API only allows reading 1000 tasks at a time, hence the original reason for this limit. We get over that limit by reading (and processing) items from the queue in a loop - 1000 at a time.
This is important because the 1000 dns-updates are shared among all TLDs,
meaning that a TLD with >1000 waiting updates can affect the update latency of
other TLDs.
In addition, partially fixes the bug where if there are more than 1000 updates to paused
/ non-existing TLDs, we completely block all updated to all TLDs.
By partially fixed, I mean "if we have around 1000 updates to paused TLDs, we will read them every time ReadDnsUpdates is called, ignore then, and only then get to the actual updates we want to process".
This works for a number of 1000 updates waiting - but if paused TLDs have tens or hundreds of thousands of updates waiting - this might still choke up other TLDs (not to mention we keep reading / updating 10s or 100s of thousands of tasks in the queue, that's... bad.)
A more thorough fix will come in a future CL, as it requires a more thorough change in the code.
Note that the queue lease command supports a maximum of 10 QPS. Any more than
that - and we get errors / empty results. Hence we limit our QPS to 9 to be on
the safe side.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185218684
When a quota request is rejected, increment the metric counter by one.
Also makes both frontend and backend metrics singleton because all the fields they have a static.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185146804
SoySyntaxException is an abstract exception type and is never even declared to be thrown (all declarations about this changed about 2 years ago). So places catching it should either change to catch SoyCompilationException, or do nothing and let it propagate.
Tested:
TAP sample presubmit queue
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185050724
The quota handler terminates connections when quota is exceeded.
The next CL will add instrumentation for quota related metrics.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185042675
Changes the code to be in compliance with the RDAP Pilot Profile document,
which specifies:
1.4.11. If permitted or required by an ICANN agreement provision, waiver, or Consensus Policy, an RDAP response may contain redacted registrant, administrative, technical and/or other contact information. If any information is redacted, the response MUST include a remarks member with title "Data Policy", type "object truncated due to authorization", a description containing the string "Some of the data in this object has been removed" and a links member with the elements rel:alternate and href indicating where the data policy can be found. An entity with redacted information MUST include the "removed" value in the status element.
We were using the "removed" status to indicate deleted contacts and inactive
registrars. Instead, we will now use "inactive", so that we can use "removed"
to indicated redaction.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185039201
Every time you run nomulus tool you currently get a bunch of useless output
to the console that looks like this:
---
Feb 08, 2018 3:11:18 PM google.registry.config.YamlUtils mergeYaml
INFO: Successfully loaded environment configuration YAML file.
Feb 08, 2018 3:11:20 PM com.google.wrappers.base.GoogleInit logArgs
INFO: First call to GoogleInit.initialize - removeFlags: false, args: [ProcessUtils, --noinstall_signal_handlers]
Feb 08, 2018 3:11:20 PM com.google.wrappers.base.GoogleInit logArgs
INFO: Subsequent call to GoogleInit.initialize, ignoring - removeFlags: false, args: [SecureWrapperBindings (via google.registry.tools.RegistryTool), --noinstall_signal_handlers]
Feb 08, 2018 3:11:25 PM com.google.monitoring.metrics.MetricRegistryImpl newIncrementableMetric
INFO: Registered new counter: /lock/acquire_lock_requests
Feb 08, 2018 3:11:25 PM com.google.monitoring.metrics.MetricRegistryImpl newEventMetric
INFO: Registered new event metric: /lock/lock_duration
---
This CL fixes that by increasing the console logging threshold from INFO to
WARNING for the relevant paths, for nomulus tool only.
I also had to decrease the logging level of one statement inside YamlUtils
from INFO to FINE, because it was being called by AppEngineConnectionFlags'
constructor in building the HostAndPort server field, which is executed
from the first line of RegistryCli.runCommand(), whereas
loggingParams.configureLogging(), which actually reads in and takes action
on the logging.properties file, isn't called until much later. This is fine
though, because there's little value from logging the statement
"Successfully loaded environment configuration YAML file." every time every
command or flow is executed. We certainly do log errors if that ever fails,
which is the important part.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185036329
The higher the number the better for serious launches. These used to be 100
but had been detuned because instances weren't dying correctly when no longer
needed, thus contributing to higher costs than necessary. That problem was
fixed when we migrated to the Java 8 runtime, however, so there's no reason
not to use the higher number.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=184742738
In publishDomain, we load the subordinate hosts of the domain from datastore and compare its nameservers to them. For any nameserver that is in-baliwick, we call publishSubordinateHost on it and stage the A/AAAA records of the host for publication.
This is superior to the old approach where we use hostName.endsWith(domainName) to check if a nameserver is in-baliwick because it will mistake ns.another-example.tld as a subordinate host of example.tld. It is also better than checking hostName.endsWith("." + domainName), which will catch false positives as above, but falls short in a corner case where the nameserver has been deleted before its superordinate domain's record is updated. In that case, subordinateHosts.cotains(hostName) will be false but hostName.endsWith("." + domainName) will still be true.
Note that we still use the suffix check in filterGlueRecords because it is filtering on existing records from Cloud DNS. It is even advantageous to do so because if there were (and there shouldn't be if everything is consistent) any orphaned glue records (suffix matches to the domain, but not actually in its subordinate host list), they would be retained by the filter and therefore be deleted when the staged changes are committed.
Also fixed a few tests that should have failed had we checked subrodinate hosts....
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=184732005
See []for more information.
Created with the tools in []
Tested:
TAP --sample for global presubmit queue
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=184727400
It's been long enough since the format change adding in years that all
registrars should no longer have any IDs in the old format lying around
that they're still attempting to ACK. All poll messages have already been
coming back to registrars with the new format for months now.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=184714735
Previously, CloudDnsWriter used InetAddress.toString() to produce the ipv4/6
address string (i.e. 127.0.0.1 or 0:0:0:0:0:0:0:1) used as an argument to the
Cloud DNS API. However, this fails because InetAddress uses the format
"HostName/IpAddress" for toString(), which uses the empty string as a HostName
if unspecified. This resulted in the erroneous use of a prefix slash (i.e.
"/127.0.01") as an InetAddress argument, causing all glue record updates to
fail.
This change replaces InetAddress.toString() with InetAddress.getHostAddress(),
which properly generates the IP address for the InetAddress. This also replaces
a lot of logic in the corresponding test with concrete equivalents, preventing
obvious errors like this from creeping up on us in the future.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=184708896
Now that we've verified the new Beam billing pipeline works, we can delete the
old manual commands we used to use.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=184707182
The DS records consist of 4 values:
- keyTag: unsigned short (2 bytes)
- alg: unsigned byte
- digestType: unsigned byte
- digest: binary hex
NOTE: the current CL doesn't support keyData, neither as the optional field in dsData nor as a replacement for dsData
The command tool accepts DS records as a string, where the 4 values are given
as one string separated by white-spaces as follows:
<keyTag> <alg> <digestType> <digest>
e.g. something like:
60485 5 2 D4B7D520E7BB5F0F67674A0CCEB1E3E0614B93C4F9E99B8383F6A1E4469DA50A
which is how it's written in Zone files, allowing easy copy-paste from existing values.
ommas is confusing when using spaces.
The various "numbers" (keyTag, alg, digestType) are only checked that they are
positive integers - the rest is left for the server.
digest it checked to be an even-lengthed hex string.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=184583068
Also changed version checker tuple from strings to ints, so that 0.10.0 is larger than 0.4.2.
I think we should just get rid of the version checker all together. It is still requirement 0.4.2 as minimal bazel version, which mostly like will not work with Nomulus at this point.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=184536748
Rosie CL for []/third_party (local approval/rejection).
[]
b/71392935
Tested:
TAP --sample for global presubmit queue
[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=184412611
When enabled for a registrar, all EPP operations on premium domains that have
costs (e.g. creates, renews, transfers) will fail unless the EPP fee extension
is used to explicitly ack the amount of fee as part of the EPP transaction.
This ack is required regardless of whether premium fee acking is required at
the registry level. No data migration is necessary since false is the desired
default for this new attribute.
This CL also contains some slight refactoring of static utility methods used to
perform fee verification; there was short-circuiting at call-sites in two
places when what was really needed was two methods, one implementing additional
functionality on top of the other, and calling the inner method in the places
where short-circuiting had previously been necessary.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=184229363
"keepTasks" is a flag that prevents ReadDnsQueueAction from removing dns-update
tasks from the dns-pull queue, while still launching PublishDnsUpdates tasks to
update the DNS (meaning these tasks will be updated again in the next
ReadDnsQueueAction).
I'm not sure what's the purpose of this flag, but given we now allow multiple
writers (meaning we can already publish the same DNS multiple times) and given
that we can now recover from a bad writer (if a writer doesn't belong to a TLD,
we put the dns-updates queued for that writer back into the dns-pull queue) - I
suspect we don't need it anymore.
Alternative considered: changing this to a "dryRun" flag that won't actually
launch PublishDnsUpdates tasks, but will log which tasks it would have
launched. Decided against it because we will still need to "own" any task for a
significant amount of time if there are many (tens of thousands) tasks in the
queue. Hence a "dryRun" will still affect any actual runs for some time.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=183997187
This is a follow-up to []
Also added jaxws-api Maven dependency and upgraded activation artifacts to 1.2.0, in parity with //third_party/java/activation.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=183714304
There are 2 types of changed done here:
- reorder the existing cron jobs to be in the same order as production (for
easier diffing)
- add missing cron-jobs to either alpha or sandbox
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=183232936
This moves the default yearMonth logic into a common ReportingModule, rather than the coarse-scoped BackendModule, which may not want the default parameter extraction logic, as well as moving the 'yearMonth' parameter constant to the common package it's used in. This also provides a basis for future consolidation of the ReportingEmailUtils and BillingEmailUtils classes, which have modest overlap.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=183130311
This uses an extensibility mechanism similar to that of WhoisCommandFactory
and CustomLogicFactory, namely, that a fully qualified Java class is
specified in the YAML file for each environment with the allocation token
custom logic to be used. By default, this points to a no-op base class
that does nothing. Users that wish to add their own allocation token
custom logic can simply create a new class that extends
AllocationTokenCustomLogic and then configure it in their .yaml config
files.
This also renames the existing *FlowCustomLogic *Flow instance variables
from customLogic to flowCustomLogic, to avoid the potential confusion with
the new AllocationTokenCustomLogic class that also now exists.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=183003112
NotModifiedException was using HttpServletResponse.SC_NOT_FOUND instead of SC_NOT_MODIFIED (likely an autocomplete typo).
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182976671
To make FOSS build compile, third_party vendoring rules for jaxb are added to package all jaxb related targets imported from maven into a uber jar, mirroring the same practice done in //third_party/java/jaxb
Cloned from CL 182666460 by 'g4 patch'.
Original change by cushon@cushon:rosie182283995-0071_Rosie:47348:citc on 2018/01/20 13:36:15.
More information:
https://docs.google.com/document/d/1htErgDIoHMEuMBfGwrtS_O4WwhTw8QOGLva-7aYYvYs/edit?usp=sharing
Tested:
TAP --sample for global presubmit queue
[] passed FOSS test
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182855173
This also updates to a newer version of Closure Rules and fixes a protobuf dep
compile issue.
Full description of the change:
Soy supports 2 kinds of loops:
* foreach- for iterating over items in a collection, e.g.
{foreach $item in $list}...{/foreach}
* for - for indexed iteration, e.g. {for $i in range(0, 10)}...{/for}
The reason Soy has two different loops is an accident of history; Soy didn’t use
to have a proper grammar for expressions and so the alternate ‘for...range’
syntax was added to make it possible to write indexed loops. As the grammar has
improved having the two syntaxes is no longer necessary and so we are
eliminating one of them.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182843207
This fixes up the following problems:
1. Using string concatenation instead of the formatting variant methods.
2. Logging or swallowing exception messages without logging the exception
itself (this swallows the stack trace).
3. Unnecessary logging on re-thrown exceptions.
4. Unnecessary use of formatting variant methods when not necessary.
5. Complicated logging statements involving significant processing not being
wrapped inside of a logging level check.
6. Redundant logging both of an exception itself and its message (this is
unnecessary duplication).
7. Use of the base Logger class instead of our FormattingLogger class.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182419837
Having a short maximum lock duration doesn't affect the lock performance -
since the lock is only in use while the command is running anyway (which
doesn't depend on the maximum lock duration).
It only affects the behavior if the command running time is longer than the
maximum lock duration. If that happens - the command will fail, retry, and fail
again forever.
This may be a left-over from the old code, where the publishDnsUpdates itself
read the domains from the pull queue and published them - which would mean that
killing the command doesn't undo the work done.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182255446
This allows grouping metrics based on the DnsWriter. We can already group by
the TLD, but since a TLD can have multiple writers, and since different writers
perform very differently from one another, it could be important to group by
writer as well.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182255398
Trying to debug the 20s delay in requests, it would help to know if the delay
happens before or after our code is called.
Right now all we know is that the delay happens before our first loggin line,
which is in RequestAuthenticator.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182211285
Remove the hack that allows us to use JSch with Java 7 on App Engine -
basically, we have a modified version of JSch that lets us attach a
GAE-friendly thread factory and use that for all JSch threads.
TESTED: Verified that RDE SFTP uploads still work on alpha.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182208225
The TokenStore is configured by a QuotaConfig for a protocol (EPP/WHOIS). It accepts concurrent take, put and refresh request to grant/accept token to the caller.
The QuotaManager contains a TokenStore and provides abstractions that are appropriate for a quota leasing entity to use. Quota return calls are executed asynchronously by the QuotaManager, and quota refresh tasks are scheduled by the QuotaManager to run periodically.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182109341
This CL also fixes a bug. Registrars were returned in an arbitrary order. This caused cursor-based pagination to fail. Now we always sort by registrar name (even for handle searches), and use the registrar name in the cursor, to ensure proper behavior.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182098187