Commit graph

27 commits

Author SHA1 Message Date
guyben
9d2b1e7572 Consolidate all Set parameter parsing
Currently, we have two different ways to parse a "set" parameter:
key=value1&key=value2&key=value3...
and
keys=value1,value2,value3

This is error prone for several reasons:
- different parts of the code must be "synchronized" to use the same style (the
  place that creates the request, and the place that parses the request)
- for the key=value1&key=value2, we often use the same key name for the single
  value and the set value. This can result in subtle bugs where part of the
  code will successfully read the key assuming there's only one key (and will
  get the first key=value1, ignoring the rest)

Here we transition everything to the keys=value1,value2,value3 method. This one
was chosen because:
- it's shorter
- it's more intuitive for users
- the key name is plural, differentiating it from the singular key=value that
  other requests might need

-----------------------------------

To make sure there are not "transition issues", we will continue to support
(with warnings) the key=value1&key=value2 parameter parsing until we're sure we
haven't forgotten to update any part of the code.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=198810681
2018-06-06 15:04:02 -04:00
jianglai
70b13596e4 Migrate to Flogger (green)
This is a 'green' Flogger migration CL. Green CLs are intended to be as
safe as possible and should be easy to review and submit.

No changes should be necessary to the code itself prior to submission,
but small changes to BUILD files may be required.

Changes within files are completely independent of each other, so this CL
can be safely split up for review using tools such as Rosie.

For more information, see []
Base CL: 197826149

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=198560170
2018-05-30 12:18:54 -04:00
jianglai
fc60890136 Migrate to internal FormattingLogger in preparation of migration to Flogger
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=197744904
2018-05-30 12:18:54 -04:00
guyben
89d8ba93f2 Remove transition code from []
The parameters were optional during the transition to allow old jobs stuck in the queue to work properly. It's been 2 months now so it's time to end the transition.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=190235532
2018-04-02 16:35:20 -04:00
larryruili
fa989e754b Add sharded DNS publishing capability
This enables sharded DNS publishing on a per-TLD basis. Instead of a TLD-wide lock, the sharded scheme locks each update on the shard number, allowing parallel writes to DNS.

We allow N (the number of shards) to be 0 or 1 for no sharding, and N > 1 for an N-way sharding scheme. Unless explicitly set, all TLDs default to a numShards of 0, so we don't have to reload all registry objects explicitly.

WARNING: This will change the lock name upon deployment for the PublishDnsAction from "<TLD> Dns Updates" to "<TLD> Dns Updates shard 0". This may cause concurrency issues if the underlying DNSWriter is not parallel-write tolerant (currently all production usages are ZonemanWriter, which is parallel-tolerant, so no issues are expected).

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=187525655
2018-03-06 19:14:26 -05:00
guyben
6e4b2bd6a8 Add metric for DNS UPDATE latency
Added:
- dns/update_latency, which measures the time since a DNS update was added to the pull queue until that updates is committed to the DnsWriter
- - It doesn't check that after being committed, it was actually published in the DNS.

- dns/publish_queue_delay, which measures how long since the initial insertion to the push queue until a publishDnsUpdate action was handled. It measures both for successes (which is what we care about) and various failures (which are important because the success for that publishDnsUpdate will be > than any of the previous failures)

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185995678
2018-02-20 15:54:15 -05:00
guyben
bba975a991 Allow over 1000 dns-updates to be handled at once
The task-queue API only allows reading 1000 tasks at a time, hence the original reason for this limit. We get over that limit by reading (and processing) items from the queue in a loop - 1000 at a time.

This is important because the 1000 dns-updates are shared among all TLDs,
meaning that a TLD with >1000 waiting updates can affect the update latency of
other TLDs.

In addition, partially fixes the bug where if there are more than 1000 updates to paused
/ non-existing TLDs, we completely block all updated to all TLDs.

By partially fixed, I mean "if we have around 1000 updates to paused TLDs, we will read them every time ReadDnsUpdates is called, ignore then, and only then get to the actual updates we want to process".

This works for a number of 1000 updates waiting - but if paused TLDs have tens or hundreds of thousands of updates waiting - this might still choke up other TLDs (not to mention we keep reading / updating 10s or 100s of thousands of tasks in the queue, that's... bad.)

A more thorough fix will come in a future CL, as it requires a more thorough change in the code.

Note that the queue lease command supports a maximum of 10 QPS. Any more than
that - and we get errors / empty results. Hence we limit our QPS to 9 to be on
the safe side.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185218684
2018-02-20 15:42:09 -05:00
mcilwain
81dc2bbbc3 Rationalize logging statements across codebase
This fixes up the following problems:
1. Using string concatenation instead of the formatting variant methods.
2. Logging or swallowing exception messages without logging the exception
   itself (this swallows the stack trace).
3. Unnecessary logging on re-thrown exceptions.
4. Unnecessary use of formatting variant methods when not necessary.
5. Complicated logging statements involving significant processing not being
   wrapped inside of a logging level check.
6. Redundant logging both of an exception itself and its message (this is
   unnecessary duplication).
7. Use of the base Logger class instead of our FormattingLogger class.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182419837
2018-01-19 14:56:45 -05:00
guyben
bf321ca044 Add label for the DnsWriter in the publishDnsUpdates metrics
This allows grouping metrics based on the DnsWriter. We can already group by
the TLD, but since a TLD can have multiple writers, and since different writers
perform very differently from one another, it could be important to group by
writer as well.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182255398
2018-01-19 14:53:21 -05:00
guyben
8e33bc898f Requeue domains on wrong DnsWriter.
Currently, if for some reason publishDnsUpdates gets a request to publish
domains to a DnsWriter that doesn't belong to said domain - it logs a warning
but published anyway.

This can happen when Writers are changed (swapped for a different writer)
leaving update commands "stuck" with the wrong writer.

Normally you'd expect these update commands to just publish their data and be
on their way. However, if the update fails for some reason (likely - if the
Writer change happened BECAUSE the updates are failing) then the same
publishDnsUpdate command will continue to run forever.

This CL changes the behavior for "publish to wrong DnsWriter" to instead
requeue the batched domains / hosts back to the Dns-pull queue, allowing them
to be re-batched (and hence published) with the correct DnsWriter(s). This
re-batching will take place in ReadDnsQueueAction.java

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=177863076
2017-12-13 12:43:45 -05:00
guyben
d577a281b8 Add stackdriver metrics to publishDnsUpdates
Adding the following metrics:

- how long does an update take, per TLD
- number of domains published, per TLD
- number of hosts published, per TLD

All are distributions.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=172933834
2017-10-24 16:53:47 -04:00
guyben
c3861f6e95 Swap all uses of Lock to LockHandler
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=167661348
2017-09-12 15:51:50 -04:00
guyben
d5ac03aae4 Make DnsWriter truly atomic
Right now - if there's an error during DnsWriter.publish*, all the publish from
before that error will be committed, while all the publish after that error
will not.

More than that - in some writers partial publishes can be committed, depending
on implementation.

This defines a new contract that publish* are only committed when .commit is
called. That way any error will simply mean no publish is committed.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=165708063
2017-08-29 16:40:07 -04:00
mcilwain
2a29ada032 Allow multiple DNS writers on TLDs
This completes the data/functionality migration for multiple DNS writers.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=163835077
2017-08-01 17:10:33 -04:00
guyben
aee4f7acc2 Remove queueing from Lock
It was buggy (didn't work) and was never actually used.

Why never actually used: for it to be used executeWithLock has to be called
with different requesters on the same lockId. That never happend in the code.

How it was buggy: Logically, the queue is deleted on release of the lock (meaning it was
meaningless the only time it mattered - when the lock isn't taken). In
addition, a different bug meant that having items in the queue prevented the
lock from being released forcing all other tasks to have to wait for lock
timeout even if the task that acquired the lock is long done.

Alternative: fix the queue. This would mean we don't want to delete the lock on release (since we want to keep the queue). Instead, we resave the same lock with expiration date being START_OF_TIME. In addition - we need to fix the .equals used to determine if the lock the same as the acquired lock - instead use some isSame function that ignores the queue.

Note: the queue is dangerous! An item (calling class / action) in the first place of a queue means no other calling class can get that lock. Everything is waiting for the first calling class to be re-run - but that might take a long time (depending on that action's rerun policy) and even might never happen (if for some reason that action decided it was no longer needed without acquiring the lock) - causing all other actions to stall forever!

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=163705463
2017-08-01 17:06:20 -04:00
guyben
fa858ac5cf Remove unneeded "requester" from publishDnsUpdates locking
This is a quick fix we can hopefully get out fast before fixing the underlying problem.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=163485468
2017-08-01 17:04:56 -04:00
mcilwain
4a921973ea Add capability to sync DNS using multiple writers if configured
This is written in such a way that it can safely handle task items in the
old format so long as the DNS writer to use for the given TLD is unambiguous
(which it is for now, until we allow multiple DNS writers to be configured).

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=162293412
2017-08-01 16:38:36 -04:00
guyben
e224a67eda Change @Auth to an AutoValue, and created a set of predefined Auths
We want to be safer and more explicit about the authentication needed by the many actions that exist.

As such, we make the 'auth' parameter required in @Action (so it's always clear who can run a specific action) and we replace the @Auth with an enum so that only pre-approved configurations that are aptly named and documented can be used.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=162210306
2017-08-01 16:33:10 -04:00
mmuller
b70f57b7c7 Update copyright year on all license headers
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=146111211
2017-02-02 16:27:22 -05:00
mcilwain
eaec03e670 Move ConfigModule and LocalTestConfig into RegistryConfig
This is the final preparatory step necessary in order to load and load
configuration from YAML in a static context and then provide it either via
Dagger (using ConfigModule) or through RegistryConfig's existing static
functions.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=143819983
2017-01-09 12:01:09 -05:00
shikhman
f76bc70f91 Preserve test logs and test summary output for Kokoro CI runs
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=135494972
2016-10-14 16:57:43 -04:00
shikhman
1c5b4e16c6 Add metrics to PublishDnsUpdatesAction
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=131190862
2016-08-26 09:43:30 -04:00
Greg Shikhman
a620d06239 Add dagger map for injecting DnsWriter implementations
This is one of several CLs in a sequence for allowing per-TLD DNS
implementations.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=129445641
2016-08-05 20:38:21 -04:00
Greg Shikhman
1ba739a6b6 Refactor DnsWriter into the model package
This is one of several CLs in order to support per-TLD DnsWriter
implementations, modeled on the work done for PremiumPricingEngine.

Since DnsWriters will be set inside the Registry object, the DnsWriter
interface definition needs to be moved to models to create minimal
dependency on the rest of the registry codebase to avoid cyclic
dependency.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=128711643
2016-08-02 19:10:49 -04:00
mcilwain
aa2f283f7c Convert entire project to strict lexicographical import sort ordering
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=127234970
2016-07-13 15:59:53 -04:00
Michael Muller
c458c05801 Rename Java packages to use the .google TLD
The dark lord Gosling designed the Java package naming system so that
ownership flows from the DNS system. Since we own the domain name
registry.google, it seems only appropriate that we should use
google.registry as our package name.
2016-05-13 20:04:42 -04:00
Justine Tunney
5012893c1d mv com/google/domain/registry google/registry
This change renames directories in preparation for the great package
rename. The repository is now in a broken state because the code
itself hasn't been updated. However this should ensure that git
correctly preserves history for each file.
2016-05-13 18:55:08 -04:00
Renamed from java/com/google/domain/registry/dns/PublishDnsUpdatesAction.java (Browse further)