Currently, we have two different ways to parse a "set" parameter:
key=value1&key=value2&key=value3...
and
keys=value1,value2,value3
This is error prone for several reasons:
- different parts of the code must be "synchronized" to use the same style (the
place that creates the request, and the place that parses the request)
- for the key=value1&key=value2, we often use the same key name for the single
value and the set value. This can result in subtle bugs where part of the
code will successfully read the key assuming there's only one key (and will
get the first key=value1, ignoring the rest)
Here we transition everything to the keys=value1,value2,value3 method. This one
was chosen because:
- it's shorter
- it's more intuitive for users
- the key name is plural, differentiating it from the singular key=value that
other requests might need
-----------------------------------
To make sure there are not "transition issues", we will continue to support
(with warnings) the key=value1&key=value2 parameter parsing until we're sure we
haven't forgotten to update any part of the code.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=198810681
This is a 'green' Flogger migration CL. Green CLs are intended to be as
safe as possible and should be easy to review and submit.
No changes should be necessary to the code itself prior to submission,
but small changes to BUILD files may be required.
Changes within files are completely independent of each other, so this CL
can be safely split up for review using tools such as Rosie.
For more information, see []
Base CL: 197826149
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=198560170
The parameters were optional during the transition to allow old jobs stuck in the queue to work properly. It's been 2 months now so it's time to end the transition.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=190235532
This enables sharded DNS publishing on a per-TLD basis. Instead of a TLD-wide lock, the sharded scheme locks each update on the shard number, allowing parallel writes to DNS.
We allow N (the number of shards) to be 0 or 1 for no sharding, and N > 1 for an N-way sharding scheme. Unless explicitly set, all TLDs default to a numShards of 0, so we don't have to reload all registry objects explicitly.
WARNING: This will change the lock name upon deployment for the PublishDnsAction from "<TLD> Dns Updates" to "<TLD> Dns Updates shard 0". This may cause concurrency issues if the underlying DNSWriter is not parallel-write tolerant (currently all production usages are ZonemanWriter, which is parallel-tolerant, so no issues are expected).
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=187525655
Added:
- dns/update_latency, which measures the time since a DNS update was added to the pull queue until that updates is committed to the DnsWriter
- - It doesn't check that after being committed, it was actually published in the DNS.
- dns/publish_queue_delay, which measures how long since the initial insertion to the push queue until a publishDnsUpdate action was handled. It measures both for successes (which is what we care about) and various failures (which are important because the success for that publishDnsUpdate will be > than any of the previous failures)
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185995678
The task-queue API only allows reading 1000 tasks at a time, hence the original reason for this limit. We get over that limit by reading (and processing) items from the queue in a loop - 1000 at a time.
This is important because the 1000 dns-updates are shared among all TLDs,
meaning that a TLD with >1000 waiting updates can affect the update latency of
other TLDs.
In addition, partially fixes the bug where if there are more than 1000 updates to paused
/ non-existing TLDs, we completely block all updated to all TLDs.
By partially fixed, I mean "if we have around 1000 updates to paused TLDs, we will read them every time ReadDnsUpdates is called, ignore then, and only then get to the actual updates we want to process".
This works for a number of 1000 updates waiting - but if paused TLDs have tens or hundreds of thousands of updates waiting - this might still choke up other TLDs (not to mention we keep reading / updating 10s or 100s of thousands of tasks in the queue, that's... bad.)
A more thorough fix will come in a future CL, as it requires a more thorough change in the code.
Note that the queue lease command supports a maximum of 10 QPS. Any more than
that - and we get errors / empty results. Hence we limit our QPS to 9 to be on
the safe side.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=185218684
This fixes up the following problems:
1. Using string concatenation instead of the formatting variant methods.
2. Logging or swallowing exception messages without logging the exception
itself (this swallows the stack trace).
3. Unnecessary logging on re-thrown exceptions.
4. Unnecessary use of formatting variant methods when not necessary.
5. Complicated logging statements involving significant processing not being
wrapped inside of a logging level check.
6. Redundant logging both of an exception itself and its message (this is
unnecessary duplication).
7. Use of the base Logger class instead of our FormattingLogger class.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182419837
This allows grouping metrics based on the DnsWriter. We can already group by
the TLD, but since a TLD can have multiple writers, and since different writers
perform very differently from one another, it could be important to group by
writer as well.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=182255398
Currently, if for some reason publishDnsUpdates gets a request to publish
domains to a DnsWriter that doesn't belong to said domain - it logs a warning
but published anyway.
This can happen when Writers are changed (swapped for a different writer)
leaving update commands "stuck" with the wrong writer.
Normally you'd expect these update commands to just publish their data and be
on their way. However, if the update fails for some reason (likely - if the
Writer change happened BECAUSE the updates are failing) then the same
publishDnsUpdate command will continue to run forever.
This CL changes the behavior for "publish to wrong DnsWriter" to instead
requeue the batched domains / hosts back to the Dns-pull queue, allowing them
to be re-batched (and hence published) with the correct DnsWriter(s). This
re-batching will take place in ReadDnsQueueAction.java
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=177863076
Adding the following metrics:
- how long does an update take, per TLD
- number of domains published, per TLD
- number of hosts published, per TLD
All are distributions.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=172933834
Right now - if there's an error during DnsWriter.publish*, all the publish from
before that error will be committed, while all the publish after that error
will not.
More than that - in some writers partial publishes can be committed, depending
on implementation.
This defines a new contract that publish* are only committed when .commit is
called. That way any error will simply mean no publish is committed.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=165708063
This completes the data/functionality migration for multiple DNS writers.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=163835077
It was buggy (didn't work) and was never actually used.
Why never actually used: for it to be used executeWithLock has to be called
with different requesters on the same lockId. That never happend in the code.
How it was buggy: Logically, the queue is deleted on release of the lock (meaning it was
meaningless the only time it mattered - when the lock isn't taken). In
addition, a different bug meant that having items in the queue prevented the
lock from being released forcing all other tasks to have to wait for lock
timeout even if the task that acquired the lock is long done.
Alternative: fix the queue. This would mean we don't want to delete the lock on release (since we want to keep the queue). Instead, we resave the same lock with expiration date being START_OF_TIME. In addition - we need to fix the .equals used to determine if the lock the same as the acquired lock - instead use some isSame function that ignores the queue.
Note: the queue is dangerous! An item (calling class / action) in the first place of a queue means no other calling class can get that lock. Everything is waiting for the first calling class to be re-run - but that might take a long time (depending on that action's rerun policy) and even might never happen (if for some reason that action decided it was no longer needed without acquiring the lock) - causing all other actions to stall forever!
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=163705463
This is a quick fix we can hopefully get out fast before fixing the underlying problem.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=163485468
This is written in such a way that it can safely handle task items in the
old format so long as the DNS writer to use for the given TLD is unambiguous
(which it is for now, until we allow multiple DNS writers to be configured).
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=162293412
We want to be safer and more explicit about the authentication needed by the many actions that exist.
As such, we make the 'auth' parameter required in @Action (so it's always clear who can run a specific action) and we replace the @Auth with an enum so that only pre-approved configurations that are aptly named and documented can be used.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=162210306
This is the final preparatory step necessary in order to load and load
configuration from YAML in a static context and then provide it either via
Dagger (using ConfigModule) or through RegistryConfig's existing static
functions.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=143819983
This is one of several CLs in a sequence for allowing per-TLD DNS
implementations.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=129445641
This is one of several CLs in order to support per-TLD DnsWriter
implementations, modeled on the work done for PremiumPricingEngine.
Since DnsWriters will be set inside the Registry object, the DnsWriter
interface definition needs to be moved to models to create minimal
dependency on the rest of the registry codebase to avoid cyclic
dependency.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=128711643
The dark lord Gosling designed the Java package naming system so that
ownership flows from the DNS system. Since we own the domain name
registry.google, it seems only appropriate that we should use
google.registry as our package name.
This change renames directories in preparation for the great package
rename. The repository is now in a broken state because the code
itself hasn't been updated. However this should ensure that git
correctly preserves history for each file.
2016-05-13 18:55:08 -04:00
Renamed from java/com/google/domain/registry/dns/PublishDnsUpdatesAction.java (Browse further)