From a365b82d42a577774e327ad3bf8a2ef3dbf6a00f Mon Sep 17 00:00:00 2001 From: larryruili Date: Fri, 16 Feb 2018 13:49:38 -0800 Subject: [PATCH] Update publish queue with practical retry params The unlimited exponential backoff makes cascading failure a serious problem, when encountering burst DNS load. Originally, it was exponential backoff, with min 1 sec max 1 hour. This changes it to be linearly scaling from 30 seconds to 10 minutes. Min 30 seconds is used to avoid over-retrying due to lock contention. Max 10 minutes allows for more retries within our 1 hour SLA. Finally, we're switching to linear scaling to increase the number of 'quick' retries for low backoff time, before ultimately settling on the upper bound of 10 minutes (if a task ever gets to that point, it's probably misconfigured.) ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=186041553 --- java/google/registry/env/common/default/WEB-INF/queue.xml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/java/google/registry/env/common/default/WEB-INF/queue.xml b/java/google/registry/env/common/default/WEB-INF/queue.xml index 61e29cfee..77b050968 100644 --- a/java/google/registry/env/common/default/WEB-INF/queue.xml +++ b/java/google/registry/env/common/default/WEB-INF/queue.xml @@ -10,6 +10,12 @@ dns-publish 100/s 100 + + + 30 + 600 + 0 +