Shorten the lock timeout for rdeStaging

Sometimes rdeStaging reduce shards die after the lock is acquired. When that happens - the (automatic) rerun of the shard fails because the lock is in place causing that specific TLD to not stage and await the next call to rdeStaging.

rdeStaging runs every 4 hours, but the current lock lives for 5 hours.

This means that on the next rerun of rdeStaging, the lock still hasn't timed out so it fails again, and we have to wait for the subsequent run - a total delay of 8 hours.

Shortening the lock timeout to be less than the 4 hours rdeStaging rerun time solves this issue.

NOTE: This is just a "quick patch" solution. To really fix the rdeStaging failure we need to fix the lock itself.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=166102387
This commit is contained in:
guyben 2017-08-22 13:05:07 -07:00 committed by Ben McIlwain
parent ffa093716c
commit e94ab94d13

View file

@ -579,7 +579,7 @@ public final class RegistryConfig {
@Provides @Provides
@Config("rdeStagingLockTimeout") @Config("rdeStagingLockTimeout")
public static Duration provideRdeStagingLockTimeout() { public static Duration provideRdeStagingLockTimeout() {
return Duration.standardHours(5); return Duration.standardHours(2);
} }
/** /**