mirror of
https://github.com/google/nomulus.git
synced 2025-04-30 12:07:51 +02:00
An automated rollback tool for Nomulus (#847)
* An automated rollback tool for Nomulus A tool that directs traffic between deployed versions. It handles the conversion between Nomulus tags and AppEngine versions, executes schema compatibility tests, ensures that steps are executed in the correct order, and updates deployment records appropriately.
This commit is contained in:
parent
478064f32b
commit
db2e896d42
11 changed files with 1552 additions and 0 deletions
151
release/rollback/README.md
Normal file
151
release/rollback/README.md
Normal file
|
@ -0,0 +1,151 @@
|
|||
## Summary
|
||||
|
||||
This package contains an automated rollback tool for the Nomulus server on
|
||||
AppEngine. When given the Nomulus tag of a deployed release, the tool directs
|
||||
all traffics in the four recognized services (backend, default, pubapi, and
|
||||
tools) to that release. In the process, it handles Nomulus tag to AppEngine
|
||||
version ID translation, checks the target binary's compatibility with SQL
|
||||
schema, starts/stops versions and redirects traffic in proper sequence, and
|
||||
updates deployment metadata appropriately.
|
||||
|
||||
The tool has two limitations:
|
||||
|
||||
1. This tool only accepts one release tag as rollback target, which is applied
|
||||
to all services.
|
||||
2. The tool immediately migrates all traffic to the new versions. It does not
|
||||
support gradual migration. This is not an issue now since gradual migration
|
||||
is only available in automatically scaled versions, while none of versions
|
||||
is using automatic scaling.
|
||||
|
||||
Although this tool is named a rollback tool, it can also reverse a rollback,
|
||||
that is, rolling forward to a newer release.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
This tool requires python version 3.7+. It also requires two GCP client
|
||||
libraries: google-cloud-storage and google-api-python-client. They can be
|
||||
installed using pip.
|
||||
|
||||
Registry team members should use either non-sudo pip3 or virtualenv/venv to
|
||||
install the GCP libraries. A 'sudo pip install' may interfere with the Linux
|
||||
tooling on your corp desktop. The non-sudo 'pip3 install' command installs the
|
||||
libraries under $HOME/.local. The virtualenv or venv methods allow more control
|
||||
over the installation location.
|
||||
|
||||
Below is an example of using virtualenv to install the libraries:
|
||||
|
||||
```shell
|
||||
sudo apt-get install virtualenv python3-venv
|
||||
python3 -m venv myproject
|
||||
source myproject/bin/activate
|
||||
pip install google-cloud-storage
|
||||
pip install google-api-python-client
|
||||
deactivate
|
||||
```
|
||||
|
||||
If using virtualenv, make sure to run 'source myproject/bin/activate' before
|
||||
running the rollback script.
|
||||
|
||||
## Usage
|
||||
|
||||
The tool can be invoked using the rollback_tool script in the Nomulus root
|
||||
directory. The following parameters may be requested:
|
||||
|
||||
* dev_project: This is the GCP project that hosts the release and deployment
|
||||
infrastructure, including the Spinnaker pipelines.
|
||||
* project: This is the GCP project that hosts the Nomulus server to be rolled
|
||||
back.
|
||||
* env: This is the name of the Nomulus environment, e.g., sandbox or
|
||||
production. Although the project to environment is available in Gradle
|
||||
scripts and internal configuration files, it is not easy to extract them.
|
||||
Therefore, we require the user to provide it for now.
|
||||
|
||||
A typical workflow goes as follows:
|
||||
|
||||
### Check Which Release is Serving
|
||||
|
||||
From the Nomulus root directory:
|
||||
|
||||
```shell
|
||||
rollback_tool show_serving_release --dev_project ... --project ... --env ...
|
||||
```
|
||||
|
||||
The output may look like:
|
||||
|
||||
```
|
||||
backend nomulus-v049 nomulus-20201019-RC00
|
||||
default nomulus-v049 nomulus-20201019-RC00
|
||||
pubapi nomulus-v049 nomulus-20201019-RC00
|
||||
tools nomulus-v049 nomulus-20201019-RC00
|
||||
```
|
||||
|
||||
### Review Recent Deployments
|
||||
|
||||
```shell
|
||||
rollback_tool show_recent_deployments --dev_project ... --project ... --env ...
|
||||
```
|
||||
|
||||
This command displays up to 3 most recent deployments. The output (from sandbox
|
||||
which only has two tracked deployments as of the writing of this document) may
|
||||
look like:
|
||||
|
||||
```
|
||||
backend nomulus-v048 nomulus-20201012-RC00
|
||||
default nomulus-v048 nomulus-20201012-RC00
|
||||
pubapi nomulus-v048 nomulus-20201012-RC00
|
||||
tools nomulus-v048 nomulus-20201012-RC00
|
||||
backend nomulus-v049 nomulus-20201019-RC00
|
||||
default nomulus-v049 nomulus-20201019-RC00
|
||||
pubapi nomulus-v049 nomulus-20201019-RC00
|
||||
tools nomulus-v049 nomulus-20201019-RC00
|
||||
```
|
||||
|
||||
### Roll to the Target Release
|
||||
|
||||
```shell
|
||||
rollback_tool rollback --dev_project ... --project ... --env ... \
|
||||
--targt_release {YOUR_CHOSEN_TAG} --run_mode ...
|
||||
```
|
||||
|
||||
The rollback subcommand has two new parameters:
|
||||
|
||||
* target_release: This is the Nomulus tag of the target release, in the form
|
||||
of nomulus-YYYYMMDD-RC[0-9][0-9]
|
||||
* run_mode: This is the execution mode of the rollback action. There are three
|
||||
modes:
|
||||
1. dryrun: The tool will only output information about every step of the
|
||||
rollback, including commands that a user can copy and run elsewhere.
|
||||
2. interactive: The tool will prompt the user before executing each step.
|
||||
The user may choose to abort the rollback, skip the step, or continue
|
||||
with the step.
|
||||
3. automatic: Tool will execute all steps in one shot.
|
||||
|
||||
The rollback steps are organized according to the following logic:
|
||||
|
||||
```
|
||||
for service in ['backend', 'default', 'pubapi', 'tools']:
|
||||
if service is on basicScaling: (See Notes # 1)
|
||||
start the target version
|
||||
if service is on manualScaling:
|
||||
start the target version
|
||||
set num_instances to its originally configured value
|
||||
|
||||
for service in ['backend', 'default', 'pubapi', 'tools']:
|
||||
direct traffic to target version
|
||||
|
||||
for service in ['backend', 'default', 'pubapi', 'tools']:
|
||||
if originally serving version is not the target version:
|
||||
if originally serving version is on basicaScaling
|
||||
stop the version
|
||||
if originally serving version is on manualScaling:
|
||||
stop the version
|
||||
set_num_instances to 1 (See Notes #2)
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
1. Versions on automatic scaling cannot be started or stopped by gcloud or the
|
||||
AppEngine Admin REST API.
|
||||
|
||||
2. The minimum value assignable to num_instances through the REST API is 1.
|
||||
This instance eventually will be released too.
|
198
release/rollback/appengine.py
Normal file
198
release/rollback/appengine.py
Normal file
|
@ -0,0 +1,198 @@
|
|||
# Copyright 2020 The Nomulus Authors. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Helper for using the AppEngine Admin REST API."""
|
||||
|
||||
import time
|
||||
from typing import Any, Dict, FrozenSet, Set
|
||||
|
||||
from googleapiclient import discovery
|
||||
from googleapiclient import http
|
||||
|
||||
import common
|
||||
|
||||
# AppEngine services under management.
|
||||
SERVICES = frozenset(['backend', 'default', 'pubapi', 'tools'])
|
||||
# Forces 'list' calls (for services and versions) to return all
|
||||
# results in one shot, to avoid having to handle pagination. This values
|
||||
# should be greater than the maximum allowed services and versions in any
|
||||
# project (
|
||||
# https://cloud.google.com/appengine/docs/standard/python/an-overview-of-app-engine#limits).
|
||||
_PAGE_SIZE = 250
|
||||
# Number of times to check the status of an operation before timing out.
|
||||
_STATUS_CHECK_TIMES = 5
|
||||
# Delay between status checks of a long-running operation, in seconds
|
||||
_STATUS_CHECK_INTERVAL = 5
|
||||
|
||||
|
||||
class PagingError(Exception):
|
||||
"""Error for unexpected partial results.
|
||||
|
||||
List calls in this module do not handle pagination. This error is raised
|
||||
when a partial result is received.
|
||||
"""
|
||||
def __init__(self, uri: str):
|
||||
super().__init__(
|
||||
self, f'Received paged response unexpectedly when calling {uri}. '
|
||||
'Consider increasing _PAGE_SIZE.')
|
||||
|
||||
|
||||
class AppEngineAdmin:
|
||||
"""Wrapper around the AppEngine Admin REST API client.
|
||||
|
||||
This class provides wrapper methods around the REST API for service and
|
||||
version queries and for migrating between versions.
|
||||
"""
|
||||
def __init__(self,
|
||||
project: str,
|
||||
service_lookup: discovery.Resource = None,
|
||||
status_check_interval: int = _STATUS_CHECK_INTERVAL) -> None:
|
||||
"""Initialize this instance for an AppEngine(GCP) project."""
|
||||
self._project = project
|
||||
|
||||
if service_lookup is not None:
|
||||
apps = service_lookup.apps()
|
||||
else:
|
||||
apps = discovery.build('appengine', 'v1beta').apps()
|
||||
|
||||
self._services = apps.services()
|
||||
self._operations = apps.operations()
|
||||
self._status_check_interval = status_check_interval
|
||||
|
||||
@property
|
||||
def project(self):
|
||||
return self._project
|
||||
|
||||
def _checked_request(self, request: http.HttpRequest) -> Dict[str, Any]:
|
||||
"""Verifies that all results are returned for a request."""
|
||||
response = request.execute()
|
||||
if 'nextPageToken' in response:
|
||||
raise PagingError(request.uri)
|
||||
|
||||
return response
|
||||
|
||||
def get_serving_versions(self) -> FrozenSet[common.VersionKey]:
|
||||
"""Returns the serving versions of every Nomulus service.
|
||||
|
||||
For each service in appengine.SERVICES, gets the version(s) actually
|
||||
serving traffic. Services with the 'SERVING' status but no allocated
|
||||
traffic are not included. Services not included in appengine.SERVICES
|
||||
are also ignored.
|
||||
|
||||
Returns: An immutable collection of the serving versions grouped by
|
||||
service.
|
||||
"""
|
||||
response = self._checked_request(
|
||||
self._services.list(appsId=self._project, pageSize=_PAGE_SIZE))
|
||||
|
||||
# Response format is specified at
|
||||
# http://googleapis.github.io/google-api-python-client/docs/dyn/appengine_v1beta5.apps.services.html#list.
|
||||
|
||||
versions = []
|
||||
for service in response.get('services', []):
|
||||
if service['id'] in SERVICES:
|
||||
# yapf: disable
|
||||
versions_with_traffic = (
|
||||
service.get('split', {}).get('allocations', {}).keys())
|
||||
# yapf: enable
|
||||
for version in versions_with_traffic:
|
||||
versions.append(common.VersionKey(service['id'], version))
|
||||
|
||||
return frozenset(versions)
|
||||
|
||||
|
||||
# yapf: disable # argument indent wrong
|
||||
def get_version_configs(
|
||||
self, versions: Set[common.VersionKey]
|
||||
) -> FrozenSet[common.VersionConfig]:
|
||||
# yapf: enable
|
||||
"""Returns the configuration of requested versions.
|
||||
|
||||
For each version in the request, gets the rollback-related data from
|
||||
its static configuration (found in appengine-web.xml).
|
||||
|
||||
Args:
|
||||
versions: A set of the VersionKey objects, each containing the
|
||||
versions being queried in that service.
|
||||
|
||||
Returns:
|
||||
The version configurations in an immutable set.
|
||||
"""
|
||||
requested_services = {version.service_id for version in versions}
|
||||
|
||||
version_configs = []
|
||||
# Sort the requested services for ease of testing. For now the mocked
|
||||
# AppEngine admin in appengine_test can only respond in a fixed order.
|
||||
for service_id in sorted(requested_services):
|
||||
response = self._checked_request(self._services.versions().list(
|
||||
appsId=self._project,
|
||||
servicesId=service_id,
|
||||
pageSize=_PAGE_SIZE))
|
||||
|
||||
# Format of version_list is defined at
|
||||
# https://googleapis.github.io/google-api-python-client/docs/dyn/appengine_v1beta5.apps.services.versions.html#list.
|
||||
|
||||
for version in response.get('versions', []):
|
||||
if common.VersionKey(service_id, version['id']) in versions:
|
||||
scalings = [
|
||||
s for s in list(common.AppEngineScaling)
|
||||
if s.value in version
|
||||
]
|
||||
if len(scalings) != 1:
|
||||
raise common.CannotRollbackError(
|
||||
f'Expecting exactly one scaling, found {scalings}')
|
||||
|
||||
scaling = common.AppEngineScaling(list(scalings)[0])
|
||||
if scaling == common.AppEngineScaling.MANUAL:
|
||||
manual_instances = version.get(
|
||||
scaling.value).get('instances')
|
||||
else:
|
||||
manual_instances = None
|
||||
|
||||
version_configs.append(
|
||||
common.VersionConfig(service_id, version['id'],
|
||||
scaling, manual_instances))
|
||||
|
||||
return frozenset(version_configs)
|
||||
|
||||
def set_manual_scaling_num_instance(self, service_id: str, version_id: str,
|
||||
manual_instances: int) -> None:
|
||||
"""Creates an request to change an AppEngine version's status."""
|
||||
update_mask = 'manualScaling.instances'
|
||||
body = {'manualScaling': {'instances': manual_instances}}
|
||||
response = self._services.versions().patch(appsId=self._project,
|
||||
servicesId=service_id,
|
||||
versionsId=version_id,
|
||||
updateMask=update_mask,
|
||||
body=body).execute()
|
||||
|
||||
operation_id = response.get('name').split('operations/')[1]
|
||||
for _ in range(_STATUS_CHECK_TIMES):
|
||||
if self.query_operation_status(operation_id):
|
||||
return
|
||||
time.sleep(self._status_check_interval)
|
||||
|
||||
raise common.CannotRollbackError(
|
||||
f'Operation {operation_id} timed out.')
|
||||
|
||||
def query_operation_status(self, operation_id):
|
||||
response = self._operations.get(appsId=self._project,
|
||||
operationsId=operation_id).execute()
|
||||
if response.get('response') is not None:
|
||||
return True
|
||||
|
||||
if response.get('error') is not None:
|
||||
raise common.CannotRollbackError(response['error'])
|
||||
|
||||
assert not response.get('done'), 'Operation done but no results.'
|
||||
return False
|
133
release/rollback/appengine_test.py
Normal file
133
release/rollback/appengine_test.py
Normal file
|
@ -0,0 +1,133 @@
|
|||
# Copyright 2020 The Nomulus Authors. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Unit tests for appengine."""
|
||||
from typing import Any, Dict, List, Tuple, Union
|
||||
import unittest
|
||||
from unittest import mock
|
||||
from unittest.mock import patch
|
||||
|
||||
import appengine
|
||||
import common
|
||||
|
||||
|
||||
def setup_appengine_admin() -> Tuple[object, object]:
|
||||
"""Helper for setting up a mocked AppEngineAdmin instance.
|
||||
|
||||
Returns:
|
||||
An AppEngineAdmin instance and a request with which API responses can
|
||||
be mocked.
|
||||
"""
|
||||
|
||||
# Assign mocked API response to mock_request.execute.
|
||||
mock_request = mock.MagicMock()
|
||||
mock_request.uri.return_value = 'myuri'
|
||||
# Mocked resource shared by services, versions, and operations.
|
||||
resource = mock.MagicMock()
|
||||
resource.list.return_value = mock_request
|
||||
resource.get.return_value = mock_request
|
||||
resource.patch.return_value = mock_request
|
||||
# Root resource of AppEngine API. Exact type unknown.
|
||||
apps = mock.MagicMock()
|
||||
apps.services.return_value = resource
|
||||
resource.versions.return_value = resource
|
||||
apps.operations.return_value = resource
|
||||
service_lookup = mock.MagicMock()
|
||||
service_lookup.apps.return_value = apps
|
||||
appengine_admin = appengine.AppEngineAdmin('project', service_lookup, 1)
|
||||
|
||||
return (appengine_admin, mock_request)
|
||||
|
||||
|
||||
class AppEngineTestCase(unittest.TestCase):
|
||||
"""Unit tests for appengine."""
|
||||
def setUp(self) -> None:
|
||||
self._client, self._mock_request = setup_appengine_admin()
|
||||
self.addCleanup(patch.stopall)
|
||||
|
||||
|
||||
# yapf: disable
|
||||
def _set_mocked_response(
|
||||
self,
|
||||
responses: Union[Dict[str, Any], List[Dict[str, Any]]]) -> None:
|
||||
# yapf: enable
|
||||
if isinstance(responses, list):
|
||||
self._mock_request.execute.side_effect = responses
|
||||
else:
|
||||
self._mock_request.execute.return_value = responses
|
||||
|
||||
def test_checked_request_multipage_raises(self) -> None:
|
||||
self._set_mocked_response({'nextPageToken': ''})
|
||||
self.assertRaises(appengine.PagingError,
|
||||
self._client.get_serving_versions)
|
||||
|
||||
def test_get_serving_versions(self) -> None:
|
||||
self._set_mocked_response({
|
||||
'services': [{
|
||||
'split': {
|
||||
'allocations': {
|
||||
'my_version': 3.14,
|
||||
}
|
||||
},
|
||||
'id': 'pubapi'
|
||||
}, {
|
||||
'split': {
|
||||
'allocations': {
|
||||
'another_version': 2.71,
|
||||
}
|
||||
},
|
||||
'id': 'error_dashboard'
|
||||
}]
|
||||
})
|
||||
self.assertEqual(
|
||||
self._client.get_serving_versions(),
|
||||
frozenset([common.VersionKey('pubapi', 'my_version')]))
|
||||
|
||||
def test_get_version_configs(self):
|
||||
self._set_mocked_response({
|
||||
'versions': [{
|
||||
'basicScaling': {
|
||||
'maxInstances': 10
|
||||
},
|
||||
'id': 'version'
|
||||
}]
|
||||
})
|
||||
self.assertEqual(
|
||||
self._client.get_version_configs(
|
||||
frozenset([common.VersionKey('default', 'version')])),
|
||||
frozenset([
|
||||
common.VersionConfig('default', 'version',
|
||||
common.AppEngineScaling.BASIC)
|
||||
]))
|
||||
|
||||
def test_async_update(self):
|
||||
self._set_mocked_response([
|
||||
{
|
||||
'name': 'project/operations/op_id',
|
||||
'done': False
|
||||
},
|
||||
{
|
||||
'name': 'project/operations/op_id',
|
||||
'done': False
|
||||
},
|
||||
{
|
||||
'name': 'project/operations/op_id',
|
||||
'response': {},
|
||||
'done': True
|
||||
},
|
||||
])
|
||||
self._client.set_manual_scaling_num_instance('service', 'version', 1)
|
||||
self.assertEqual(self._mock_request.execute.call_count, 3)
|
||||
|
||||
if __name__ == '__main__':
|
||||
unittest.main()
|
111
release/rollback/common.py
Normal file
111
release/rollback/common.py
Normal file
|
@ -0,0 +1,111 @@
|
|||
# Copyright 2020 The Nomulus Authors. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Declares data types that describe AppEngine services and versions."""
|
||||
|
||||
import dataclasses
|
||||
import enum
|
||||
import pathlib
|
||||
import re
|
||||
from typing import Optional
|
||||
|
||||
|
||||
class CannotRollbackError(Exception):
|
||||
"""Indicates that rollback cannot be done by this tool.
|
||||
|
||||
This error is for situations where rollbacks are either not allowed or
|
||||
cannot be planned. Example scenarios include:
|
||||
- The target release is incompatible with the SQL schema.
|
||||
- The target release has never been deployed to AppEngine.
|
||||
- The target release is no longer available, e.g., has been manually
|
||||
deleted by the operators.
|
||||
- A state-changing call to AppEngine Admin API has failed.
|
||||
|
||||
User must manually fix such problems before trying again to roll back.
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
class AppEngineScaling(enum.Enum):
|
||||
"""Types of scaling schemes supported in AppEngine.
|
||||
|
||||
The value of each name is the property name in the REST API requests and
|
||||
responses.
|
||||
"""
|
||||
|
||||
AUTOMATIC = 'automaticScaling'
|
||||
BASIC = 'basicScaling'
|
||||
MANUAL = 'manualScaling'
|
||||
|
||||
|
||||
@dataclasses.dataclass(frozen=True)
|
||||
class VersionKey:
|
||||
"""Identifier of a deployed version on AppEngine.
|
||||
|
||||
AppEngine versions as deployable units are managed on per-service basis.
|
||||
Each instance of this class uniquely identifies an AppEngine version.
|
||||
|
||||
This class implements the __eq__ method so that its equality property
|
||||
applies to subclasses by default unless they override it.
|
||||
"""
|
||||
|
||||
service_id: str
|
||||
version_id: str
|
||||
|
||||
def __eq__(self, other):
|
||||
return (isinstance(other, VersionKey)
|
||||
and self.service_id == other.service_id
|
||||
and self.version_id == other.version_id)
|
||||
|
||||
|
||||
@dataclasses.dataclass(frozen=True, eq=False)
|
||||
class VersionConfig(VersionKey):
|
||||
"""Rollback-related static configuration of an AppEngine version.
|
||||
|
||||
Contains data found from the application-web.xml for this version.
|
||||
|
||||
Attributes:
|
||||
scaling: The scaling scheme of this version. This value determines what
|
||||
steps are needed for the rollback. If a version is on automatic
|
||||
scaling, we only need to direct traffic to it or away from it. The
|
||||
version cannot be started, stopped, or have its number of instances
|
||||
updated. If a version is on manual scaling, it not only needs to be
|
||||
started or stopped explicitly, its instances need to be updated too
|
||||
(to 1, the lowest allowed number) when it is shutdown, and to its
|
||||
originally configured number of VM instances when brought up.
|
||||
manual_scaling_instances: The originally configured VM instances to use
|
||||
for each version that is on manual scaling.
|
||||
"""
|
||||
|
||||
scaling: AppEngineScaling
|
||||
manual_scaling_instances: Optional[int] = None
|
||||
|
||||
|
||||
def get_nomulus_root() -> str:
|
||||
"""Finds the current Nomulus root directory.
|
||||
|
||||
Returns:
|
||||
The absolute path to the Nomulus root directory.
|
||||
"""
|
||||
for folder in pathlib.Path(__file__).parents:
|
||||
if folder.name != 'nomulus':
|
||||
continue
|
||||
if not folder.joinpath('settings.gradle').exists():
|
||||
continue
|
||||
with open(folder.joinpath('settings.gradle'), 'r') as file:
|
||||
for line in file:
|
||||
if re.match(r"^rootProject.name\s*=\s*'nomulus'\s*$", line):
|
||||
return folder.absolute()
|
||||
|
||||
raise RuntimeError(
|
||||
'Do not move this file out of the Nomulus directory tree.')
|
148
release/rollback/gcs.py
Normal file
148
release/rollback/gcs.py
Normal file
|
@ -0,0 +1,148 @@
|
|||
# Copyright 2020 The Nomulus Authors. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Helper for managing Nomulus deployment records on GCS."""
|
||||
|
||||
from typing import Dict, FrozenSet, Set
|
||||
|
||||
from google.cloud import storage
|
||||
|
||||
import common
|
||||
|
||||
|
||||
def _get_version_map_name(env: str):
|
||||
return f'nomulus.{env}.versions'
|
||||
|
||||
|
||||
def _get_schema_tag_file(env: str):
|
||||
return f'sql.{env}.tag'
|
||||
|
||||
|
||||
class GcsClient:
|
||||
"""Manages Nomulus deployment records on GCS."""
|
||||
def __init__(self, project: str, gcs_client=None) -> None:
|
||||
"""Initializes the instance for a GCP project.
|
||||
|
||||
Args:
|
||||
project: The GCP project with Nomulus deployment records.
|
||||
gcs_client: Optional API client to use.
|
||||
"""
|
||||
|
||||
self._project = project
|
||||
|
||||
if gcs_client is not None:
|
||||
self._client = gcs_client
|
||||
else:
|
||||
self._client = storage.Client(self._project)
|
||||
|
||||
@property
|
||||
def project(self):
|
||||
return self._project
|
||||
|
||||
def _get_deploy_bucket_name(self):
|
||||
return f'{self._project}-deployed-tags'
|
||||
|
||||
def _get_release_to_version_mapping(
|
||||
self, env: str) -> Dict[common.VersionKey, str]:
|
||||
"""Returns the content of the release to version mapping file.
|
||||
|
||||
File content is returned in utf-8 encoding. Each line in the file is
|
||||
in this format:
|
||||
'{RELEASE_TAG},{APP_ENGINE_SERVICE_ID},{APP_ENGINE_VERSION}'.
|
||||
"""
|
||||
file_content = self._client.get_bucket(
|
||||
self._get_deploy_bucket_name()).get_blob(
|
||||
_get_version_map_name(env)).download_as_text()
|
||||
|
||||
mapping = {}
|
||||
for line in file_content.splitlines(False):
|
||||
tag, service_id, version_id = line.split(',')
|
||||
mapping[common.VersionKey(service_id, version_id)] = tag
|
||||
|
||||
return mapping
|
||||
|
||||
def get_versions_by_release(self, env: str,
|
||||
nom_tag: str) -> FrozenSet[common.VersionKey]:
|
||||
"""Returns AppEngine version ids of a given Nomulus release tag.
|
||||
|
||||
Fetches the version mapping file maintained by the deployment process
|
||||
and parses its content into a collection of VersionKey instances.
|
||||
|
||||
A release may map to multiple versions in a service if it has been
|
||||
deployed multiple times. This is not intended behavior and may only
|
||||
happen by mistake.
|
||||
|
||||
Args:
|
||||
env: The environment of the deployed release, e.g., sandbox.
|
||||
nom_tag: The Nomulus release tag.
|
||||
|
||||
Returns:
|
||||
An immutable set of VersionKey instances.
|
||||
"""
|
||||
mapping = self._get_release_to_version_mapping(env)
|
||||
return frozenset(
|
||||
[version for version in mapping if mapping[version] == nom_tag])
|
||||
|
||||
def get_releases_by_versions(
|
||||
self, env: str,
|
||||
versions: Set[common.VersionKey]) -> Dict[common.VersionKey, str]:
|
||||
"""Gets the release tags of the AppEngine versions.
|
||||
|
||||
Args:
|
||||
env: The environment of the deployed release, e.g., sandbox.
|
||||
versions: The AppEngine versions.
|
||||
|
||||
Returns:
|
||||
A mapping of versions to release tags.
|
||||
"""
|
||||
mapping = self._get_release_to_version_mapping(env)
|
||||
return {
|
||||
version: tag
|
||||
for version, tag in mapping.items() if version in versions
|
||||
}
|
||||
|
||||
def get_recent_deployments(
|
||||
self, env: str, num_records: int) -> Dict[common.VersionKey, str]:
|
||||
"""Gets the most recent deployment records.
|
||||
|
||||
Deployment records are stored in a file, with one line per service.
|
||||
Caller should adjust num_records according to the number of services
|
||||
in AppEngine.
|
||||
|
||||
Args:
|
||||
env: The environment of the deployed release, e.g., sandbox.
|
||||
num_records: the number of lines to go back.
|
||||
"""
|
||||
file_content = self._client.get_bucket(
|
||||
self._get_deploy_bucket_name()).get_blob(
|
||||
_get_version_map_name(env)).download_as_text()
|
||||
|
||||
mapping = {}
|
||||
for line in file_content.splitlines(False)[-num_records:]:
|
||||
tag, service_id, version_id = line.split(',')
|
||||
mapping[common.VersionKey(service_id, version_id)] = tag
|
||||
|
||||
return mapping
|
||||
|
||||
def get_schema_tag(self, env: str) -> str:
|
||||
"""Gets the release tag of the SQL schema in the given environment.
|
||||
|
||||
This tag is needed for the server/schema compatibility test.
|
||||
"""
|
||||
file_content = self._client.get_bucket(
|
||||
self._get_deploy_bucket_name()).get_blob(
|
||||
_get_schema_tag_file(env)).download_as_text().splitlines(False)
|
||||
assert len(
|
||||
file_content
|
||||
) == 1, f'Unexpected content in {_get_schema_tag_file(env)}.'
|
||||
return file_content[0]
|
152
release/rollback/gcs_test.py
Normal file
152
release/rollback/gcs_test.py
Normal file
|
@ -0,0 +1,152 @@
|
|||
# Copyright 2020 The Nomulus Authors. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Unit tests for gcs."""
|
||||
import textwrap
|
||||
import unittest
|
||||
from unittest import mock
|
||||
|
||||
import common
|
||||
import gcs
|
||||
|
||||
|
||||
def setup_gcs_client(env: str):
|
||||
"""Sets up a mocked GcsClient.
|
||||
|
||||
Args:
|
||||
env: Name of the Nomulus environment.
|
||||
|
||||
Returns:
|
||||
A GcsClient instance and two mocked blobs representing the two schema
|
||||
tag file and version map file on GCS.
|
||||
"""
|
||||
|
||||
schema_tag_blob = mock.MagicMock()
|
||||
schema_tag_blob.download_as_text.return_value = 'tag\n'
|
||||
version_map_blob = mock.MagicMock()
|
||||
blobs_by_name = {
|
||||
f'nomulus.{env}.versions': version_map_blob,
|
||||
f'sql.{env}.tag': schema_tag_blob
|
||||
}
|
||||
|
||||
bucket = mock.MagicMock()
|
||||
bucket.get_blob.side_effect = lambda blob_name: blobs_by_name[blob_name]
|
||||
google_client = mock.MagicMock()
|
||||
google_client.get_bucket.return_value = bucket
|
||||
gcs_client = gcs.GcsClient('project', google_client)
|
||||
|
||||
return (gcs_client, schema_tag_blob, version_map_blob)
|
||||
|
||||
|
||||
class GcsTestCase(unittest.TestCase):
|
||||
"""Unit tests for gcs."""
|
||||
_ENV = 'crash'
|
||||
|
||||
def setUp(self) -> None:
|
||||
self._client, self._schema_tag_blob, self._version_map_blob = \
|
||||
setup_gcs_client(self._ENV)
|
||||
self.addCleanup(mock.patch.stopall)
|
||||
|
||||
def test_get_schema_tag(self):
|
||||
self.assertEqual(self._client.get_schema_tag(self._ENV), 'tag')
|
||||
|
||||
def test_get_versions_by_release(self):
|
||||
self._version_map_blob.download_as_text.return_value = \
|
||||
'nomulus-20200925-RC02,backend,nomulus-backend-v008'
|
||||
self.assertEqual(
|
||||
self._client.get_versions_by_release(self._ENV,
|
||||
'nomulus-20200925-RC02'),
|
||||
frozenset([common.VersionKey('backend', 'nomulus-backend-v008')]))
|
||||
|
||||
def test_get_versions_by_release_not_found(self):
|
||||
self._version_map_blob.download_as_text.return_value = \
|
||||
'nomulus-20200925-RC02,backend,nomulus-backend-v008'
|
||||
self.assertEqual(
|
||||
self._client.get_versions_by_release(self._ENV, 'no-such-tag'),
|
||||
frozenset([]))
|
||||
|
||||
def test_get_versions_by_release_multiple_service(self):
|
||||
self._version_map_blob.download_as_text.return_value = textwrap.dedent(
|
||||
"""\
|
||||
nomulus-20200925-RC02,backend,nomulus-backend-v008
|
||||
nomulus-20200925-RC02,default,nomulus-default-v008
|
||||
""")
|
||||
self.assertEqual(
|
||||
self._client.get_versions_by_release(self._ENV,
|
||||
'nomulus-20200925-RC02'),
|
||||
frozenset([
|
||||
common.VersionKey('backend', 'nomulus-backend-v008'),
|
||||
common.VersionKey('default', 'nomulus-default-v008')
|
||||
]))
|
||||
|
||||
def test_get_versions_by_release_multiple_deployment(self):
|
||||
self._version_map_blob.download_as_text.return_value = textwrap.dedent(
|
||||
"""\
|
||||
nomulus-20200925-RC02,backend,nomulus-backend-v008
|
||||
nomulus-20200925-RC02,backend,nomulus-backend-v018
|
||||
""")
|
||||
self.assertEqual(
|
||||
self._client.get_versions_by_release(self._ENV,
|
||||
'nomulus-20200925-RC02'),
|
||||
frozenset([
|
||||
common.VersionKey('backend', 'nomulus-backend-v008'),
|
||||
common.VersionKey('backend', 'nomulus-backend-v018')
|
||||
]))
|
||||
|
||||
def test_get_releases_by_versions(self):
|
||||
self._version_map_blob.download_as_text.return_value = textwrap.dedent(
|
||||
"""\
|
||||
nomulus-20200925-RC02,backend,nomulus-backend-v008
|
||||
nomulus-20200925-RC02,default,nomulus-default-v008
|
||||
""")
|
||||
self.assertEqual(
|
||||
self._client.get_releases_by_versions(
|
||||
self._ENV, {
|
||||
common.VersionKey('backend', 'nomulus-backend-v008'),
|
||||
common.VersionKey('default', 'nomulus-default-v008')
|
||||
}), {
|
||||
common.VersionKey('backend', 'nomulus-backend-v008'):
|
||||
'nomulus-20200925-RC02',
|
||||
common.VersionKey('default', 'nomulus-default-v008'):
|
||||
'nomulus-20200925-RC02',
|
||||
})
|
||||
|
||||
def test_get_recent_deployments(self):
|
||||
file_content = textwrap.dedent("""\
|
||||
nomulus-20200925-RC02,backend,nomulus-backend-v008
|
||||
nomulus-20200925-RC02,default,nomulus-default-v008
|
||||
""")
|
||||
self._version_map_blob.download_as_text.return_value = file_content
|
||||
self.assertEqual(
|
||||
self._client.get_recent_deployments(self._ENV, 2), {
|
||||
common.VersionKey('default', 'nomulus-default-v008'):
|
||||
'nomulus-20200925-RC02',
|
||||
common.VersionKey('backend', 'nomulus-backend-v008'):
|
||||
'nomulus-20200925-RC02'
|
||||
})
|
||||
|
||||
def test_get_recent_deployments_fewer_lines(self):
|
||||
self._version_map_blob.download_as_text.return_value = textwrap.dedent(
|
||||
"""\
|
||||
nomulus-20200925-RC02,backend,nomulus-backend-v008
|
||||
nomulus-20200925-RC02,default,nomulus-default-v008
|
||||
""")
|
||||
self.assertEqual(
|
||||
self._client.get_recent_deployments(self._ENV, 1), {
|
||||
common.VersionKey('default', 'nomulus-default-v008'):
|
||||
'nomulus-20200925-RC02'
|
||||
})
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
unittest.main()
|
195
release/rollback/plan.py
Normal file
195
release/rollback/plan.py
Normal file
|
@ -0,0 +1,195 @@
|
|||
# Copyright 2020 The Nomulus Authors. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Generates a sequence of operations for execution."""
|
||||
from typing import FrozenSet, Tuple
|
||||
|
||||
import appengine
|
||||
import common
|
||||
import dataclasses
|
||||
|
||||
import gcs
|
||||
import steps
|
||||
|
||||
|
||||
@dataclasses.dataclass(frozen=True)
|
||||
class ServiceRollback:
|
||||
"""Data needed for rolling back one service.
|
||||
|
||||
Holds the configurations of both the currently serving version(s) and the
|
||||
rollback target in a service.
|
||||
|
||||
Attributes:
|
||||
target_version: The version to roll back to.
|
||||
serving_versions: The currently serving versions to be stopped. This
|
||||
set may be empty. It may also have multiple versions (when traffic
|
||||
is split).
|
||||
"""
|
||||
|
||||
target_version: common.VersionConfig
|
||||
serving_versions: FrozenSet[common.VersionConfig]
|
||||
|
||||
def __post_init__(self):
|
||||
"""Validates that all versions are for the same service."""
|
||||
|
||||
if self.serving_versions:
|
||||
for config in self.serving_versions:
|
||||
assert config.service_id == self.target_version.service_id
|
||||
|
||||
|
||||
# yapf: disable
|
||||
def _get_service_rollback_plan(
|
||||
target_configs: FrozenSet[common.VersionConfig],
|
||||
serving_configs: FrozenSet[common.VersionConfig]
|
||||
) -> Tuple[ServiceRollback]:
|
||||
# yapf: enable
|
||||
"""Determines the versions to bring up/down in each service.
|
||||
|
||||
In each service, this method makes sure that at least one version is found
|
||||
for the rollback target. If multiple versions are found, which may only
|
||||
happen if the target release was deployed multiple times, randomly choose
|
||||
one.
|
||||
|
||||
If a target version is already serving traffic, instead of checking if it
|
||||
gets 100 percent of traffic, this method still generates operations to
|
||||
start it and direct all traffic to it. This is not a problem since these
|
||||
operations are idempotent.
|
||||
|
||||
Attributes:
|
||||
target_configs: The rollback target versions in each managed service
|
||||
(as defined in appengine.SERVICES).
|
||||
serving_configs: The currently serving versions in each service.
|
||||
|
||||
Raises:
|
||||
CannotRollbackError: Rollback is impossible because a target version
|
||||
cannot be found for some service.
|
||||
|
||||
Returns:
|
||||
For each service, the versions to bring up/down if applicable.
|
||||
"""
|
||||
targets_by_service = {}
|
||||
for version in target_configs:
|
||||
targets_by_service.setdefault(version.service_id, set()).add(version)
|
||||
|
||||
serving_by_service = {}
|
||||
for version in serving_configs:
|
||||
serving_by_service.setdefault(version.service_id, set()).add(version)
|
||||
|
||||
# The target_configs parameter only has configs for managed services.
|
||||
# Since targets_by_service is derived from it, its keyset() should equal
|
||||
# to appengine.SERVICES.
|
||||
if targets_by_service.keys() != appengine.SERVICES:
|
||||
cannot_rollback = appengine.SERVICES.difference(
|
||||
targets_by_service.keys())
|
||||
raise common.CannotRollbackError(
|
||||
f'Target version(s) not found for {cannot_rollback}')
|
||||
|
||||
plan = []
|
||||
for service_id, versions in targets_by_service.items():
|
||||
serving_configs = serving_by_service.get(service_id, set())
|
||||
versions_to_stop = serving_configs.difference(versions)
|
||||
chosen_target = list(versions)[0]
|
||||
plan.append(ServiceRollback(chosen_target,
|
||||
frozenset(versions_to_stop)))
|
||||
|
||||
return tuple(plan)
|
||||
|
||||
|
||||
# yapf: disable
|
||||
def _generate_steps(
|
||||
gcs_client: gcs.GcsClient,
|
||||
appengine_admin: appengine.AppEngineAdmin,
|
||||
env: str,
|
||||
target_release: str,
|
||||
rollback_plan: Tuple[ServiceRollback]
|
||||
) -> Tuple[steps.RollbackStep, ...]:
|
||||
# yapf: enable
|
||||
"""Generates the sequence of operations for execution.
|
||||
|
||||
A rollback consists of the following steps:
|
||||
1. Run schema compatibility test for the target release.
|
||||
2. For each service,
|
||||
a. If the target version does not use automatic scaling, start it.
|
||||
i. If target version uses manual scaling, sets its instances to the
|
||||
configured values.
|
||||
b. If the target version uses automatic scaling, do nothing.
|
||||
3. For each service, immediately direct all traffic to the target version.
|
||||
4. For each service, go over its versions to be stopped:
|
||||
a. If a version uses automatic scaling, do nothing.
|
||||
b. If a version does not use automatic scaling, stop it.
|
||||
i. If a version uses manual scaling, sets its instances to 1 (one, the
|
||||
lowest value allowed on the REST API) to release the instances.
|
||||
5. Update the appropriate deployed tag file on GCS with the target release
|
||||
tag.
|
||||
|
||||
Returns:
|
||||
The sequence of operations to execute for rollback.
|
||||
"""
|
||||
rollback_steps = [
|
||||
steps.check_schema_compatibility(gcs_client.project, target_release,
|
||||
gcs_client.get_schema_tag(env))
|
||||
]
|
||||
|
||||
for plan in rollback_plan:
|
||||
if plan.target_version.scaling != common.AppEngineScaling.AUTOMATIC:
|
||||
rollback_steps.append(
|
||||
steps.start_or_stop_version(appengine_admin.project, 'start',
|
||||
plan.target_version))
|
||||
if plan.target_version.scaling == common.AppEngineScaling.MANUAL:
|
||||
rollback_steps.append(
|
||||
steps.set_manual_scaling_instances(
|
||||
appengine_admin, plan.target_version,
|
||||
plan.target_version.manual_scaling_instances))
|
||||
|
||||
for plan in rollback_plan:
|
||||
rollback_steps.append(
|
||||
steps.direct_service_traffic_to_version(appengine_admin.project,
|
||||
plan.target_version))
|
||||
|
||||
for plan in rollback_plan:
|
||||
for version in plan.serving_versions:
|
||||
if plan.target_version.scaling != common.AppEngineScaling.AUTOMATIC:
|
||||
rollback_steps.append(
|
||||
steps.start_or_stop_version(appengine_admin.project,
|
||||
'stop', version))
|
||||
if plan.target_version.scaling == common.AppEngineScaling.MANUAL:
|
||||
# Release all but one instances. Cannot set num_instances to 0
|
||||
# with this api.
|
||||
rollback_steps.append(
|
||||
steps.set_manual_scaling_instances(appengine_admin,
|
||||
version, 1))
|
||||
|
||||
rollback_steps.append(
|
||||
steps.update_deploy_tags(gcs_client.project, env, target_release))
|
||||
|
||||
return tuple(rollback_steps)
|
||||
|
||||
|
||||
def get_rollback_plan(gcs_client: gcs.GcsClient,
|
||||
appengine_admin: appengine.AppEngineAdmin, env: str,
|
||||
target_release: str) -> Tuple[steps.RollbackStep]:
|
||||
"""Generates the sequence of rollback operations for execution."""
|
||||
target_versions = gcs_client.get_versions_by_release(env, target_release)
|
||||
serving_versions = appengine_admin.get_serving_versions()
|
||||
all_version_configs = appengine_admin.get_version_configs(
|
||||
target_versions.union(serving_versions))
|
||||
|
||||
target_configs = frozenset([
|
||||
config for config in all_version_configs if config in target_versions
|
||||
])
|
||||
serving_configs = frozenset([
|
||||
config for config in all_version_configs if config in serving_versions
|
||||
])
|
||||
rollback_plan = _get_service_rollback_plan(target_configs, serving_configs)
|
||||
return _generate_steps(gcs_client, appengine_admin, env, target_release,
|
||||
rollback_plan)
|
129
release/rollback/rollback_test.py
Normal file
129
release/rollback/rollback_test.py
Normal file
|
@ -0,0 +1,129 @@
|
|||
# Copyright 2020 The Nomulus Authors. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""End-to-end test of rollback."""
|
||||
import textwrap
|
||||
from typing import Any, Dict
|
||||
import unittest
|
||||
from unittest import mock
|
||||
|
||||
import appengine_test
|
||||
import gcs_test
|
||||
import plan
|
||||
|
||||
|
||||
def _make_serving_version(service: str, version: str) -> Dict[str, Any]:
|
||||
"""Creates description of one serving version in API response."""
|
||||
return {
|
||||
'split': {
|
||||
'allocations': {
|
||||
version: 1,
|
||||
}
|
||||
},
|
||||
'id': service
|
||||
}
|
||||
|
||||
|
||||
def _make_version_config(version,
|
||||
scaling: str,
|
||||
instance_tag: str,
|
||||
instances: int = 10) -> Dict[str, Any]:
|
||||
"""Creates one version config as part of an API response."""
|
||||
return {scaling: {instance_tag: instances}, 'id': version}
|
||||
|
||||
|
||||
class RollbackTestCase(unittest.TestCase):
|
||||
"""End-to-end test of rollback."""
|
||||
def setUp(self) -> None:
|
||||
self._appengine_admin, self._appengine_request = (
|
||||
appengine_test.setup_appengine_admin())
|
||||
self._gcs_client, self._schema_tag, self._version_map = (
|
||||
gcs_test.setup_gcs_client('crash'))
|
||||
self.addCleanup(mock.patch.stopall)
|
||||
|
||||
def test_rollback_success(self):
|
||||
self._schema_tag.download_as_text.return_value = (
|
||||
'nomulus-2010-1014-RC00')
|
||||
self._version_map.download_as_text.return_value = textwrap.dedent("""\
|
||||
nomulus-20201014-RC00,backend,nomulus-backend-v009
|
||||
nomulus-20201014-RC00,default,nomulus-default-v009
|
||||
nomulus-20201014-RC00,pubapi,nomulus-pubapi-v009
|
||||
nomulus-20201014-RC00,tools,nomulus-tools-v009
|
||||
nomulus-20201014-RC01,backend,nomulus-backend-v011
|
||||
nomulus-20201014-RC01,default,nomulus-default-v010
|
||||
nomulus-20201014-RC01,pubapi,nomulus-pubapi-v010
|
||||
nomulus-20201014-RC01,tools,nomulus-tools-v010
|
||||
""")
|
||||
self._appengine_request.execute.side_effect = [
|
||||
# Response to get_serving_versions:
|
||||
{
|
||||
'services': [
|
||||
_make_serving_version('backend', 'nomulus-backend-v011'),
|
||||
_make_serving_version('default', 'nomulus-default-v010'),
|
||||
_make_serving_version('pubapi', 'nomulus-pubapi-v010'),
|
||||
_make_serving_version('tools', 'nomulus-tools-v010')
|
||||
]
|
||||
},
|
||||
# Responses to get_version_configs. AppEngineAdmin queries the
|
||||
# services by alphabetical order to facilitate this test.
|
||||
{
|
||||
'versions': [
|
||||
_make_version_config('nomulus-backend-v009',
|
||||
'basicScaling', 'maxInstances'),
|
||||
_make_version_config('nomulus-backend-v011',
|
||||
'basicScaling', 'maxInstances')
|
||||
]
|
||||
},
|
||||
{
|
||||
'versions': [
|
||||
_make_version_config('nomulus-default-v009',
|
||||
'basicScaling', 'maxInstances'),
|
||||
_make_version_config('nomulus-default-v010',
|
||||
'basicScaling', 'maxInstances')
|
||||
]
|
||||
},
|
||||
{
|
||||
'versions': [
|
||||
_make_version_config('nomulus-pubapi-v009',
|
||||
'manualScaling', 'instances'),
|
||||
_make_version_config('nomulus-pubapi-v010',
|
||||
'manualScaling', 'instances')
|
||||
]
|
||||
},
|
||||
{
|
||||
'versions': [
|
||||
_make_version_config('nomulus-tools-v009',
|
||||
'automaticScaling',
|
||||
'maxTotalInstances'),
|
||||
_make_version_config('nomulus-tools-v010',
|
||||
'automaticScaling',
|
||||
'maxTotalInstances')
|
||||
]
|
||||
}
|
||||
]
|
||||
|
||||
steps = plan.get_rollback_plan(self._gcs_client, self._appengine_admin,
|
||||
'crash', 'nomulus-20201014-RC00')
|
||||
self.assertEqual(len(steps), 14)
|
||||
self.assertRegex(steps[0].info(),
|
||||
'.*nom_build :integration:sqlIntegrationTest.*')
|
||||
self.assertRegex(steps[1].info(), '.*gcloud app versions start.*')
|
||||
self.assertRegex(steps[5].info(),
|
||||
'.*gcloud app services set-traffic.*')
|
||||
self.assertRegex(steps[9].info(), '.*gcloud app versions stop.*')
|
||||
self.assertRegex(steps[13].info(),
|
||||
'.*echo nomulus-20201014-RC00 | gsutil cat -.*')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
unittest.main()
|
178
release/rollback/rollback_tool.py
Normal file
178
release/rollback/rollback_tool.py
Normal file
|
@ -0,0 +1,178 @@
|
|||
# Copyright 2020 The Nomulus Authors. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Script to rollback the Nomulus server on AppEngine."""
|
||||
|
||||
import argparse
|
||||
import dataclasses
|
||||
import sys
|
||||
import textwrap
|
||||
from typing import Any, Optional, Tuple
|
||||
|
||||
import appengine
|
||||
import gcs
|
||||
import plan
|
||||
|
||||
MAIN_HELP = 'Script to roll back the Nomulus server on AppEngine.'
|
||||
ROLLBACK_HELP = 'Rolls back Nomulus to the target release.'
|
||||
GET_SERVING_RELEASE_HELP = 'Shows the release tag(s) of the serving versions.'
|
||||
GET_RECENT_DEPLOYMENTS_HELP = ('Shows recently deployed versions and their '
|
||||
'release tags.')
|
||||
ROLLBACK_MODE_HELP = textwrap.dedent("""\
|
||||
The execution mode.
|
||||
- dryrun: Prints descriptions of all steps.
|
||||
- interactive: Prompts for confirmation before executing
|
||||
each step.
|
||||
- auto: Executes all steps in one go.
|
||||
""")
|
||||
|
||||
|
||||
@dataclasses.dataclass(frozen=True)
|
||||
class Argument:
|
||||
"""Describes a command line argument.
|
||||
|
||||
This class is for use with argparse.ArgumentParser. Except for the
|
||||
'arg_names' attribute which specifies the argument name and/or flags, all
|
||||
other attributes must match an accepted parameter in the parser's
|
||||
add_argument() method.
|
||||
"""
|
||||
|
||||
arg_names: Tuple[str, ...]
|
||||
help: str
|
||||
default: Optional[Any] = None
|
||||
required: bool = True
|
||||
choices: Optional[Tuple[str, ...]] = None
|
||||
|
||||
def get_arg_attrs(self):
|
||||
return dict((k, v) for k, v in vars(self).items() if k != 'arg_names')
|
||||
|
||||
|
||||
ARGUMENTS = (Argument(('--dev_project', '-d'),
|
||||
'The GCP project with Nomulus deployment records.'),
|
||||
Argument(('--project', '-p'),
|
||||
'The GCP project where the Nomulus server is deployed.'),
|
||||
Argument(('--env', '-e'),
|
||||
'The name of the Nomulus server environment.',
|
||||
choices=('production', 'sandbox', 'crash', 'alpha')))
|
||||
|
||||
ROLLBACK_ARGUMENTS = (Argument(('--target_release', '-t'),
|
||||
'The release to be deployed.'),
|
||||
Argument(('--run_mode', '-m'),
|
||||
ROLLBACK_MODE_HELP,
|
||||
required=False,
|
||||
default='dryrun',
|
||||
choices=('dryrun', 'interactive', 'auto')))
|
||||
|
||||
|
||||
def rollback(dev_project: str, project: str, env: str, target_release: str,
|
||||
run_mode: str) -> None:
|
||||
"""Rolls back a Nomulus server to the target release.
|
||||
|
||||
Args:
|
||||
dev_project: The GCP project with deployment records.
|
||||
project: The GCP project of the Nomulus server.
|
||||
env: The environment name of the Nomulus server.
|
||||
target_release: The tag of the release to be brought up.
|
||||
run_mode: How to handle the rollback steps: print-only (dryrun)
|
||||
one step at a time with user confirmation (interactive),
|
||||
or all steps in one shot (automatic).
|
||||
"""
|
||||
steps = plan.get_rollback_plan(gcs.GcsClient(dev_project),
|
||||
appengine.AppEngineAdmin(project), env,
|
||||
target_release)
|
||||
|
||||
print('Rollback steps:\n\n')
|
||||
|
||||
for step in steps:
|
||||
print(f'{step.info()}\n')
|
||||
|
||||
if run_mode == 'dryrun':
|
||||
continue
|
||||
|
||||
if run_mode == 'interactive':
|
||||
confirmation = input(
|
||||
'Do you wish to (c)ontinue, (s)kip, or (a)bort? ')
|
||||
if confirmation == 'a':
|
||||
return
|
||||
if confirmation == 's':
|
||||
continue
|
||||
|
||||
step.execute()
|
||||
|
||||
|
||||
def show_serving_release(dev_project: str, project: str, env: str) -> None:
|
||||
"""Shows the release tag(s) of the currently serving versions."""
|
||||
serving_versions = appengine.AppEngineAdmin(project).get_serving_versions()
|
||||
versions_to_tags = gcs.GcsClient(dev_project).get_releases_by_versions(
|
||||
env, serving_versions)
|
||||
print(f'{project}:')
|
||||
for version, tag in versions_to_tags.items():
|
||||
print(f'{version.service_id}\t{version.version_id}\t{tag}')
|
||||
|
||||
|
||||
def show_recent_deployments(dev_project: str, project: str, env: str) -> None:
|
||||
"""Show release and version of recent deployments."""
|
||||
num_services = len(appengine.SERVICES)
|
||||
num_records = 3 * num_services
|
||||
print(f'{project}:')
|
||||
for version, tag in gcs.GcsClient(dev_project).get_recent_deployments(
|
||||
env, num_records).items():
|
||||
print(f'{version.service_id}\t{version.version_id}\t{tag}')
|
||||
|
||||
|
||||
def main() -> int:
|
||||
parser = argparse.ArgumentParser(prog='nom_rollback',
|
||||
description=MAIN_HELP)
|
||||
subparsers = parser.add_subparsers(dest='command',
|
||||
help='Supported commands')
|
||||
|
||||
rollback_parser = subparsers.add_parser(
|
||||
'rollback',
|
||||
help=ROLLBACK_HELP,
|
||||
formatter_class=argparse.RawTextHelpFormatter)
|
||||
for flag in ARGUMENTS:
|
||||
rollback_parser.add_argument(*flag.arg_names, **flag.get_arg_attrs())
|
||||
for flag in ROLLBACK_ARGUMENTS:
|
||||
rollback_parser.add_argument(*flag.arg_names, **flag.get_arg_attrs())
|
||||
|
||||
show_serving_release_parser = subparsers.add_parser(
|
||||
'show_serving_release', help=GET_SERVING_RELEASE_HELP)
|
||||
for flag in ARGUMENTS:
|
||||
show_serving_release_parser.add_argument(*flag.arg_names,
|
||||
**flag.get_arg_attrs())
|
||||
|
||||
show_recent_deployments_parser = subparsers.add_parser(
|
||||
'show_recent_deployments', help=GET_RECENT_DEPLOYMENTS_HELP)
|
||||
for flag in ARGUMENTS:
|
||||
show_recent_deployments_parser.add_argument(*flag.arg_names,
|
||||
**flag.get_arg_attrs())
|
||||
|
||||
args = parser.parse_args()
|
||||
command = args.command
|
||||
args = {k: v for k, v in vars(args).items() if k != 'command'}
|
||||
|
||||
{
|
||||
'rollback': rollback,
|
||||
'show_recent_deployments': show_recent_deployments,
|
||||
'show_serving_release': show_serving_release
|
||||
}[command](**args)
|
||||
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
try:
|
||||
sys.exit(main())
|
||||
except Exception as ex: # pylint: disable=broad-except
|
||||
print(ex)
|
||||
sys.exit(1)
|
152
release/rollback/steps.py
Normal file
152
release/rollback/steps.py
Normal file
|
@ -0,0 +1,152 @@
|
|||
# Copyright 2020 The Nomulus Authors. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Definition of rollback steps and factory methods to create them."""
|
||||
|
||||
import dataclasses
|
||||
import subprocess
|
||||
import textwrap
|
||||
from typing import Tuple
|
||||
|
||||
import appengine
|
||||
import common
|
||||
|
||||
|
||||
@dataclasses.dataclass(frozen=True)
|
||||
class RollbackStep:
|
||||
"""One rollback step.
|
||||
|
||||
Most steps are implemented using commandline tools, e.g., gcloud and
|
||||
gsutil, and execute their commands by forking a subprocess. Each step
|
||||
also has a info method that returns its command with a description.
|
||||
|
||||
Two steps are handled differently. The _UpdateDeployTag step gets a piped
|
||||
shell command, which needs to be handled differently. The
|
||||
_SetManualScalingNumInstances step uses the AppEngine Admin API client in
|
||||
this package to set the number of instances. The Nomulus set_num_instances
|
||||
command is not working right now.
|
||||
"""
|
||||
|
||||
description: str
|
||||
command: Tuple[str, ...]
|
||||
|
||||
def info(self) -> str:
|
||||
return f'# {self.description}\n' f'{" ".join(self.command)}'
|
||||
|
||||
def execute(self) -> None:
|
||||
"""Executes the step.
|
||||
|
||||
Raises:
|
||||
CannotRollbackError if command fails.
|
||||
"""
|
||||
if subprocess.call(self.command) != 0:
|
||||
raise common.CannotRollbackError(f'Failed: {self.description}')
|
||||
|
||||
|
||||
def check_schema_compatibility(dev_project: str, nom_tag: str,
|
||||
sql_tag: str) -> RollbackStep:
|
||||
|
||||
return RollbackStep(description='Check compatibility with SQL schema.',
|
||||
command=(f'{common.get_nomulus_root()}/nom_build',
|
||||
':integration:sqlIntegrationTest',
|
||||
f'--schema_version={sql_tag}',
|
||||
f'--nomulus_version={nom_tag}',
|
||||
'--publish_repo='
|
||||
f'gcs://{dev_project}-deployed-tags/maven'))
|
||||
|
||||
|
||||
@dataclasses.dataclass(frozen=True)
|
||||
class _SetManualScalingNumInstances(RollbackStep):
|
||||
"""Sets the number of instances for a manual scaling version.
|
||||
|
||||
The Nomulus set_num_instances command is currently broken. This step uses
|
||||
the AppEngine REST API to update the version.
|
||||
"""
|
||||
|
||||
appengine_admin: appengine.AppEngineAdmin
|
||||
version: common.VersionKey
|
||||
num_instance: int
|
||||
|
||||
def execute(self) -> None:
|
||||
self.appengine_admin.set_manual_scaling_num_instance(
|
||||
self.version.service_id, self.version.version_id,
|
||||
self.num_instance)
|
||||
|
||||
|
||||
def set_manual_scaling_instances(appengine_admin: appengine.AppEngineAdmin,
|
||||
version: common.VersionConfig,
|
||||
num_instances: int) -> RollbackStep:
|
||||
|
||||
cmd_description = textwrap.dedent("""\
|
||||
Nomulus set_num_instances command is currently broken.
|
||||
This script uses the AppEngine REST API to update the version.
|
||||
To set this value without using this tool, you may use the REST API at
|
||||
https://cloud.google.com/appengine/docs/admin-api/reference/rest/v1beta/apps.services.versions/patch
|
||||
""")
|
||||
return _SetManualScalingNumInstances(
|
||||
f'Set number of instance for manual-scaling version '
|
||||
f'{version.version_id} in {version.service_id} to {num_instances}.',
|
||||
(cmd_description, ''), appengine_admin, version, num_instances)
|
||||
|
||||
|
||||
def start_or_stop_version(project: str, action: str,
|
||||
version: common.VersionKey) -> RollbackStep:
|
||||
"""Creates a rollback step that starts or stops an AppEngine version.
|
||||
|
||||
Args:
|
||||
project: The GCP project of the AppEngine application.
|
||||
action: Start or Stop.
|
||||
version: The version being managed.
|
||||
"""
|
||||
return RollbackStep(
|
||||
f'{action.title()} {version.version_id} in {version.service_id}',
|
||||
('gcloud', 'app', 'versions', action, version.version_id, '--quiet',
|
||||
'--service', version.service_id, '--project', project))
|
||||
|
||||
|
||||
def direct_service_traffic_to_version(
|
||||
project: str, version: common.VersionKey) -> RollbackStep:
|
||||
return RollbackStep(
|
||||
f'Direct all traffic to {version.version_id} in {version.service_id}.',
|
||||
('gcloud', 'app', 'services', 'set-traffic', version.service_id,
|
||||
'--quiet', f'--splits={version.version_id}=1', '--project', project))
|
||||
|
||||
|
||||
@dataclasses.dataclass(frozen=True)
|
||||
class _UpdateDeployTag(RollbackStep):
|
||||
"""Updates the deployment tag on GCS."""
|
||||
|
||||
nom_tag: str
|
||||
destination: str
|
||||
|
||||
def execute(self) -> None:
|
||||
with subprocess.Popen(('gsutil', 'cp', '-', self.destination),
|
||||
stdin=subprocess.PIPE) as p:
|
||||
try:
|
||||
p.communicate(self.nom_tag.encode('utf-8'))
|
||||
if p.wait() != 0:
|
||||
raise common.CannotRollbackError(
|
||||
f'Failed: {self.description}')
|
||||
except:
|
||||
p.kill()
|
||||
raise
|
||||
|
||||
|
||||
def update_deploy_tags(dev_project: str, env: str,
|
||||
nom_tag: str) -> RollbackStep:
|
||||
destination = f'gs://{dev_project}-deployed-tags/nomulus.{env}.tag'
|
||||
|
||||
return _UpdateDeployTag(
|
||||
f'Update Nomulus tag in {env}',
|
||||
(f'echo {nom_tag} | gsutil cp - {destination}', ''), nom_tag,
|
||||
destination)
|
5
rollback_tool
Executable file
5
rollback_tool
Executable file
|
@ -0,0 +1,5 @@
|
|||
#!/bin/sh
|
||||
# Wrapper for rollback_tool.py.
|
||||
cd $(dirname $0)
|
||||
python3 ./release/rollback/rollback_tool.py "$@"
|
||||
exit $?
|
Loading…
Add table
Reference in a new issue