mirror of
https://github.com/cisagov/manage.get.gov.git
synced 2025-08-16 06:24:12 +02:00
Add more ops debugging tips
This commit is contained in:
parent
78a9c50561
commit
737184130f
1 changed files with 111 additions and 1 deletions
|
@ -1,5 +1,4 @@
|
||||||
# Operations
|
# Operations
|
||||||
========================
|
|
||||||
|
|
||||||
Some basic information and setup steps are included in this README.
|
Some basic information and setup steps are included in this README.
|
||||||
|
|
||||||
|
@ -46,3 +45,114 @@ Your sandbox space should've been setup as part of the onboarding process. If th
|
||||||
We are using [WhiteNoise](http://whitenoise.evans.io/en/stable/index.html) plugin to serve our static assets on cloud.gov. This plugin is added to the `MIDDLEWARE` list in our apps `settings.py`.
|
We are using [WhiteNoise](http://whitenoise.evans.io/en/stable/index.html) plugin to serve our static assets on cloud.gov. This plugin is added to the `MIDDLEWARE` list in our apps `settings.py`.
|
||||||
|
|
||||||
Note that it’s a good idea to run `collectstatic` locally or in the docker container before pushing files up to your sandbox. This is because `collectstatic` relies on timestamps when deciding to whether to overwrite the existing assets in `/public`. Due the way files are uploaded, the compiled css in the `/assets/css` folder on your sandbox will have a slightly earlier timestamp than the files in `/public/css`, and consequently running `collectstatic` on your sandbox will not update `public/css` as you may expect. For convenience, both the `deploy.sh` and `build.sh` scripts will take care of that.
|
Note that it’s a good idea to run `collectstatic` locally or in the docker container before pushing files up to your sandbox. This is because `collectstatic` relies on timestamps when deciding to whether to overwrite the existing assets in `/public`. Due the way files are uploaded, the compiled css in the `/assets/css` folder on your sandbox will have a slightly earlier timestamp than the files in `/public/css`, and consequently running `collectstatic` on your sandbox will not update `public/css` as you may expect. For convenience, both the `deploy.sh` and `build.sh` scripts will take care of that.
|
||||||
|
|
||||||
|
# Debugging
|
||||||
|
|
||||||
|
Debugging errors observed in applications running on Cloud.gov requires being
|
||||||
|
able to see the log information from the environment that the application is
|
||||||
|
running in. There are (at least) three different ways to see that information:
|
||||||
|
Cloud.gov dashboard, CloudFoundry CLI application, and Cloud.gov Kibana logging
|
||||||
|
queries. There is also SSH access into Cloud.gov containers and Github Actions
|
||||||
|
that can be used for specific tasks.
|
||||||
|
|
||||||
|
## Cloud.gov dashboard
|
||||||
|
|
||||||
|
At <https://dashboard.fr.cloud.gov/applications> there is a list for all of the
|
||||||
|
applications that a Cloud.gov user has access to. Clicking on an application
|
||||||
|
goes to a screen for that individual application, e.g.
|
||||||
|
<https://dashboard.fr.cloud.gov/applications/2oBn9LBurIXUNpfmtZCQTCHnxUM/53b88024-1492-46aa-8fb6-1429bdb35f95/summary>.
|
||||||
|
On that page is a left-hand link for "Log Stream" e.g.
|
||||||
|
<https://dashboard.fr.cloud.gov/applications/2oBn9LBurIXUNpfmtZCQTCHnxUM/53b88024-1492-46aa-8fb6-1429bdb35f95/log-stream>.
|
||||||
|
That log stream shows a stream of Cloud.gov log messages. Cloud.gov has
|
||||||
|
different layers that log requests. One is `RTR` which is the router within
|
||||||
|
Cloud.gov. Messages from our Django app are prefixed with `APP/PROC/WEB`. While
|
||||||
|
it is possible to search inside the browser for particular log messages, this
|
||||||
|
is not a sophisticated interface for querying logs.
|
||||||
|
|
||||||
|
## CloudFoundry CLI
|
||||||
|
|
||||||
|
When logged in with the CloudFoundry CLI (see
|
||||||
|
[above](#authenticating-to-cloudgov-via-the-command-line)) Cloudfoundry
|
||||||
|
application logs can be viewed with the `cf logs <application>` where
|
||||||
|
`<application>` is the name of the application in the currently targeted space.
|
||||||
|
By default `cf logs` starts a streaming view of log messages from the
|
||||||
|
application. It appears to show the same information as the dashboard web
|
||||||
|
application, but in the terminal. There is a `--recent` option that will dump
|
||||||
|
things that happened prior to the current time rather than starting a stream of
|
||||||
|
the present log messages, but that is also not a full log archive and search
|
||||||
|
system.
|
||||||
|
|
||||||
|
CloudFoundry also offers a `run-task` command that can be used to run a single
|
||||||
|
command in the running Cloud.gov container. For example, to run our Django
|
||||||
|
admin command that loads test fixture data:
|
||||||
|
|
||||||
|
```
|
||||||
|
cf run-task getgov-nmb --command "./manage.py load" --name fixtures
|
||||||
|
```
|
||||||
|
|
||||||
|
However, this task runs asynchronously in the background without any command
|
||||||
|
output, so it can sometimes be hard to know if the command has completed and if
|
||||||
|
so, if it was successful.
|
||||||
|
|
||||||
|
## Cloud.gov Kibana
|
||||||
|
|
||||||
|
Cloud.gov provides an instance of the log query program Kibana at
|
||||||
|
<https://logs.fr.cloud.gov>. Kibana is powerful, but also complicated software
|
||||||
|
that can take time to learn how to use most effectively. A few hints:
|
||||||
|
|
||||||
|
- Set the timeframe of the display appropriately, the default is the last
|
||||||
|
15 minutes which may not show any results in some environments.
|
||||||
|
|
||||||
|
- Kibana queries and filters can be used to narrow in on particular
|
||||||
|
environments. Try the query `@source.type:APP` to focus on messages from the
|
||||||
|
Django application or `@cf.app:"getgov-nmb"` to see results from a single
|
||||||
|
environment.
|
||||||
|
|
||||||
|
Currently, our application emits Python's default log format which is textual
|
||||||
|
and not record-based. In particular, tracebacks are on multiple lines and show
|
||||||
|
up in Kibana as multiple records that are not necessarily connected. As the
|
||||||
|
application gets closer to production, we may want to switch to a JSON log format
|
||||||
|
where errors will be captured by Kibana as a single message, however with a
|
||||||
|
slightly more difficult developer experience when reading logs by eyeball.
|
||||||
|
|
||||||
|
|
||||||
|
## SSH access
|
||||||
|
|
||||||
|
The CloudFoundry CLI provides SSH access to the running container of an
|
||||||
|
application. Use `cf ssh <application>` to SSH into the container. To make sure
|
||||||
|
that your shell is seeing the same configuration as the running application, be
|
||||||
|
sure to run `/tmp/lifecycle/shell` very first.
|
||||||
|
|
||||||
|
Inside the container, the python code should be in `/app` and you can check
|
||||||
|
there to see if the expected version of code is deployed in a particular file.
|
||||||
|
There is no hot-reloading inside the container, so it isn't possible to make
|
||||||
|
code changes there and see the results reflected in the running application.
|
||||||
|
(Templates may be read directly from disk every page load so it is possible
|
||||||
|
that you could change a page template and see the result in the application.)
|
||||||
|
|
||||||
|
Inside the container, it can be useful to run various Django admin commands
|
||||||
|
using `./manage.py`. For example, `./manage.py shell` can be used to give a
|
||||||
|
python interpreter where code can be run to modify objects in the database, say
|
||||||
|
to make a user an administrator.
|
||||||
|
|
||||||
|
## Github Actions
|
||||||
|
|
||||||
|
In order to allow some ops activities by people without CloudFoundry on a
|
||||||
|
laptop, we have some ops-related actions under
|
||||||
|
<https://github.com/cisagov/getgov/actions>.
|
||||||
|
|
||||||
|
### Migrate data
|
||||||
|
|
||||||
|
This Github action runs Django's `manage.py migrate` command on the specified
|
||||||
|
environment. **This is the first thing to try when fixing 500 errors from an
|
||||||
|
application environment**. The migrations should be idempotent, so running the
|
||||||
|
same migrations more than once should never cause an additional problem.
|
||||||
|
|
||||||
|
### Reset database
|
||||||
|
|
||||||
|
Very occasionally, there are migrations that don't succeed when run against a
|
||||||
|
database with data already in it. This action drops the database and re-creates
|
||||||
|
it with the latest model schema. Once launched, this should never be used on
|
||||||
|
the `stable` environment, but during development, it may be useful on the
|
||||||
|
various sandbox environments. After launch, some schema changes may take the
|
||||||
|
involvement of a skilled DBA to fix problems like this.
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue