Scheduled tasks¶
We normally write and execute scheduled tasks with Django management commands, which are executed on the command line.
The tasks¶
django-admin clearsessions- once a day (not necessary, but useful). Built-in to Django.django-admin update_index- once a day (not necessary, but useful to make sure the search index stays intact). Defined by Haystack package.django-admin publish_scheduled_pages- every 10 minutes or more often. This is necessary to make publishing scheduled pages work. Defined by Wagtail package.django-admin import_cis_content- once a week (see CIS import scheduling page). User-defined in/tate/art/management/commands/directory.
In Azure¶
For Azure App Service hosting, there is no cron-like scheduler for commands – instead, we use Azure Functions since it supports scheduling functions activations. Functions trigger HTTP requests to the main application, which then executes the management commands.
Each separate scheduled job has its own function, so it can have its own schedule, and can be manually triggered from within the Azure portal. You need to select a specific function and go to "Code and test".
The commands app in the Django project exposes the remote commands to the functions, with token-based authentication to make sure only Azure Functions can trigger commands.
Command execution logs are redirected back to the HTTP response for each trigger, so they are logged in the Azure Functions logs.
The Function App needs to have the following configuration variables set:
AUTHENTICATION_TOKEN
The token used by the function to authenticate with the main Django app. This
should match the CRON_AUTHENTICATION_TOKEN environment variable set on the
Django app.
This should be a reasonably secure random string.
BASE_URL
The base URL where the main Django app lives. e.g. https://www.tate.org.uk
NB: When setting this variable, do not include the trailing slash.
BASIC_AUTH_LOGIN and BASIC_AUTH_PASSWORD
This is required when the Django app has basic auth enabled. Otheriwse, you can leave these variables unset.
Deployments¶
Functions are deployed to Azure Functions via GitHub Actions, just like the main application. We only trigger functions deployments once the main application deployment has been successful. There isn’t any particular sync mechanism during deployments so there is always a risk functions would be triggering executions on an application that’s of a newer version than expected.
Local development¶
First off, it’s always possible to manually run any of the commands from the command line.
To test the Azure Functions setup, you will need the Azure Functions core tools v3 (installed in Docker by default).
# 1. Go to the `functions` subfolder.
cd functions
# 2. Create a local config file.
cp local.settings.json.example local.settings.json
# 3. In this config file, update AzureWebJobsStorage with the connection string to an Azure Storage account.
The AzureWebJobsStorage Azure storage account connection string is needed so function logs are stored somewhere. It should look like: DefaultEndpointsProtocol=https;AccountName=mystorageaccountname;AccountKey=mybase64storageaccountkey;EndpointSuffix=core.windows.net. There is a local development option, UseDevelopmentStorage=true, unfortunately this is only available on Windows.
To retrieve your connection string, go to the intended Storage account in Azure Portal, then go to Access keys under Security + networking.
Once this is all set up, you can func start to start the functions host. Make sure the local Django server is running. Here is sample output (truncated) to demonstrate what the local setup should do:
Azure Functions Core Tools
Core Tools Version: 3.0.3568 Commit hash: e30a0ede85fd498199c28ad699ab2548593f759b (64-bit)
Function Runtime Version: 3.0.15828.0
Functions:
ClearSessions: timerTrigger
PublishScheduledPages: timerTrigger
UpdateIndex: timerTrigger
For detailed output, run func with --verbose flag.
Worker process started and initialized.
Host lock lease acquired by instance ID '0000000000000000000000007FE97AD0'.
Executing 'Functions.PublishScheduledPages' (Reason='Timer fired at 2021-07-30T09:03:29.9712856+00:00', Id=[…])
Making a HTTP POST request to http://localhost:8000/commands/publish-scheduled-pages/.
{'detail': {}}
Executed 'Functions.PublishScheduledPages' (Succeeded, Id=[…], Duration=73ms)
Celery Background Worker¶
To prevent long-running tasks from timing out the Gunicorn request before completing, Celery is used to schedule and execute tasks in the background. For consistency and to allow automatic retries where sensible, all the scheduled tasks described above are added to the Celery queue upon receipt of the 'cron' web request from Azure Functions.
Celery config¶
The celery tasks are defined in /tate/background_tasks/tasks.py and extra configuration such as number of retries and the delay
between attempts is defined on each task.
Starting the Celery worker¶
- The Celery worker will start automatically in the
tate-wagtail-webcontainer by the startup command defined in/rootfs/app/start-celery.shwhich is executed by Supervisor as defined in/rootfs/etc/supervisor/conf.d/celery.conf. - To start the worker locally, the shell file should be called manually from the terminal.
- It's important for Celery to run in the web container to allow it to receive tasks as triggered by the cron web requests, and execute the tasks in the same Django application.
Testing 'cron' web requests locally¶
To trigger a web requested command locally, open Postman or similar and set up a request as follows:
- POST http://localhost:8000/commands/{{command_name}}/ (the trailing slash is required).
- Headers:
- cron-authentication-token = {{cronToken}}
- cron-request-id = A_UUID_IDENTIFIER
- accept = application/json
Where command_name is defined in /tate/commands/urls.py as kebab-case names, e.g. clear-sessions.
For local testing, cronToken is llamasavers.
This is effectively the same as the request that will be made by Azure Functions to the live site.
Monitoring Tasks¶
Azure Application Insights¶
If the environment variable APPINSIGHTS_CONNECTION_STRING is set, Celery task events are logged to Application Insights.
Failed tasks are recorded as severity 3 (error). Retry attempts are recorded as severity 2 (warning). Task queuing (sent), start and completion (success) are recorded as severity 1 (information).
Similarly, the Function App used to queue up the tasks logs information, warnings and errors to Application Insights too.
Monitoring alerts can be set up to email an action group upon detection of a log message of given severity or message content. This should be defined in Terraform in the Tate Web Infra repo.
Checking Celery logs¶
If testing locally or you want to see more detailed logs on a given web app instance, you can obtain more detailed logs from Celery.
- The logs are set to output to
/home/LogFiles/celery.log. - You can see live updates by running
tail -f /home/LogFiles/celery.log. This outputs all logging from Celery including logs from the tasks being executed.
Checking queued tasks directly in Redis¶
If tasks are currently waiting to be picked up by a worker, or have been scheduled for later (e.g. a retry task with a timeout), then you can see a list of these in Redis under key celery.
- Enter the shell of
tate-wagtail-redisusingfab sh --service redis. - Execute command
redis-cli. - Execute Redis CLI command
lrange celery 0 -1to get the full list of queued tasks. This does not include running tasks, only waiting.