Logging is mostly about trade off between easy of troubleshooting and the overhead. Log too much makes the app slow, and noisy log output is also hard to trace. Log too little won’t help at all. Many of the points below are from the Python logging cookbook.
1/ Python’s logging module
Django make use of Python’s built-in logging module. The way you use it in traditional Python app is:
You can also use
logger = logging.getLogger(...), but it’s not necessary.
Django already have logging module setup, so you don’t need to call
basicConfig() anymore, just import logging and uses it.
- Error: Should be use for fatal errors and unrecoverable failure. Good places to use error log:
- wherever you have a try-except block, for example: failed call to external API. If you implement retry for API call, initial failure is a warning, and the final failed attempt is an error
- however, if the exception is handled by upstream handler you don’t need to log the error, for example in
get_querysetif DoesNotExists exception is raised Django will automatically respond with 404, however if you’re raising 404 for other reasons you could still include an error log.
- Warn: Deprecations and non-fatal errors. For example:
- failure on requests to external service when you have implemented retry
- requests with expired JWT.
- For our app the case where someone register with non alumni email address, and when user position does not have month are good places to put a warning.
- Some also do warning on slow queries and requests but I don’t. We can get this information from the timestamp of the log message using Logstash and I think we should use tools that are specialized for metrics like Prometheus or InfluxDB.
- Debug: Places to help you troubleshooting issues
- If you’re lazy you can just add a
logging.debugto the beginning of every function calls, possibly with their parameters included. Doing so adds a lot of noise to the code base but in my experience it’s worth it. Django CBVs don’t have any logging, so you can only do logging on the methods you override. Be careful though, doing this carelessly would lead to embarrassing incidents. https://www.bleepingcomputer.com/news/security/twitter-admits-recording-plaintext-passwords-in-internal-logs-just-like-github/.
- If you don’t like the noise then use
logging.debugon critical flows only.
- Info: Anything else. Some common places are:
- the beginning and the end of a request. Django already have request log but you can do better
- when a background job started and finished
- logging the Django settings on starting up. You can log all the relevant settings in
coremodule. Or just log the relevant settings for each modules on it’s
- end of import tasks where you shows how many records
- Elasticsearch log handler: https://github.com/cmanaha/python-elasticsearch-logger. My past projects all uses custom log format and handlers so I’m not sure about this one. There’s also https://github.com/hoffmann/stash, which also does index rotate daily - very helpful because ES don’t like large indexes.
- Request ID
- https://github.com/dabapps/django-log-request-id or https://github.com/nigma/django-request-id
- If you use ES log handler or structured logging you don’t have to worry about this https://docs.python.org/3/howto/logging-cookbook.html#implementing-structured-logging
- If you log to file, please make sure at least the following information is included:
- Log level
- Request ID
- File location or Package
- Formatting: Call to logging module will always incur overhead, even if that log level is not enabled. To reduce the overhead, use the logging module’s formatting capability.
is better than
because the later always format the log message before sending to the handler, which always call
data.__str__(). The former call does not do that if log level is higher than DEBUG.
2/ Django’s logging Read Django’s guide: https://docs.djangoproject.com/en/2.0/topics/logging/
- Django logging settings are done by LOGGING settings
- Some of the log namespaces that should be enabled:
- django.db.backends: should enable duration and sql, but not params
- django.request (for request id)
These are my personal opinion:
- Between logging and metrics I prefer logging. Metrics are for optimization , they are nice to have but they’re only helpful when you have a lot of users and requests. Logging is more useful and more important in every other situation. Many of the metrics can be measure by analyzing to log, given you have detailed enough logs.
- ES don’t like large indexes, we will need to have the index used for logging rotate every week/day.
- Logging can be a major overhead, especially logging over network (.i.e to ES). Logging might also fail, for example: disk full or network outage. That’s why many people still log to file and log file rotation. And have system like Logstash collect them later. For this project I think it’s ok to use ES for logging.
- Log level is for filtering, I’d like to set logging level to debug all the times, even in production so that I can gather enough data to troubleshoot issues and possibly recover the errors.