Logging

注解

scrapy.log has been deprecated alongside its functions in favor of explicit calls to the Python standard logging. Keep reading to learn more about the new logging system.

Scrapy uses Python’s builtin logging system for event logging. We’ll provide some simple examples to get you started, but for more advanced use-cases it’s strongly suggested to read thoroughly its documentation.

Logging works out of the box, and can be configured to some extent with the Scrapy settings listed in Logging settings.

Scrapy calls scrapy.utils.log.configure_logging() to set some reasonable defaults and handle those settings in Logging settings when running commands, so it’s recommended to manually call it if you’re running Scrapy from scripts as described in 在脚本中运行Scrapy.

Log levels

Python’s builtin logging defines 5 different levels to indicate severity on a given log message. Here are the standard ones, listed in decreasing order:

  1. logging.CRITICAL - for critical errors (highest severity)
  2. logging.ERROR - for regular errors
  3. logging.WARNING - for warning messages
  4. logging.INFO - for informational messages
  5. logging.DEBUG - for debugging messages (lowest severity)

How to log messages

Here’s a quick example of how to log a message using the logging.WARNING level:

import logging
logging.warning("This is a warning")

There are shortcuts for issuing log messages on any of the standard 5 levels, and there’s also a general logging.log method which takes a given level as argument. If you need so, last example could be rewrote as:

import logging
logging.log(logging.WARNING, "This is a warning")

On top of that, you can create different “loggers” to encapsulate messages (For example, a common practice it’s to create different loggers for every module). These loggers can be configured independently, and they allow hierarchical constructions.

Last examples use the root logger behind the scenes, which is a top level logger where all messages are propagated to (unless otherwise specified). Using logging helpers is merely a shortcut for getting the root logger explicitly, so this is also an equivalent of last snippets:

import logging
logger = logging.getLogger()
logger.warning("This is a warning")

You can use a different logger just by getting its name with the logging.getLogger function:

import logging
logger = logging.getLogger('mycustomlogger')
logger.warning("This is a warning")

Finally, you can ensure having a custom logger for any module you’re working on by using the __name__ variable, which is populated with current module’s path:

import logging
logger = logging.getLogger(__name__)
logger.warning("This is a warning")

参见

Module logging, HowTo
Basic Logging Tutorial
Module logging, Loggers
Further documentation on loggers

Logging from Spiders

Scrapy provides a logger within each Spider instance, that can be accessed and used like this:

import scrapy

class MySpider(scrapy.Spider):

    name = 'myspider'
    start_urls = ['http://scrapinghub.com']

    def parse(self, response):
        self.logger.info('Parse function called on %s', response.url)

That logger is created using the Spider’s name, but you can use any custom Python logger you want. For example:

import logging
import scrapy

logger = logging.getLogger('mycustomlogger')

class MySpider(scrapy.Spider):

    name = 'myspider'
    start_urls = ['http://scrapinghub.com']

    def parse(self, response):
        logger.info('Parse function called on %s', response.url)

Logging configuration

Loggers on their own don’t manage how messages sent through them are displayed. For this task, different “handlers” can be attached to any logger instance and they will redirect those messages to appropriate destinations, such as the standard output, files, emails, etc.

By default, Scrapy sets and configures a handler for the root logger, based on the settings below.

Logging settings

These settings can be used to configure the logging:

First couple of settings define a destination for log messages. If LOG_FILE is set, messages sent through the root logger will be redirected to a file named LOG_FILE with encoding LOG_ENCODING. If unset and LOG_ENABLED is True, log messages will be displayed on the standard error. Lastly, if LOG_ENABLED is False, there won’t be any visible log output.

LOG_LEVEL determines the minimum level of severity to display, those messages with lower severity will be filtered out. It ranges through the possible levels listed in Log levels.

LOG_FORMAT and LOG_DATEFORMAT specify formatting strings used as layouts for all messages. Those strings can contain any placeholders listed in logging’s logrecord attributes docs and datetime’s strftime and strptime directives respectively.

Command-line options

There are command-line arguments, available for all commands, that you can use to override some of the Scrapy settings regarding logging.

参见

Module logging.handlers
Further documentation on available handlers

scrapy.utils.log module