view doc/deployment.rst @ 66:f33cb3e93473 draft

Fix bad log invocations.
author David Barts <n5jrn@me.com>
date Fri, 12 Jul 2019 20:27:52 -0700
parents 25fdd985d046
children
line wrap: on
line source

**********
Deployment
**********

========================
Bottle's Built-In Server
========================

The ``launch`` command will simply use the WSGI server built into Bottle (built into the Python standard library, actually) to serve the routes defined by the webapp. This is a multithreaded server (see the :ref:`multithreading-problem`), which is primarily intended for debugging and light-duty use.

Static and Dynamic Routes
-------------------------

If you're using the built-in server, one thing to be aware of is that *by default, only dynamic content is served*. That means, if you create pages that reference style sheet, image, or font files, all these resources will cause 404 errors when a user agent attempts to request them from the server, because no routes exist to serve such content.

If you specify the ``--static`` option to ``launch``, then for every file that is not related to dynamic content being generated by TinCan, a route will be created to serve that file's content. The HTTP ``Content-Type`` header for such routes will be set based on the file's extension, using the Python ``mimetype`` library.

.. _example-apache-config:

====
WSGI
====

Bottle, and by implication TinCan, support WSGI, so any web server that supports WSGI can be used to serve a TinCan webapp. This is generally to be preferred for production use.

How exactly to do this is beyond the scope of this document, but here's an example of serving a webapp using ``mod_wsgi`` under Apache. Note how rewrites are used to ensure that only *dynamic* content is served by TinCan, leaving static content to be served by Apache itself.

In the configuration file for the site in question::

    WSGIScriptAlias /ti_webapp /home/davidb/webapp.wsgi
    RewriteEngine on
    RewriteRule ^/ti/(.*\.pspx)$ /ti_webapp/$1 [PT]
    <LocationMatch "^/ti/WEB-INF/">
        Order deny,allow
        Deny from all
    </LocationMatch>
    <LocationMatch "^/ti/.*\.(py|pyc|pt)$">
        Order deny,allow
        Deny from all
    </LocationMatch>
    <Directory "/var/www/html/ti">
        DirectoryIndex index.html index.pspx
    </Directory>
    <Directory "/home/davidb">
        <Files "webapp.wsgi">
            Order deny,allow
            Allow from all
            Require all granted
        </Files>
    </Directory>

The ``webapp.wsgi`` script::

    #!/usr/bin/env python3

    # C o n s t a n t s

    # Set the following appropriately.
    TINCAN_HOME = "/home/davidb/src/tincan"
    WEBAPP_HOME = "/var/www/html/ti"
    LOG_HOME = "/var/log/wsgi"
    LOG_NAME = "webapp"

    # I m p o r t s

    # First, fix up the path
    import os, sys
    if sys.path[0] == "":
        sys.path[0] = TINCAN_HOME
    else:
        sys.path.insert(0, TINCAN_HOME)

    # Then do the importing
    import bottle
    import logging
    from logging.handlers import RotatingFileHandler
    import tincan

    # M a i n   P r o g r a m

    # Set up logging
    logger = logging.getLogger(LOG_NAME)
    logger.setLevel(logging.INFO)
    handler = RotatingFileHandler(os.path.join(LOG_HOME, LOG_NAME+".log"),
        maxBytes=1048576, backupCount=5)
    handler.setFormatter(
        logging.Formatter(fmt="%(asctime)s - %(levelname)s: %(message)s"))
    logger.addHandler(handler)

    # And away we go
    application, errors = tincan.launch(fsroot=WEBAPP_HOME, logger=logger,
        multithread=False)

Note the bit about setting up logging. The WSGI standard provides no standard means by which to log errors. It *does* mention that WSGI code is strictly forbidden to write to either the standard output or the standard error stream. What this all means is that *unless you explicitly tell TinCan what to do with its error messages, it has no place to log them when running via WSGI. Thus, by default, TinCan is silent under WSGI.* So you almost certainly will want to create a logger and pass it to ``tincan.launch`` as in the example above.

TinCan is not particularly chatty with its logging, preferring to only log what routes it creates when starting up and then only logging something when a truly exceptional condition happens. Logging of requests is left to the main server. Therefore, running with a logging level of ``INFO`` will not produce an undesirably large amount of output to the logs.

.. _multithreading-problem:

==========================
The Multithreading Problem
==========================

Due to fundamantal aspects of the language's design and implementation, Python isn't that great at running multithreaded code. The *global interpreter lock* (GIL) lets only a single thread execute Python bytecode at any given time. Thus, a busy multithreaded WSGI server is likely to spend much of its time unable to service new, incoming requests.

The simplest workaround is to avoid using threads with Python. This is what the default ``prefork`` execution model of ``mod_wsgi`` does, and it is recommended that you run ``mod_wsgi`` this way. If you do, you should set ``multithread=False`` when calling ``tincan.launch``, as it will optimize memory use somewhat by maximizing the degree to which templates are shared.

That said, using threads *could* conceivably be useful for webapps that spend a large amount of their time doing I/O (e.g. database queries), as when a Python app is waiting on I/O, it is in the operating system kernel, and thus not actively executing Python bytecode, allowing another thread to hold the GIL.

The simple server created by the ``launch`` command runs in a single process and thus does always use multithreading.

So far as being thread safe goes, you don't have to much worry about it. TinCan gives each request a completely separate copy of a page's code-behind, request, and response objects. Although templates *are* shared to some degree, locking is used to ensure only a single thread has access to shared template objects at any one time. In short, multithreading might cause performance issues, but I have taken pains to ensure it should not cause code to execute incorrectly.

==========
Installing
==========

There's basically two strategies for installing a webapp to a production server: *the single-directory strategy*, in which one runs a server against a single directory exactly like the one you created to develop the webapp, and *the two-directory strategy*, in which the static and dynamic parts of a webapp are separated into separate directory trees.

The Single-Directory Strategy
-----------------------------

This is arguably the simplest, since it doesn't involve breaking a webapp into two separate directory trees at install time. If you're serving a webapp for test purposes or limited-duty internal usage with the ``launch`` command, this is the only strategy supported.

The rub comes in production use. It's unwise to have a web server run with the permission to create files in the directory trees it serves. However, TinCan needs to do just that if it needs to compile a ``.py`` file into a ``.pyc`` file.

The solution is to run ``launch --compile`` each time you update the webapp, before serving it. This will cause all referenced Python code that needs it to be recompiled, generating fresh ``.pyc`` files. The production server will then see that the byte-compiled files are all newer than the source code they are based upon, and not attempt to compile anything.

The single-directory strategy is what I typically use myself, and what is used in the example Apache configuration in the :ref:`example-apache-config` section above.

The Two-Directory Strategy
--------------------------

This gets rid of the need to recompile things, at the expense of having to install the webapp into two separate directories. What you must do is:

#. Create an empty directory for serving static content, and point your web server at it. This directory should *readable but not writeable* by the server.
#. Create a directory for serving the dynamic content. This will be the directory that ``tincan.launch`` gets passed, and it must be both readable and writeable by the server.
#. Place all the webapp's files in the second directory, ensuring all subdirectories created are both readable and writeable by the server.
#. Use the ``install-static`` command to copy or move the static content into the first directory.