Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
191 views
in Technique[技术] by (71.8m points)

python - Cherrypy : which solutions for pages with large processing time

I have a website powered by cherrypy. For some pages, I need quite a long processing time (a multi-join SQL request on a several-million-row DB). The processing needs sometimes 20 seconds or more, and the browser get crashed because it is too long.

I'm wondering what would be a nice solution here.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Everything here depends on a volume of the website. CherryPy is a threaded server and once every thread is waiting for database, new requests won't be processed. There's also aspect of request queue, but in general it is so.

Poor man's solution

If you know that you have small traffic you can try to workaround. Increase response.timeout if needed (default is 300 seconds). Increase server.thread_pool (defaults to 10). If you use reserve proxy, like nginx, in front of CherryPy application, increase proxy timeout there as well.

The following solutions will require you to redesign your website. Specifically to make it asynchronous, where client code sends a task, and then uses pull or push to get its result. It will require changes on both sides of the wire.

CherryPy BackgroundTask

You can make use of cherrypy.process.plugins.BackgroundTask and some intermediary storage (e.g. new table in your database) at server side. XmlHttpRequest for pull or WebSockets for push to client side. CherryPy can handle both.

Note that because CherryPy is run in single Python process, the background task's thread will run within it too. If you do some SQL result set post-processing, you will be affected by GIL. So you may want rewrite it to use processes instead, which is a little more complicated.

Industrial solution

If your website operates or is deemed to operate at scale, you are better to consider a distributed task queue like Rq or Celery. It makes server-side difference. Client side is the same pull or push.

Example

Here follows a toy implementation for BackgroundTags with XHR polling.

#!/usr/bin/env python
# -*- coding: utf-8 -*-


import time
import uuid

import cherrypy
from cherrypy.process.plugins import BackgroundTask


config = {
  'global' : {
    'server.socket_host' : '127.0.0.1',
    'server.socket_port' : 8080,
    'server.thread_pool' : 8,
  }
}


class App:

  _taskResultMap = None


  def __init__(self):
    self._taskResultMap = {}

  def _target(self, task, id, arg):
    time.sleep(10) # long one, right?
    try:
      self._taskResultMap[id] = 42 + arg
    finally:
      task.cancel()

  @cherrypy.expose
  @cherrypy.tools.json_out()
  def schedule(self, arg):
    id = str(uuid.uuid1())
    self._taskResultMap[id] = None
    task = BackgroundTask(
      interval = 0, function = self._target, args = [id, int(arg)], 
      bus = cherrypy.engine)
    task.args.insert(0, task)
    task.start()
    return str(id)

  @cherrypy.expose
  @cherrypy.tools.json_out()
  def poll(self, id):
    if self._taskResultMap[id] is None:
      return {'id': id, 'status': 'wait', 'result': None}
    else:
      return {
        'id'     : id, 
        'status' : 'ready', 
        'result' : self._taskResultMap.pop(id)
      }

  @cherrypy.expose
  def index(self):
    return '''<!DOCTYPE html>
      <html>
      <head>
        <title>CherryPy BackgroundTask demo</title>
        <script type='text/javascript' 
          src='http://cdnjs.cloudflare.com/ajax/libs/qooxdoo/3.5.1/q.min.js'>
        </script>
        <script type='text/javascript'>
          // Do not structure you real JavaScript application this way. 
          // This callback spaghetti is only for brevity.

          function sendSchedule(arg, callback)
          {
            var xhr = q.io.xhr('/schedule?arg=' + arg);
            xhr.on('loadend', function(xhr) 
            {
              if(xhr.status == 200)
              {
                callback(JSON.parse(xhr.responseText))
              }
            });
            xhr.send();
          };

          function sendPoll(id, callback)
          {
            var xhr = q.io.xhr('/poll?id=' + id);
            xhr.on('loadend', function(xhr) 
            {
              if(xhr.status == 200)
              {
                callback(JSON.parse(xhr.responseText))
              }
            });
            xhr.send();
          }

          function start(event)
          {
            event.preventDefault();

            // example argument to pass to the task
            var arg = Math.round(Math.random() * 100);

            sendSchedule(arg, function(id)
            {
              console.log('scheduled (', arg, ') as', id);
              q.create('<li/>')
                .setAttribute('id', id)
                .append('<span>' + id + ': 42 + ' + arg + 
                  ' = <img src="http://sstatic.net/Img/progress-dots.gif" />' + 
                  '</span>')
                .appendTo('#result-list');

              var poll = function()
              {
                console.log('polling', id);
                sendPoll(id, function(response)
                {
                  console.log('polled', id, '(', response, ')');
                  if(response.status == 'wait')
                  {
                    setTimeout(poll, 2500);
                  }
                  else if(response.status == 'ready')
                  {
                    q('#' + id)
                      .empty()
                      .append('<span>' + id + ': 42 + ' + arg + ' = ' + 
                        response.result + '</span>');
                  }
                });
              };
              setTimeout(poll, 2500);
            });
          }

          q.ready(function()
          {
            q('#run').on('click', start);
          });
        </script>
      </head>
      <body>
        <p>
          <a href='#' id='run'>Run a long task</a>, look in browser console.
        </p>
        <ul id='result-list'></ul>
      </body>
      </html>
    '''


if __name__ == '__main__':
  cherrypy.quickstart(App(), '/', config)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...