Questions:
- What is the best practice for
keeping track of a thread's
progress without locking the GUI
("Not Responding")?
- Generally, what are the best practices for
threading as it applies to GUI
development?
Question Background:
- I have a PyQt GUI for Windows.
- It is used to process sets of HTML
documents.
- It takes anywhere from three seconds
to three hours to process a set of
documents.
- I want to be able to process
multiple sets at the same time.
- I don't want the GUI to lock.
- I'm looking at the threading module
to achieve this.
- I am relatively new to threading.
- The GUI has one progress bar.
- I want it to display the progress of
the selected thread.
- Display results of the selected
thread if it's finished.
- I'm using Python 2.5.
My Idea: Have the threads emit a QtSignal when the progress is updated that triggers some function that updates the progress bar. Also signal when finished processing so results can be displayed.
#NOTE: this is example code for my idea, you do not have
# to read this to answer the question(s).
import threading
from PyQt4 import QtCore, QtGui
import re
import copy
class ProcessingThread(threading.Thread, QtCore.QObject):
__pyqtSignals__ = ( "progressUpdated(str)",
"resultsReady(str)")
def __init__(self, docs):
self.docs = docs
self.progress = 0 #int between 0 and 100
self.results = []
threading.Thread.__init__(self)
def getResults(self):
return copy.deepcopy(self.results)
def run(self):
num_docs = len(self.docs) - 1
for i, doc in enumerate(self.docs):
processed_doc = self.processDoc(doc)
self.results.append(processed_doc)
new_progress = int((float(i)/num_docs)*100)
#emit signal only if progress has changed
if self.progress != new_progress:
self.emit(QtCore.SIGNAL("progressUpdated(str)"), self.getName())
self.progress = new_progress
if self.progress == 100:
self.emit(QtCore.SIGNAL("resultsReady(str)"), self.getName())
def processDoc(self, doc):
''' this is tivial for shortness sake '''
return re.findall('<a [^>]*>.*?</a>', doc)
class GuiApp(QtGui.QMainWindow):
def __init__(self):
self.processing_threads = {} #{'thread_name': Thread(processing_thread)}
self.progress_object = {} #{'thread_name': int(thread_progress)}
self.results_object = {} #{'thread_name': []}
self.selected_thread = '' #'thread_name'
def processDocs(self, docs):
#create new thread
p_thread = ProcessingThread(docs)
thread_name = "example_thread_name"
p_thread.setName(thread_name)
p_thread.start()
#add thread to dict of threads
self.processing_threads[thread_name] = p_thread
#init progress_object for this thread
self.progress_object[thread_name] = p_thread.progress
#connect thread signals to GuiApp functions
QtCore.QObject.connect(p_thread, QtCore.SIGNAL('progressUpdated(str)'), self.updateProgressObject(thread_name))
QtCore.QObject.connect(p_thread, QtCore.SIGNAL('resultsReady(str)'), self.updateResultsObject(thread_name))
def updateProgressObject(self, thread_name):
#update progress_object for all threads
self.progress_object[thread_name] = self.processing_threads[thread_name].progress
#update progress bar for selected thread
if self.selected_thread == thread_name:
self.setProgressBar(self.progress_object[self.selected_thread])
def updateResultsObject(self, thread_name):
#update results_object for thread with results
self.results_object[thread_name] = self.processing_threads[thread_name].getResults()
#update results widget for selected thread
try:
self.setResultsWidget(self.results_object[thread_name])
except KeyError:
self.setResultsWidget(None)
Any commentary on this approach (e.g. drawbacks, pitfalls, praises, etc.) will be appreciated.
Resolution:
I ended up using the QThread class and associated signals and slots to communicate between threads. This is primarily because my program already uses Qt/PyQt4 for the GUI objects/widgets. This solution also required fewer changes to my existing code to implement.
Here is a link to an applicable Qt article that explains how Qt handles threads and signals, http://www.linuxjournal.com/article/9602. Excerpt below:
Fortunately, Qt permits
signals and slots to be connected
across threads—as long as the threads
are running their own event loops.
This is a much cleaner method of
communication compared to sending and
receiving events, because it avoids
all the bookkeeping and intermediate
QEvent-derived classes that become
necessary in any nontrivial
application. Communicating between
threads now becomes a matter of
connecting signals from one thread to
the slots in another, and the mutexing
and thread-safety issues of exchanging
data between threads are handled by
Qt.
Why is it necessary to run an event
loop within each thread to which you
want to connect signals? The reason
has to do with the inter-thread
communication mechanism used by Qt
when connecting signals from one
thread to the slot of another thread.
When such a connection is made, it is
referred to as a queued connection.
When signals are emitted through a
queued connection, the slot is invoked
the next time the destination object's
event loop is executed. If the slot
had instead been invoked directly by a
signal from another thread, that slot
would execute in the same context as
the calling thread. Normally, this is
not what you want (and especially not
what you want if you are using a
database connection, as the database
connection can be used only by the
thread that created it). The queued
connection properly dispatches the
signal to the thread object and
invokes its slot in its own context by
piggy-backing on the event system.
This is precisely what we want for
inter-thread communication in which
some of the threads are handling
database connections. The Qt
signal/slot mechanism is at root an
implementation of the inter-thread
event-passing scheme outlined above,
but with a much cleaner and
easier-to-use interface.
NOTE: eliben also has a good answer, and if I weren't using PyQt4, which handles thread-safety and mutexing, his solution would have been my choice.
See Question&Answers more detail:
os