Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
255 views
in Technique[技术] by (71.8m points)

python - Merge SQLite files into one db file, and 'begin/commit' question

This post refers to this page for merging SQLite databases.

The sequence is as follows. Let's say I want to merge a.db and b.db. In command line I do the following.

  • sqlite3 a.db
  • attach 'b.db' as toM;
  • begin; <--
  • insert into benchmark select * from toM.benchmark;
  • commit; <--
  • detach database toM;

It works well, but in the referred site, the questioner asks about speeding up, and the answer is to use the 'begin' and 'commit' command.

Then, I came up with the following python code to do the exactly same thing. I abstract the SQLite function calls with SQLiteDB, and one of it's method is runCommand(). I got the same error even though I delete the self.connector.commit().

# run command
def runCommand(self, command):
    self.cursor.execute(command)
    self.connector.commit() # same error even though I delete this line

db = SQLiteDB('a.db')
cmd = "attach "%s" as toMerge" % "b.db"
print cmd
db.runCommand(cmd)
cmd = "begin"
db.runCommand(cmd)
cmd = "insert into benchmark select * from toMerge.benchmark"
db.runCommand(cmd)
cmd = "commit"
db.runCommand(cmd)
cmd = "detach database toMerge"
db.runCommand(cmd)

But, I got the following error.

OperationalError: cannot commit - no transaction is active

Even though the error, the result db is well merged. And without the begin/commit, there's no error at all.

  • Why can't I run the begin/commit command?
  • Is it absolutely necessary to run begin/commit to safely merge the db files? The post says that the purpose of begin/commit is for speedup. Then, what's the difference between using and not using the begin/commit command in terms of speedup?
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Apparently, Cursor.execute doesn't support the 'commit' command. It does support the 'begin' command but this is redundant because sqlite3 begins them for you anway:

>>> import sqlite3
>>> conn = sqlite3.connect(':memory:')
>>> cur = conn.cursor()
>>> cur.execute('begin')
<sqlite3.Cursor object at 0x0104B020>
>>> cur.execute('CREATE TABLE test (id INTEGER)')
<sqlite3.Cursor object at 0x0104B020>
>>> cur.execute('INSERT INTO test VALUES (1)')
<sqlite3.Cursor object at 0x0104B020>
>>> cur.execute('commit')

Traceback (most recent call last):
  File "<pyshell#10>", line 1, in <module>
    cur.execute('commit')
OperationalError: cannot commit - no transaction is active
>>> 

just use the commit method on your Connection object.

As for your second question, it is not absolutely necessary to call begin/commit when merging the files: just be sure that there is absolutely no disk error, modifications to the db's or people looking at the computer the wrong way while it is happening. So begin/commit is probably a good idea. Of course, if the original db's aren't being modified (I honestly haven't looked) then there is no need for that even. If there is an error, you can just scrap the partial output and start over.

It also provides a speedup because every change doesn't have to be written to disk as it occurs. They can be stored in memory and written in bulk. But as mentioned sqlite3 handles this for you.

Also, it's worth mentioning that

cmd = "attach "%s" as toMerge" % "b.db"

is wrong in the sense that it's depracated. If you want to do the wrong thing correctly, it's

cmd = 'attach "{0}" as toMerge'.format("b.db") #why not just one string though?

This is forward compatible with newer versions of python which will make porting code easier.

if you want to do the right thing, it's

cmd = "attach ? as toMerge"
cursor.execute(cmd, ('b.db', ))

This avoids sql injection and is, apparently, slightly faster so it's win-win.

You could modify your runCommand method as follows:

def runCommand(self, sql, params=(), commit=True):
    self.cursor.execute(sql, params)
    if commit:
        self.connector.commit()

now you can not commit after every single command by passing commit=False when you don't need a commit. This preserves the notion of transaction.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...