I've got a big ol' MySQL 5.1 database, and for a variety of stupid reasons, I've been storing, I believe, UTF8 characters encoded as LATIN1 in a UTF8 table. It's... strange. And I'd like to fix it.
The MySQL - Convert latin1 characters on a UTF8 table into UTF8 question seems to work -- a column at a time. But I have 24 tables and dozens of columns to convert. I'm really looking for a solution that'll convert at least a table at once.
For reference, the single-column solution that works for me is:
UPDATE foo SET col1 = CONVERT(CAST(CONVERT(col1 USING latin1) AS binary) USING utf8);
For tables, I can do:
ALTER TABLE foo CONVERT TO CHARACTER SET latin1;
ALTER TABLE foo CONVERT TO CHARACTER SET binary;
ALTER TABLE foo CHARACTER SET utf8 COLLATE utf8_unicode_ci;
which gets me very close -- however, the CONVERT TO CHARACTER SET binary
step turns all my VARCHAR columns into VARBINARY and my TEXT columns into BLOBs in one fell swoop. I can go through and change them back and all appears to be well... but then I'm back in the "let's modify all the columns individually" world -- in which case, I may just as well
I've tried about 50 variations on those SQL statements, but I can't find one that both leaves my columns in character data types and encodes the data properly.
Any suggestions?
Update: Deciding to just fix the columns rather than waiting for a database- or table-wise solution, I came up with:
#!/usr/bin/env ruby
require 'rubygems'
require 'mysql2'
CONNECT_OPTS = {} # whatever you want
Mysql2::Client.default_query_options.merge!(:as => :array)
conn = Mysql2::Client.new(CONNECT_OPTS)
tables = conn.query("SHOW TABLES").map {|row| row[0] }
# See http://dev.mysql.com/doc/refman/5.0/en/charset-column.html
# One might want to include enum and set columns; I don't have them
TYPES_TO_CONVERT = %w(char varchar text)
tables.each do |table|
puts "converting #{table}"
# Get all the columns and we'll filter for the ones we want
columns = conn.query("DESCRIBE #{table}")
columns_to_convert = columns.find_all {|row|
TYPES_TO_CONVERT.include? row[1].gsub(/(d+)/, '')
}.map {|row| row[0]}
next if columns_to_convert.empty?
query = "UPDATE `#{table}` SET "
query += columns_to_convert.map {|col|
"`#{col}` = convert(cast(convert(`#{col}` using latin1) as binary) using utf8)"
}.join ", "
puts query
conn.query query
end
... which gets the job done. Amusingly, this runs on my database in 36 seconds, rather than the ALTER TABLE route which took 13 minutes (and had the VARBINARY problem) or the mysqldump solutions which would take upwards of twenty assuming I could get them to run.
I'll still accept an answer if someone knows an elegant way to do this for a whole database or table in one step.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…