There is a MySQL dump and I need to get an image from it. I have written a small text parser to parse this file, but I have a problem with the encoding that I cannot solve in any way. Here is a snippet of the MySQL dump:
I pasted it as a screenshot because if you copy and paste here, the bytes (0xFF,0xD8,0xFF,0xE0) in the image change to strange characters (???JFIF).
Here is the code snippet where I am trying to process the image:
List<ImageRecord> ImagesList = new List<ImageRecord>();
private void Parse(byte[] sqlFile)
{
var sql = Encoding.UTF8.GetString(sqlFile);
string strStart = @"INSERT INTO `images` VALUES (";
string strEnd = @"');";
int Start = sql.IndexOf(strStart, 0) + strStart.Length;
int End = sql.IndexOf(strEnd, Start);
var value = sql.Substring(Start, End - Start);
var valueslist = value.Split("'),('");
foreach (var imagedata in valueslist)
{
ImageRecord cfg = new ImageRecord(imagedata);
this.ImagesList.Add(cfg);
}
}
public class ImageRecord
{
public int id { get; set; }
public DateTime timestamp { get; set; }
public string user { get; set; } = String.Empty;
public byte[] imagedata { get; set; }
public ImageRecord() { }
public ImageRecord(string sqlpart)
{
string value = sqlpart;
if (sqlpart[0] == ''')
value = value.Substring(1, value.Length - 1);
var valueslist = value.Split("', '");
this.id = Convert.ToInt32(valueslist[0]);
this.timestamp = Convert.ToDateTime(valueslist[1]);
this.user = valueslist[2];
this.imagedata = Encoding.UTF8.GetBytes(valueslist[3]);
}
}
I understand that the problem is that I am reading the file as UTF8 and those bytes are converted to characters, but I don't know how to do it differently. I also tried this option: get the position in the document where the image starts, go back to the file in byte representation and get the bytes, but that doesn't work because the position is not the same and the resulting file doesn't start with 0xFF, 0xD8,0xFF, 0xE0. but a little earlier (in the middle of the table description) and the file length does not match what I need. It turns out that I can only navigate through this file if I read it in UTF8, but I need to get the document fragment as it is.
In this example, the image is in *.jpg format, but can be in any other format.
question from:
https://stackoverflow.com/questions/65950830/net-parse-blob-data-from-mysql-dump-file 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…