Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.4k views
in Technique[技术] by (71.8m points)

visual studio 2010 - OLEDB Import of CSV to VB.NET datatable reading '-' as 0

Greetings helpful delightful people,

I have a problem with reading CSV files and converting to datatables in VB.Net. If the CSV file contains a column full of '-' then on import into the datatable they appear as '0' and the entire column is formatted in a numeric format.

The code I have written is:

Public Function LoadCsvFile(filePath As String) As DataTable
    Try

        DTReadCSV = New DataTable

        Dim connection As System.Data.OleDb.OleDbConnection = New System.Data.OleDb.OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & Microsoft.VisualBasic.Left(filePath, InStrRev(filePath, "")) & ";Extended Properties=""text;HDR=Yes;FMT=Delimited""")

        Dim adapter As New OleDb.OleDbDataAdapter("SELECT * FROM [" + Microsoft.VisualBasic.Mid(filePath, InStrRev(filePath, "") + 1) + "]", connection)
        'Dim table As New DataTable()
        adapter.Fill(DTReadCSV)

        'now thats its nicely in a datatable 
        IntAmountRows = DTReadCSV.Rows.Count
        IntAmountColumns = DTReadCSV.Columns.Count


        'System.Diagnostics.Debug.Print(DTReadCSV.Rows.Item(1)(1).ToString)
        Return DTReadCSV

        Exit Function
    Catch ex As Exception
        MsgBox(ex.ToString())

        MsgBox(Err.Number & " " & Chr(13) & Err.Description)
    End Try
End Function

Please can someone smarter figure out how to combat this issue besides modifying the CSV file by taking out the '-' as blanks, which at the moment seems the only long winded way of importing these CSV files.

Many thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Using a Schema.INI you can describe to OleDB what the CSV looks like in detail including resulting column names and data types. For instance, given this data:

"Country","Capital City","Population", "Fake"
"France","Paris","2.25","-----"
"Canada","Toronto","2.5","-----"
"Italy","Rome","2.8","-----"

Create a Schema.ini file in the same folder; it can have multiple sections to define various CSVs in that folder. If there is a Schema.INI in the same folder as the CSV and it has an entry for your CSV, OleDB will automatically use it (nothing special to do).

The Schema.INI entry for the above data may look like this:

[Capitals.CSV]
ColNameHeader=True
CharacterSet=1252
Format=CSVDelimited
TextDelimiter="
Col1="Country" Text Width 254
Col2="Capital City" Text Width 254
Col3="Population" Single
Col4="Fake" Text Width 254

OleDb will use these definitions when reading that file, resulting in a string of dashes for the "Fake" column in the output:

enter image description here

One additional benefit (among many) of using a Schema.INI is that you can name the columns there rather than using or aliasing F1, F2, F3 etc. The SQL for that output was just "SELECT * FROM Capitals.CSV"

Tip (for others to come): to specify UTF8 as the character set, use CharacterSet=65001 in the schema.

For more information see
- Schema.ini Text File Driver
- Code Page Identifiers


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...