Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
233 views
in Technique[技术] by (71.8m points)

Read CSV in C# .NET without inputting field names

I need to read a CSV file for use as an array in C# .NET framework. The equivalent of e.g.

var animalList = new List<Animal>()
{
    new Animal{Name = "German Shepherd",
    Height = 25,
    Weight = 77},
    new Animal{Name = "Chihuahua",
    Height = 7,
    Weight = 4.4},
};

But stored and read from a CSV file with columns Name,Height,Weight. I can see methods to do this with and without packages online, e.g. with the Lumen CSV Reader package. However, I have two issues:

  1. Ideally I'd like to do this without installing anything (like the Lumen CSV Reader package)
  2. My CSV has thousands of fields, so the part where
public class SearchParameters  
    {  
        public string FirstName{ get; set; }  
        public string LastName{ get; set; }  
        public string Email{ get; set; }  
    }  

is written in the tutorial is impractical, as I don't want to write out all the field names. I would like to just read them from the header of the CSV. Anyone know how to do this?

The CSV files are pretty large (up to around 1,000 columns and 20,000 rows - most elements are Boolean: True or False). Reading them in doesn't have to be the most efficient thing in the world (but I will need the final array to be queried by System.LINQ with maximum speed).

question from:https://stackoverflow.com/questions/65888718/read-csv-in-c-sharp-net-without-inputting-field-names

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Some packages do allow for working on the csv file with some kind of a record type, that you can then read by index/header-name (cannot find the one I remember). But considering the size of the input, I am unsure how regular csv deserializers will perform.

Please consider, that if there is no underlying class to represent a record, then at some point you will have to tell the code what type to use (each time you access a property). You could write a (for example) python script, that based on the first two lines creates the *.cs file for the class, and you compile it into the project.

Regarding not using any packages... well you could write some simple code, where you split the line. If it is guaranteed, that none of the fields contain a comma (or the separator) and a linebreak, it could work - but you still have to write a dynamic program, that matches csv records with a property and somehow finds a proper deserializer for that type. I would strongly suggest using a library for this, like CsvHelper.

As a sidenote, if you are willing to consider alternatives, I would load this thing into a key-value database (you can simulate it with an RDBMS, though it wont be super fast). It might be easier to work with SQL.

TL;DR

  • option 1: generate a class with a script and then use a nuget package to handle the serialization (kind of the 'spray and pray' method) - linq will be available like normal
  • option 2: use a database, that is more prepared for large datasets

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...