Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
105 views
in Technique[技术] by (71.8m points)

c# - how to validate a file is Proper CSV or not in .net core?

I have a APi which receives CSV files as IFormFile . I have to check if the sent file is a proper CS file or not. So i am doing below checks.

  1. Checking the File Extension.
  2. Checking the File content type.

Issue:- If any app will use the API, then it's feasible to change the file extension along with the content-type. So how to validate a proper CSV file? I didn't get any helpful article as of now.

e.g. a PDF file can be changed to a .CSV(in extension) file along with its content-type. But PDF file is not a valid CSV

NB:- Magic number is one of the process for .XLSX,.docx,.pdf etc.But for CSV its not applicable, tried the same & failed. Any other way to check it?

question from:https://stackoverflow.com/questions/65949566/how-to-validate-a-file-is-proper-csv-or-not-in-net-core

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Closest you can get would be a robust TryParse method.

But instead of re-inventing the wheel, try first a few libraries, they might do the job:

https://github.com/TinyCsvParser/TinyCsvParser

https://github.com/nreco/csv

Note that CSV parsing can be a difficult task even though it's a simple format.

Even if you can't use a library, there are plenty of ideas you can grab from them.

If I were to detect CSV content, I'd do the following:

  • ensure that a line contains readable characters
    • optionally detecting file enconding might help
  • ensure that a line isn't incredibly long, else it's likely to be binary, see #1
  • detect that first line has repeating separators
  • try parse lines

More or less this:

  • find 1st index of CR/LF or LF
  • read up to that
  • find separators in it
  • try parse the rest of the file, check against column count
    • if it fails then it's probably not CSV

It's pretty much all the heuristics you can try unless I'm mistaken.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...