Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
249 views
in Technique[技术] by (71.8m points)

How to remove unnecessary white space and sort in a datagrid from csv file, using C#?

I going to explain what I need.

In the end of the program, it will be able to input csv file, calculate and output the result. For now I'm doing it step by step.

  1. Able to import csv to datagridview (done)
  2. Remove unnecessary white space, sort it by name (in progress)
  3. Calculation

To makes my question clear and easy to understand, here is the csv file sample.

enter image description here

As you can see that there are repeated 'lotID' is every section, and 2 type to lotID.

And here is what i have done so far. let's call this pic a.I successfully filter out lotID of the 1st type lotID.

enter image description here

This is pic B , as you can see the 'LotID' of second type(MSA) is appear again in each section

enter image description here

As you can see in PIC A, the lotID of each section is not repeated, and it white space appear in each section. This is the first thing i try want to fix.

Secondly, I want to filter out the 'LotID' header of second type lotid.

Here is the code.

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;

namespace test2
{
    public partial class Form1 : Form
    {
        OpenFileDialog openFile = new OpenFileDialog();

        public Form1()
        {
            InitializeComponent();
        }


        private void Button1_Click(object sender, EventArgs e)
        {
            if (openFile.ShowDialog() == DialogResult.OK)
            {
                List<string[]> rows = File.ReadLines(openFile.FileName).Select(x => x.Split(',')).ToList();
                DataTable dt = new DataTable();
                List<string> headerNames = rows[0].ToList();
                foreach (var headers in rows[0])
                {
                    dt.Columns.Add(headers);
                }
                foreach (var x in rows.Skip(1))
                {
                    if (x.SequenceEqual(headerNames))   //linq to check if 2 lists are have the same elements (perfect for strings)
                      continue;     //skip the row with repeated headers
                    dt.Rows.Add(x);
                }

                dataGridView1.DataSource = dt;
            }
        }

        private void Form1_Load_1(object sender, EventArgs e)
        {
            openFile.Filter = "CSV|*.csv";
        }
    }
}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

For sorting by header & removing the blank rows, try this piece of code: (this requires you to know "Lot ID" will be the first column)

private void Button1_Click(object sender, EventArgs e)
{
    if (openFile.ShowDialog() == DialogResult.OK)
    {
        List<string[]> rows = File.ReadLines(openFile.FileName).Select(x => x.Split(',')).ToList();
        DataTable dt = new DataTable();
        List<string> headerNames = rows[0].ToList();
        foreach (var headers in rows[0])
        {
            dt.Columns.Add(headers);
        }
        foreach (var x in rows.Skip(1).OrderBy(r => r.First()))  //sort based on first column of each row
        {
            if (x.SequenceEqual(headerNames))   //linq to check if 2 lists are have the same elements (perfect for strings)
                continue;     //skip the row with repeated headers
            if (x.All(val => string.IsNullOrWhiteSpace(val))) //if all columns of the row are whitespace / empty, skip this row
                continue;
            dt.Rows.Add(x);
        }

        dataGridView1.DataSource = dt;
    }
}

As a kind of hackish way to remove a duplicated header line, you could try this:

if (x[0] == "Lot ID")
    continue;

instead of

if (x.SequenceEqual(headerNames))
    continue;

It's not very elegant, but it will work.

I'll add some explanation to the linq methods used:

File.ReadLines(openFile.FileName).Select(x => x.Split(',')).ToList();

Reads all the lines in the file, the .Select goes through each line and splits based on commma (since it is csv). Split by default returns an array of splitted values, and finally ToList() means this line returns a List of array of strings. The array contains individual cell values while the list contains rows.

List<string> headerNames = rows[0].ToList();

This saves the first row, which contains all the header names into a separate List which we can use later.

foreach (var x in rows.Skip(1).OrderBy(r => r.First()))

Skip() method ignores the first element in the list (and takes all the others), and OrderBy() sorts alphabetically, r => r.First() just means for each row "r", sort based on the First column inside "r.First()". "x" represents each row.

if (x[0] == "Lot ID")

This is not LINQ anymore, it just checks if the first column of this row is "Lot ID" and if it is, "continue" skips to the next row in foreach.

Hope my explanations helped you learn! A link to some basic LINQ is in the comments.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...