Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
102 views
in Technique[技术] by (71.8m points)

c# - Removing duplicate values in a column in DevEpress Spreadsheet

DevEpress Spreadsheet doesn't currently support "Remove Duplicate" function. I want to write a C# code to do this manually. I have a column of values. Some of them are duplicate and those duplicate values may or may not be adjacent. I want to remove corresponding row of duplicate values. I tried this code:

IWorkbook workbook = spreadsheetControl.Document;
        Worksheet worksheet = workbook.Worksheets["Sheet1"];
        CellRange range = worksheet.GetUsedRange();
        int LastRow = range.BottomRowIndex;
        //MessageBox.Show(Convert.ToString(LastRow));
        for (int i = 0; i < LastRow; i++)
        {
            for (int j = i+1; j < LastRow; j++)
            {
                if (worksheet.Cells[i,0].Value == worksheet.Cells[j,0].Value)
                {
                    worksheet.Rows[j].Delete();
                }
            }
        }

It doesn't work properly.

question from:https://stackoverflow.com/questions/65873905/removing-duplicate-values-in-a-column-in-devepress-spreadsheet

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It doesn't remove non-adjacent duplicate values.

You have a bug in your loops. The indexes are not accounting for the fact the collection is changing due to the Delete call.

You're also doing more work than necessary by going through all rows multiple times.

This creates a list of DevExpress Row objects that need to be deleted. And it only goes through the rows once.

//Let's keep track of every value seen as we go through the rows
var valuesSeen = new HashSet<string>();
//Rows marked for deletion
var duplicateRows = new List<Row>();
for (int i = 0; i < LastRow; i++)
{
    string cellValue = worksheet.Cells[i, 0].DisplayText;
    //Returns false if the value already exists
    if (!valuesSeen.Add(cellValue))
    {
        //Mark this row for deletion since we've seen the value before
        duplicateRows.Add(worksheet.Rows[i]);
    }
}
//Delete all marked rows only after we're done identifying them.
foreach (var row in duplicateRows)
    row.Delete();

You may need to specify the namespace for Row given it's a fairly common class name.

In the above code I'm comparing the display text (not actual value). Use HashSet<CellValue> instead of HashSet<string> if you don't want this.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...