Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
362 views
in Technique[技术] by (71.8m points)

c# - How to Quickly Remove Items From a List

I am looking for a way to quickly remove items from a C# List<T>. The documentation states that the List.Remove() and List.RemoveAt() operations are both O(n)

This is severely affecting my application.

I wrote a few different remove methods and tested them all on a List<String> with 500,000 items. The test cases are shown below...


Overview

I wrote a method that would generate a list of strings that simply contains string representations of each number ("1", "2", "3", ...). I then attempted to remove every 5th item in the list. Here is the method used to generate the list:

private List<String> GetList(int size)
{
    List<String> myList = new List<String>();
    for (int i = 0; i < size; i++)
        myList.Add(i.ToString());
    return myList;
}

Test 1: RemoveAt()

Here is the test I used to test the RemoveAt() method.

private void RemoveTest1(ref List<String> list)
{
     for (int i = 0; i < list.Count; i++)
         if (i % 5 == 0)
             list.RemoveAt(i);
}

Test 2: Remove()

Here is the test I used to test the Remove() method.

private void RemoveTest2(ref List<String> list)
{
     List<int> itemsToRemove = new List<int>();
     for (int i = 0; i < list.Count; i++)
        if (i % 5 == 0)
             list.Remove(list[i]);
}

Test 3: Set to null, sort, then RemoveRange

In this test, I looped through the list one time and set the to-be-removed items to null. Then, I sorted the list (so null would be at the top), and removed all the items at the top that were set to null. NOTE: This reordered my list, so I may have to go put it back in the correct order.

private void RemoveTest3(ref List<String> list)
{
    int numToRemove = 0;
    for (int i = 0; i < list.Count; i++)
    {
        if (i % 5 == 0)
        {
            list[i] = null;
            numToRemove++;
        }
    }
    list.Sort();
    list.RemoveRange(0, numToRemove);
    // Now they're out of order...
}

Test 4: Create a new list, and add all of the "good" values to the new list

In this test, I created a new list, and added all of my keep-items to the new list. Then, I put all of these items into the original list.

private void RemoveTest4(ref List<String> list)
{
   List<String> newList = new List<String>();
   for (int i = 0; i < list.Count; i++)
   {
      if (i % 5 == 0)
         continue;
      else
         newList.Add(list[i]);
   }

   list.RemoveRange(0, list.Count);
   list.AddRange(newList);
}

Test 5: Set to null and then FindAll()

In this test, I set all the to-be-deleted items to null, then used the FindAll() feature to find all the items that are not null

private void RemoveTest5(ref List<String> list)
{
    for (int i = 0; i < list.Count; i++)
       if (i % 5 == 0)
           list[i] = null;
    list = list.FindAll(x => x != null);
}

Test 6: Set to null and then RemoveAll()

In this test, I set all the to-be-deleted items to null, then used the RemoveAll() feature to remove all the items that are not null

private void RemoveTest6(ref List<String> list)
{
    for (int i = 0; i < list.Count; i++)
        if (i % 5 == 0)
            list[i] = null;
    list.RemoveAll(x => x == null);
}

Client Application and Outputs

int numItems = 500000;
Stopwatch watch = new Stopwatch();

// List 1...
watch.Start();
List<String> list1 = GetList(numItems);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());

watch.Reset(); watch.Start();
RemoveTest1(ref list1);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
Console.WriteLine();

// List 2...
watch.Start();
List<String> list2 = GetList(numItems);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());

watch.Reset(); watch.Start();
RemoveTest2(ref list2);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
Console.WriteLine();

// List 3...
watch.Reset(); watch.Start();
List<String> list3 = GetList(numItems);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());

watch.Reset(); watch.Start();
RemoveTest3(ref list3);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
Console.WriteLine();

// List 4...
watch.Reset(); watch.Start();
List<String> list4 = GetList(numItems);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());

watch.Reset(); watch.Start();
RemoveTest4(ref list4);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
Console.WriteLine();

// List 5...
watch.Reset(); watch.Start();
List<String> list5 = GetList(numItems);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());

watch.Reset(); watch.Start();
RemoveTest5(ref list5);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
Console.WriteLine();

// List 6...
watch.Reset(); watch.Start();
List<String> list6 = GetList(numItems);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());

watch.Reset(); watch.Start();
RemoveTest6(ref list6);
watch.Stop(); Console.WriteLine(watch.Elapsed.ToString());
Console.WriteLine();

Results

00:00:00.1433089   // Create list
00:00:32.8031420   // RemoveAt()

00:00:32.9612512   // Forgot to reset stopwatch :(
00:04:40.3633045   // Remove()

00:00:00.2405003   // Create list
00:00:01.1054731   // Null, Sort(), RemoveRange()

00:00:00.1796988   // Create list
00:00:00.0166984   // Add good values to new list

00:00:00.2115022   // Create list
00:00:00.0194616   // FindAll()

00:00:00.3064646   // Create list
00:00:00.0167236   // RemoveAll()

Notes And Comments

  • The first two tests do not actually remove every 5th item from the list, because the list is being reordered after each remove. In fact, out of 500,000 items, only 83,334 were removed (should have been 100,000). I am okay with this - clearly the Remove()/RemoveAt() methods are not a good idea anyway.

  • Although I tried to remove the 5th item from the list, in reality there will not be such a pattern. Entries to be removed will be random.

  • Although I used a List<String> in this example, that will not always be the case. It could be a List<Anything>

  • Not putting the items in the list to begin with is not an option.

  • The other methods (3 - 6) all performed much better, comparatively, however I am a little concerned -- In 3, 5, and 6 I was forced to set a value to null, and then remove all the items according to this sentinel. I don't like that approach because I can envision a scenario where one of the items in the list might be null and it would get removed unintentionally.

My question is: What is the best way to quickly remove many items from a List<T>? Most of the approaches I've tried look really ugly, and potentially dangerous, to me. Is a List the wrong data structure?

Right now, I am leaning towards creating a new list and adding the good items to the new list, but it seems like there should be a better way.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

List isn't an efficient data structure when it comes to removal. You would do better to use a double linked list (LinkedList) as removal simply requires reference updates in the adjacent entries.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...