.NET List Speed

//
// Written by Grant Jenks
// http://www.grantjenks.com/
//
// DISCLAIMER
// THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE
// LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
// OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND,
// EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE
// ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.
// SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY
// SERVICING, REPAIR OR CORRECTION.
//
// Copyright
// This work is licensed under the
// Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.
// To view a copy of this license, visit
// http://creativecommons.org/licenses/by-nc-nd/3.0/
// or send a letter to Creative Commons, 171 Second Street, Suite 300,
// San Francisco, California, 94105, US.A
//
// Description
// Recently I wondered about the performance of List<T> vs. LinkedList<T>.
// If all I need is insertion and traversal (no O(1) random access as List<T>
// provides, nor O(1) item removal as LinkedList<T> provides), then what's
// the performance comparison of using a LinkedList<T> vs. a List<T>.
//
// Data
// Object          Count           Append (ms)     Accumulate (ms)
// List            1               0               0
// List            10              0               0
// List            100             0               0
// List            1000            0               0
// List            10000           0               0
// List            100000          3               1
// List            1000000         17              4
// List            10000000        94              46
// List            100000000       919             468
// LinkedList      1               0               1
// LinkedList      10              0               0
// LinkedList      100             0               0
// LinkedList      1000            0               0
// LinkedList      10000           0               0
// LinkedList      100000          58              1
// LinkedList      1000000         138             12
// LinkedList      10000000        1815            100
// LinkedList      100000000       1848201         392487
//
// Discussion of Data
// Based on the above, a few points are clear:
// 1. If you've fewer than 10,000 elements, don't worry about the choice.
// 2. For larger lists, the List<T> data type is faster by a factor of 2 to 20.
// 3. For really large lists, the List<T> data type is faster by 3-4 magnitudes.
// Note that the times for LinkedList above are not inaccurate at 100,000,000
// elements. With that many elements, Append took 30.8 minutes and Accumulate
// took 6.5 minutes. The parallel List<T> operation took no more than 1 second.
//
// Analysis
// Based solely on the performance characteristics described on MSDN, that is:
// 1. Random removal/insertion from a List<T> object is an O(n) operation
// 2. Random removal/insertion from a LinkedList<T> object is an O(1) operation
// 3. Random access into a List<T> object is an O(1) operation
// 4. Random access into a LinkedList<T> object is an O(n) operation
// I'm inferring that List<T> objects would be better termed ArrayList<T>
// because they are array-backed. In contrast, the LinkedList<T> object uses
// a LinkedListNode<T> to chain together items in a list.
//
// To be honest, I thought the LinkedList<T> datatype would be much faster. I
// wasn't sure whether it would be faster than the List<T> datatype (hence this
// program). It seems clear that object creation in .NET is expensive. On every
// insert, O(n), to LinkedList<T>, a new LinkedListNode<T> is created to
// contain the data. Instead, the List<T> object does an allocation only
// O(log(n)) times (assuming they double the size of the backing array when the
// capacity is exhausted). I believe this accounts for the majority of the
// extra time spent appending to the LinkedList<T> datatype. This object
// creation causes LinkedList<T> to be 2-20 times slower.
//
// With 100,000,000 items, a scary thing happens. From 100,000 to 10,000,000
// items, the LinkedList<T> datatype scales linearly. This is to be expected.
// So what happens to violate this linear scaling at 100,000,000? One word:
// paging. It's not trivial to determine the size of a LinkedListNode<T> but
// we could guess at 8 methods, 4 fields, padding and GC tracking that the
// size may be on the order of 12 * 8 = ~100 bytes. At 100,000,000 items,
// that'll require 10,000,000,000 (ten gigabytes) of memory. My laptop has only
// four gigabytes of memory and so needs to page out about six gigabytes of
// data. In contrast, the List<T> requires only ~4 bytes/item so at
// 100,000,000 items, it consumes 400 megabytes and will fit entirely in
// memory. The takeaway, and to be expected: the disk is about 1,000 times
// slower than memory.
//

using System;
using System.Diagnostics;
using System.Collections.Generic;

class CSharpListSpeed
{

//
// Time and report speed of 'List'
//

static void TimeList(int iterations)
{
   int accum = 0;
   Stopwatch stopwatchAppend = new Stopwatch();
   Stopwatch stopwatchAccum = new Stopwatch();
   List<int> list = new List<int>();

   stopwatchAppend.Start();

   for (int i = 0; i < iterations; i++)
   {
      list.Add(i);
   }

   stopwatchAppend.Stop();

   stopwatchAccum.Start();

   foreach (int value in list)
   {
      accum += value;
   }

   stopwatchAccum.Stop();

   Console.WriteLine("List\t{0}\t{1}\t{2}", iterations,
      stopwatchAppend.ElapsedMilliseconds, stopwatchAccum.ElapsedMilliseconds);
}

//
// Time and report speed of 'LinkedList'
//

static void TimeLinkedList(int iterations)
{
   int accum = 0;
   Stopwatch stopwatchAppend = new Stopwatch();
   Stopwatch stopwatchAccum = new Stopwatch();
   LinkedList<int> list = new LinkedList<int>();

   stopwatchAppend.Start();

   for (int i = 0; i < iterations; i++)
   {
      list.AddLast(i);
   }

   stopwatchAppend.Stop();

   stopwatchAccum.Start();

   foreach (int value in list)
   {
      accum += value;
   }

   stopwatchAccum.Stop();

   Console.WriteLine("LinkedList\t{0}\t{1}\t{2}", iterations,
      stopwatchAppend.ElapsedMilliseconds, stopwatchAccum.ElapsedMilliseconds);
}

//
// Time and report speed of 'List' and 'LinkedList'.
//

static int Main(String[] args)
{
   // Pause at the beginning so that tools can be attached.

   Console.WriteLine("Press enter when ready...");
   Console.ReadLine();

   Console.WriteLine("Object\tCount\tAppend (ms)\tAccumulate (ms)");

   for (int i = 1; i <= 100000000; i *= 10)
   {
      TimeList(i);
   }

   for (int i = 1; i <= 100000000; i *= 10)
   {
      TimeLinkedList(i);
   }

   return 0;
}
}
random/net_list_speed.txt · Last modified: 2011/07/07 05:13 by grant