IEnumerable, IQueryable, ICollection and More
When working with enumerables in C# there are many different data structures that can be used to represent the set of data. Often, these are used interchangeably, creating unnecessary overhead at best, or non-compiling code at worst. It's critical to understand the functionality of each in order to write performant code.
All of the constructs I'll be detailing below are defined as both generic and non-generic interfaces. The vast majority of the time, you'll want to use their generic counterparts, giving you access to their Linq extension methods (Select, Count, Max, OrderBy, etc.).
IEnumerable
IEnumerable is the base of all other enumerables in C# (note the extensive "Derived" section in the Microsoft documentation). Any class that wants to implement IEnumerable need only define a single method, GetEnumerator()
, which powers foreach
. Perhaps the most commonly used IEnumerable is String, although you rarely think of it as such.
I very infrequently find a need to implement IEnumerable directly. It can be useful if you want to specify the broadest possible enumerable that a property or variable could represent, however you will lose a lot of functionality by doing this, and you may end up referring to objects that do not yet actually exist in memory.
Speaking of which...
IQueryable
IQueryable extends IEnumerable. It represents a plan to create some other sort of enumerable, but by itself it does not contain any of the objects in that enumerable. It remains merely a plan until you either enumerate it with a for loop or call to ToArray()
or similar method to actually execute the query plan and retrieve those objects from the related provider.
One of the most common uses of IQueryable is Entity Framework Core's DbSet. It is typically used to represent an entire Database table, and for the most part you're going to want to load only a small subset of that table into memory for a given operation. IQueryable will allow you to plan out the database query that you want to write before you actually execute it, saving you from having to load an entire table into application memory in order to work with it.
An IQueryable in EF Core...
dbContext.Entities.Where(e => e.foo == "bar");
... roughly translates into the SQL query:
SELECT * FROM Entities WHERE Entities.foo = 'bar'
The plan in an IQueryable does not have to be executed entirely in the database. Operations unsupported in the database will be executed in memory.
dbContext.Entities.Where(e => e.foo == "bar") // Will be translated to SQL and passed the data
.Select(e => new EntityObject(e.foo)); // Will be evaluated in memory after the rows are returned from the database
ICollection
ICollection extends IEnumerable as well, and is what you probably think of when you envision an enumerable. It represents an actual, in-memory collection of objects or structures. When you enumerate an IQueryable through ToArray()
, ToList()
, or similar you are pulling data from the queryable source and transforming it into an ICollection object (Array and List in those examples, respectively).
The majority of concrete datasets you work with in C# are ICollections. There are many implementations, but below are a few of the most common ones, along with a few less common ones that I tend to enjoy.
IList
IList is a collection of which any individual entry can be accessed by specifying its index in the list. By extension, this also implies that the collection is in a pre-set order. Array and List are commonly used examples of this interface, with the former representing a fixed-length data structure, and the latter allowing for dynamic length adjustment as you add or remove items from it.
list[2]; // Returns the 3rd object in list
ISet
ISet represents a series of unique objects. Objects are compared to all other objects (using their default comparison or an IEqualityComparer you provide) in the collection when they are added, and if they already exist, they are not added.
Stack and Queue
Stacks and Queues are collections optimized for adding and removing a specific single object at a time, although they do so in completely different ways. Queues are First-in-First-Out, meaning that added objects are appended to the end of a queue, and removed objects are pulled from the beginning of the queue (think of a line of shoppers at a supermarket). Stacks are the opposite, First-in-Last-Out. Objects are both added and removed from the "top" of the stack.
IDictionary
IDictionary represents a collection of key/value pairs. That is to say, the value portion of any given entry is accessed by providing the associated key object. This creates the restriction that all keys must be unique, although you can specify an IEqualityComparer if you want to compare your keys in a non-default way. Dictionary is the most commonly used implementation of this interface, which is implemented as hash table, making lookup by key very fast indeed (in big-O notation, O(1)).
ConcurrentBag, ConcurrentDictionary, ConcurrentStack, and ConcurrentQueue
If you're leveraging all of the benefits of asynchronous programming, you may find yourself needing to manipulate a collection from multiple threads simultaneously. This can result in thread locking if multiple operations attempt to access the collection at the same time. Enter the Concurrent family. For a bit of extra overhead, you can safely manipulate these collections from multiple threads all at once. ConcurrentDictionary, ConcurrentStack, and ConcurrentQueue should make it fairly apparent as to what they replace (Dictionary, Stack, Queue, respectively), but ConcurrentBag doesn't have a direct corollary in this article. It's a good replacement for an IList, however it must be noted that there is no guaranteed order or index access, so if these are required, you must convert your ConcurrentBag to one of those structures after you've finished manipulating it.
IEnumerable IRL
Recently I mistakenly used the incorrect enumerable in my code. Like most, I've always found that past errors make for a wonderful case study of "what not to do", so I figure we can use it here. As part of an MVC application, I was attempting to pull some entities from a SQL database, map them to a view model, and then display a view that listed them out for the user to see. My code looked something like this:
public class ViewModel {
public IEnumerable<ViewModelEnumerableObject> Objects { get; set;}
}
public IActionResult ListObjects() {
var viewModel = new ViewModel {
Objects = dbContext.EnumerableObjects
.Take(100)
.Select(c => mapper.Map<ViewModelEnumerableObject>(c))
};
foreach(var obj in viewModel.Objects) {
obj.foo = GenerateFooValue(obj.bar);
}
return View(viewModel);
}
mapper.Map<ViewModelEnumerableObject>(c)
is a reference to a method from Automapper, which essentially transforms an incoming object into the type specified by the generic according to a pre-defined plan. It's an excellent package if find yourself frequently mapping objects from one type to another (which seems to be most all of the C# projects I undertake).
The issue I experienced was that none of the obj.foo
values were appearing in the view, despite being able to step through GenerateFooValue
with my debugger, and the value being set after that method had executed. If you've been following along (and I've done a half-decent job of writing this all up), you might see the issue with my code already. Let's walk through it.
var viewModel = new ViewModel {
objects = dbContext.EnumerableObjects
.Take(100)
.Select(c => mapper.Map<ViewModelEnumerableObject>(c))
};
dbContext.EnumerableObjects
represents a SQL table via a DbSet, an IQueryable object.
This code creates a SQL query that looks a bit like: SELECT TOP 100 * FROM EnumerableObjects
That automapper function (the Select part), needs to be executed in application memory after the query has completed. The resulting viewModel.Objects
is an IQueryable, which, as IQueryable extends IEnumerable, is a permitted type for that property.
foreach (var obj in viewModel.Objects) {
obj.foo = GenerateFooValue(obj.bar);
}
At the beginning of the foreach
, viewModel.Objects
is evaluated, and thus the query plan is executed. The data is pulled from the SQL database, and the EnumerableObjects
results are transformed by Automapper into ViewModelEnumerableObjects
. Each of the resulting objects are enumerated and their foo
property is updated. However, the important thing to note is that these objects will only exist for the duration of the foreach
's brackets. When I later enumerate viewModel.Objects
again in the resulting .cshtml template, those updates no longer exist. This is because a brand new enumerable is created according to IQueryable query plan in viewModel.Objects
, with the original obj.foo
values.
So how do we fix that code? The issue stems primarily from the fact that I declared ViewModel.Objects
as an IEnumerable. This allowed me to mistakenly save an IQueryable into that property. Since I needed my changes to persist in object memory, I should have declared ViewModel.Objects
as ICollection instead, which would have ensured that the enumerable be loaded into memory at the moment of assignment.
public class ViewModel {
public ICollection<ViewModelEnumerableObject> Objects { get; set;}
}
public IActionResult ListObjects() {
var viewModel = new ViewModel {
Objects = dbContext.EnumerableObjects
.Take(100)
.ToArray(c => mapper.Map<ViewModelEnumerableObject>(c))
};
foreach(var obj in viewModel.Objects) {
obj.foo = GenerateFooValue(obj.bar);
}
return View(viewModel);
}
This time, when viewModel.Objects
is enumerated, both in the foreach
and the subsequent .cshtml, the same array in application memory will be iterated both times, and thus the same foo
property is updated and accessed.