MongoDB C# Aggregating and Grouping By Distinct Fields

This post describes how to do the equivalent of a SQL SELECT DISTINCT in MongoDB on multiple fields. This seems to be a difficult solution to search for at the moment. The MongoDB .NET driver documentation links to the unit test files on GitHub for examples on aggregation related items, but the links are broken at the time of this post. Looking at the repos on GitHub is useful to find the actual tests, but for the specific problem I was trying to solve, it wasn't entirely helpful.

Basically what I was trying to do would be relatively easy with SQL -- a SELECT DISTINCT from a table/collection based on multiple fields. I really just wanted to get the unique combinations of several fields at the database layer without having to retrieve all the documents from the collection, and then figuring out which were distinct.

Below is a bit of a contrived example using an IMongoCollection<Location> named Locations where the code is grouping and retrieving the distinct set of records for State, StateShortName, StateSlug, and CountrySlug into a List<State>.

var uniqueStates = await _context
        .Locations
        .Aggregate()
        .Group(
            i => new { i.State, i.StateShortName, i.StateSlug, i.CountrySlug },
            g => new State
            {
                Name = g.First().State,
                ShortName = g.First().StateShortName,
                Slug = g.First().StateSlug,
                CountrySlug = g.First().CountrySlug
            })
        .ToListAsync();

The first argument to the Group method is the new _id for the group, and the second is the actual group/output into the output object. The group output needs to correspond to accumulator objects, such as First, Distinct, Max, Min, Sum, etc. This is also possible using BsonDocument, but I wanted to use expressions and make it more strongly typed.

It's more or less the equivalent of something like the below in SQL:

SELECT DISTINCT
	l.State, l.StateShortName, l.StateSlug, l.CountrySlug
	FROM Locations l

There may be a more straightforward way to do this, and if you know a better way, please feel free to let me know.