Ordering Sets in C++

Now some of you may be thinking “a set” like some kind of gang terminology. While I do belong to a gang of some hard core programmers here at DIC, I can assure you that we are only talk programming terms here. So I don’t want you to drive by my house or anyone to ask me to score some weed. However, if you want to know how to order objects in a set, besides being alphabetical, then you have come to the right purple house. Here we will show you how to order some integers in a different sorting manner than simply by ascending/alphabetical order. So take off your bandanna and kick it right here on the Programming Underground!

What is a set? A set is a type of container which stores unique objects in a sorted manner. What is meant by unique? It means that each object placed in a set (like a vector or a hash) formulates an associative key which is not repeated for any other elements. If you store the same object again into the same set, it’s key will be recalculated and overwrite the existing object. This means it is great for taking lists with duplicates values and “weeding” them out. Yeah ok, I can see where you would think I am a drug dealer or something.

Assume we have a list of values 1, 5, 2, 4, 3, 2, 8…. When we store each of these values into a set it will calculate a key based on the object and store it in a sorted order. It will also overwrite the double values thus removing them. You would add 1, then you would add 5, you would add 2 and 4 (which would go between the 1 and 5), you would add 3 (which would go between 2 and 4) and when it hits the second 2 it would go to the place where the first 2 was stored and overwrite it. You would then add the 8 and the result of the set would be…. 1, 2, 3, 4, 5, 8. An ordered set but with no repeat values.

Now that is all fine and dandy, but what if we wanted to create a set that wasn’t ordered in ascending order? What if we wanted a set that was descending? Or maybe we have a list of car objects and we wanted them sorted not by their model name, but by their make/color/year combined value?

Well it so happens that the C++ language enthusiasts thought of a way. When declaring the set instead of a simple declaration like set mylist; you can supply a structure (or function pointer) which implements a comparison function. This will tell the set how to compare two elements during its sort. In other languages, like Java, this would be called a Comparator. C# and VB.NET have similar methods as well.

During the sorting process of a set, C++ takes the value to be added and compares it to other elements already in the list to determine where to place the element. By supplying our own structure, and implementation of comparing, C++ will rely on our method for its comparing to know which way to sort. It does this by taking two values it is comparing and passing it to our function. Based on some arbitrary method that we come up with, we return true or false if one is greater than or lesser than another. Other languages implement a -1, 0, or 1 for the less than/equal/greater than scheme but that is not so here. The idea of equal won’t apply here because remember that if they are equal they are overwritten.

To show this in action we have some code which does two tests. First it creates a set with some standard numbers and then loops through them to show the result of a list in ordered sequence. The default behavior of a set. The second one implements a structure which has the () operator implemented to change the comparison between two objects to be the opposite of a normal sorted comparison. The result is that the list is the same as the first except in DESCENDING order instead of ASCENDING order.

This is a simple example of this process, but as you see the comparing function could actually be quite sophisticated, could access child member variables of the objects, call their functions and use the data IN THE OBJECT to actually determine which should appear first in the order.

#include <iostream>

#include <set>
using namespace std;

struct classcomp {
  bool operator() (const int& lhs, const int& rhs) const
  {return lhs>rhs;}
};

int main() {
	// Setup a standard set
	set<int> a;

	// Even though we insert in unsorted order, they come out sorted
	// according to value. Standard thing with a Set container.
	a.insert(1);
	a.insert(5);
	a.insert(2);

	// Iterator for ints
	set<int>::iterator it;

	// Show they are in order
	for ( it=a.begin(); it != a.end(); it++ )
		cout << " " << *it;

	cout << endl << endl;

	// New set, but this time with a structure used to
	// control the ordering. Takes two elements and returns true
	// false based on our comparison. (look at struct above)
	set<int,classcomp> b;

	// Again we insert just like "a"
	b.insert(1);
	b.insert(5);
	b.insert(2);

	// Iterator that takes int and classcomp struct
	set<int,classcomp>::iterator itt;

	// Iterate over this set
	// You will notice it is now in reverse order because
	// when we inserted it used the struct to determine sort order
	for ( itt=b.begin(); itt != b.end(); itt++ )
		cout << " " << *itt;

	cout << endl;

	return 0;
}

Some of the tricks to watch out for is dealing with the struct being defined as a parameter to the set. Notice that we supply the datatype for this set (notice the “int” for integer saying that this set is an integer set), then a comma, then the structure which implements our comparator function. Now also notice that our iterator (used for looping or “iterating” through the set) also defines the same datatype/comparator setup. This is crucial to keep in mind.

As you read through the code above, and follow my in-code comments, you will quickly see that there are two tests. The first one is the pretty straight forward set setup, with basic integers that will come out sorted, and the second test (test “b”) that will show the same as the first but with our comparing modification.

The last thing I want to point out is that the comparator function in our structure “classcomp” is passed two references to the objects. This means that you are actually given access to the two items compared, not copies. In addition, they are passed as “contants” so while you have complete access to the object, you do not have permission to actually modify the object values. Any attempt to do so will cause a nice fat compiler error.

So in the comparator you could implement something like…

if (strcmp(rhs->MakeColorYear(), lhs->MakeColorYear()) > 0) { 
	return true;
} 
else { return false; }

…to compare the make/color/year values of some car objects. Then test them for equality and based on the make/color/year values sort the cars in our set accordingly. Then, with our iterator, we could iterate through the set and list out the cars. Perhaps we can print their make, color and year values as we go.

Experiment with this and see if you can take a custom object, place it in the set, and sort it based on a value of the object. Keep in mind that the comparison function in our structure is going to determine that order. Then once you master this see if you can implement the sort order using a function pointer. The function pointer will go in place of our structure when setting up the set.

With this in your toolbox you are ready to roll with the DIC posse as we explore the underground even further. Thanks for reading and remember, if some gang wants you to join their “set” you be sure to set them out straight with some of your C++ set knowledge….. well that or run. :blink:

About The Author

Martyr2 is the founder of the Coders Lexicon and author of the new ebooks "The Programmers Idea Book" and "Diagnosing the Problem" . He has been a programmer for over 25 years. He works for a hot application development company in Vancouver Canada which service some of the biggest tech companies in the world. He has won numerous awards for his mentoring in software development and contributes regularly to several communities around the web. He is an expert in numerous languages including .NET, PHP, C/C++, Java and more.