Variance and Standard Deviation of An Array in C#

Statistics… the bane of some coders existence. Why would it be any fun unless the stats say you are a superstar sports athlete (which coders are not), are going to win a huge prize like the lottery (which coders realize is futile with something like 1 in 13 million PER SELECTION) or that the probability for love is high… you mean with an actual girl? Hogwash! Never the less, statistics are what people want and what they ask from programmers often. So lets cover a few of the more basic ones. Given a range of values in an array, calculate the variance of those numbers. While we are at it, since we have the variance, why not get the standard deviation too. We will tackle the problem of these two basic functions in C#… all here on the blog with the widest variance and the most standard deviants on the net… the Programming Underground!

Lets start off with some basic definitions of these terms and what exactly our solution is going to tell us about the numbers.

Mean: the arithmetic average of all values. This is pretty much the average of the numbers. Add all the numbers and divide by the number of values to get the mean.

Variance: How spread out the values are from the mean. If it was on a dartboard, the circle in the middle being the mean, a large variance would have darts spread out from the middle and all over the place. A low variance would have the darts bunched up close to the middle.

Standard Deviation: The standard deviation is the the square root of the variance. The standard deviation is in the same ‘scale’ as the mean is. This makes these two indicators ‘comparable’.

Now knowing these definitions we can see that it isn’t too hard to get these values. The steps we will need to take are the following…

1) Get the average of the numbers in our set.
2) Use that average to go through each number in the set and subtract the average. Take its value and then square it.
3) Take the value of each number (average it has had average taken out and squared) and add them together.
4) Divide that number by 1 less than the total number of values. In the case we have one number, return 0.
5) Now steps 1 – 4 will get us our variance. All that is left is to take the square root of that to get our standard deviation.

Here is how we did it in C#….

private void button1_Click(object sender, EventArgs e) {
	 // Our sample values
	 int[] nums = { 12, 39, 45, 47, 56 };

	 // Get the variance of our values
	 double varianceValue = variance(nums);

	 // Now calculate the standard deviation 
	 double stndDeviation = standardDeviation(varianceValue);

	 // Print out our result
	 MessageBox.Show(stndDeviation.ToString());

}

private double variance(int[] nums) {
	 if (nums.Length > 1) {

		  // Get the average of the values
		  double avg = getAverage(nums);

		  // Now figure out how far each point is from the mean
		  // So we subtract from the number the average
		  // Then raise it to the power of 2
		  double sumOfSquares = 0.0;

		  foreach (int num in nums) {
			   sumOfSquares += Math.Pow((num - avg), 2.0);
		  }

		  // Finally divide it by n - 1 (for standard deviation variance)
		  // Or use length without subtracting one ( for population standard deviation variance)
		  return sumOfSquares / (double) (nums.Length - 1);
	 }
	 else { return 0.0; }
}

// Square root the variance to get the standard deviation
private double standardDeviation(double variance)
{
	 return Math.Sqrt(variance);
}

// Get the average of our values in the array
private double getAverage(int[] nums) {
	 int sum = 0;

	 if (nums.Length > 1) {

		  // Sum up the values
		  foreach (int num in nums) {
			   sum += num;
		  }

		  // Divide by the number of values
		  return sum / (double)nums.Length;
	 }
	 else { return (double)nums[0]; }
}

As you can see above we have an average function called “getAverage” to get our average of the numbers. We then use that in the variance() function to loop through each value, subtract the average and then square it. In addition, we are summing up the results. At the end of the variance function we divide by one less than the number of values to get the variance for a standard deviation. We would actually use the total number of values if we wanted the variance of the population.

We then toss the variance value into our standardDeviation function to square root it and return the result. That will be our standard deviation of the variance.

Together they provide us all the basic stats we need for our set of values. Standard deviation, the mean, and also the variance. Now you are armed with some basic stat pushing functions to impress your friends or that special lady. Of course if she loves programming and stats she is very special indeed! 😉

Hope you enjoyed the code. Feel free to help yourself to a five finger discount on that code and do what you like with it. It is in the public domain as is all the code on the Programming Underground. Have fun and thanks for reading! 🙂

About The Author

Martyr2 is the founder of the Coders Lexicon and author of the new ebooks "The Programmers Idea Book" and "Diagnosing the Problem" . He has been a programmer for over 25 years. He works for a hot application development company in Vancouver Canada which service some of the biggest tech companies in the world. He has won numerous awards for his mentoring in software development and contributes regularly to several communities around the web. He is an expert in numerous languages including .NET, PHP, C/C++, Java and more.