Threads, ProgressBar, and File Processing in C#

We have had a few questions on the board in recent times asking about the best way to show the progress of a long running process; such as file reading of a HUGE file. If you have a file with tens or hundreds of thousands of lines, or datasets which may have millions of records, it is going to take some time to process. It is fine that it takes a bit of time, it is understandable, but how about showing the user the progress WITHOUT tying up their user interface (so they can do other things)? When asking on the board the first thing someone is going to say is “use threads” but how do we go about doing that? How do we process the file on a separate thread while updating the progress bar for the user and allowing them to work with the interface? We cover these topics on this entry of the Programming Underground!

Damn! You try to code your program due at 4pm and it is taking forever to process this simple text file. C# is processing slower than molasses and you don’t even know where in the processing it is at. Is it only half way through? Will you make the deadline? If only you had thrown in that progress bar so you can get an idea how much longer it will take! After all, you have Capty’s birthday to get too and you don’t want to miss him in his drunkin speedo wearin debut. Glory waits for no one!

Well then you read this blog and there is hope. You will make it to that party I tell you! All you need is a worker thread, a progressbar, and a processing function. The program below is pretty simple if you follow the line of execution here, which splits into two threads. We start out in our main thread. This is where you always start out in the program and this is where your forms and controls live. It is your main line of program execution. When this thread reaches the end, so does your program. But you can create other threads of execution that run along side your programs execution. These threads are often referred to as “worker threads” and they can be setup to run functions and code in parallel with your main thread. So while the user plays with controls on the interface, you can have another thread in the background processing your file.

Lets take a look at a simple program for processing our big file called “bigfile.txt”. Yeah I am creative I know. I setup this file to have over 17000+ lines of text in it. This isn’t too bad for .NET to process on modern processors, but if we were to chew through these lines at a measly 10 characters, it is going to take a little time. (Still faster than you think if all you are doing is reading and counting).

I have setup a standard form with nothing but a button (to kick off the process), a progressbar (to show the progress through our file), and a label to act as a numeric representation of the value of our progress bar (this of course is optional and can be thought of as a percentage since our progressbar goes from 0 to 100).

threadexample.cs

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;

// Namespaces to read files (IO) and to handle threads (Threading)
using System.IO;
using System.Threading;

namespace threadingexample
{
	public partial class Form1 : Form
	{
		public Form1()
		{
			InitializeComponent();
		}

		private void btnStart_Click(object sender, EventArgs e)
		{
			// Start by setting up a new thread using the delegate ThreadStart
			// We tell it the entry function (the function to call first in the thread)
			// Think of it as main() for the new thread.
			ThreadStart theprogress = new ThreadStart(ReadFile);

			// Now the thread which we create using the delegate
			Thread startprogress = new Thread(theprogress);

			// We can give it a name (optional)
			startprogress.Name = "Update ProgressBar";

			// Start the execution
			startprogress.Start();
		   
		}

		// This delegate tells which function to call. It matches the signature
		// of our UpdateProgress function. During the invoke in ReadFile, we give it
		// the function to call which matches. In this case UpdateProgress.

		public delegate void updatebar();

		// This will run in the main thread so it can update the controls for us.
		private void UpdateProgress()
		{
			progressBar1.Value += 1;

			// Here we are just updating a label to show the progressbar value
			label1.Text = Convert.ToString(Convert.ToInt64(label1.Text) + 1);
		}

		// This would be our version of main() for the new thread.
		// We start reading a huge file with 17000+ lines in it.
		private void ReadFile()
		{
			String bigFile = @"c:\\bigfile.txt";

			// Lets get the length so that when we are reading we know
			// when we have hit a "milestone" and to update the progress bar.
			FileInfo fileSize = new FileInfo(bigFile);
			long size = fileSize.Length;

			// Next we need to know where we are at and what defines a milestone in the
			// progress. So take the size of the file and divide it into 100 milestones
			// (which will match our 100 marks on the progress bar.

			long currentSize = 0;
			long incrementSize = (size / 100);
			
			// Open the big text file with open filemode access.
			StreamReader stream = new StreamReader(new FileStream(bigFile, FileMode.Open));

			// This buffer is only 10 characters long so we process the file in 10 char chunks.
			// We could have boosted this up, but we want a slow process to show the slow progress.
			char[] buff = new char[10];

			// Read through the file until end of file
			while (!stream.EndOfStream)
			{
				// Add to the current position in the file
				currentSize += stream.Read(buff,0,buff.Length);
				
				// Once we hit a milestone, subtract the milestone value and
				// call our delegate we defined above.
				// We must do this through invoke since progressbar was defined in the other
				// thread.
				if (currentSize >= incrementSize)
				{
					currentSize -= incrementSize;
					progressBar1.Invoke(new updatebar(this.UpdateProgress));
				}
			}

			// Close the stream and show we are done.
			// At the end of this ends the run of our thread.
			stream.Close();
			MessageBox.Show("Done");
		}
	}
}

We first include the namespaces we need to process files, System.IO, and to handle threads System.Threading. Then we kick off the whole process by loading up the form and having the user press the start button. This will trigger our btnStart_Click event and in there we do a few things. First we create an instance of a delegate called ThreadStart. While this is not absolutely needed, it is a great way to make sure that we setup things right the first time and then you can learn the shortcuts later. A delegate can be thought of as a pointer to a function. In this case it is going to our startup function for our new thread. Here we use the file ReadFile(). Notice the function is defined as void and takes no parameters. Think of this as main() which takes no parameters here typically.

The second step is that we then pass this delegate to our actual Thread class. This tells the thread that we must use the delegate, which points to our Readfile() function, when we go to start the thread. At this same time I also give the thread a name (it is good to name threads to keep track of them and what they are doing) and lastly we start the thread.

Now C# is going to create another thread, call the delegate which points to our main funciton ReadFile(), and start executing that alongside our main thread. Now since these two threads are now independent of each other, they create their own variables and functions local to them. They don’t easily share info with one another. So our progress bar that we wanted to update is back in the main thread and our new worker thread has no access to it…. yet!

So we need a mechanism to share the progressbar control with the worker thread. This again is where delegates come in. We create a delegate which points to a function on the main thread. This delegate we called “updatebar” and it points to “UpdateProgress”. UpdateProgress() is in control of updating our progress bar and our label. So every time we call the delegate it calls the UpdateProgress function where we add one to both the progressbar and label controls in the main thread. This means our file processing milestone was hit during the processing of the file. A milestone meaning a place in the file we have determined that warrants updating our progress indicators for the user. If our progress bar has 100 ticks, we divide the file length by 100. So each time we process past the milestone, we update the progress bar.

Lastly, as we process the file we tell the progressbar (which it can see but not manipulate) to invoke the delegate which reaches back into the main thread and updates the controls. Hopefully that isn’t too confusing.

Once we are done with processing the file, we reach the end of our worker thread. The main thread remains unaffected. But when that main thread ends, so does the program.

That is all there is to it. The only trick here is to use the delegate to reach back into the main thread FROM the worker thread to update the controls. Now that our worker thread is working on processing of our big file, the main thread is keeping track of our user interface and freeing it up (from the user’s prospective) for the user to do other things like press buttons and manipulate other controls.

Now there is a trade off with using multiple threads and it comes at the price of CPU cycles. Processors set a specific number of timeslices to each thread and only processes each thread a little bit at a time then moving on to the next one (it will come back to it later). It goes around each of the threads and does this giving you the appearance that you are doing multiple things at once when really the CPU is still doing everything just in baby steps.

With the invention of multi-core and multiple CPU computers, some of these threads have been given to other processors so you should see speed improvements over traditional one CPU computers. But if you have too many threads doing too many things at once, your computer will grind to a slow crawl. So use them wisely and as a limited resource. Other programs on your computer will be using threads as well so keep that in consideration when using them. If you can keep the processing down to only a few threads, all that much better.

In personal experience I have only ever needed 2 or 3 threads at maximum to get most things done (unless you are creating servers or chat software which typically uses a lot of threads.. one handling each client or user you are connected to). But try to keep it under that limit when you go with design and you should be ok.

In short this program goes…

1. Button click
2. Create and start worker thread
3. Call ReadFile for our worker thread
4. Process the file while our main thread continues with the updating of the user interface.
5. During the processing it calls the delegate to call UpdateProgress to update the controls back on the main thread
6. Worker thread ends at the end of the processing and says “Done”.
7. Main thread ends and program ends.

I hope this makes sense and if you follow along with the in code comments. You can see how this has been accomplished. If you keep one thread for the user interface and other threads for background processes and you can do many things. Process images, process big files, transfer files that are quite big, establish other chat connections etc.

Now get out of that cave and over to the party! Capty is already half way through a keg and is dancing with a lampshade on his head. Thanks for reading and enjoy! 🙂

About The Author

Martyr2 is the founder of the Coders Lexicon and author of the new ebooks "The Programmers Idea Book" and "Diagnosing the Problem" . He has been a programmer for over 20 years. He works for a hot application development company in Vancouver Canada which service some of the biggest tech companies in the world. He has won numerous awards for his mentoring in software development and contributes regularly to several communities around the web. He is an expert in numerous languages including .NET, PHP, C/C++, Java and more.