Using Hashes To Check If A File Changed in VB.NET

Once in awhile you may want to check if a file has changed its contents in some way. Perhaps you are monitoring a log file or checking to see if an exe has had a virus mess with it. One way you could do this is by hooking up a FileSystemWatcher object and monitor the file. This might be a good approach for more elaborate programs where you also want to watch the creation/deletion of files, monitor more than one file or watch an entire directory. However, you may want something with a more simple concept. The idea below can be scaled up to monitor several files for a small change or check for file changes on your schedule. We will show you how to use a simple file hash to check for such changes in VB.NET on this entry of the Programming Underground!

To start off, we want to be able to do three things in our example program. First we want to locate a file. Second we want to open that file and generate a hash based on its contents. This file could be a text file or a binary file, doesn’t matter. Whatever that hash is, we save it along with the location of the file for convenience. The third and last phase is to check the file for a change by opening that same file, recomputing the hash and comparing it to the one we saved. That is all there is to it! 🙂

Now if we wanted to take this to the next level we could simply open several files and store the hashes in an array (along with the path to the file so we know which hash belongs to which file) and compare them again later. We could even sort the array and check files based on an ascending file name order or sort based on some other attribute like file modified or create date.

Below is one simple windows application in VB.NET where we have three buttons.

Public Class frmCheckFile
    Private fileToCheck As String = ""
    Private fileHash As Byte()

    ' Select file button opens a file dialog to select a file to check.
    Private Sub btnSelectFile_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnSelectFile.Click
        If dlgOpenFile.ShowDialog() = DialogResult.OK Then
            If Not String.IsNullOrEmpty(dlgOpenFile.FileName) Then

                ' Save the file name so we can check later. 
                ' Then compute its hash and save that.
                fileToCheck = dlgOpenFile.FileName
                fileHash = ComputeFileHash(fileToCheck)
                MessageBox.Show("Please select a valid file to compute the hash for.")
            End If
        End If
    End Sub

    ' Check File button recomputes hash for the saved file name and compares it to the old saved hash.
    Private Sub btnCheckFileChanges_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnCheckFileChanges.Click
        If Not String.IsNullOrEmpty(fileToCheck) Then
            Dim newFileHash As Byte() = ComputeFileHash(fileToCheck)

            If CompareByteHashes(newFileHash, fileHash) Then
                MessageBox.Show("They are the same, no changes were made.")
                MessageBox.Show("The file " + fileToCheck + " was changed.")

            End If
            MessageBox.Show("Please select a file to hash before you attempt to check it for changes.")
        End If
    End Sub

    ' Simply opens up the file name we have stored and appends a byte onto the end of it to change it slightly.
    Private Sub btnChangeSlightly_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnChangeSlightly.Click
        If Not String.IsNullOrEmpty(fileToCheck) Then
                Dim fileToChange As FileStream = New FileStream(fileToCheck, FileMode.Open, FileAccess.Write)

                ' Write a byte representing "A" onto the end of the file.
                fileToChange.Seek(fileToHash.Length, SeekOrigin.Begin)

                MessageBox.Show("File successfully modified.")

            Catch ex As IOException
                MessageBox.Show("Error opening or writing to file: " & ex.Message)
            End Try

            MessageBox.Show("Please select a file to check/change first before trying to change it.")
        End If
    End Sub

    ' Calculates a file's hash value and returns it as a byte array.
    Private Function ComputeFileHash(ByVal fileName As String) As Byte()
        Dim ourHash(0) As Byte

        ' If file exists, create a HashAlgorithm instance based off of MD5 encryption
        ' You could use a variant of SHA or RIPEMD160 if you like with larger hash bit sizes.
        If File.Exists(fileName) Then
                Dim ourHashAlg As HashAlgorithm = HashAlgorithm.Create("MD5")
                Dim fileToHash As FileStream = New FileStream(fileName, FileMode.Open, FileAccess.Read)

                'Compute the hash to return using the Stream we created.
                ourHash = ourHashAlg.ComputeHash(fileToHash)

            Catch ex As IOException
                MessageBox.Show("There was an error opening the file: " & ex.Message)
            End Try
        End If

        Return ourHash
    End Function

    ' Return true/false if the two hashes are the same.
    Private Function CompareByteHashes(ByVal newHash As Byte(), ByVal oldHash As Byte()) As Boolean

        ' If any of these conditions are true, the hashes are definitely not the same.
        If newHash Is Nothing Or oldHash Is Nothing Or newHash.Length <> oldHash.Length Then
            Return False
        End If

        ' Compare each byte of the two hashes. Any time they are not the same, we know there was a change.
        For i As Integer = 0 To newHash.Length - 1
            If newHash(i) <> oldHash(i) Then
                Return False
            End If
        Next i

        Return True
    End Function
End Class

The code above is commented so that you can follow along through as we talk about each part. It starts with three button events which match up to three buttons on our simple form. The first button is responsible for simply opening a file dialog and letting the user select a file. Once they do that and press OK, it stores the name of the file and passes the file name onto our ComputeFileHash() function to get a byte array. This byte array is a hash of the file’s contents and we save that as well. The hash value is calculated here in our program using an MD5 hashing algorithm provider. You could use SHA/SHA1 or any of its variants or even RIPEMD160 etc.

The next button is for checking the two hashes. One hash is the hash we stored away when we first selected the file, the other hash is a newly generated hash based on the current file contents. So we are comparing the two hashes, computed now to what it was then. This event passes the two hashes to our helper function CompareByteHashes() and it returns true/false if they match or not.

Our third button is used to change the file slightly by appending a simple byte onto it. Once we modify the file, we can check it again by pressing the check file button and we will see that it reports back that the files are not the same. A slight change to the contents of a file can drastically alter the hash value generated for the file. There is a very low possibility that two hashes would be generated for more than one file, so this is a great way to check for a file change.

By using a simple hash approach like this, we can determine if a file has been changed without our knowledge. Many viruses have been known to infect existing exe files by appending bits/bytes and this would be picked up in this procedure as well. However, if you are looking for something much more elaborate, you can again look at a FileSystemWatcher class and get access to all of its functionality. Perhaps you could create your own class based off the code above to do something similar!

I am pretty sure you can see what is going on in the code above and you are more than welcome to rip out the ComputeFileHash() and CompareByteHashes() functions to put in your ever expanding function library. They make great additions. Thanks for reading! 🙂

About The Author

Martyr2 is the founder of the Coders Lexicon and author of the new ebooks "The Programmers Idea Book" and "Diagnosing the Problem" . He has been a programmer for over 20 years. He works for a hot application development company in Vancouver Canada which service some of the biggest tech companies in the world. He has won numerous awards for his mentoring in software development and contributes regularly to several communities around the web. He is an expert in numerous languages including .NET, PHP, C/C++, Java and more.