Batch Renaming of Files Using Recursion in Python

Batch Renaming of Files Using Recursion in PythonSomeone came into the Dream.In.Code forums the other day asking about batch renaming of files using Python. I thought I would work out the example for everyone to see and create something that they could add to their own personal libraries of code. Python makes this pretty easy to do using “os” and “os.path” but one tricky part I found was dealing with the method isdir() and similar functions which require a full path. I will discuss that in a bit more detail later. Below I will share my solution and hopefully save someone some time in the future. Let’s get coding!

The Plan

Before we begin I wanted to create some of the ground rules for this function and how it might work. One of the features I wanted to do was support the ability to not only batch rename files in the specified directory, but any subdirectories within it. I also wanted a way to limit how far this function would go, meaning its depth. Do I want the function to keep going down into subdirectories until it runs out of subdirectories or do I want to limit it by saying only go 2 levels of subdirectories deep? In addition to this, I wanted it to make sure that we don’t follow any symbolic links it might run across. This is more of a Linux/Unix consideration but it might impact a Windows user as well. For this function I chose not to follow symbolic links so that I could make sure I was only changing files in my directory of focus and not have it spill out into some other directory that I might have linked to.

To control the depth I simply added an optional integer argument to the function parameter list. I can use this integer to essentially count down the levels as I move down the hierarchy tree. This will allow me to specify a value like 3 and each time I go down into a subdirectory, I subtract 1 from this count until I reach a value less than 0. Then I can stop further processing and return back up the tree.

To avoid the symbolic links, I simply check the file I am processing each time to see if it is a link using the os.path.islink() method. I do this before using a method like isfile() because isfile() actually follows symbolic links.

The Code

The code I list below has been tested on Windows 7 and Ubuntu 14.04 but I am sure it will work on other flavors as well. We are going to start out by defining the function to accept a base file path to start and a base file name we want to rename each file to. This base file name is going to have an integer appended to it. For instance if we pass it “test” it will name the files “test_1”, “test_2” and “test_3” etc. Each time it goes into a subdirectory, it will restart the counting. Here is how this function looks…

import os

def renameFiles(path, basefilename, depth=1):
    # Once we hit depth, just return (base case)
    if depth < 0: return
	
    # Make sure that a path was supplied and it is not a symbolic link
    if os.path.isdir(path) and not os.path.islink(path):
        ind = 1

        # Loop through each file in the start directory and create a fullpath
        for file in os.listdir(path):
            fullpath = path + os.path.sep + file

            # Again we don't want to follow symbolic links
            if not os.path.islink(fullpath):

                # If it is a directory, recursively call this function 
                # giving that path and reducing the depth.
                if os.path.isdir(fullpath):
                    renameFiles(fullpath, basefilename, depth - 1)
                else:
                    # Find the extension (if available) and rebuild file name 
                    # using the directory, new base filename, index and the old extension.
                    extension = os.path.splitext(fullpath)[1]
                    os.rename(fullpath, os.path.dirname(fullpath) + os.path.sep + basefilename + "_" + str(ind) + extension)
                    ind += 1
    return

We start of by bringing in the os package and then defining our function to take in the path, the base file name and the depth. Here I specified a depth of 1 as the default value so that we would only go one level deep (one set of subdirectories). If you want to do just the base directory, you would set depth to 0. We then check to make sure that the depth is not less than zero. This is our base case for the recursion. Once this is true, we know not to proceed further and immediately return.

The next step here is to check to make sure the path specified is a directory and that it is also not a symbolic link. If that is good to go, we create a variable to act as our counter and proceed into looping over directories and files of the current directory. For each file / directory we encounter we create a full path value with it. This is the base path, plus whatever separator the current environment uses and then the file. On windows this is typically the backslash. On Linux/Unix environments this is the forward slash typically. This is where that trick I spoke of early is. If you do not formulate a full path, isdir() and other similar functions will start returning false because it doesn’t see just the sub directory name as a directory without specifying the whole path. Building the full path like we are seems to do the trick here and makes sure we are being explicit.

If the current file is actually a directory, we recursively call the function giving it the new full path, the base name again and subtract 1 from the depth. If we are ever hitting a depth less than zero we hit our base case and return.

For any current file which is actually a real file, we are going to fetch out the extension of the file and use it along with the directory name of the full path, separator, the base name and index to build a new file name. Here we are making a file in the format “basename_index.ext” like “myfile_1.txt”. If you wish to have a different naming scheme, this is where you would introduce it. I went with an index numbering system to make sure we never had two files the same and to show you a bit of how you could dynamically build a name.

Lastly we increment the index and move onto the next file / subdirectory until we eventually run out of these files.

The Conclusion

This function is pretty straight forward and it should prove useful in projects where you need to do any kind of batch file processing. Here we used the os package to rename files, but if you take this function and boil it down to a pattern it would allow you to run several operations on each file or directory. The main piece of gold here is the recursive nature of navigating the directory hierarchy and limiting the depth and breath of how far it will go. We could use this to rename files (like we show here), move files, open files and edit them based on an extension, concatenate files together to create new files, open files in a program or all of the above in a single sweep. Play around with it and see what you can do with it. One limitation it has is in cases where a file cannot be renamed. Perhaps you don’t have permissions or the file is currently locked due to being in use by another program. See if perhaps you can make it better by checking for this and silently skip it.

I hope you find this function useful to you. As with any of the code displayed on The Coders Lexicon, it is in the open domain for free copying and modification. Thanks for reading and if you haven’t already, check out our project ebook and subscribe to our newsletter! 🙂

About The Author

Martyr2 is the founder of the Coders Lexicon and author of the new ebooks "The Programmers Idea Book" and "Diagnosing the Problem" . He has been a programmer for over 18 years. He works for a hot application development company in Vancouver Canada which service some of the biggest telecoms in the world. He has won numerous awards for his mentoring in software development and contributes regularly to several communities around the web. He is an expert in numerous languages including .NET, PHP, C/C++, Java and more.
  • Use os.walk: https://docs.python.org/library/os.html#os.walk This would eliminate almost all lines of code in your renameFiles function, and it would avoid the possibility of “maximum recursion depth exceeded” error.

    • That would certainly be another way to do it. I don’t know if it actually shortens things though because you would still have to implement depth limiting, pulling off the extension and dynamically building a filename. There are also discussions that os.walk may actually be slower (can’t say that it is or not at this point). I guess it makes calls to listdir() and stat() in the background.

      But certainly another idea to play with. Thanks for sharing! 🙂