Variable lifespan is a very important topic. But what is it and how do we use it to our advantage? A variable’s lifespan is the time between its initialization and the time it either goes out of scope or is purposely destroyed (yeah that is right, a programmer can be a hitman too!) Most programmers don’t fully realize the implications of setting a variable at the top of a function and then using that variable much later in the same function. What harm can it do to put all your variables at the top? In short, it can harm a lot of things and I will explain why on this issue of the Programming Underground!
There are many dangers to setting up a variable with a long lifespan… longer than it needs to really be. The goal is to actually keep the lifespan down to a minimum to avoid:
1) Accidentally corrupting the variable’s data
2) Losing track of where that variable is used later
3) Complicates maintainability from programmers other than yourself
The first disadvantage is a pretty straight forward one. The longer a variable’s lifespan, the more opportunity there is for other statements to corrupt it. One common way this is done is through a simple assignment statement. Maybe you are using a temp variable named “temp” and you assign it results of a computation. Perhaps you had used that variable elsewhere and temp is currently holding a value you need later for another loop. By using it again you may accidentally put in some data you didn’t want to use to overwrite the old. You usually find this when you use a variable to do more than one thing (run more than one loop, use it as an accumulator, or use it to do accumulating one minute and then use it to run a loop index another). That brings up the point that a variable should be used for one purpose like how Britney tossed K-fed. (That is a whole other blog posting)
The second disadvantage happens a lot, especially in longer function bodies. You code up a nice function, put the variable declarations at the top and write a loop further down the line to use that variable to sum up a row of numbers. Your function is working great! You then find that killer programming job where they let you stay home in your pajamas and sip java (the drink not the language) and make a quarter million a year because you are leet so you leave the other company behind. Everything is fabulous! Well the schmuck behind you in the corporate ladder is then entrusted with your code and needs to get rid of that summing loop you created. Simple, highlight and delete right? But being that the variable associated with the loop was defined at the very top of the function, he doesn’t catch it. Now you have a variable that is sitting there not being used. Better yet still, he then uses the same name later only to find out he has an error about a variable redefinition. He spends several hours on it only to find the mistake at the top. Great! He spent 3 hours finding out what was wrong and spent 100-250 dollars of the companies money! I am sure they are going to love that! By placing the definition closer to the loop where it is actually used, it is easier for him to see the relation between that variable and the loop itself. Below is an example…
Accumulator *Sum = new Accumulator; for (int i = 0; i < 4; i++) { Sum->Add(i); } delete Sum;
Now if I had to remove this loop, I could easily see that Sum, which is defined right above the loop is associated with it. In this small very simple case, the lifespan of the variable Sum is small. It is said to have a lifespan of about 5 lines. Lifespan is often measured in lines between its initialization to when it is destroyed. Now lets say that I was to increase the lifespan by putting in statements between the definition and where it is used.
Accumulator *Sum = new Accumulator; //...Do something 1 //...Do something 2 //...Do something 3 //...Do something 4 for (int i = 0; i < 4; i++) { Sum->Add(i); } delete Sum;
We have increased the lifespan to about 9. By increasing the lifespan here you have made it possible for statements “Do something 1 – 4” to modify “Sum” first of all, second it has separated the obvious relationship between our loop and the variable above and lastly it makes it harder for other programmers to see a relationship when they go to maintain this code, possibly leaving incomplete code behind.
Unlike computers, human memory is often limited to a few statements at a time or, in the case of if statement branching, about three levels. If a programmer was to look at our second example they may simply forget what Sum means. This is the heart of disadvantage number 3. Now some of this can be corrected by good variable naming (which I always recommend), but often times when a function is handling multiple variables the closer the variable is defined to where it is actually used the better. A human can see and remember the relationship when they go to modify it.
There are plenty of other reasons to keep the variable lifespan down… but the three reasons mentioned here are the most advantageous in preventing errors and keeping programmer readability, and thus easy maintainability, at the highest level possible.
If you find your variables having large lifespans, and you are using the variable the entire time, that is also another flag that perhaps you should break that function up. You want to keep a variable’s life span to less than about 20 if you can help it.