Search a Text File in Java

File handling can be a bit daunting to the beginner. We all know what a file is, but how to access and use even the simplest text files through a program can make our brains go to mush. That is because programs see files as streams of data. It opens access to a file and pulls in that data as a stream in either chunks or in a serial (single file) type of style… one character or byte or bit at a time. Most beginners to Java have no clue where to even begin. Well Dream.In.Code and Martyr2 to the rescue! As super heroes, we fight the baddies which cause confusion and mayhem and bring you the straight scoop. So worry not programming citizen, we are on the job in this episode of the Programming Underground! Up Up and Awayyyyyy!

“It is a bird! It is a plane! No it is a text file! Say what?” Sure the topic of files doesn’t necessarily fit easy in that memorable phrase, but using our handy Java utility belt I am sure we can open up that file, read its contents and find what lurks inside. In this example we will be creating a small program which opens a standard text file and searches for a word or phrase we provided at runtime as a parameter. The user will be able to type something like java searchfile “hello” and it will search our text file for the phrase “hello” and return the position and line number where the word appears.

We do this by opening up a file using a specialized Java class called a “BufferedReader”. This class is designed to open a file and create a stream of data (buffered) for us to then read from and use. To get the process started we will give it a plain FileReader object initialized with a file that we want to search. For simplicity sake I hardcoded this value in the program but you could easily use another parameter for the filename to search. I will leave that part up to you.

So once we have this file open we begin the process of reading through the file line by line. We will do this using a while loop that reads the line, checks if anything was read, then begins our most basic search on that line. The process looks like this…

// Need the input/output package when handling files.
import java.io.*;

public class searchfile {
	public static void main(String args[]) {

		// Check to see if they supplied the search phrase as a parameter.
		// If so, set our searchword from the parameter passed and begin searching.
		if (args.length > 0) {

			// Set the searchword to our parameter 
			// (eg if they type 'java searchfile "hello"' then the search word would be "hello")
			String searchword = args[0];
		 
			try {

				// Keep track of the line we are on and what the line is.
				int LineCount = 0;
				String line = "";


				// Create a reader which reads our file. In this example searchfile.txt is the file we are searching.
				BufferedReader bReader = new BufferedReader(new FileReader("c:\\searchfile.txt"));


				// While we loop through the file, read each line until there is nothing left to read.
				// This assumes we have carriage returns ending each text line.

				while ((line = bReader.readLine()) != null) {
					LineCount++;

					// See if the searchword is in this line, if it is, it will return a position.
					int posFound = line.indexOf(searchword);


					// If we found the word, print the position and our current line.
					if (posFound > - 1) {
						System.out.println("Search word found at position " + posFound + " on line " + LineCount);
					}	
				}

		
				// Close the reader.
				bReader.close();

			}
			catch (IOException e) {
				// We encountered an error with the file, print it to the user.
				System.out.println("Error: " + e.toString());
			}
		}
		else {
			// They obviosly didn't provide a search term when starting the program.
			System.out.println("Please provide a word to search the file for.");
		}
	}
}

Our supplied parameter to this program is going to come into the program through the args[] array. If you have ever wondered what that args[] array was for, now you know. We check this array first to see if the user supplied us a search keyword/phrase for us to find. No use conducting the search if there is no word to search for. Once we have it, then we can move onto the reading of the file and our hunt for the elusive search word (them baddies like to hide in the dark alleys you know).

The key part of this program is that while loop. By calling the BufferedReader’s readLine() method we read in a line of the text file and store it in our variable called “line”. We then compare this to “null” to see if indeed something was read. Once we hit the end of the file, this will lead to a null value which will end the loop.

Now that we have our line read in from the text file, we use the string class’ indexOf() method to locate the word within the line. The value returned by this method is the position (starting at index 0) of the search word in our current line. If the word is not found, it will return a value of -1. Obviously we can test to see if the value is higher than -1 (meaning it found the word) and print the position of that word and the line we found it on. The line count of course is incremented each time we read a line to keep track of where in the file we are at.

The results of this program should look something similar to this…

c:>java searchfile "hello"
Search word found at position 0 on line 1
Search word found at position 9 on line 4
Search word found at position 12 on line 5

So the results tell us that it had found the word “hello” three times on lines 1, 4 and 5 at positions 0, 9, and 12 respectively. Now this program is very basic and would work on text files that are setup to have each line terminated by a carriage return. The method readLine() will read a line until it hits this carriage return. Another limitation to this program is that it will only find the first match of each line. So lets say on the first line you had two “hello” words. It would only detect the first one at position 0 and not the second at position 13.

You could solve this limitation by putting in another loop inside the while loop (once you have determined that indexOf() was not -1) and would loop until indexOf() equals -1. This would be a simple addition that you could make if you were interested in not only finding out if the line had the word, but all locations of the word in the text.

Of course this search method can slow down for huge text files since we are doing a sequential process of the file. So for experimental purposes I suggest that you start with a file that has no more than a few hundred lines and then expand it when you want to do some serious processing. If the text file gets too big, you might want to look at an indexed approach. Maybe we will cover this in another entry… who knows!

Read through the in-code comments of the example above and see how this is put together. Each step has been documented so that you can put the pieces together and come up with an idea of what the program is doing. Feel free to edit the code as you see fit and make whatever modifications you want to it. Just keep in mind this is ideal searching for text files and not exactly what you want to do for binary files. Those baddies will have to be handled another day. After all, the super criminals are not easily caught!

So the next time you are in a dark room at night and no where else to turn for learning file handling in Java, have no fear, DIC and Martyr2 is here! Thank you for reading! 🙂

About The Author

Martyr2 is the founder of the Coders Lexicon and author of the new ebooks "The Programmers Idea Book" and "Diagnosing the Problem" . He has been a programmer for over 18 years. He works for a hot application development company in Vancouver Canada which service some of the biggest telecoms in the world. He has won numerous awards for his mentoring in software development and contributes regularly to several communities around the web. He is an expert in numerous languages including .NET, PHP, C/C++, Java and more.