Historically the question “How do I login to a website from X Language?” gets asked from time to time on Dream.In.Code. The answers to this question has not been the greatest and these questions tend to be largely ignored. I intend to put an end to this for the Java language with a simple example. The answer, while simple, does expect the reader to be at least familiar with how a web form works. We are going to assume the reader knows the page/script of where the form is submitted to and the name of the fields used in the form. In most cases, for a login form, this is simply a username field, password field and a submit button. However, be sure to check the source code of the form to know for sure and make sure all fields are covered when submitting the data from Java. So lets get down to business submitting form data on this episode of the Programming Underground!
So how do we do this? Relatively simple actually if you know of a few Java objects from the java.net namespace. I am talking about the objects URL and URLConnection. We use the URL object to construct a simple URL to where we want to submit our data and also use it to get a URL connection to that page/script. Once we get the connection, we set the connection to a write mode, write to the connection and then read the response… which will typically be our HTML response.
Lets show a simple example…
import java.net.*; import java.io.*; public class ConnectToURL { // Variables to hold the URL object and its connection to that URL. private static URL URLObj; private static URLConnection connect; public static void main(String[] args) { try { // Establish a URL and open a connection to it. Set it to output mode. URLObj = new URL("http://www.examplesite.com/login.php"); connect = URLObj.openConnection(); connect.setDoOutput(true); } catch (MalformedURLException ex) { System.out.println("The URL specified was unable to be parsed or uses an invalid protocol. Please try again."); System.exit(1); } catch (Exception ex) { System.out.println("An exception occurred. " + ex.getMessage()); System.exit(1); } try { // Create a buffered writer to the URLConnection's output stream and write our forms parameters. BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(connect.getOutputStream())); writer.write("username=MyUsername&pass=MyPassword&submit=Login"); writer.close(); // Now establish a buffered reader to read the URLConnection's input stream. BufferedReader reader = new BufferedReader(new InputStreamReader(connect.getInputStream())); String lineRead = ""; // Read all available lines of data from the URL and print them to screen. while ((lineRead = reader.readLine()) != null) { System.out.println(lineRead); } reader.close(); } catch (Exception ex) { System.out.println("There was an error reading or writing to the URL: " + ex.getMessage()); } } }
Not a whole lot to this example, which is great for learning the concept! We start by importing our namespaces to get access to URL and URLConnection along with our BufferedReader and BufferedWriter objects. Next we construct our URL object by giving it the URL to where our login form submits to. This is not the URL of where the form is currently located. We want to send the info to the same URL that our form does when we login on a webpage. You can find this information in the source of a page which contains the login form.
If the URL is some how not correct, it will throw a MalformedException when it tries to create the URL. This is typically caused by the wrong protocol being used, a misspelled word or some other error that may pop up from parsing the URL. Next we use this new URL object to open a connection to that URL. It will return our URLConnection object and allow us to get at the underlying streams. Note: To set this connection up for writing, we are going to call the setDoOutput setting to true. If we don’t do this, the connection would have output mode set to false and not allow us to write to it.
Once we have the streams, we can grab the write stream with a BufferedWriter (or some other Writer object) and write our form parameters to the URL. In our example we submit a username, password and the login button. These fields will match the field names in your specific login form. Their values will be the values you would have typed in to login. Separate each field=value pair with an ampersand just like you would in a URL. Since we are writing directly to the page/script, this will be registered as a POST which is how most login forms work.
After writing our parameters, we then setup a read of the connection and read back the response (which is typically a HTML page of showing that you are logged in or that login failed). Based on the response content, we must then determine if we did in fact login correctly and then go about fetching pages. That is really all there is to it!
You can use the code provided in the example above to submit data through any web form you like. Just make sure you change the URL and the parameters to match the target web form. I hope you find the code useful.
Enjoy and thanks again for reading! 🙂