HTTP POST from java application

207 views Asked by At

I'm trying to POST to a html form using a java application. The form is on a page, with html extension if that matters (eg. http://www.domain.tld/somepage.html), holding a form as follows:

<Form method="POST">
<input type="hidden" name="op" value="checkfiles">
<Textarea name="list" rows=12 style="width:100%;font:12px Arial"></Textarea>
<br><input type="submit" name="process" value="Check">
</Form>

I tried with Apache HTTPComponents, but so far my attempts have been unsuccessful. Here's the function I'm using:

private static void submit(String text) throws Exception{

    HttpClient httpclient = new DefaultHttpClient();
    HttpPost httppost = new HttpPost("http://www.domain.tld/somepage.html");

    List <NameValuePair> params = new ArrayList<NameValuePair>();
    params.add(new BasicNameValuePair("list", text));

    httppost.setEntity(new UrlEncodedFormEntity(params, HTTP.UTF_8));

    HttpResponse response = httpclient.execute(httppost);

    BufferedReader rd = new BufferedReader(
                    new InputStreamReader(response.getEntity().getContent()));

    StringBuffer result = new StringBuffer();
    String line = "";
    while ((line = rd.readLine()) != null) {
        result.append(line);
    }

    System.out.println(result.toString());

}

For some reason, this returns me the page at http://www.domain.tld/. Help please.

2

There are 2 answers

1
Antoniossss On BEST ANSWER

For start (but irrelevant to question) use EntityUtils.toString() to get page content as String insteed of manually reading it. Remember also to consume entity with EntityUtils.consume() to release resources (connections to the pool etc.)

As for the question, your code looks just fine. From my experience, page you are submitting to is redirecting you to main page due to various of reasons. Some I can think of are

  1. Your post parameters are wrong - remote system threats your request as malformed and redirecting you to the main page (that is normal)
  2. You are missing some input parameters that should be parsed from page (eg. __VIEWSTATE for .NET systems)
  3. You should first GET your page to initialize session - eg. receive session cookies etc.
  4. You are posting to the wrong webpage
  5. Your client doesn't have some headers eg. client name

There can be other problems, but these are the most common ones that I have encountered. In my practice I have implemented page managers for dosens of websites. You have to analyze how the post is performed from web browser. I strongly suggest to use Firefox with Firebug extension and double check what is happening. Basicly you almoust always have to recreate exact behaviour of browser including reading pages, initializing sessions, holding cookies etc.

EDIT

Your form has 3 inputs, but you are submitting only list. Try submiting op and (optionally) process fields as well.

2
Rubicon On

You cannot POST to a web form using Apache HttpComponents/HttpClient. What that library is intended for is to POST data through HTTP protocols and NOT input data on a form, which implies DOM manipulation/interaction. For instance, a good question to ask is this - if HttpComponents could be used to enter form data, does it have a "Click" method that'll allow you to click on a button (e.g. the Submit button on a form)?

If you really want to use automation to enter data on a form, try using other libraries that interact with the DOM, or even Selenium Webdriver, which is an industry standard way to interact with webelements.

Some Selenium documentation that can help:

http://docs.seleniumhq.org/projects/webdriver/ Selenium Webdriver: Entering text into text field

Hope this helps! :)