I am learning ruby and attempting a simple ruby request on command line to scrape a website. There are 2 input elements with ids = "tb_radius_miles" and "locationSearchTextBox"
I am trying to make a request with these values filled in and then read the result which displays in the textarea with id="tb_output"
All of my efforts result in all these tags having empty values when I read them out.
Here is my ruby script:
require "httparty"
require "nokogiri"
require 'json'
response = HTTParty.post("https://www.freemaptools.com/find-zip-codes-inside-radius.htm",
{
:body => [ { "tb_radius_miles" => "10", "locationSearchTextBox" => "10118" } ].to_json,
:headers => { 'Content-Type' => 'application/json', 'Accept' => 'application/json'}
})
# parse html of web page
document = Nokogiri::HTML(response.body)
puts document.at_css("textarea#tb_output")
puts document.at_css("input#tb_radius_miles")
puts document.at_css("input#locationSearchTextBox")
and here is my output with empty values in these elements
<textarea cols="50" rows="4" id="tb_output" name="tb_output" readonly></textarea>
<input type="text" id="tb_radius_miles" value size="4" maxlength="4" onchange="tb_radius_miles_changed(this.value);">
<input type="text" id="locationSearchTextBox" style="width:300px;" placeholder="Example:10118">
I am simply expecting the 3 print outs to display the values i am sending in and the result that should be populated in the tb_output element when I run it manually in my browser.
I have tried different syntax for forming the httparty request like using query instead of body, removing the : and also trying the request without json format.
Thanks for any tips, i have spent an embarrassing amount of time reading, googling, and trying to get this most basic script to work.
I am not using any tools except vscode and the installed ruby 3.2.3 Tips on other tools to use would be appreciated as well.
EDIT**** inspecting the page's network tab shows a php request with parameters when clicking the search button. Would it be proper to send this url in though? Does not seem like the correct way to go about it. screenshot headers payload
When using HTTParty for web scraping, ensure the website doesn't require javascript execution for displaying content. Your current approach might not work if the site relies on JS to populate data. Instead of directly scraping the output, consider using APIs if available. Check the website's robots.txt for legality. Also, double-check if the form submission correctly sends JSON data; you might need to mimic form encoding instead. If javascript rendering is needed, switching to a tool like Selenium might be more effective for your task.