Detail procedure to generate a har file from a given url via command line tool

8.3k views Asked by At

Could anybody advise how to generate a har file from given a url via command line in linux? Detail tools used and guidelines are much appreciated.

Thanks

3

There are 3 answers

0
xdebug On BEST ANSWER

You can use phantomjs for this work.

phantomjs examples/netsniff.js "some_url" > out.har 

or take a look at the BrowserMob Proxy

1
Paras Dahal On

I have worked with PhantomJS to produce HAR files but they are not really reliable as opposed to the HAR files generated by actual browsers like Chrome, Firefox. Using selenium and BrowsermobProxy, you can generate HAR files directly from browsers with a python script such as this:

from browsermobproxy import Server
from selenium import webdriver
import json

server = Server("path/to/browsermob-proxy")
server.start()
proxy = server.create_proxy()
profile = webdriver.FirefoxProfile()
profile.set_proxy(self.proxy.selenium_proxy())
driver = webdriver.Firefox(firefox_profile=profile)
proxy.new_har("http://stackoverflow.com", options={'captureHeaders': True})
driver.get("http://stackoverflow.com")    
result = json.dumps(proxy.har, ensure_ascii=False)
print result
proxy.stop()    
driver.quit()

If you are looking for a command line tool that headlessly generates HAR and performance data with Chrome and Firefox, have a look at Speedprofile.

0
Bob On

Phantomjs' har files are an abbreviated list of assets. In other words, when you visit a web page with Chrome or another browser, files load over a period of a few seconds.

But phantomjs takes an instantaneous snapshot of that website, before all the assets have had time to load.

It also excludes data and image files (because they're not part of the har spec)

You can work around this by modifying the netsniff.js example file.

I've forked that project and made those modifications at the link below. Please note that I've set the timer to wait 20 seconds before generating the har. I've also added a bit of error handling to ignore js errors. The error handling bit was added to deal with phantomjs creating invalid har files if it encountered an error. (I also commented out the function that excludes data/image files)

So this may not be exactly what you want. But it's a starting point for you or anyone else looking to use phantomjs.

After these changes, I went from consistently getting four asset files to about 25.

https://github.com/associatedpress/phantomjs/blob/netsniff-timer/examples/netsniff.js