I am running into a weird error with spynner, though the question is a generic one. Spynner is the stateful web-browser module for python. It works fine when it works but I almost with every run I get a failure saying this --
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/spynner-2.16.dev0-py2.7.egg/spynner/browser.py", line 1651, in createRequest
self.cookies,
AttributeError: 'Browser' object has no attribute 'cookies'
Segmentation fault (core dumped)
The problem here is its segfaulting and not letting me continue.
Looking at the code for spynner I see that the cookies variable is in fact initialized in the __init__() function for the Browser class like this:
self.cookies = []
Now on failure its really saying that the __init__() is not run since its not seeing the cookies variable. I do not understand how that can be possible. Without restricting to the spynner module can someone venture a guess as to how a python object could fail with an error like this?
EDIT: I definitely would have pasted my code here except its not all in one place for me to compactly show it. I should have done it earlier but here is the overall structure and how I instantiate and use spynner.
# helper class to get url data
class C:
def __init__(self):
self.browser = spynner.Browser()
def get_data(self, url):
try:
self.browser.load(url)
return self.browser.html
except:
raise
# class that does other stuff among saving url data to disk
class B:
def save_url_to_disk(self, url):
urlObj = C()
html = urlObj.get_data(url)
# do stuff with html
# class that drives everything
class A:
def do_stuff_and_save_url_data(self, url):
fileObj = B()
fileObj.save_url_to_disk(url)
driver = A()
# call this function for multiple URLs.
driver.do_stuff_and_save_url_data(url)
The way I run it is ---
# xvfb-run python myfile.py
The segfault is probably something else I am doing. May be its because of the xvfb I am using and not handling properly? I don't know yet. I need to mention that I am relatively new to python.
I noticed that when I run the code above with say 'http://www.google.com' I get the segfault every other time.
The code block of
do_stuff_and_save_url_data()doesn't use the referenceself:then the execution of this function doesn't depend on
driver.The code block of
save_url_to_disk()also doesn't use the referenceself:then the execution of this second function doesn't depend on the object
fileObj.Only the code block of
get_data()uses the referenceself, and more precisely the referenceself.browser:so its execution and result depends on the attribute
browserof the instanceurlObjfrom classC. This attribute is in fact a browser instance namedbrowserof thespynner.Browserclass.In the end, you "do stuff with html" with just the data outputed by
spynner.Browser().html. And creation ofdriverandfileObjaren't mandatory in any way..
Another point is that
when the instruction
driver.do_stuff_and_save_url_data(url)is executed,the method
driver.do_stuff_and_save_url_data(url)is first created, then executed, and finally "destroyed" (or more precisely forgot somewhere in the RAM) because it hasn't been assigned to any identifier.Then the identifier
fileObj, which is an identifier belonging to the local namespace of the functiondriver.do_stuff_and_save_url_data(), is lost too, which means the instance fileObj of classBis also lost for ulterior use since it has no more assigned identifier alive.It's the same for
save_url_to_disk():after the creation and execution of the method
fileObj.save_url_to_disk(url), the object urlObj of classC, which contains an instance of browser ( an object created byspynner.Browser()), is lost: the created browser and all its data is lost.I wonder if this isn't because of this destruction of the browser instance after each execution of
do_stuff_and_save_url_data()andsave_url_to_disk()that the cookies information wouldn't be destroyed before an ulterior call..
So, in my opinion, your code only embeds two functions in two definitions of classes
AandBand they are used as being considered functions , not as methods.1/ I don't think it is a good coding pattern. When one wants only plain functions, they must be written outside of any class.
2/ The problem is that if operations are triggered by functions, a new browser is created each time these functions are activated , even if they have the mantle of methods.
You will say me that you want these functions to act with data provided by the browser defined by
spynny.Browser().That's why I think that they must not be functions embeded in classes as now, but real methods attached to a stable instance of a browser. That's the aim of object to keep in the same namespace the data and the tools to treat the data.
.
All that said, I would personnally write:
But I'm not sure to have well undesrtood all your considerations, and I warn that I didn't know spynner before reading your post. All that I've written could be stupid relatively to your real problem. Keep a critic eye on my post, please.