I am running into a weird error with spynner, though the question is a generic one. Spynner is the stateful web-browser module for python. It works fine when it works but I almost with every run I get a failure saying this --
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/spynner-2.16.dev0-py2.7.egg/spynner/browser.py", line 1651, in createRequest
self.cookies,
AttributeError: 'Browser' object has no attribute 'cookies'
Segmentation fault (core dumped)
The problem here is its segfaulting and not letting me continue.
Looking at the code for spynner I see that the cookies variable is in fact initialized in the __init__()
function for the Browser class like this:
self.cookies = []
Now on failure its really saying that the __init__()
is not run since its not seeing the cookies variable. I do not understand how that can be possible. Without restricting to the spynner module can someone venture a guess as to how a python object could fail with an error like this?
EDIT: I definitely would have pasted my code here except its not all in one place for me to compactly show it. I should have done it earlier but here is the overall structure and how I instantiate and use spynner.
# helper class to get url data
class C:
def __init__(self):
self.browser = spynner.Browser()
def get_data(self, url):
try:
self.browser.load(url)
return self.browser.html
except:
raise
# class that does other stuff among saving url data to disk
class B:
def save_url_to_disk(self, url):
urlObj = C()
html = urlObj.get_data(url)
# do stuff with html
# class that drives everything
class A:
def do_stuff_and_save_url_data(self, url):
fileObj = B()
fileObj.save_url_to_disk(url)
driver = A()
# call this function for multiple URLs.
driver.do_stuff_and_save_url_data(url)
The way I run it is ---
# xvfb-run python myfile.py
The segfault is probably something else I am doing. May be its because of the xvfb I am using and not handling properly? I don't know yet. I need to mention that I am relatively new to python.
I noticed that when I run the code above with say 'http://www.google.com' I get the segfault every other time.
The code block of
do_stuff_and_save_url_data()
doesn't use the referenceself
:then the execution of this function doesn't depend on
driver
.The code block of
save_url_to_disk()
also doesn't use the referenceself
:then the execution of this second function doesn't depend on the object
fileObj
.Only the code block of
get_data()
uses the referenceself
, and more precisely the referenceself.browser
:so its execution and result depends on the attribute
browser
of the instanceurlObj
from classC
. This attribute is in fact a browser instance namedbrowser
of thespynner.Browser
class.In the end, you "do stuff with html" with just the data outputed by
spynner.Browser().html
. And creation ofdriver
andfileObj
aren't mandatory in any way..
Another point is that
when the instruction
driver.do_stuff_and_save_url_data(url)
is executed,the method
driver.do_stuff_and_save_url_data(url)
is first created, then executed, and finally "destroyed" (or more precisely forgot somewhere in the RAM) because it hasn't been assigned to any identifier.Then the identifier
fileObj
, which is an identifier belonging to the local namespace of the functiondriver.do_stuff_and_save_url_data()
, is lost too, which means the instance fileObj of classB
is also lost for ulterior use since it has no more assigned identifier alive.It's the same for
save_url_to_disk()
:after the creation and execution of the method
fileObj.save_url_to_disk(url)
, the object urlObj of classC
, which contains an instance of browser ( an object created byspynner.Browser()
), is lost: the created browser and all its data is lost.I wonder if this isn't because of this destruction of the browser instance after each execution of
do_stuff_and_save_url_data()
andsave_url_to_disk()
that the cookies information wouldn't be destroyed before an ulterior call..
So, in my opinion, your code only embeds two functions in two definitions of classes
A
andB
and they are used as being considered functions , not as methods.1/ I don't think it is a good coding pattern. When one wants only plain functions, they must be written outside of any class.
2/ The problem is that if operations are triggered by functions, a new browser is created each time these functions are activated , even if they have the mantle of methods.
You will say me that you want these functions to act with data provided by the browser defined by
spynny.Browser()
.That's why I think that they must not be functions embeded in classes as now, but real methods attached to a stable instance of a browser. That's the aim of object to keep in the same namespace the data and the tools to treat the data.
.
All that said, I would personnally write:
But I'm not sure to have well undesrtood all your considerations, and I warn that I didn't know spynner before reading your post. All that I've written could be stupid relatively to your real problem. Keep a critic eye on my post, please.