I have a project related to NLP course, which is about to classify colloquial dialects in Arabic. I need to scrap a lot of data from different domains to tbe train properly.
I'm using Python 2.7
on Windows 10 64-bit with Eclipse IDE (using PyDev32), the top popular and effective framework I have found is Scrapy
I have followed all installation steps carefully:
when installing with
pip install scrapy
, it outputs NO errorBUT when I start a project or executing
scrapy shell "google.com"
, I got the error (the last four lines):
from OpenSSL._util import (
File "c:\python27\lib\site-packages\OpenSSL\_util.py", line 6, in <module>
from cryptography.hazmat.bindings.openssl.binding import Binding
File "c:\python27\lib\site-packages\cryptography\hazmat\bindings\openssl\binding.py", line 14, in <module>
from cryptography.hazmat.bindings._openssl import ffi, lib
ImportError: DLL load failed: %1 is not a valid Win32 application.
another error:
Could not find function xmlCheckVersion in library libxml2. Is libxml2 installed?
Notes:
I'm using Python of 32-bit because I noticed (after googling) Scrapy doesn't work on 64-bit
I have found many solutions on Stack Overflow but all in vain, thus I don't my question is duplicate
I tried to turn ON/OFF the firewall with no benefit
I installed both versions of OpenSSL (32bit/64bit) and nothing fixed
I thought that the problem is with
lxml
but it's NOT related to it.I'm totally beginner, and my project should be finished in less than a week
I test running
scrapy
on Anaconda (as they recommended), I got the same errors
I'm so sorry for my modest question, I'm so optimistic to got anyone's help :)
You are most likely having trouble with
lxml
dependancy which is notoriously difficult to compile on windows systems.The best thing you can do is install a binary of it as mentioned in the official docummentation You can download unofficial binaries directly from here