FOA I've never had pandas crash (freeze, loop infinitely) on me before. Second it's not the files, they were reading well before.
Doing a bit of research I stumbled upon this issue where the cause is traced back to pd._libs.cp36. Another similar this
I looked up my own pd.libs to find diverse .py files like algos.cp38-win.... A couple things fell to mind. First that I upgraded to python 3.8. Environment is called work38 btw But trying on a different environment didn't work
The only other thing is that I installed fbprophet. To install fbprophet I installed pystan.
To install pystan I had to run this command as per their docs
conda install libpython m2w64-toolchain -c msys2
.
There are many guides to installing pystan that encourage you to install in a particular order (first pystan, then numpy cython pandas etc). Idk if there's a reason for this.
In any case, my idea is that the code above f***ed up my whole anaconda environment with some c compilers and now pandas is broken in all environments, even if I pip uninstall & pip install --no-cache-dir pandas.
2 Questions: First one is, do you know what's happening here? Could you explain me? And second, any idea how I can repair this? Or must I uninstall anaconda an reinstall everything (then of course pip install -r requirements.txt)
Edit: Maybe the C compiler stuff is unrelated. I just let read_excel run for a painful amount of time and it returned a dataframe with 65000 rows and 250 columns. I see that when I convert the xlsx to csv (with a CLI script) the csv contains a bunch of empty rows and columns.
TLDR: I have an xlsx file with 250 rows and ~20 columns but apparently the empty cells aren't empty?