How to use tabulate lib with float64 : python

664 views Asked by At

In order to pretty print the data, I am using tabulate library in python. Here is the code I am using :

train = pd.read_csv('../misc/data/train.csv')
test = pd.read_csv('../misc/data/test.csv')

# Prints the head of data prettily :)
print(tabulate(train.head(), headers='keys', tablefmt='psql'))

The data is titanic dataset from kaggle. Now, I need to use tabulate on data that has float64 values. Here is the code that's giving me the error:

surv_age = train[train['Survived'] == 1]['Age'].value_counts()
dead_age = train[train['Survived'] == 0]['Age'].value_counts()

print(tabulate(surv_age, headers='keys', tablefmt='psql'))

df = pd.DataFrame([surv_age, dead_age])
df.index = ['Survived', 'Dead']
df.plot(kind='hist', stacked=True, figsize=(15, 8))
plt.xlabel('Age')
plt.ylabel('Number of passengers')
plt.show()

The error is: Traceback (most recent call last):

  File "main.py", line 49, in <module>
    print(tabulate(surv_age, headers='keys', tablefmt='psql'))
  File "/usr/local/lib/python2.7/dist-packages/tabulate.py", line 1109, in tabulate
    tabular_data, headers, showindex=showindex)
  File "/usr/local/lib/python2.7/dist-packages/tabulate.py", line 741, in _normalize_tabular_data
    rows = [list(row) for row in vals]
TypeError: 'numpy.float64' object is not iterable

Line 49 is the print(tabulate(.. line from the code.

How do I iterate float64 values of data so that I can pretty print in tabulate? If its not possible in tabulate, please suggest an alternative way of pretty printing that can do so. Here is the sample of what tabulate can do :

+----+---------------+------------+----------+-----------------------------------------------------+--------+-------+---------+---------+------------------+---------+---------+------------+
|    |   PassengerId |   Survived |   Pclass | Name                                                | Sex    |   Age |   SibSp |   Parch | Ticket           |    Fare | Cabin   | Embarked   |
|----+---------------+------------+----------+-----------------------------------------------------+--------+-------+---------+---------+------------------+---------+---------+------------|
|  0 |             1 |          0 |        3 | Braund, Mr. Owen Harris                             | male   |    22 |       1 |       0 | A/5 21171        |  7.25   | nan     | S          |
|  1 |             2 |          1 |        1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female |    38 |       1 |       0 | PC 17599         | 71.2833 | C85     | C          |
|  2 |             3 |          1 |        3 | Heikkinen, Miss. Laina                              | female |    26 |       0 |       0 | STON/O2. 3101282 |  7.925  | nan     | S          |
|  3 |             4 |          1 |        1 | Futrelle, Mrs. Jacques Heath (Lily May Peel)        | female |    35 |       1 |       0 | 113803           | 53.1    | C123    | S          |
|  4 |             5 |          0 |        3 | Allen, Mr. William Henry                            | male   |    35 |       0 |       0 | 373450           |  8.05   | nan     | S          |
+----+---------------+------------+----------+-----------------------------------------------------+--------+-------+---------+---------+------------------+---------+---------+------------+
1

There are 1 answers

7
martianwars On BEST ANSWER

Quoting from the tabulate documentation,

The following tabular data types are supported:

  • list of lists or another iterable of iterables
  • list or another iterable of dicts (keys as columns)
  • dict of iterables (keys as columns)
  • two-dimensional NumPy array
  • NumPy record arrays (names as columns)
  • pandas.DataFrame

Your variable surv_age is a 1-D numpy array of shape (342,). You will need to re-shape into a 2-D numpy array. You can do this easily using numpy.reshape,

surv_age = np.reshape(surv_age, (-1, 1))

You can also do this using np.expand_dims like this,

surv_age = np.expand_dims(surv_age, axis=1)