I have a Pandas dataframe that looks like this:
df = pd.DataFrame ({
'id': [1, 17, 19, 17, 22, 3, 0, 3],
'color': ['Green', 'Blue', 'Orange', 'Yellow', 'White', 'Silver', 'Purple', 'Black'],
'shape' : ['Circle', 'Square', 'Circle', 'Triangle', 'Rectangle', 'Circle', 'Square', 'Triangle'],
'person' : ['Sally', 'Bob', 'Tim', 'Sue', 'Bill', 'Diane', 'Brian', 'Sandy']
})
df
id color shape person
0 1 Green Circle Sally
1 17 Blue Square Bob
2 19 Orange Circle Tim
3 17 Yellow Triangle Sue
4 22 White Rectangle Bill
5 3 Silver Circle Diane
6 0 Purple Square Brian
7 3 Black Triangle Sandy
I set the index to color
:
df.set_index ('color', inplace = True )
id shape person
color
Green 1 Circle Sally
Blue 17 Square Bob
Orange 19 Circle Tim
Yellow 17 Triangle Sue
White 22 Rectangle Bill
Silver 3 Circle Diane
Purple 0 Square Brian
Black 3 Triangle Sandy
I'd like to select only the columns id
and person
and only the indices 2 and 3. To do so, I'm using the following:
new_df = df.loc[:, ['id', 'person']][2:4]
new_df
id person
color
Orange 19 Tim
Yellow 17 Sue
It feels like this might not be the most 'elegant' approach. Instead of tacking on [2:4]
to slice the rows, is there a way to effectively combine .loc
(to get the columns) and .iloc
(to get the rows)?
Thanks!
Alternatively, you can start with
df.iloc
(specifying slices or arbitrary indices) and filter column names at the end: