I have a data frame representing some restaurants and their names.
- What i want to do is to add a column
is_chain
to my initial Dataframedf
that represents if the restaurant is a food chain or not. This new column Takes 0 or 1. The value 1 indicates that the restaurant is part of a chain (eg McDonald's).A restaurant is considered to be part of a chain, if there is another restaurant in the database with the same name.
data = {
'restaurant_id': ['1', '2','3','4','5','6','7','8','9','10','11','12'],
'restaurant_name': ['Dennys', 'Dennys','Pho U','Pho U','Dennys','Japanese Cafe','Japanese Cafe','Midori','Midori','xxx','yyy','zzz'],
}
df = pd.DataFrame (data, columns = ['restaurant_id','restaurant_name'])
df.head(15)
So for example here, xxx
, yyy
and zzz
are not part of a chain.
I'm not sure about the correct syntax using pandas to achieve something like this. If any clarifications needed, please ask.
Thank you.
This sounds like
duplicated
:Output: