My dataset looks like this:
Country | year | poverty rate | sales |
---|---|---|---|
Austria | 1950 | 0.54 | 142 |
Austria | 1951 | 0.32 | 12441 |
Austria | 1952 | 0.32 | 12441 |
Bangladesh | 1950 | 0.11 | 142123123 |
Bangladesh | 1951 | 0.52 | 1234 |
Bangladesh | 1952 | 0.32 | 12441 |
Sri Lanka | 1950 | 0.95 | 4215 |
Sri Lanka | 1951 | 0.21 | 142421 |
Sri Lanka | 1952 | 0.32 | 12441 |
I want to do tsset
so that I can (for example) create a new variable for change in sales per year for each country. When I try to do tsset country year
, I see "repeated time values within panel". How can I create a new variable that is change in sales per year for each country and year? I have more variables so I would want to be able to specify the variable.
country
looks like a string variable from here, but if it were thenwould fail for that reason. So, suppose
country
is a numeric variable with value labels. Then it is essential to follow up the report of repeated observations with sayThen the next step depends on what you see, for example:
The duplicates are just junk with missing values on one of those variables.
drop
the junk.Accidental duplicate observations.
drop
the duplicates.Something more serious.
See also FAQ https://www.stata.com/support/faqs/data-management/repeated-time-values/