Excluding CSV Columns from Data Dictionary

81 views Asked by At

I am attempting to create a data dictionary that does not include all of the columns in the source csv file. I have managed to create one that does include all the columns, but want to exclude some of them.

The code I am using is this:

input_file = csv.DictReader(open(DATA_FILE))
fieldnames = input_file.fieldnames
data_large_countries = {fn: [] for fn in fieldnames}
for line in input_file:
    for k, v in line.items():
        if (v == ''): 
            v=0
        try:
            data_large_countries[k].append(int(v))
        except ValueError:
            try:
                data_large_countries[k].append(float(v))
            except ValueError:
                data_large_countries[k].append(v)
                
for k, v in data_large_countries.items():
    data_large_countries[k] = np.array(v)
     
print(data_large_countries.keys())

with the output:

dict_keys(['iso_code', 'continent', 'location', 'date', 'total_cases', 'new_cases', 'new_cases_smoothed', 'total_deaths', 'new_deaths', 'new_deaths_smoothed', 'total_cases_per_million', 'new_cases_per_million', 'new_cases_smoothed_per_million', 'total_deaths_per_million', 'new_deaths_per_million', 'new_deaths_smoothed_per_million', 'reproduction_rate', 'icu_patients', 'icu_patients_per_million', 'hosp_patients', 'hosp_patients_per_million', 'weekly_icu_admissions', 'weekly_icu_admissions_per_million', 'weekly_hosp_admissions', 'weekly_hosp_admissions_per_million', 'total_tests', 'new_tests', 'total_tests_per_thousand', 'new_tests_per_thousand', 'new_tests_smoothed', 'new_tests_smoothed_per_thousand', 'positive_rate', 'tests_per_case', 'tests_units', 'total_vaccinations', 'people_vaccinated', 'people_fully_vaccinated', 'total_boosters', 'new_vaccinations', 'new_vaccinations_smoothed', 'total_vaccinations_per_hundred', 'people_vaccinated_per_hundred', 'people_fully_vaccinated_per_hundred', 'total_boosters_per_hundred', 'new_vaccinations_smoothed_per_million', 'new_people_vaccinated_smoothed', 'new_people_vaccinated_smoothed_per_hundred', 'stringency_index', 'population', 'population_density', 'median_age', 'aged_65_older', 'aged_70_older', 'gdp_per_capita', 'extreme_poverty', 'cardiovasc_death_rate', 'diabetes_prevalence', 'female_smokers', 'male_smokers', 'handwashing_facilities', 'hospital_beds_per_thousand', 'life_expectancy', 'human_development_index', 'excess_mortality_cumulative_absolute', 'excess_mortality_cumulative', 'excess_mortality', 'excess_mortality_cumulative_per_million'])

I only need 6 of these keys in my data dictionary. How do I amend my code to get only the keys I want?

0

There are 0 answers