My professor wants us to download an excel file directly from the website, and part of the analysis, we need to generate the sum of some of the columns (the professor suggested that we use starts_with). The point is that the lines with the same name (the second header, if I can call it this way) are the second line, and the RStudio is reading as an observation instead of a proper header. I tried to delete the first row, but the r deleted the header I didn't want. I am going to put the codes here. Initially, I tried this one:
install.packages("tidyverse", dependencies = T)
install.packages("data.table", dependencies = T)
install.packages("readxl", dependencies = T)
install.packages("ggplot2", dependencies = T)
install.packages("openxlsx", dependencies = T)
library(tidyverse)
library(data.table)
library(readxl)
library(ggplot2)
library(openxlsx)
datatable <- data.table(openxlsx::read.xlsx('https://doi.org/10.1371/journal.pone.0242866.s001')) %>%
tail(-1)
Later on, I tried separately (I uploaded the same without the tail(-1)) and in a second line I wrote:
dt <- dt[-1,]
I Also tried something that I saw on the internet with the:
name(dt) = NULL
but it gave me this problem: Error in View : Internal error: length of names (0) is not length of dt (61)
Can someone tell me the proper way? (In the second and third line I added one object dt = datatable, that is why it is different from the first one)
You just need to use the
startRowparameter:Which makes your column names:
Then you can use
starts_with()to select columns: