How to open a xls file that seems to have something wrong in R

72 views Asked by At

I have a xls file that I cannot open in R. I've tried a lot and the best I got was something like:

data = read.table('D:/Usuario/pivo1/table.xls', sep = ';')

but the df comes in this form:

enter image description here

while it should be something like this:

enter image description here

Aditionally:

print(data[9,1])

enter image description here

We can see the numbers of interest. Dates, hours and etc. but this form, in line 9 and column 1 of the df.

None of the excel functions worked, this was the best result i got.

I have a dropbox link for the file access:

https://www.dropbox.com/scl/fi/jlo15oasidxrbvvbcl3ge/table.xls?rlkey=ny8804ox4yqb1nqs9wn33j5ak&dl=0

Thanks for the help, i still have no solution. Working manually on the data.

2

There are 2 answers

0
margusl On BEST ANSWER

The file contains HTML for a table and is somehow saved with xls extension:

<table id="tb1">            
            <tbody><tr>
                <th>DATA</th>
                <th>HORA</th>
                <th>VOLUME (m³)</th>
                <th>HORIMETRO</th>
            </tr>
            
        <tr>...</tr></tbody></table>

Somewhat surprisingly Excel can actually handle this, even though it does complain about format and extension mismatch.

You can open the file and parse the table with {rvest}:

library(rvest)
read_html("table.xls") |>
  html_element("table") |>
  html_table()
#> # A tibble: 162 × 4
#>    DATA     HORA  `VOLUME (m³)` HORIMETRO
#>    <chr>    <chr> <chr>             <dbl>
#>  1 23/04/23 00:00 0253037 m³         514.
#>  2 23/04/23 01:00 0253037 m³         514.
#>  3 23/04/23 02:00 0253037 m³         514.
#>  4 23/04/23 03:00 0253037 m³         514.
#>  5 23/04/23 04:00 0253037 m³         514.
#>  6 23/04/23 05:00 0253037 m³         514.
#>  7 23/04/23 06:00 0253037 m³         514.
#>  8 23/04/23 07:00 0253037 m³         514.
#>  9 23/04/23 08:00 0253037 m³         514.
#> 10 23/04/23 09:00 0253037 m³         514.
#> # ℹ 152 more rows

Created on 2023-10-28 with reprex v2.0.2

0
oszkar On

To read Excel files (actually worksheets or part(s) of worksheets from Excel files) in R you have several options. e.g.: