I am working with a file in .tar.Z format. I manually changed its name so it would only have the .tar extension, and I'm currently struggling to open it and read the data. I can't seem to find what did I do wrong.
!pip install tslearn #Library for Time Series
!pip install hmmlearn #Library for Hidden Markov Models
import pandas as pd
import numpy as np
import time # For optimization purposes
import matplotlib.pyplot as plt
from matplotlib import cm
import pylab as pl
import io
from google.colab import drive
#Jupyter notebook option for display
pd.set_option('display.max_rows', None)
np.set_printoptions(threshold=np.inf)
%matplotlib inline
filename='diabetes-data'
uploaded = files.upload()
columnsNames = [
'sequenceName',
'TagIdentificator',
'timestamp',
'dateFORMAT',
'x-coordinate-of-the-tag',
'y-coordinate-of-the-tag',
'z-coordinate-of-the-tag',
'activity'
]
data = pd.read_csv(io.BytesIO(uploaded[filename]+'.tar'),encoding='latin1',header=None,names=columnsNames)
I did some research and ended up adding the "encoding='latin1' when an error about character reading occurred, but I have no idea how to solve this one. Thank you so much!
tar.Z indicates it is not just a tar file but it is additionally compressed. As far as I know it is a zip compression. You might need unzip additionally.