Change list with strings

Tried np.array :

with open(file_to_open) as f:
    for line in f:
        # split the line
        line = line.strip()
        columns = line.split(",")
        if columns[0] == "1":
               x_train.append(line)
        if columns[0] == "2":
             y_train.append(line)
                #print(line, end='')
        print( x_train)

I get this result:

['1,14.23,1.71,2.43,15.6,127,2.8,3.06,.28,2.29,5.64,1.04,3.92,1065', '1,13.2,1.78,2.14,11.2,100,2.65,2.76,.26,1.28,4.38,1.05,3.4,1050', '1,13.16,2.36,2.67,18.6,101,2.8,3.24,.3,2.81,5.68,1.03,3.17,1185', '1,14.37,1.95,2.5,16.8,113,3.85,3.49,.24,2.18,7.8,.86,3.45,1480', '1,13.24,2.59,2.87,21,118,2.8,2.69,.39,1.82,4.32,1.04,2.93,735', '1,14.2,1.76,2.4 ....]

But I would like to have in this way:

1,14.23,1.71,2.43,15.6,127,2.8,3.06,.28,2.29,5.64,1.04,3.92,1065
1,13.2,1.78,2.14,11.2,100,2.65,2.76,.26,1.28,4.38,1.05,3.4,1050
1,13.16,2.36,2.67,18.6,101,2.8,3.24,.3,2.81,5.68,1.03,3.17,1185
1,14.37,1.95,2.5,16.8,113,3.85,3.49,.24,2.18,7.8,.86,3.45,1480
1,13.24,2.59,2.87,21,118,2.8,2.69,.39,1.82,4.32,1.04,2.93,735
1,14.2,1.76,2.45,15.2,112,3.27,3.39,.34,1.97,6.75,1.05,2.85,1450
1,14.39,1.87,2.45,14.6,96,2.5,2.52,.3,1.98,5.25,1.02,3.58,1290
1,14.06,2.15,2.61,17.6,121,2.6,2.51,.31,1.25,5.05,1.06,3.58,1295
1,14.83,1.64,2.17,14,97,2.8,2.98,.29,1.98,5.2,1.08,2.85,1045
1,13.86,1.35,2.27,16,98,2.98,3.15,.22,1.85,7.22,1.01,3.55,1045
1,14.1,2.16,2.3,18,105,2.95,3.32,.22,2.38,5.75,1.25,3.17,1510
1,14.12,1.48,2.32,16.8,95,2.2,2.43,.26,1.57,5,1.17,2.82,1280
1,13.75,1.73,2.41,16,89,2.6,2.76,.29,1.81,5.6,1.15,2.9,1320

That is a part of the txt file a load:

1,14.23,1.71,2.43,15.6,127,2.8,3.06,.28,2.29,5.64,1.04,3.92,1065 1,13.2,1.78,2.14,11.2,100,2.65,2.76,.26,1.28,4.38,1.05,3.4,1050 1,13.16,2.36,2.67,18.6,101,2.8,3.24,.3,2.81,5.68,1.03,3.17,1185 1,14.37,1.95,2.5,16.8,113,3.85,3.49,.24,2.18,7.8,.86,3.45,1480 1,13.24,2.59,2.87,21,118,2.8,2.69,.39,1.82,4.32,1.04,2.93,735 1,14.2,1.76,2.45,15.2,112,3.27,3.39,.34,1.97,6.75,1.05,2.85,1450 1,14.39,1.87,2.45,14.6,96,2.5,2.52,.3,1.98,5.25,1.02,3.58,1290 1,14.06,2.15,2.61,17.6,121,2.6,2.51,.31,1.25,5.05,1.06,3.58,1295

3 Answers

0
narn On Best Solutions

If you are certain that all elements in list are numbers and no other alphabetic values, you could tweak your code to transfrom type of strings to floats with the following line:

[float(element) for element in columns]

In your code you can use it like so:

if columns[0] == "1":
    x_train.append([float(element) for element in columns])

if columns[0] == "2":
    y_train.append([float(element) for element in columns])
1
holdenweb On

You are appending strings to a list, so what you get back is, unsurprisingly, a list of strings. You don't make it clear how you are using an np.array - I can see no sign of one in your code - and neither do you make it obvious what actual data format you want.

I have therefore assumed that you would like a list of lists in your x_train and y_train variables. Instead of appending the line, append a list of columns converted to floats:

    ...
    x_train.append([float(x) for x in columns])
    ...
    y_train.append([float(x) for x in columns])
    ...

It should then be quite easy to convert x_train and y_train into numpy nd_arrays, or print each line in the format you want it ...

0
Arpan On

Since, it looks like a comma separated file, you can try:

data = ps.read_csv('train.txt', header=None)
x_train = data[data.iloc[:,0]==1]
y_train = data[data.iloc[:,0]==2]

If you want from your code, just try this:

np.array(list(map(float, ','.join(s).split(',')))).reshape(len(x_train),14)