Home » Python » Python parses data from text files

Python parses data from text files

prepares data: parsing data from text and entering features into a classifier before changing the format of the data to be processed into a format acceptable to the classifier. The code and interpretation are as follows:


def file2matrix (filename):


Fr = open (filename)

ArrayOLines = fr.realines ()

ReturnMat = zeros ((numbersOfLines, 3))

ClassLabelVector = []

Index = 0

For line in arrayOLines:

The line = line.strip () //strip () function deletes the whitespace
in a row
ListFromLine = line.split ('t') // string according to space interception, the whole row is split into a list of elements

ReturnMat[index: = = = listFromLine[0:3]

The classLabelVector.append (int (listFromLine[-1])) //append () method is used to add a new object at the end of the list..-1 represents the last column
of the list
Index = 1

, return, returnMat, classLabelVector


prepares data: normalized value


def autonorm (dataset):

MinVals = dataset.min (0)

MaxVals = dataset.max (0)

Ranges = maxVals - minVals

NormDataSet = zeros (shape (dataSet))

M = dataSet.shape[0]

NormDataSet = dataSet-tile (minVals, (m, 1))

NormDataSet = normDataSet/tile (ranges, (m, 1))

Return, normDataSet, ranges, minVals

Latest