How to create feature vectors out of document of words and do operations on them?

393 views Asked by Sid At 21 December 2016 at 18:18

I am currently trying to implement a scholarly paper recommendation system. The first part of this project is to create the profile for each junior researcher using the research paper that they have published and the papers that have been referenced in that paper. The mathematical notation for which will be :

Where P1 is the researcher's paper, f is the feature vector, W is the weight assigned to the vectors to give appropriate importance to each paper referenced and ref is the reference paper.

Now the data for each paper, reference and research, is given as the words and their term frequency. For eg.

For the individual files I have no problem in constructing the feature vector. I use this code:

def create_fvector_p(file_name):
    file = open(file_name,'r')
    feature_dict = defaultdict(float)
    for line in file:
        feature = line.split()
        feature_dict[feature[0]] = feature[1]
    feature_vector = DataFrame.from_dict([feature_dict])
    return feature_vector

Now when it comes to do operations using these vectors I am lost. I don't know how to manipulate this vector space model so that i can fit it into those equations. What am I doing wrong and how should I make it right ?

Original Q&A

TechQA.

How to create feature vectors out of document of words and do operations on them?

There are 0 answers

Related Questions in PYTHON

Related Questions in NUMPY

Related Questions in SCIKIT-LEARN

Related Questions in TEXT-MINING

Related Questions in RECOMMENDATION-ENGINE

Popular Questions

Popular Tags

Trending Questions