Faiss how to access data in a database, after indexing and retrieving the indexes?

2k views Asked by At

Good afternoon,

I am facing a challenge related to retrieving sentences based on questions asked. To solve this problem, I chose to use the FAISS model. The process I followed involved coding the question, creating a database, and coding the data within it using the same template. In addition, I performed FastText coding in the Python language.

So far, I am having success generating answers based on the index, however, I am experiencing difficulties when trying to retrieve the phrases that were previously stored in the database. To carry out this storage, I created a table in PostgreSQL.

The current difficulty is in extracting these specific phrases that were previously stored together with the encoded data in the database.

My attempts included several strategies to achieve different results, but the system continued to return only the first sentence from the list of available sentences instead of providing varied answers. However, when I tried to get the program to explore all stored phrases, I encountered the "out of range" error, indicating that my attempt to access phrases beyond the database's capacity was unsuccessful.

This challenge has been a source of frustration, and I am looking for additional solutions to understand why the program is not responding as expected and how I can fix this problem. Any additional help or guidance would be greatly appreciated.

the code where i retrieve the information and do the similarity search

data = request.get_json()
        cursor = conn.cursor()

        query_text = data.get('query_text') 

        query_vector = model.encode([query_text])
        
        if not query_text:
            return jsonify({'error': 'Missing query_text parameter'}), 400

        cursor.execute("SELECT * FROM indexes")
        results = cursor.fetchall()

        if not results:
            return jsonify({'error': 'No indexes found'}), 404

        search_results = []

        for result in results:
            index_data = dict(zip(cursor.column_names, result))
    
            vectors_from_db = json.loads(index_data['vectors'])
            vectors_np = np.array(vectors_from_db)
    
            sentences = json.loads(index_data['sentences'])

            top_k = 5 
            top_results = []
            for idx in np.argsort(similarities)[::-1][:top_k]:
                top_results.append({'index': idx, 'sentence': sentences[idx], 'similarity': similarities[idx]})

        return jsonify({'results': search_results}), 200
0

There are 0 answers