I have a movie list that I want OpenAI to return data from. Initially this list was CSV file with columns like movie title, genre, studio, year and main actors, but OpenAI was not understanding it well. So I transformed each item in the list to be represented in string like so:
1. "The Godfather" is a crime drama created by Francis Ford Coppola in 1972 with the actors Marlon Brando, Al Pacino, and James Caan.
The list is 100 items long in the PDF format and OpenAI improved greatly with this different format. However, when I ask things like "Give me 10 action movies from my list" it returns some movies that are in the list and some that aren't. Same for questions like "Give me Al Pacino movies from my list."
How can I improve this? Any thoughts?
Tech Stack used in this project is OpenAI, LangChain and Pinecone.