OpenAI Function Calling Unstructured Data

56 views Asked by At

I am trying to create a structured JSON output from sample text that is related to two students activities at school. My current code only returns the first student's information, I would like to collect all student info.

Text = "Ravi Patel is a sophomore majoring in computer science at the University of Michigan. He is South Asian Indian American and has a 3.7 GPA. Ravi is an active member of the university's Chess Club and the South Asian Student Association. He hopes to pursue a career in software engineering after graduating. David Nguyen is a sophomore majoring in computer science at Stanford University. He is Asian American and has a 3.8 GPA. David is known for his programming skills and is an active member of the university's Robotics Club. He hopes to pursue a career in artificial intelligence after graduating."

Variables I am looking to extract: {'name':'major':'school':'grades':'club':}

My Code:

%pip install --upgrade openai -q

import os
from openai import OpenAI

my_key = "API KEY"
os.environ["OPENAI_API_KEY"] = my_key
client = OpenAI()

student_1_description = "Ravi Patel is a sophomore majoring in computer science at the University of Michigan. He is South Asian Indian American and has a 3.7 GPA. Ravi is an active member of the university's Chess Club and the South Asian Student Association. He hopes to pursue a career in software engineering after graduating. David Nguyen is a sophomore majoring in computer science at Stanford University. He is Asian American and has a 3.8 GPA. David is known for his programming skills and is an active member of the university's Robotics Club. He hopes to pursue a career in artificial intelligence after graduating."

import json

# Loading the response as a JSON object
#json_response = json.loads(openai_response.choices[0].message.content)
#json_response

student_custom_functions = [
    {
        'name': 'extract_student_info',
        'description': 'Extraction of all individuals mentioned in the text, including their names, majors, schools, grades and clubs',
        'parameters': {
            'type': 'object',
            'properties': {
                'name': {
                    'type': 'string',
                    'description': 'Name of the person'
                },
                'major': {
                    'type': 'string',
                    'description': 'Major subject.'
                },
                'school': {
                    'type': 'string',
                    'description': 'The university name.'
                },
                'grades': {
                    'type': 'integer',
                    'description': 'GPA of the student.'
                },
                'club': {
                    'type': 'string',
                    'description': 'School club for extracurricular activities. '
                }
                
            }
        }
    }
]

student_description = [student_1_description]
for i in student_description:
    response = client.chat.completions.create(
        model = 'gpt-3.5-turbo',
        messages = [{'role': 'user', 'content': i}],
        functions = student_custom_functions,
        function_call = 'auto'
    )

    # Loading the response as a JSON object
    json_response = json.loads(response.choices[0].message.function_call.arguments)
    print(json_response)

Code output:

{'name': 'Ravi Patel', 'major': 'Computer Science', 'school': 'University of Michigan', 'grades': 3.7, 'club': 'Chess Club, South Asian Student Association'}

I was expecting to see both student's information but the code only returned one student's info

1

There are 1 answers

0
nick On

The function calls seem to have some issues with arrays.

I changed your custom function to this:

student_custom_functions = [
    {
        'name': 'extract_student_info',
        'description': 'Extraction of all individuals mentioned in the text, including their names, majors, schools, grades and clubs',
        'parameters': {
            "type": "object",
            "properties": {
                "students": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "name": {
                                "type": "string",
                                "description": "Name of the person"
                            },
                            "major": {
                                "type": "string",
                                "description": "Major subject."
                            },
                            "school": {
                                "type": "string",
                                "description": "The university name."
                            },
                            "grades": {
                                "type": "integer",
                                "description": "GPA of the student."
                            },
                            "club": {
                                "type": "string",
                                "description": "School club for extracurricular activities."
                            }
                        },
                    },
                },
            }
        },
    },
]

And it returns:

{'students': [{'name': 'Ravi Patel', 'major': 'computer science', 'school': 'University of Michigan', 'grades': 3.7, 'club': 'Chess Club and South Asian Student Association'}, {'name': 'David Nguyen', 'major': 'computer science', 'school': 'Stanford University', 'grades': 3.8, 'club': 'Robotics Club'}]}