How to take a text file line by line as the input of the gpt2's generate method and save its output to another text file?

79 views Asked by Alaeddine At 06 November 2023 at 19:24

I want to generate text using gpt2 after fine-tunning and I want to use a text file as the input for the generate function of the model however not reading it as a one text block but line by line.

At the beginning I have tried this code:

text_data = open('/content/drive/My Drive/output_data.txt', 'w')
with open('/content/drive/My Drive/input_data.txt') as lines:
    for line in lines:
        ids = tokenizer.encode(f'{line}',add_special_tokens = True, return_tensors='pt')
        final_outputs = model.generate(
             ids,
             do_sample=True,
             max_new_tokens=ids.shape[1]+1,
             pad_token_id=model.config.eos_token_id,
             top_k=50,
             top_p=0.95,
             num_return_sequences=1
         )
         a = tokenizer.decode(final_outputs[0], skip_special_tokens=True)
         text_data.write(a)
text_data.close()

However instead of looping on input_data.txt lines and processing it line by line, it takes it as a whole text and takes a long time running at this line

final_outputs = model.generate(
             ids,
             do_sample=True,
             max_new_tokens=ids.shape[1]+1,
             pad_token_id=model.config.eos_token_id,
             top_k=50,
             top_p=0.95,
             num_return_sequences=1
         )

And at the end in the output file, I found only the result of one line of the input_text file.

I have tried many idea but it gives the same result:

like reading the input_text separately using readlines() and using another loop on these lines to generate the output.
I also tried to convert the input.txt to csv form and read it as a dataframe and looping on it.

Any suggestion it will help. Thanks in advance

Original Q&A

TechQA.

How to take a text file line by line as the input of the gpt2's generate method and save its output to another text file?

There are 0 answers

Related Questions in PYTHON

Related Questions in READLINE

Related Questions in LARGE-LANGUAGE-MODEL

Related Questions in GPT-2

Related Questions in TEXT-GENERATION

Popular Questions

Popular Tags

Trending Questions