How to download the pretrained dataset of huggingface RagRetriever to a custom directory

82 views Asked by At

I'm playing with a RAG example from facebook (huggingface) https://huggingface.co/facebook/rag-token-nq#usage.

Here a very nice explanation of it: https://ai.facebook.com/blog/retrieval-augmented-generation-streamlining-the-creation-of-intelligent-natural-language-processing-models/

The code is very simple but the dataset it downloads in this step is a little big (75GB):

retriever = RagRetriever.from_pretrained("facebook/rag-token-nq", index_name="exact", use_dummy_dataset=True)

It downloads the dataset in /root/.cache/huggingface/datasets/, something that I'd like to change if possible. This is the output of that line of code is:

Downloading and preparing dataset wiki_dpr/psgs_w100.nq.no_index (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to /root/.cache/huggingface/datasets/wiki_dpr/

My question is: how I can change the folder to where download the dataset used by RagRetriever.from_pretrained (the 75GB one) to another one different to root/.cache/huggingface/datasets/ .

Thanks!.

0

There are 0 answers