Suppress LLamaCpp stats output

Question

Suppress LLamaCpp stats output

543 views Asked by sten At 27 June 2023 at 18:45

How can I suppress LLamaCpp stats output in Langchain ... equivalent code :

llm = LlamaCpp(model_path=...,  ....)
llm('who is Caesar')


> who is Caesar ?
 Julius Caesar was a Roman general and statesman who played a critical role in the events that led to the demise of the Roman Republic and the rise of the Roman Empire. He is widely considered one of Rome's greatest warlords and is often ranked alongside his adopted son, Octavian, as one of the two most important figures in ancient
llama_print_timings:        load time =   532.05 ms
llama_print_timings:      sample time =    32.74 ms /    71 runs   (    0.46 ms per token,  2168.40 tokens per second)
llama_print_timings: prompt eval time = 29011.08 ms /   432 tokens (   67.16 ms per token,    14.89 tokens per second)
llama_print_timings:        eval time = 10284.56 ms /    70 runs   (  146.92 ms per token,     6.81 tokens per second)
llama_print_timings:       total time = 39599.38 ms
 Rome.

Original Q&A

There are 2 answers

**sten** · Answer 1 · 2023-08-05T23:47:23+00:00

sten On 05 August 2023 at 23:47

the reason is that langchain doesnt support "verbose" parameter.. you can edit the init() method of its class and add it..

**Malgo** · Answer 2 · 2024-02-21T21:43:33+00:00

Below code worked fine for me for gguf models. Added the verbose parameter while loading the model.

from llama_cpp import Llama
llm = Llama(model_path="/path/to/model.gguf",
            verbose=False)

This stopped the stat outputs for model loading as well as inferencing.

Source: https://llama-cpp-python.readthedocs.io/en/latest/api-reference/

TechQA.

Suppress LLamaCpp stats output

There are 2 answers

Related Questions in PYTHON

Related Questions in LANGCHAIN

Related Questions in LARGE-LANGUAGE-MODEL

Related Questions in LLAMACPP

Popular Questions

Popular Tags

Trending Questions