transformers stream generator
1.0.0
This is a text generation method which returns a generator, streaming out each token in real-time during inference, based on Huggingface/Transformers.
pip install transformers-stream-generator
from transformers_stream_generator import init_stream_support
init_stream_support()
do_stream=True
in model.generate
function and keep do_sample=True
, then you can get a generatorgenerator = model.generate(input_ids, do_stream=True, do_sample=True)
for token in generator:
word = tokenizer.decode(token)
print(word)