lmppl
1.0.0
困惑度衡量文本通过语言模型 (LM) 的可预测程度,通常用于评估文本的流畅性或原型性(困惑度越低,文本越流畅或原型化)。 LM-PPL 是一个 Python 库,用于使用任何类型的预训练 LM 计算文本的困惑度。我们计算循环 LM 的普通困惑度,例如 GPT3 (Brown et al., 2020) 和编码器-解码器 LM 的解码器困惑度,例如 BART (Lewis et al., 2020) 或 T5 (Raffel et al., 2020) ),同时我们计算 masked LM 的伪困惑度(Wang 和 Cho,2018)。
通过 pip 安装。
pip install lmppl
我们以困惑度来解决情感分析为例!请记住,困惑度较低的文本更好,因此我们比较两个文本(正面和负面)并选择困惑度较低的文本作为模型预测。
import lmppl
scorer = lmppl . LM ( 'gpt2' )
text = [
'sentiment classification: I dropped my laptop on my knee, and someone stole my coffee. I am happy.' ,
'sentiment classification: I dropped my laptop on my knee, and someone stole my coffee. I am sad.'
]
ppl = scorer . get_perplexity ( text )
print ( list ( zip ( text , ppl )))
>> > [
( 'sentiment classification: I dropped my laptop on my knee, and someone stole my coffee. I am happy.' , 136.64255272925908 ),
( 'sentiment classification: I dropped my laptop on my knee, and someone stole my coffee. I am sad.' , 139.2400838400971 )
]
print ( f"prediction: { text [ ppl . index ( min ( ppl ))] } " )
>> > "prediction: sentiment classification: I dropped my laptop on my knee, and someone stole my coffee. I am happy."
import lmppl
scorer = lmppl . MaskedLM ( 'microsoft/deberta-v3-small' )
text = [
'sentiment classification: I dropped my laptop on my knee, and someone stole my coffee. I am happy.' ,
'sentiment classification: I dropped my laptop on my knee, and someone stole my coffee. I am sad.'
]
ppl = scorer . get_perplexity ( text )
print ( list ( zip ( text , ppl )))
>> > [
( 'sentiment classification: I dropped my laptop on my knee, and someone stole my coffee. I am happy.' , 1190212.1699246117 ),
( 'sentiment classification: I dropped my laptop on my knee, and someone stole my coffee. I am sad.' , 1152767.482071837 )
]
print ( f"prediction: { text [ ppl . index ( min ( ppl ))] } " )
>> > "prediction: sentiment classification: I dropped my laptop on my knee, and someone stole my coffee. I am sad."
import lmppl
scorer = lmppl . EncoderDecoderLM ( 'google/flan-t5-small' )
inputs = [
'sentiment classification: I dropped my laptop on my knee, and someone stole my coffee.' ,
'sentiment classification: I dropped my laptop on my knee, and someone stole my coffee.'
]
outputs = [
'I am happy.' ,
'I am sad.'
]
ppl = scorer . get_perplexity ( input_texts = inputs , output_texts = outputs )
print ( list ( zip ( outputs , ppl )))
>> > [
( 'I am happy.' , 4138.748977714201 ),
( 'I am sad.' , 2991.629250051472 )
]
print ( f"prediction: { outputs [ ppl . index ( min ( ppl ))] } " )
>> > "prediction: I am sad."
以下是一些流行模型的示例以及在 lmppl 包中使用的相应模型类型。
模型 | 拥抱面部识别码 | 型号类型 |
---|---|---|
伯特 | google-bert/bert-base-uncased | 蒙面LM |
罗伯塔 | 罗伯塔大 | 蒙面LM |
通用技术2 | gpt2-xl | LM |
果馅饼-ul2 | 谷歌/flan-ul2 | 编码器解码器LM |
GPT-NeoX | EleutherAI/gpt-neox-20b | LM |
选择 | 脸书/opt-30b | LM |
混合 | 米斯特拉莱/Mixtral-8x22B-v0.1 | LM |
骆驼3 | 元骆驼/元骆驼-3-8B | LM |
最大令牌长度:每个 LM 都有自己的最大令牌长度(用于循环/屏蔽 LM 的max_length
,以及用于编码器-解码器 LM 的max_length_encoder
和max_length_decoder
)。限制这些最大令牌将减少处理文本的时间,但可能会影响困惑度的准确性,因此请对您的文本进行实验并确定最佳令牌长度。
批量大小:可以将批量大小传递给函数get_perplexity
(例如get_perplexity(text, batch_size=32)
)。默认情况下,它会将所有文本处理一次,如果文本数量过多,可能会导致内存错误。