nlpcloud python下载 - nlpcloud python源码下载

NLP 云的 Python 客户端

这是 NLP Cloud API 的 Python 客户端。请参阅文档了解更多详细信息。

NLP Cloud 为 NER、情感分析、分类、摘要、对话摘要、释义、意图分类、产品描述和广告生成、聊天机器人、语法和拼写纠正、关键字和关键短语提取、文本生成提供高性能预训练或自定义模型、图像生成、源代码生成、问答、自动语音识别、机器翻译、语言检测、语义搜索、语义相似性、标记化、POS 标记、嵌入和依存解析。它已准备好用于生产，并通过 REST API 提供服务。

您可以使用 NLP Cloud 预训练模型、微调您自己的模型或部署您自己的模型。

如果您遇到问题，请毫不犹豫地将其作为 Github 问题提出。谢谢！

安装

通过 pip 安装。

pip install nlpcloud

示例

下面是一个完整的示例，使用 Facebook 的 Bart Large CNN 模型（带有假令牌）总结了文本：

 import nlpcloud

client = nlpcloud . Client ( "bart-large-cnn" , "4eC39HqLyjWDarjtT1zdp7dc" )
client . summarization ( """One month after the United States began what has become a 
  troubled rollout of a national COVID vaccination campaign, the effort is finally 
  gathering real steam. Close to a million doses -- over 951,000, to be more exact -- 
  made their way into the arms of Americans in the past 24 hours, the U.S. Centers 
  for Disease Control and Prevention reported Wednesday. That s the largest number 
  of shots given in one day since the rollout began and a big jump from the 
  previous day, when just under 340,000 doses were given, CBS News reported. 
  That number is likely to jump quickly after the federal government on Tuesday 
  gave states the OK to vaccinate anyone over 65 and said it would release all 
  the doses of vaccine it has available for distribution. Meanwhile, a number 
  of states have now opened mass vaccination sites in an effort to get larger 
  numbers of people inoculated, CBS News reported.""" )

下面是一个完整的示例，它执行相同的操作，但在 GPU 上：

 import nlpcloud

client = nlpcloud . Client ( "bart-large-cnn" , "4eC39HqLyjWDarjtT1zdp7dc" , True )
client . summarization ( """One month after the United States began what has become a 
  troubled rollout of a national COVID vaccination campaign, the effort is finally 
  gathering real steam. Close to a million doses -- over 951,000, to be more exact -- 
  made their way into the arms of Americans in the past 24 hours, the U.S. Centers 
  for Disease Control and Prevention reported Wednesday. That s the largest number 
  of shots given in one day since the rollout began and a big jump from the 
  previous day, when just under 340,000 doses were given, CBS News reported. 
  That number is likely to jump quickly after the federal government on Tuesday 
  gave states the OK to vaccinate anyone over 65 and said it would release all 
  the doses of vaccine it has available for distribution. Meanwhile, a number 
  of states have now opened mass vaccination sites in an effort to get larger 
  numbers of people inoculated, CBS News reported.""" )

这是一个完整的示例，它执行相同的操作，但使用的是法语文本：

 import nlpcloud

client = nlpcloud . Client ( "bart-large-cnn" , "4eC39HqLyjWDarjtT1zdp7dc" , True , "fra_Latn" )
client . summarization ( """Sur des images aériennes, prises la veille par un vol de surveillance 
  de la Nouvelle-Zélande, la côte d’une île est bordée d’arbres passés du vert 
  au gris sous l’effet des retombées volcaniques. On y voit aussi des immeubles
  endommagés côtoyer des bâtiments intacts. « D’après le peu d’informations
  dont nous disposons, l’échelle de la dévastation pourrait être immense, 
  spécialement pour les îles les plus isolées », avait déclaré plus tôt 
  Katie Greenwood, de la Fédération internationale des sociétés de la Croix-Rouge.
  Selon l’Organisation mondiale de la santé (OMS), une centaine de maisons ont
  été endommagées, dont cinquante ont été détruites sur l’île principale de
  Tonga, Tongatapu. La police locale, citée par les autorités néo-zélandaises,
  a également fait état de deux morts, dont une Britannique âgée de 50 ans,
  Angela Glover, emportée par le tsunami après avoir essayé de sauver les chiens
  de son refuge, selon sa famille.""" )

返回一个 json 对象：

{
  "summary_text" : " Over 951,000 doses were given in the past 24 hours. That's the largest number of shots given in one day since the  rollout began. That number is likely to jump quickly after the federal government gave states the OK to vaccinate anyone over 65. A number of states have now opened mass vaccination sites. "
}

用法

客户端初始化

在初始化期间将要使用的模型和 NLP Cloud 令牌传递给客户端。

该模型可以是预训练模型，如en_core_web_lg 、 bart-large-mnli ...，也可以是使用custom_model/<model id> （例如custom_model/2568 ）的自定义模型之一。请参阅文档以获取所有可用型号的完整列表。

您可以从 NLP Cloud 仪表板检索您的令牌。

 import nlpcloud

client = nlpcloud . Client ( "<model>" , "<your token>" )

如果您想使用 GPU，请传递gpu=True 。

 import nlpcloud

client = nlpcloud . Client ( "<model>" , "<your token>" , gpu = True )

如果您想使用多语言插件来处理非英语文本，请传递lang="<your language code>" 。例如，如果您想处理法语文本，您应该设置lang="fra_Latn" 。

 import nlpcloud

client = nlpcloud . Client ( "<model>" , "<your token>" , lang = "<your language code>" )

如果要发出异步请求，请传递asynchronous=True 。

 import nlpcloud

client = nlpcloud . Client ( "<model>" , "<your token>" , asynchronous = True )

如果您发出异步请求，您将始终收到包含 URL 的快速响应。然后，您应该使用async_result()定期（例如每 10 秒）轮询该 URL，以检查结果是否可用。这是一个例子：

 client . async_result ( "https://api.nlpcloud.io/v1/get-async-result/21718218-42e8-4be9-a67f-b7e18e03b436" )

当响应准备好时，上述命令返回一个 JSON 对象。否则返回None 。

自动语音识别（语音转文本）端点

调用asr()方法并传递以下参数：

（可选：应设置此文件或编码文件） url ：托管音频或视频文件的 URL
（可选：应设置此值或 url） encoded_file ：文件的 Base 64 编码版本
（可选） input_language ：文件的语言（ISO 代码）

 client . asr ( "Your url" )

上面的命令返回一个 JSON 对象。

聊天机器人端点

调用chatbot()方法并传递您的输入。作为一个选项，您还可以传递上下文和对话历史记录（字典列表）。每个字典都由聊天机器人的input和response组成。

 client . chatbot ( "Your input" , "You context" , [{ "input" : "input 1" , "response" : "response 1" }, { "input" : "input 2" , "response" : "response 2" }, ...])

上面的命令返回一个 JSON 对象。

分类端点

调用classification()方法并传递以下参数：

您想要分类的文本，作为字符串
文本的候选标签，作为字符串列表
（可选） multi_class ：分类是否应该是多类，作为布尔值。默认为 true。

 client . classification ( "<Your block of text>" , [ "label 1" , "label 2" , "..." ])

上面的命令返回一个 JSON 对象。

代码生成端点

调用code_generation()方法并传递要生成的程序的指令：

 client . code_generation ( "<Your instruction>" )

上面的命令返回一个 JSON 对象。

依赖端点

调用dependencies()方法并传递要执行词性标记 (POS) + 弧线的文本。

 client . dependencies ( "<Your block of text>" )

上面的命令返回一个 JSON 对象。

嵌入端点

调用embeddings()方法并传递要从中提取嵌入的文本块列表。

 client . embeddings ([ "<Text 1>" , "<Text 2>" , "<Text 3>" , ...])

上面的命令返回一个 JSON 对象。

实体端点

调用entities()方法并传递要执行命名实体识别(NER)的文本。

 client . entities ( "<Your block of text>" )

上面的命令返回一个 JSON 对象。

生成端点

调用generation()方法并传递以下参数：

开始生成文本的文本块。 CPU 上的 GPT-J 最多 256 个令牌，GPU 上的 GPT-J 和 GPT-NeoX 20B 最多 1024 个令牌，GPU 上的 Fast GPT-J 和 Finetuned GPT-NeoX 20B 最多 2048 个令牌。
（可选） max_length ：可选。生成的文本应包含的最大标记数。 CPU 上的 GPT-J 最多 256 个令牌，GPU 上的 GPT-J 和 GPT-NeoX 20B 最多 1024 个令牌，GPU 上的 Fast GPT-J 和 Finetuned GPT-NeoX 20B 最多 2048 个令牌。如果length_no_input为 false，则生成文本的大小是max_length与输入文本长度之间的差值。如果length_no_input为 true，则生成的文本的大小就是max_length 。默认为 50。
（可选） length_no_input ： min_length和max_length是否不应包含输入文本的长度（布尔值）。如果为 false，则min_length和max_length包括输入文本的长度。如果为 true，则 min_length 和max_length不包括输入文本的长度。默认为 false。
（可选） end_sequence ：应该是生成序列末尾的特定标记，作为字符串。例如如果可以的话.或n或###或任何其他少于 10 个字符的内容。
（可选） remove_input ：是否要从结果中删除输入文本，为布尔值。默认为 false。
（可选） num_beams ：用于波束搜索的波束数量。 1 表示不进行波束搜索。这是一个整数。默认为 1。
（可选） num_return_sequences ：批次中每个元素独立计算的返回序列的数量，作为整数。默认为 1。
（可选） top_k ：为 top-k 过滤保留的最高概率词汇标记的数量，作为整数。最多 1000 个代币。默认为 0。
（可选） top_p ：如果设置为 float < 1，则仅保留概率总计为 top_p 或更高的最可能的标记进行生成。这是一个浮标。应介于 0 和 1 之间。默认为 0.7。
（可选） temperature ：用于对下一个标记概率进行建模的值，作为浮点数。应介于 0 和 1 之间。默认为 1。
（可选） repetition_penalty ：重复惩罚的参数，作为浮点数。 1.0 表示没有处罚。默认为 1.0。
（可选） bad_words ：不允许生成的标记列表，作为字符串列表。默认为空。
（可选） remove_end_sequence ：可选。是否要从结果中删除end_sequence字符串。默认为 false。

 client . generation ( "<Your input text>" )

上面的命令返回一个 JSON 对象。

语法和拼写纠正端点

调用gs_correction()方法并传递您想要正确的文本：

 client . gs_correction ( "<Your block of text>" )

上面的命令返回一个 JSON 对象。

图像生成端点

调用image_generation()方法并传递要生成的新图像的文本指令：

 client . image_generation ( "<Your block of text>" )

上面的命令返回一个 JSON 对象。

意图分类端点

调用intent_classification()方法并传递要从中提取意图的文本：

 client . intent_classification ( "<Your block of text>" )

上面的命令返回一个 JSON 对象。

关键词和关键短语提取端点

调用kw_kp_extraction()方法并传递要从中提取关键字和关键短语的文本：

 client . kw_kp_extraction ( "<Your block of text>" )

上面的命令返回一个 JSON 对象。

语言检测端点

调用langdetection()方法并传递要分析的文本以检测语言。

 client . langdetection ( "<The text you want to analyze>" )

上面的命令返回一个 JSON 对象。

转述端点

调用paraphrasing()方法并传递要释义的文本。

 client . paraphrasing ( "<Your text to paraphrase>" )

上面的命令返回一个 JSON 对象。

问答端点

调用question()方法并传递以下内容：

你的问题
（可选）模型将用来尝试回答您的问题的上下文

 client . question ( "<Your question>" , "<Your context>" )

上面的命令返回一个 JSON 对象。

语义搜索端点

调用semantic_search()方法并传递您的搜索查询。

 client . semantic_search ( "Your search query" )

上面的命令返回一个 JSON 对象。

语义相似度端点

调用semantic_similarity()方法并传递由要比较的2 个文本块组成的列表。

 client . semantic_similarity ([ "<Block of text 1>" , "<Block of text 2>" ])

上面的命令返回一个 JSON 对象。

句子依赖端点

调用sentence_dependencies()方法并传递由要对其执行 POS + 弧的多个句子组成的文本块。

 client . sentence_dependencies ( "<Your block of text>" )

上面的命令返回一个 JSON 对象。

情感分析端点

调用sentiment()方法并传递以下内容：

您想要分析并获取情感的文本
（可选）情绪应应用于的目标元素

 client . sentiment ( "<Your block of text>" , "<Your target element>" )

上面的命令返回一个 JSON 对象。

语音合成端点

调用speech_synthesis()方法并传递要转换为音频的文本：

 client . speech_synthesis ( "<Your block of text>" )

上面的命令返回一个 JSON 对象。

总结端点

调用summarization()方法并传递要摘要的文本。

 client . summarization ( "<Your text to summarize>" )

上面的命令返回一个 JSON 对象。

标记化端点

调用tokens()方法并传递要标记化的文本。

 client . tokens ( "<Your block of text>" )

上面的命令返回一个 JSON 对象。

翻译端点

调用translation()方法并传递要翻译的文本。

 client . translation ( "<Your text to translate>" )

上面的命令返回一个 JSON 对象。

展开