nlpcloud python 다운로드 - nlpcloud python 소스 코드 다운로드

NLP 클라우드용 Python 클라이언트

이것은 NLP Cloud API용 Python 클라이언트입니다. 자세한 내용은 설명서를 참조하세요.

NLP Cloud는 NER, 감정 분석, 분류, 요약, 대화 요약, 의역, 의도 분류, 제품 설명 및 광고 생성, 챗봇, 문법 및 철자 교정, 키워드 및 핵심 구문 추출, 텍스트 생성을 위한 고성능 사전 훈련 또는 맞춤형 모델을 제공합니다. , 이미지 생성, 소스 코드 생성, 질문 응답, 자동 음성 인식, 기계 번역, 언어 감지, 의미 검색, 의미 유사성, 토큰화, POS 태깅, 임베딩 및 종속성 구문 분석. REST API를 통해 제공되며 프로덕션 준비가 완료되었습니다.

NLP Cloud 사전 훈련된 모델을 사용하거나, 자체 모델을 미세 조정하거나, 자체 모델을 배포할 수 있습니다.

문제가 발생하면 주저하지 말고 Github 문제로 제기하세요. 감사해요!

설치

pip를 통해 설치합니다.

pip install nlpcloud

예

다음은 가짜 토큰과 함께 Facebook의 Bart Large CNN 모델을 사용하여 텍스트를 요약하는 전체 예입니다.

 import nlpcloud

client = nlpcloud . Client ( "bart-large-cnn" , "4eC39HqLyjWDarjtT1zdp7dc" )
client . summarization ( """One month after the United States began what has become a 
  troubled rollout of a national COVID vaccination campaign, the effort is finally 
  gathering real steam. Close to a million doses -- over 951,000, to be more exact -- 
  made their way into the arms of Americans in the past 24 hours, the U.S. Centers 
  for Disease Control and Prevention reported Wednesday. That s the largest number 
  of shots given in one day since the rollout began and a big jump from the 
  previous day, when just under 340,000 doses were given, CBS News reported. 
  That number is likely to jump quickly after the federal government on Tuesday 
  gave states the OK to vaccinate anyone over 65 and said it would release all 
  the doses of vaccine it has available for distribution. Meanwhile, a number 
  of states have now opened mass vaccination sites in an effort to get larger 
  numbers of people inoculated, CBS News reported.""" )

다음은 동일한 작업을 수행하지만 GPU에서 수행하는 전체 예입니다.

 import nlpcloud

client = nlpcloud . Client ( "bart-large-cnn" , "4eC39HqLyjWDarjtT1zdp7dc" , True )
client . summarization ( """One month after the United States began what has become a 
  troubled rollout of a national COVID vaccination campaign, the effort is finally 
  gathering real steam. Close to a million doses -- over 951,000, to be more exact -- 
  made their way into the arms of Americans in the past 24 hours, the U.S. Centers 
  for Disease Control and Prevention reported Wednesday. That s the largest number 
  of shots given in one day since the rollout began and a big jump from the 
  previous day, when just under 340,000 doses were given, CBS News reported. 
  That number is likely to jump quickly after the federal government on Tuesday 
  gave states the OK to vaccinate anyone over 65 and said it would release all 
  the doses of vaccine it has available for distribution. Meanwhile, a number 
  of states have now opened mass vaccination sites in an effort to get larger 
  numbers of people inoculated, CBS News reported.""" )

다음은 동일한 작업을 수행하지만 프랑스어 텍스트에 대한 전체 예입니다.

 import nlpcloud

client = nlpcloud . Client ( "bart-large-cnn" , "4eC39HqLyjWDarjtT1zdp7dc" , True , "fra_Latn" )
client . summarization ( """Sur des images aériennes, prises la veille par un vol de surveillance 
  de la Nouvelle-Zélande, la côte d’une île est bordée d’arbres passés du vert 
  au gris sous l’effet des retombées volcaniques. On y voit aussi des immeubles
  endommagés côtoyer des bâtiments intacts. « D’après le peu d’informations
  dont nous disposons, l’échelle de la dévastation pourrait être immense, 
  spécialement pour les îles les plus isolées », avait déclaré plus tôt 
  Katie Greenwood, de la Fédération internationale des sociétés de la Croix-Rouge.
  Selon l’Organisation mondiale de la santé (OMS), une centaine de maisons ont
  été endommagées, dont cinquante ont été détruites sur l’île principale de
  Tonga, Tongatapu. La police locale, citée par les autorités néo-zélandaises,
  a également fait état de deux morts, dont une Britannique âgée de 50 ans,
  Angela Glover, emportée par le tsunami après avoir essayé de sauver les chiens
  de son refuge, selon sa famille.""" )

json 객체가 반환됩니다:

{
  "summary_text" : " Over 951,000 doses were given in the past 24 hours. That's the largest number of shots given in one day since the  rollout began. That number is likely to jump quickly after the federal government gave states the OK to vaccinate anyone over 65. A number of states have now opened mass vaccination sites. "
}

용법

클라이언트 초기화

초기화 중에 사용하려는 모델과 NLP Cloud 토큰을 클라이언트에 전달합니다.

모델은 en_core_web_lg , bart-large-mnli ...와 같은 사전 학습된 모델일 수도 있고 custom_model/<model id> (예: custom_model/2568 )를 사용하는 사용자 정의 모델 중 하나일 수도 있습니다. 사용 가능한 모든 모델의 전체 목록은 설명서를 참조하세요.

NLP Cloud 대시보드에서 토큰을 검색할 수 있습니다.

 import nlpcloud

client = nlpcloud . Client ( "<model>" , "<your token>" )

GPU를 사용하려면 gpu=True 전달하세요.

 import nlpcloud

client = nlpcloud . Client ( "<model>" , "<your token>" , gpu = True )

영어가 아닌 텍스트를 처리하기 위해 다국어 추가 기능을 사용하려면 lang="<your language code>" 전달하세요. 예를 들어 프랑스어 텍스트를 처리하려면 lang="fra_Latn" 설정해야 합니다.

 import nlpcloud

client = nlpcloud . Client ( "<model>" , "<your token>" , lang = "<your language code>" )

비동기 요청을 하려면 asynchronous=True 전달하세요.

 import nlpcloud

client = nlpcloud . Client ( "<model>" , "<your token>" , asynchronous = True )

비동기 요청을 하는 경우 항상 URL이 포함된 빠른 응답을 받게 됩니다. 그런 다음 정기적으로(예를 들어 10초마다) async_result() 사용하여 이 URL을 폴링하여 결과가 사용 가능한지 확인해야 합니다. 예는 다음과 같습니다.

 client . async_result ( "https://api.nlpcloud.io/v1/get-async-result/21718218-42e8-4be9-a67f-b7e18e03b436" )

위 명령은 응답이 준비되면 JSON 개체를 반환합니다. 그렇지 않으면 None 반환합니다.

자동 음성 인식(음성-텍스트) 엔드포인트

asr() 메서드를 호출하고 다음 인수를 전달합니다.

(선택 사항: 이 파일 또는 인코딩된 파일을 설정해야 함) url : 오디오 또는 비디오 파일이 호스팅되는 URL
(선택사항: 이 URL 또는 URL을 설정해야 함) encoded_file : 파일의 Base 64 인코딩 버전
(선택 사항) input_language : 파일 언어(ISO 코드)

 client . asr ( "Your url" )

위 명령은 JSON 개체를 반환합니다.

챗봇 엔드포인트

chatbot() 메서드를 호출하고 입력 내용을 전달합니다. 옵션으로 사전 목록인 컨텍스트 및 대화 기록을 전달할 수도 있습니다. 각 사전은 챗봇의 input 과 response 으로 구성됩니다.

 client . chatbot ( "Your input" , "You context" , [{ "input" : "input 1" , "response" : "response 1" }, { "input" : "input 2" , "response" : "response 2" }, ...])

위 명령은 JSON 개체를 반환합니다.

분류 종점

classification() 메서드를 호출하고 다음 인수를 전달합니다.

문자열로 분류하려는 텍스트
문자열 목록으로 표시된 텍스트의 후보 레이블
(선택 사항) multi_class : 분류가 다중 클래스인지 여부를 부울로 나타냅니다. 기본값은 true입니다.

 client . classification ( "<Your block of text>" , [ "label 1" , "label 2" , "..." ])

위 명령은 JSON 개체를 반환합니다.

코드 생성 끝점

code_generation() 메서드를 호출하고 생성하려는 프로그램에 대한 명령을 전달합니다.

 client . code_generation ( "<Your instruction>" )

위 명령은 JSON 개체를 반환합니다.

종속성 끝점

dependencies() 메서드를 호출하고 음성 태그 지정(POS) + 호를 수행하려는 텍스트를 전달합니다.

 client . dependencies ( "<Your block of text>" )

위 명령은 JSON 개체를 반환합니다.

임베딩 엔드포인트

embeddings() 메서드를 호출하고 임베딩을 추출하려는 텍스트 블록 목록을 전달합니다.

 client . embeddings ([ "<Text 1>" , "<Text 2>" , "<Text 3>" , ...])

위 명령은 JSON 개체를 반환합니다.

엔터티 엔드포인트

entities() 메서드를 호출하고 명명된 엔터티 인식(NER)을 수행하려는 텍스트를 전달합니다.

 client . entities ( "<Your block of text>" )

위 명령은 JSON 개체를 반환합니다.

세대 엔드포인트

generation() 메서드를 호출하고 다음 인수를 전달합니다.

생성된 텍스트를 시작하는 텍스트 블록입니다. CPU에서 GPT-J의 경우 최대 256개 토큰, GPU의 GPT-J 및 GPT-NeoX 20B의 경우 최대 1024개 토큰, GPU의 Fast GPT-J 및 Finetuned GPT-NeoX 20B의 경우 최대 2048개 토큰입니다.
(선택사항) max_length : 선택사항입니다. 생성된 텍스트에 포함되어야 하는 최대 토큰 수입니다. CPU에서 GPT-J의 경우 최대 256개 토큰, GPU의 GPT-J 및 GPT-NeoX 20B의 경우 최대 1024개 토큰, GPU의 Fast GPT-J 및 Finetuned GPT-NeoX 20B의 경우 최대 2048개 토큰입니다. length_no_input 이 false인 경우 생성된 텍스트의 크기는 max_length 와 입력 텍스트 길이의 차이입니다. length_no_input 이 true인 경우 생성된 텍스트의 크기는 단순히 max_length 입니다. 기본값은 50입니다.
(선택 사항) length_no_input : min_length 및 max_length 입력 텍스트의 길이를 부울로 포함하지 않아야 하는지 여부입니다. false인 경우 min_length 및 max_length 에 입력 텍스트의 길이가 포함됩니다. true인 경우 min_length 및 max_length 에는 입력 텍스트의 길이가 포함되지 않습니다. 기본값은 거짓입니다.
(선택 사항) end_sequence : 생성된 시퀀스의 끝이 되어야 하는 특정 토큰(문자열)입니다. 예를 들어 if가 될 수 있습니다 . 또는 n , ### 또는 10자 미만의 모든 문자.
(선택 사항) remove_input : 결과에서 입력 텍스트를 부울로 제거할지 여부입니다. 기본값은 거짓입니다.
(선택사항) num_beams : 빔 검색을 위한 빔 수입니다. 1은 빔 검색이 없음을 의미합니다. 이것은 정수입니다. 기본값은 1입니다.
(선택 사항) num_return_sequences : 일괄 처리의 각 요소에 대해 독립적으로 계산되어 반환된 시퀀스의 수(정수)입니다. 기본값은 1입니다.
(선택 사항) top_k : top-k 필터링을 위해 유지할 가장 높은 확률의 어휘 토큰 수(정수)입니다. 최대 1000개의 토큰. 기본값은 0입니다.
(선택 사항) top_p : float < 1로 설정된 경우, 합산 확률이 top_p 이상인 가장 가능성이 높은 토큰만 생성을 위해 유지됩니다. 이것은 플로트입니다. 0과 1 사이여야 합니다. 기본값은 0.7입니다.
(선택 사항) temperature : 다음 토큰 확률을 부동 소수점으로 모듈화하는 데 사용되는 값입니다. 0과 1 사이여야 합니다. 기본값은 1입니다.
(선택 사항) repetition_penalty : 반복 페널티에 대한 매개변수(float)입니다. 1.0은 페널티가 없음을 의미합니다. 기본값은 1.0입니다.
(선택 사항) bad_words : 문자열 목록으로 생성이 허용되지 않는 토큰 목록입니다. 기본값은 null입니다.
(선택사항) remove_end_sequence : 선택사항입니다. 결과에서 end_sequence 문자열을 제거할지 여부입니다. 기본값은 거짓입니다.

 client . generation ( "<Your input text>" )

위 명령은 JSON 개체를 반환합니다.

문법 및 철자 교정 끝점

gs_correction() 메소드를 호출하고 올바른 텍스트를 전달하십시오.

 client . gs_correction ( "<Your block of text>" )

위 명령은 JSON 개체를 반환합니다.

이미지 생성 엔드포인트

image_generation() 메서드를 호출하고 생성하려는 새 이미지에 대한 텍스트 명령을 전달합니다.

 client . image_generation ( "<Your block of text>" )

위 명령은 JSON 개체를 반환합니다.

의도 분류 끝점

intent_classification() 메소드를 호출하고 인텐트를 추출하려는 텍스트를 전달하십시오.

 client . intent_classification ( "<Your block of text>" )

위 명령은 JSON 개체를 반환합니다.

키워드 및 핵심 문구 추출 끝점

kw_kp_extraction() 메소드를 호출하고 키워드와 핵심 문구를 추출하려는 텍스트를 전달하십시오.

 client . kw_kp_extraction ( "<Your block of text>" )

위 명령은 JSON 개체를 반환합니다.

언어 감지 엔드포인트

langdetection() 메서드를 호출하고 언어를 감지하기 위해 분석하려는 텍스트를 전달합니다.

 client . langdetection ( "<The text you want to analyze>" )

위 명령은 JSON 개체를 반환합니다.

의역 끝점

paraphrasing() 메서드를 호출하고 바꿔 말할 텍스트를 전달합니다.

 client . paraphrasing ( "<Your text to paraphrase>" )

위 명령은 JSON 개체를 반환합니다.

질문 응답 끝점

question() 메서드를 호출하고 다음을 전달합니다.

귀하의 질문
(선택 사항) 모델이 질문에 답하기 위해 사용할 컨텍스트

 client . question ( "<Your question>" , "<Your context>" )

위 명령은 JSON 개체를 반환합니다.

의미론적 검색 엔드포인트

semantic_search() 메서드를 호출하고 검색 쿼리를 전달합니다.

 client . semantic_search ( "Your search query" )

위 명령은 JSON 개체를 반환합니다.

의미론적 유사성 끝점

semantic_similarity() 메서드를 호출하고 비교하려는 2개의 텍스트 블록으로 구성된 목록을 전달합니다.

 client . semantic_similarity ([ "<Block of text 1>" , "<Block of text 2>" ])

위 명령은 JSON 개체를 반환합니다.

문장 종속성 끝점

sentence_dependencies() 메서드를 호출하고 POS + 호를 수행하려는 여러 문장으로 구성된 텍스트 블록을 전달합니다.

 client . sentence_dependencies ( "<Your block of text>" )

위 명령은 JSON 개체를 반환합니다.

감정 분석 엔드포인트

sentiment() 메서드를 호출하고 다음을 전달합니다.

분석하고 감정을 얻고 싶은 텍스트
(선택사항) 감정이 적용되어야 하는 대상 요소입니다.

 client . sentiment ( "<Your block of text>" , "<Your target element>" )

위 명령은 JSON 개체를 반환합니다.

음성 합성 엔드포인트

speech_synthesis() 메서드를 호출하고 오디오로 변환하려는 텍스트를 전달합니다.

 client . speech_synthesis ( "<Your block of text>" )

위 명령은 JSON 개체를 반환합니다.

요약 끝점

summarization() 메서드를 호출하고 요약하려는 텍스트를 전달합니다.

 client . summarization ( "<Your text to summarize>" )

위 명령은 JSON 개체를 반환합니다.

토큰화 엔드포인트

tokens() 메서드를 호출하고 토큰화하려는 텍스트를 전달합니다.

 client . tokens ( "<Your block of text>" )

위 명령은 JSON 개체를 반환합니다.

번역 종점

translation() 메서드를 호출하고 번역하려는 텍스트를 전달합니다.

 client . translation ( "<Your text to translate>" )

위 명령은 JSON 개체를 반환합니다.

확장하다