basis embedding 다운로드 - basis embedding 소스 코드 다운로드

basis embedding

AI 소스 코드

1.0.0

다운로드

basis embedding

저메모리 신경망 언어 모델을 위한 구조화된 단어 임베딩을 위한 코드

모델 크기와 메모리 소비를 줄이기 위한 basis embedding 위한 코드 저장소 이 저장소는 github의 pytorch/examples 저장소를 기반으로 구축되었습니다.

매개변수 소개

basis embedding 관련 인수:

--basis <0>: 임베딩 행렬을 분해할 기준 수, 0은 일반 모드입니다.
--num_clusters : 모든 어휘에 대한 클러스터 수
--load_input_embedding : 입력 임베딩을 위해 사전 훈련된 임베딩 행렬의 경로
--load_output_embedding : 출력 임베딩을 위해 사전 훈련된 임베딩 행렬의 경로

기타 옵션:

-c 또는 --config : 구성 파일의 경로입니다. 인수 파서의 기본값을 재정의하고 명령줄 옵션에 의해 재정의됩니다.
--train : 기존 모델을 훈련하거나 평가합니다.
--dict <None> : 지정된 경우 어휘 파일을 사용하고, 그렇지 않으면 train.txt의 단어를 사용합니다.

예

python main.py -c config/default.conf  # train a cross-entropy baseline
python main.py -c config/ptb_basis_tied.conf # basis embedding inited via tied embedding on ptb

훈련 중에 키보드 인터럽트(Ctrl-C)가 수신되면 훈련이 중지되고 현재 모델이 테스트 데이터 세트에 대해 평가됩니다.

main.py 스크립트는 다음 인수를 허용합니다.

basis embedding related parameters">

optional arguments:
  -h, --help         show this help message and exit
  -c, --config PATH  preset configurations to load
  --data DATA        location of the data corpus
  --model MODEL      type of recurrent net (RNN_TANH, RNN_RELU, LSTM, GRU)
  --emsize EMSIZE    size of word embeddings
  --nhid NHID        humber of hidden units per layer
  --nlayers NLAYERS  number of layers
  --lr LR            initial learning rate
  --clip CLIP        gradient clipping
  --epochs EPOCHS    upper epoch limit
  --batch-size N     batch size
  --dropout DROPOUT  dropout applied to layers (0 = no dropout)
  --tied             tie the word embedding and softmax weights
  --seed SEED        random seed
  --cuda             use CUDA
  --log-interval N   report interval
  --save SAVE        path to save the final model
  ... more from previous basis embedding related parameters