basis embedding下載 - basis embedding原始碼下載

basis embedding

Ai源碼

1.0.0

下載

basis embedding

低記憶體神經網路語言模型的結構化字嵌入程式碼

用於減少模型大小和內存消耗的basis embedding的代碼存儲庫此存儲庫基於 github 上的 pytorch/examples 存儲庫構建

參數介紹

basis embedding相關參數：

--basis <0>: 分解嵌入矩陣的基數，0為普通模式
--num_clusters ：所有詞彙的簇數
--load_input_embedding ：用於輸入嵌入的預訓練嵌入矩陣的路徑
--load_output_embedding ：用於輸出嵌入的預訓練嵌入矩陣的路徑

其他選項：

-c或--config ：設定檔的路徑，它將覆寫參數解析器的預設值並被命令列選項覆寫
--train ：訓練或僅評估現有模型
--dict <None> : 如果指定則使用詞彙文件，否則使用 train.txt 中的單字

例子

python main.py -c config/default.conf  # train a cross-entropy baseline
python main.py -c config/ptb_basis_tied.conf # basis embedding inited via tied embedding on ptb

在訓練期間，如果收到鍵盤中斷 (Ctrl-C)，訓練就會停止，並根據測試資料集評估當前模型。

main.py腳本接受以下參數：

basis embedding related parameters">

optional arguments:
  -h, --help         show this help message and exit
  -c, --config PATH  preset configurations to load
  --data DATA        location of the data corpus
  --model MODEL      type of recurrent net (RNN_TANH, RNN_RELU, LSTM, GRU)
  --emsize EMSIZE    size of word embeddings
  --nhid NHID        humber of hidden units per layer
  --nlayers NLAYERS  number of layers
  --lr LR            initial learning rate
  --clip CLIP        gradient clipping
  --epochs EPOCHS    upper epoch limit
  --batch-size N     batch size
  --dropout DROPOUT  dropout applied to layers (0 = no dropout)
  --tied             tie the word embedding and softmax weights
  --seed SEED        random seed
  --cuda             use CUDA
  --log-interval N   report interval
  --save SAVE        path to save the final model
  ... more from previous basis embedding related parameters