trax下載 - trax源代碼下載

Trax - 清晰的代碼和速度深度學習

Trax是一個端到端的庫，用於深度學習，專注於清晰的代碼和速度。它是在Google Brain團隊中積極使用和維護的。本筆記本（在Colab中運行）顯示瞭如何使用TRAX以及您可以在哪裡找到更多信息。

運行預訓練的變壓器：在幾行代碼中創建轉換器
功能和資源：API文檔，在哪裡與我們交談，如何打開問題等等
演練：Trax的工作原理，如何製作新型號並訓練自己的數據

我們歡迎對Trax的貢獻！我們歡迎使用新型號和層的代碼以及對代碼和文檔的改進的PR。我們特別喜歡筆記本，這些筆記本可以解釋模型如何工作並展示如何使用它們來解決問題！

以下是一些示例筆記本： -

trax.data api解釋了：解釋了trax.data api中的一些主要功能
使用Reformer的命名實體識別：使用Kaggle數據集使用改革儀體系結構實現命名實體識別。
深N-gram模型：在莎士比亞作品中訓練的深N-gram模型的實現

一般設置

在運行任何代碼示例之前，執行以下單元格（一次）。

 import os
import numpy as np

!p ip install - q - U trax
import trax

1。運行預訓練的變壓器

這是您用幾行代碼創建英語 - 德語翻譯器的方式：

用trax.models.transformer在trax中創建變壓器模型
從具有模型的預訓練的文件中初始化它。INIT_FROM_FILE
將您的輸入句子引入以用trax.data.tokenize輸入到模型中
用trax.supervised.decoding.autoregreserese_sample從變壓器解碼
將解碼的結果降低，以用trax.data.detokenize獲取翻譯

 # Create a Transformer model.
# Pre-trained model config in gs://trax-ml/models/translation/ende_wmt32k.gin
model = trax . models . Transformer (
    input_vocab_size = 33300 ,
    d_model = 512 , d_ff = 2048 ,
    n_heads = 8 , n_encoder_layers = 6 , n_decoder_layers = 6 ,
    max_len = 2048 , mode = 'predict' )

# Initialize using pre-trained weights.
model . init_from_file ( 'gs://trax-ml/models/translation/ende_wmt32k.pkl.gz' ,
                     weights_only = True )

# Tokenize a sentence.
sentence = 'It is nice to learn new things today!'
tokenized = list ( trax . data . tokenize ( iter ([ sentence ]),  # Operates on streams.
                                    vocab_dir = 'gs://trax-ml/vocabs/' ,
                                    vocab_file = 'ende_32k.subword' ))[ 0 ]

# Decode from the Transformer.
tokenized = tokenized [ None , :]  # Add batch dimension.
tokenized_translation = trax . supervised . decoding . autoregressive_sample (
    model , tokenized , temperature = 0.0 )  # Higher temperature: more diverse results.

# De-tokenize,
tokenized_translation = tokenized_translation [ 0 ][: - 1 ]  # Remove batch and EOS.
translation = trax . data . detokenize ( tokenized_translation ,
                                   vocab_dir = 'gs://trax-ml/vocabs/' ,
                                   vocab_file = 'ende_32k.subword' )
print ( translation )

 Es ist schön, heute neue Dinge zu lernen!

2。功能和資源

Trax包括基本模型（例如Resnet，LSTM，Transformer）和RL算法（例如Readforce，A2C，PPO）。它也被積極地用於研究，並包括改革者和AWR等新型RL算法等新模型。 Trax與大量深度學習數據集具有綁定，包括Tensor2Tensor和TensorFlow數據集。

您可以將TRAX用作您自己的Python腳本和筆記本的庫，也可以用作外殼的二進製文件，這對於培訓大型型號可能更方便。它可以在CPU，GPU和TPU上進行任何更改。

API文檔
與我們聊天
打開一個問題
訂閱trax-discuss以獲取新聞

3。演練

您可以在這裡了解Trax的工作原理，如何創建新模型以及如何在自己的數據上訓練它們。

張量和快速數學

流經Trax模型的基本單元是張量- 多維陣列，有時也稱為Numpy陣列，這是由於使用最廣泛的張量操作的軟件包numpy 。如果您不知道如何在張張器上操作Numpy指南：TRAX還使用Numpy API。

在Trax中，我們希望使用Numpy操作非常快速運行，並利用GPU和TPU來加速它們。我們還希望自動計算張量的功能梯度。這是在trax.fastmath軟件包中完成的，這要歸功於其後端-Jax和Tensorflow Numpy。

 from trax . fastmath import numpy as fastnp
trax . fastmath . use_backend ( 'jax' )  # Can be 'jax' or 'tensorflow-numpy'.

matrix  = fastnp . array ([[ 1 , 2 , 3 ], [ 4 , 5 , 6 ], [ 7 , 8 , 9 ]])
print ( f'matrix = n { matrix } ' )
vector = fastnp . ones ( 3 )
print ( f'vector = { vector } ' )
product = fastnp . dot ( vector , matrix )
print ( f'product = { product } ' )
tanh = fastnp . tanh ( product )
print ( f'tanh(product) = { tanh } ' )

 matrix = 
[[1 2 3]
 [4 5 6]
 [7 8 9]]
vector = [1. 1. 1.]
product = [12. 15. 18.]
tanh(product) = [0.99999994 0.99999994 0.99999994]

可以使用trax.fastmath.grad計算梯度。

 def f ( x ):
  return 2.0 * x * x

grad_f = trax . fastmath . grad ( f )

print ( f'grad(2x^2) at 1 = { grad_f ( 1.0 ) } ' )

 grad(2x^2) at 1 = 4.0

層

層是Trax模型的基本基礎。您將在“層”介紹中了解所有有關它們的知識，但是就目前而言，只需查看一個核心Trax層的實現， Embedding ：

 class Embedding ( base . Layer ):
  """Trainable layer that maps discrete tokens/IDs to vectors."""

  def __init__ ( self ,
               vocab_size ,
               d_feature ,
               kernel_initializer = init . RandomNormalInitializer ( 1.0 )):
    """Returns an embedding layer with given vocabulary size and vector size.

    Args:
      vocab_size: Size of the input vocabulary. The layer will assign a unique
          vector to each ID in `range(vocab_size)`.
      d_feature: Dimensionality/depth of the output vectors.
      kernel_initializer: Function that creates (random) initial vectors for
          the embedding.
    """
    super (). __init__ ( name = f'Embedding_ { vocab_size } _ { d_feature } ' )
    self . _d_feature = d_feature  # feature dimensionality
    self . _vocab_size = vocab_size
    self . _kernel_initializer = kernel_initializer

  def forward ( self , x ):
    """Returns embedding vectors corresponding to input token IDs.

    Args:
      x: Tensor of token IDs.

    Returns:
      Tensor of embedding vectors.
    """
    return jnp . take ( self . weights , x , axis = 0 , mode = 'clip' )

  def init_weights_and_state ( self , input_signature ):
    """Returns tensor of newly initialized embedding vectors."""
    del input_signature
    shape_w = ( self . _vocab_size , self . _d_feature )
    w = self . _kernel_initializer ( shape_w , self . rng )
    self . weights = w

需要使用輸入的簽名（形狀和DTYPE） Embedding具有訓練權重的層，然後可以通過調用它們來運行。

 from trax import layers as tl

# Create an input tensor x.
x = np . arange ( 15 )
print ( f'x = { x } ' )

# Create the embedding layer.
embedding = tl . Embedding ( vocab_size = 20 , d_feature = 32 )
embedding . init ( trax . shapes . signature ( x ))

# Run the layer -- y = embedding(x).
y = embedding ( x )
print ( f'shape of y = { y . shape } ' )

 x = [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]
shape of y = (15, 32)

型號

TRAX中的模型是由最常使用Serial和Branch組合器的圖層構建的。您可以在“圖層”介紹中閱讀有關這些組合器的更多信息，並在trax/models/中查看許多模型的代碼，例如，這就是Transformer語言模型的實現。以下是如何構建情感分類模型的示例。

 model = tl . Serial (
    tl . Embedding ( vocab_size = 8192 , d_feature = 256 ),
    tl . Mean ( axis = 1 ),  # Average on axis 1 (length of sentence).
    tl . Dense ( 2 ),      # Classify 2 classes.
    tl . LogSoftmax ()   # Produce log-probabilities.
)

# You can print model structure.
print ( model )

 Serial[
  Embedding_8192_256
  Mean
  Dense_2
  LogSoftmax
]

數據

要訓練模型，您需要數據。在TRAX中，數據流表示為Python迭代器，因此您可以調用next(data_stream)並獲取元組，例如(inputs, targets) 。 TRAX允許您輕鬆使用TensorFlow數據集，並且還可以使用標準open('my_file.txt')從自己的文本文件中獲取迭代器。

 train_stream = trax . data . TFDS ( 'imdb_reviews' , keys = ( 'text' , 'label' ), train = True )()
eval_stream = trax . data . TFDS ( 'imdb_reviews' , keys = ( 'text' , 'label' ), train = False )()
print ( next ( train_stream ))  # See one example.

 (b"This was an absolutely terrible movie. Don't be lured in by Christopher Walken or Michael Ironside. Both are great actors, but this must simply be their worst role in history. Even their great acting could not redeem this movie's ridiculous storyline. This movie is an early nineties US propaganda piece. The most pathetic scenes were those when the Columbian rebels were making their cases for revolutions. Maria Conchita Alonso appeared phony, and her pseudo-love affair with Walken was nothing but a pathetic emotional plug in a movie that was devoid of any real meaning. I am disappointed that there are movies like this, ruining actor's like Christopher Walken's good name. I could barely sit through it.", 0)

使用trax.data模塊，您可以創建輸入處理管道，例如，以代幣化和調整數據。您可以使用trax.data.Serial創建數據管道，它們是您適用於流以創建處理流的流的功能。

 data_pipeline = trax . data . Serial (
    trax . data . Tokenize ( vocab_file = 'en_8k.subword' , keys = [ 0 ]),
    trax . data . Shuffle (),
    trax . data . FilterByLength ( max_length = 2048 , length_keys = [ 0 ]),
    trax . data . BucketByLength ( boundaries = [  32 , 128 , 512 , 2048 ],
                             batch_sizes = [ 256 ,  64 ,  16 ,    4 , 1 ],
                             length_keys = [ 0 ]),
    trax . data . AddLossWeights ()
  )
train_batches_stream = data_pipeline ( train_stream )
eval_batches_stream = data_pipeline ( eval_stream )
example_batch = next ( train_batches_stream )
print ( f'shapes = { [ x . shape for x in example_batch ] } ' )  # Check the shapes.

 shapes = [(4, 1024), (4,), (4,)]

監督培訓

當您擁有模型和數據時，請使用trax.supervised.training來定義培訓和評估任務並創建訓練循環。 TRAX訓練環將優化培訓，並將為您創建張板日誌和型號檢查點。

 from trax . supervised import training

# Training task.
train_task = training . TrainTask (
    labeled_data = train_batches_stream ,
    loss_layer = tl . WeightedCategoryCrossEntropy (),
    optimizer = trax . optimizers . Adam ( 0.01 ),
    n_steps_per_checkpoint = 500 ,
)

# Evaluaton task.
eval_task = training . EvalTask (
    labeled_data = eval_batches_stream ,
    metrics = [ tl . WeightedCategoryCrossEntropy (), tl . WeightedCategoryAccuracy ()],
    n_eval_batches = 20  # For less variance in eval numbers.
)

# Training loop saves checkpoints to output_dir.
output_dir = os . path . expanduser ( '~/output_dir/' )
!r m - rf { output_dir }
training_loop = training . Loop ( model ,
                              train_task ,
                              eval_tasks = [ eval_task ],
                              output_dir = output_dir )

# Run 2000 steps (batches).
training_loop . run ( 2000 )

 Step      1: Ran 1 train steps in 0.78 secs
Step      1: train WeightedCategoryCrossEntropy |  1.33800304
Step      1: eval  WeightedCategoryCrossEntropy |  0.71843582
Step      1: eval      WeightedCategoryAccuracy |  0.56562500

Step    500: Ran 499 train steps in 5.77 secs
Step    500: train WeightedCategoryCrossEntropy |  0.62914723
Step    500: eval  WeightedCategoryCrossEntropy |  0.49253047
Step    500: eval      WeightedCategoryAccuracy |  0.74062500

Step   1000: Ran 500 train steps in 5.03 secs
Step   1000: train WeightedCategoryCrossEntropy |  0.42949259
Step   1000: eval  WeightedCategoryCrossEntropy |  0.35451687
Step   1000: eval      WeightedCategoryAccuracy |  0.83750000

Step   1500: Ran 500 train steps in 4.80 secs
Step   1500: train WeightedCategoryCrossEntropy |  0.41843575
Step   1500: eval  WeightedCategoryCrossEntropy |  0.35207348
Step   1500: eval      WeightedCategoryAccuracy |  0.82109375

Step   2000: Ran 500 train steps in 5.35 secs
Step   2000: train WeightedCategoryCrossEntropy |  0.38129005
Step   2000: eval  WeightedCategoryCrossEntropy |  0.33760912
Step   2000: eval      WeightedCategoryAccuracy |  0.85312500

訓練模型後，像任何一層一樣運行以獲取結果。

 example_input = next ( eval_batches_stream )[ 0 ][ 0 ]
example_input_str = trax . data . detokenize ( example_input , vocab_file = 'en_8k.subword' )
print ( f'example input_str: { example_input_str } ' )
sentiment_log_probs = model ( example_input [ None , :])  # Add batch dimension.
print ( f'Model returned sentiment probabilities: { np . exp ( sentiment_log_probs ) } ' )

 example input_str: I first saw this when I was a teen in my last year of Junior High. I was riveted to it! I loved the special effects, the fantastic places and the trial-aspect and flashback method of telling the story.<br /><br />Several years later I read the book and while it was interesting and I could definitely see what Swift was trying to say, I think that while it's not as perfect as the book for social commentary, as a story the movie is better. It makes more sense to have it be one long adventure than having Gulliver return after each voyage and making a profit by selling the tiny Lilliput sheep or whatever.<br /><br />It's much more arresting when everyone thinks he's crazy and the sheep DO make a cameo anyway. As a side note, when I saw Laputa I was stunned. It looks very much like the Kingdom of Zeal from the Chrono Trigger video game (1995) that also made me like this mini-series even more.<br /><br />I saw it again about 4 years ago, and realized that I still enjoyed it just as much. Really high quality stuff and began an excellent run of Sweeps mini-series for NBC who followed it up with the solid Merlin and interesting Alice in Wonderland.<pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad>
Model returned sentiment probabilities: [[3.984500e-04 9.996014e-01]]

展開