traxのダウンロードtraxソースコードのダウンロード

TRAX - 明確なコードと速度を備えた深い学習

Traxは、クリアコードと速度に焦点を当てた深い学習のためのエンドツーエンドライブラリです。 Google Brainチームで積極的に使用され、維持されています。このノートブック（Colabで実行）は、Traxの使用方法と詳細情報を見つけることができる場所を示しています。

事前に訓練された変圧器を実行する：コードの数行で翻訳者を作成する
機能とリソース：APIドキュメント、私たちと話す場所、問題を開く方法など
ウォークスルー：TRAXの仕組み、新しいモデルを作成し、自分のデータでトレーニングする方法

Traxへの貢献を歓迎します！新しいモデルとレイヤー用のコードを備えたPRSを歓迎し、コードとドキュメントの改善を歓迎します。モデルがどのように機能するかを説明し、問題を解決するためにそれらを使用する方法を示すノートブックが特に大好きです！

ここにいくつかの例がありますノートブック： -

trax.data api説明： trax.data apiの主要な機能のいくつかを説明します
改革者を使用した名前のエンティティ認識：kaggleデータセットを使用して、改革者アーキテクチャを使用して名前のエンティティ認識を実装します。
ディープNグラムモデル：シェークスピア作品で訓練されたディープNグラムモデルの実装

一般セットアップ

コードサンプルを実行する前に、次のセル（1回）を実行します。

 import os
import numpy as np

!p ip install - q - U trax
import trax

1.事前に訓練された変圧器を実行します

数行のコードで英語とドイツの翻訳者を作成する方法は次のとおりです。

trax.models.transformerを使用してtraxでトランスモデルを作成します
Model.init_from_fileを使用した事前に訓練された重みを持つファイルから初期化
入力文をトークン化して、trax.data.tokenizeを使用してモデルに入力する
trax.supervised.decoding.autoregression_sampleを使用して、変圧器からデコードします
デコードされた結果を非トークン化して、trax.data.detokenizeで翻訳を取得します

 # Create a Transformer model.
# Pre-trained model config in gs://trax-ml/models/translation/ende_wmt32k.gin
model = trax . models . Transformer (
    input_vocab_size = 33300 ,
    d_model = 512 , d_ff = 2048 ,
    n_heads = 8 , n_encoder_layers = 6 , n_decoder_layers = 6 ,
    max_len = 2048 , mode = 'predict' )

# Initialize using pre-trained weights.
model . init_from_file ( 'gs://trax-ml/models/translation/ende_wmt32k.pkl.gz' ,
                     weights_only = True )

# Tokenize a sentence.
sentence = 'It is nice to learn new things today!'
tokenized = list ( trax . data . tokenize ( iter ([ sentence ]),  # Operates on streams.
                                    vocab_dir = 'gs://trax-ml/vocabs/' ,
                                    vocab_file = 'ende_32k.subword' ))[ 0 ]

# Decode from the Transformer.
tokenized = tokenized [ None , :]  # Add batch dimension.
tokenized_translation = trax . supervised . decoding . autoregressive_sample (
    model , tokenized , temperature = 0.0 )  # Higher temperature: more diverse results.

# De-tokenize,
tokenized_translation = tokenized_translation [ 0 ][: - 1 ]  # Remove batch and EOS.
translation = trax . data . detokenize ( tokenized_translation ,
                                   vocab_dir = 'gs://trax-ml/vocabs/' ,
                                   vocab_file = 'ende_32k.subword' )
print ( translation )

 Es ist schön, heute neue Dinge zu lernen!

2。機能とリソース

TRAXには、基本モデル（RESNET、LSTM、トランスなど）、RLアルゴリズム（Renforce、A2C、PPOなど）が含まれます。また、研究に積極的に使用されており、改革者やAWRなどの新しいRLアルゴリズムなどの新しいモデルが含まれています。 TRAXには、Tensor2TensorやTensorflowデータセットなど、多数の深い学習データセットにバインディングがあります。

Traxは、独自のPythonスクリプトやノートブックのライブラリとして、またはシェルからのバイナリとして使用できます。これは、大きなモデルのトレーニングに便利です。 CPU、GPU、TPUに変更なしで実行されます。

APIドキュメント
私たちとチャットします
問題を開きます
Trax-Discussを購読してください

3。ウォークスルー

ここで、TRAXの仕組み、新しいモデルの作成方法、および独自のデータでそれらをトレーニングする方法を学ぶことができます。

テンソルと高速数学

TRAXモデルを介して流れる基本的なユニットは、テンソルアレイとも呼ばれるテンソル- 多次元配列であり、テンソル操作のために最も広く使用されているパッケージであるnumpy 。テンソルの操作方法がわからない場合は、Numpyガイドを見てみましょう。Traxは、そのためにNumpy APIも使用します。

Traxでは、Numpy操作を非常に速く実行し、GPUとTPUを使用してそれらを加速させたいと考えています。また、テンソル上の関数の勾配を自動的に計算する必要があります。これは、そのバックエンドのおかげで、 trax.fastmathパッケージで行われます - JaxとTensorflow numpy。

 from trax . fastmath import numpy as fastnp
trax . fastmath . use_backend ( 'jax' )  # Can be 'jax' or 'tensorflow-numpy'.

matrix  = fastnp . array ([[ 1 , 2 , 3 ], [ 4 , 5 , 6 ], [ 7 , 8 , 9 ]])
print ( f'matrix = n { matrix } ' )
vector = fastnp . ones ( 3 )
print ( f'vector = { vector } ' )
product = fastnp . dot ( vector , matrix )
print ( f'product = { product } ' )
tanh = fastnp . tanh ( product )
print ( f'tanh(product) = { tanh } ' )

 matrix = 
[[1 2 3]
 [4 5 6]
 [7 8 9]]
vector = [1. 1. 1.]
product = [12. 15. 18.]
tanh(product) = [0.99999994 0.99999994 0.99999994]

勾配は、 trax.fastmath.gradを使用して計算できます。

 def f ( x ):
  return 2.0 * x * x

grad_f = trax . fastmath . grad ( f )

print ( f'grad(2x^2) at 1 = { grad_f ( 1.0 ) } ' )

 grad(2x^2) at 1 = 4.0

レイヤー

レイヤーは、TRAXモデルの基本的な構成要素です。レイヤーイントロでそれらについてすべてを学びますが、今のところ、1つのコアTrax層の実装、 Embeddingをご覧ください。

 class Embedding ( base . Layer ):
  """Trainable layer that maps discrete tokens/IDs to vectors."""

  def __init__ ( self ,
               vocab_size ,
               d_feature ,
               kernel_initializer = init . RandomNormalInitializer ( 1.0 )):
    """Returns an embedding layer with given vocabulary size and vector size.

    Args:
      vocab_size: Size of the input vocabulary. The layer will assign a unique
          vector to each ID in `range(vocab_size)`.
      d_feature: Dimensionality/depth of the output vectors.
      kernel_initializer: Function that creates (random) initial vectors for
          the embedding.
    """
    super (). __init__ ( name = f'Embedding_ { vocab_size } _ { d_feature } ' )
    self . _d_feature = d_feature  # feature dimensionality
    self . _vocab_size = vocab_size
    self . _kernel_initializer = kernel_initializer

  def forward ( self , x ):
    """Returns embedding vectors corresponding to input token IDs.

    Args:
      x: Tensor of token IDs.

    Returns:
      Tensor of embedding vectors.
    """
    return jnp . take ( self . weights , x , axis = 0 , mode = 'clip' )

  def init_weights_and_state ( self , input_signature ):
    """Returns tensor of newly initialized embedding vectors."""
    del input_signature
    shape_w = ( self . _vocab_size , self . _d_feature )
    w = self . _kernel_initializer ( shape_w , self . rng )
    self . weights = w

Embeddingなどのトレーニング可能な重量のあるレイヤーは、入力の署名（形状とDTYPE）で初期化する必要があります。その後、それらを呼び出すことで実行できます。

 from trax import layers as tl

# Create an input tensor x.
x = np . arange ( 15 )
print ( f'x = { x } ' )

# Create the embedding layer.
embedding = tl . Embedding ( vocab_size = 20 , d_feature = 32 )
embedding . init ( trax . shapes . signature ( x ))

# Run the layer -- y = embedding(x).
y = embedding ( x )
print ( f'shape of y = { y . shape } ' )

 x = [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]
shape of y = (15, 32)

モデル

TRAXのモデルは、 SerialとBranch組み合わせを使用して、ほとんどの場合層から構築されます。レイヤーイントロのこれらのコンビネーターの詳細を読むことができ、 trax/models/の多くのモデルのコードを見ることができます。以下は、センチメント分類モデルを構築する方法の例です。

 model = tl . Serial (
    tl . Embedding ( vocab_size = 8192 , d_feature = 256 ),
    tl . Mean ( axis = 1 ),  # Average on axis 1 (length of sentence).
    tl . Dense ( 2 ),      # Classify 2 classes.
    tl . LogSoftmax ()   # Produce log-probabilities.
)

# You can print model structure.
print ( model )

 Serial[
  Embedding_8192_256
  Mean
  Dense_2
  LogSoftmax
]

データ

モデルをトレーニングするには、データが必要です。 Traxでは、データストリームはPython Iteratorsとして表されるため、 next(data_stream)を呼び出して、たとえば(inputs, targets)を取得できます。 Traxを使用すると、Tensorflowデータセットを簡単に使用でき、標準open('my_file.txt')を使用して独自のテキストファイルからイテレーターを取得することもできます。

 train_stream = trax . data . TFDS ( 'imdb_reviews' , keys = ( 'text' , 'label' ), train = True )()
eval_stream = trax . data . TFDS ( 'imdb_reviews' , keys = ( 'text' , 'label' ), train = False )()
print ( next ( train_stream ))  # See one example.

 (b"This was an absolutely terrible movie. Don't be lured in by Christopher Walken or Michael Ironside. Both are great actors, but this must simply be their worst role in history. Even their great acting could not redeem this movie's ridiculous storyline. This movie is an early nineties US propaganda piece. The most pathetic scenes were those when the Columbian rebels were making their cases for revolutions. Maria Conchita Alonso appeared phony, and her pseudo-love affair with Walken was nothing but a pathetic emotional plug in a movie that was devoid of any real meaning. I am disappointed that there are movies like this, ruining actor's like Christopher Walken's good name. I could barely sit through it.", 0)

trax.dataモジュールを使用すると、入力処理パイプラインを作成して、データをトークン化してシャッフルできます。 trax.data.Serialを使用してデータパイプラインを作成し、それらはストリームに適用して処理されたストリームを作成する関数です。

 data_pipeline = trax . data . Serial (
    trax . data . Tokenize ( vocab_file = 'en_8k.subword' , keys = [ 0 ]),
    trax . data . Shuffle (),
    trax . data . FilterByLength ( max_length = 2048 , length_keys = [ 0 ]),
    trax . data . BucketByLength ( boundaries = [  32 , 128 , 512 , 2048 ],
                             batch_sizes = [ 256 ,  64 ,  16 ,    4 , 1 ],
                             length_keys = [ 0 ]),
    trax . data . AddLossWeights ()
  )
train_batches_stream = data_pipeline ( train_stream )
eval_batches_stream = data_pipeline ( eval_stream )
example_batch = next ( train_batches_stream )
print ( f'shapes = { [ x . shape for x in example_batch ] } ' )  # Check the shapes.

 shapes = [(4, 1024), (4,), (4,)]

監視されたトレーニング

モデルとデータがある場合は、 trax.supervised.trainingを使用して、トレーニングと評価を定義し、トレーニングループを作成します。 TRAXトレーニングループはトレーニングを最適化し、テンソルボードログとモデルチェックポイントを作成します。

 from trax . supervised import training

# Training task.
train_task = training . TrainTask (
    labeled_data = train_batches_stream ,
    loss_layer = tl . WeightedCategoryCrossEntropy (),
    optimizer = trax . optimizers . Adam ( 0.01 ),
    n_steps_per_checkpoint = 500 ,
)

# Evaluaton task.
eval_task = training . EvalTask (
    labeled_data = eval_batches_stream ,
    metrics = [ tl . WeightedCategoryCrossEntropy (), tl . WeightedCategoryAccuracy ()],
    n_eval_batches = 20  # For less variance in eval numbers.
)

# Training loop saves checkpoints to output_dir.
output_dir = os . path . expanduser ( '~/output_dir/' )
!r m - rf { output_dir }
training_loop = training . Loop ( model ,
                              train_task ,
                              eval_tasks = [ eval_task ],
                              output_dir = output_dir )

# Run 2000 steps (batches).
training_loop . run ( 2000 )

 Step      1: Ran 1 train steps in 0.78 secs
Step      1: train WeightedCategoryCrossEntropy |  1.33800304
Step      1: eval  WeightedCategoryCrossEntropy |  0.71843582
Step      1: eval      WeightedCategoryAccuracy |  0.56562500

Step    500: Ran 499 train steps in 5.77 secs
Step    500: train WeightedCategoryCrossEntropy |  0.62914723
Step    500: eval  WeightedCategoryCrossEntropy |  0.49253047
Step    500: eval      WeightedCategoryAccuracy |  0.74062500

Step   1000: Ran 500 train steps in 5.03 secs
Step   1000: train WeightedCategoryCrossEntropy |  0.42949259
Step   1000: eval  WeightedCategoryCrossEntropy |  0.35451687
Step   1000: eval      WeightedCategoryAccuracy |  0.83750000

Step   1500: Ran 500 train steps in 4.80 secs
Step   1500: train WeightedCategoryCrossEntropy |  0.41843575
Step   1500: eval  WeightedCategoryCrossEntropy |  0.35207348
Step   1500: eval      WeightedCategoryAccuracy |  0.82109375

Step   2000: Ran 500 train steps in 5.35 secs
Step   2000: train WeightedCategoryCrossEntropy |  0.38129005
Step   2000: eval  WeightedCategoryCrossEntropy |  0.33760912
Step   2000: eval      WeightedCategoryAccuracy |  0.85312500

モデルをトレーニングした後、結果を得るためにレイヤーのように実行します。

 example_input = next ( eval_batches_stream )[ 0 ][ 0 ]
example_input_str = trax . data . detokenize ( example_input , vocab_file = 'en_8k.subword' )
print ( f'example input_str: { example_input_str } ' )
sentiment_log_probs = model ( example_input [ None , :])  # Add batch dimension.
print ( f'Model returned sentiment probabilities: { np . exp ( sentiment_log_probs ) } ' )

 example input_str: I first saw this when I was a teen in my last year of Junior High. I was riveted to it! I loved the special effects, the fantastic places and the trial-aspect and flashback method of telling the story.<br /><br />Several years later I read the book and while it was interesting and I could definitely see what Swift was trying to say, I think that while it's not as perfect as the book for social commentary, as a story the movie is better. It makes more sense to have it be one long adventure than having Gulliver return after each voyage and making a profit by selling the tiny Lilliput sheep or whatever.<br /><br />It's much more arresting when everyone thinks he's crazy and the sheep DO make a cameo anyway. As a side note, when I saw Laputa I was stunned. It looks very much like the Kingdom of Zeal from the Chrono Trigger video game (1995) that also made me like this mini-series even more.<br /><br />I saw it again about 4 years ago, and realized that I still enjoyed it just as much. Really high quality stuff and began an excellent run of Sweeps mini-series for NBC who followed it up with the solid Merlin and interesting Alice in Wonderland.<pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad>
Model returned sentiment probabilities: [[3.984500e-04 9.996014e-01]]

拡大する