trax 다운로드 trax 소스 코드 다운로드

TRAX - 명확한 코드와 속도를 가진 딥 러닝

Trax는 명확한 코드와 속도에 중점을 둔 딥 러닝을위한 엔드 투 엔드 라이브러리입니다. Google Brain Team에서는 적극적으로 사용 및 유지됩니다. 이 노트북 (Colab에서 실행)은 Trax를 사용하는 방법과 자세한 정보를 찾을 수있는 위치를 보여줍니다.

미리 훈련 된 변압기 실행 : 몇 줄의 코드로 번역기 생성
특징 및 리소스 : API 문서, 우리와 대화 할 곳, 문제를 여는 방법 등
연습 : Trax의 작동 방식, 새로운 모델을 만들고 자신의 데이터를 훈련시키는 방법

우리는 Trax에 대한 기여를 환영합니다! 우리는 새로운 모델 및 계층에 대한 코드와 코드 및 문서 개선 사항이있는 PR을 환영합니다. 우리는 특히 모델의 작동 방식을 설명하고 문제를 해결하는 방법을 보여주는 노트북을 좋아합니다!

다음은 몇 가지 예제 노트가 있습니다.

trax.data API 설명 : trax.data API의 주요 기능 중 일부를 설명합니다.
개혁자를 사용하여 명명 된 엔티티 인식 : 개혁자 아키텍처를 사용하여 명명 된 엔티티 인식을 구현하기 위해 Kaggle 데이터 세트를 사용합니다.
깊은 N- 그램 모델 : 셰익스피어 작업에서 훈련 된 깊은 N- 그램 모델 구현

일반 설정

코드 샘플을 실행하기 전에 다음 셀 (한 번)을 실행하십시오.

 import os
import numpy as np

!p ip install - q - U trax
import trax

1. 미리 훈련 된 변압기를 실행하십시오

다음은 몇 줄의 코드로 영어-독일어 번역기를 만드는 방법입니다.

trax.models.transformer를 사용하여 Trax에서 변압기 모델을 만듭니다
model.init_from_file을 사용하여 미리 훈련 된 가중치가있는 파일에서 초기화
trax.data.tokenize를 사용하여 입력 문장을 모델에 입력하기 위해 토큰 화
trax.supervised.decoding.autorgreation_sample으로 변압기에서 디코딩합니다
Decoded 결과를 제거하여 trax.data.detokenize로 변환을받습니다.

 # Create a Transformer model.
# Pre-trained model config in gs://trax-ml/models/translation/ende_wmt32k.gin
model = trax . models . Transformer (
    input_vocab_size = 33300 ,
    d_model = 512 , d_ff = 2048 ,
    n_heads = 8 , n_encoder_layers = 6 , n_decoder_layers = 6 ,
    max_len = 2048 , mode = 'predict' )

# Initialize using pre-trained weights.
model . init_from_file ( 'gs://trax-ml/models/translation/ende_wmt32k.pkl.gz' ,
                     weights_only = True )

# Tokenize a sentence.
sentence = 'It is nice to learn new things today!'
tokenized = list ( trax . data . tokenize ( iter ([ sentence ]),  # Operates on streams.
                                    vocab_dir = 'gs://trax-ml/vocabs/' ,
                                    vocab_file = 'ende_32k.subword' ))[ 0 ]

# Decode from the Transformer.
tokenized = tokenized [ None , :]  # Add batch dimension.
tokenized_translation = trax . supervised . decoding . autoregressive_sample (
    model , tokenized , temperature = 0.0 )  # Higher temperature: more diverse results.

# De-tokenize,
tokenized_translation = tokenized_translation [ 0 ][: - 1 ]  # Remove batch and EOS.
translation = trax . data . detokenize ( tokenized_translation ,
                                   vocab_dir = 'gs://trax-ml/vocabs/' ,
                                   vocab_file = 'ende_32k.subword' )
print ( translation )

 Es ist schön, heute neue Dinge zu lernen!

2. 기능과 자원

TRAX에는 기본 모델 (RESNET, LSTM, Transformer) 및 RL 알고리즘 (강아지, A2C, PPO)이 포함됩니다. 또한 연구에 적극적으로 사용되며 개혁자 및 AWR과 같은 새로운 RL 알고리즘과 같은 새로운 모델이 포함되어 있습니다. TRAX는 Tensor2tensor 및 Tensorflow 데이터 세트를 포함한 다수의 딥 러닝 데이터 세트에 바인딩을합니다.

Trax를 자신의 파이썬 스크립트 및 노트북의 라이브러리 또는 쉘의 이진으로 사용할 수 있으며, 이는 대형 모델을 훈련하기에 더 편리 할 수 있습니다. CPU, GPU 및 TPU에 대한 변경없이 실행됩니다.

API 문서
우리와 채팅
문제를여십시오
뉴스를 위해 Trax-Discuss를 구독하십시오

3. 연습

Trax의 작동 방식, 새로운 모델을 만드는 방법 및 자신의 데이터를 훈련시키는 방법을 여기서 배울 수 있습니다.

텐서와 빠른 수학

Trax 모델을 통해 흐르는 기본 장치는 텐서 - 다차원 배열이며 때로는 텐서 작업에 가장 널리 사용되는 패키지로 인해 Numpy가 numpy 로 알려진 다차원 어레이라고도합니다. 텐서에서 작동하는 방법을 모르는 경우 Numpy Guide를 살펴 봐야합니다. Trax는 Numpy API를 사용합니다.

Trax에서는 Numpy 작업이 매우 빠르게 실행되어 GPU와 TPU를 사용하여이를 가속화하기를 원합니다. 또한 텐서에서 기능의 그라디언트를 자동으로 계산하려고합니다. 이것은 jax와 tensorflow numpy 덕분에 trax.fastmath 패키지에서 이루어집니다.

 from trax . fastmath import numpy as fastnp
trax . fastmath . use_backend ( 'jax' )  # Can be 'jax' or 'tensorflow-numpy'.

matrix  = fastnp . array ([[ 1 , 2 , 3 ], [ 4 , 5 , 6 ], [ 7 , 8 , 9 ]])
print ( f'matrix = n { matrix } ' )
vector = fastnp . ones ( 3 )
print ( f'vector = { vector } ' )
product = fastnp . dot ( vector , matrix )
print ( f'product = { product } ' )
tanh = fastnp . tanh ( product )
print ( f'tanh(product) = { tanh } ' )

 matrix = 
[[1 2 3]
 [4 5 6]
 [7 8 9]]
vector = [1. 1. 1.]
product = [12. 15. 18.]
tanh(product) = [0.99999994 0.99999994 0.99999994]

그라디언트는 trax.fastmath.grad 사용하여 계산할 수 있습니다.

 def f ( x ):
  return 2.0 * x * x

grad_f = trax . fastmath . grad ( f )

print ( f'grad(2x^2) at 1 = { grad_f ( 1.0 ) } ' )

 grad(2x^2) at 1 = 4.0

레이어

레이어는 Trax 모델의 기본 빌딩 블록입니다. 당신은 레이어 소개에서 그들에 대해 모든 것을 배울 것이지만 지금은 하나의 핵심 트랙스 층의 구현을 살펴 Embedding .

 class Embedding ( base . Layer ):
  """Trainable layer that maps discrete tokens/IDs to vectors."""

  def __init__ ( self ,
               vocab_size ,
               d_feature ,
               kernel_initializer = init . RandomNormalInitializer ( 1.0 )):
    """Returns an embedding layer with given vocabulary size and vector size.

    Args:
      vocab_size: Size of the input vocabulary. The layer will assign a unique
          vector to each ID in `range(vocab_size)`.
      d_feature: Dimensionality/depth of the output vectors.
      kernel_initializer: Function that creates (random) initial vectors for
          the embedding.
    """
    super (). __init__ ( name = f'Embedding_ { vocab_size } _ { d_feature } ' )
    self . _d_feature = d_feature  # feature dimensionality
    self . _vocab_size = vocab_size
    self . _kernel_initializer = kernel_initializer

  def forward ( self , x ):
    """Returns embedding vectors corresponding to input token IDs.

    Args:
      x: Tensor of token IDs.

    Returns:
      Tensor of embedding vectors.
    """
    return jnp . take ( self . weights , x , axis = 0 , mode = 'clip' )

  def init_weights_and_state ( self , input_signature ):
    """Returns tensor of newly initialized embedding vectors."""
    del input_signature
    shape_w = ( self . _vocab_size , self . _d_feature )
    w = self . _kernel_initializer ( shape_w , self . rng )
    self . weights = w

Embedding 과 같은 훈련 가능한 무게를 가진 레이어는 입력의 시그니처 (모양 및 dtype)로 초기화해야하며 호출하여 실행할 수 있습니다.

 from trax import layers as tl

# Create an input tensor x.
x = np . arange ( 15 )
print ( f'x = { x } ' )

# Create the embedding layer.
embedding = tl . Embedding ( vocab_size = 20 , d_feature = 32 )
embedding . init ( trax . shapes . signature ( x ))

# Run the layer -- y = embedding(x).
y = embedding ( x )
print ( f'shape of y = { y . shape } ' )

 x = [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]
shape of y = (15, 32)

모델

Trax의 모델은 Serial 및 Branch 조합기를 사용하여 가장 자주 층으로 구축됩니다. 레이어 소개의 해당 콤비네이터에 대한 자세한 내용을 읽고 trax/models/ (예 : Transformer Language 모델)의 많은 모델에 대한 코드를 볼 수 있습니다. 아래는 감정 분류 모델을 구축하는 방법의 예입니다.

 model = tl . Serial (
    tl . Embedding ( vocab_size = 8192 , d_feature = 256 ),
    tl . Mean ( axis = 1 ),  # Average on axis 1 (length of sentence).
    tl . Dense ( 2 ),      # Classify 2 classes.
    tl . LogSoftmax ()   # Produce log-probabilities.
)

# You can print model structure.
print ( model )

 Serial[
  Embedding_8192_256
  Mean
  Dense_2
  LogSoftmax
]

데이터

모델을 훈련하려면 데이터가 필요합니다. Trax에서 데이터 스트림은 Python Iterators로 표시되므로 next(data_stream) 호출하고 튜플 (inputs, targets) 을 얻을 수 있습니다. Trax를 사용하면 Tensorflow 데이터 세트를 쉽게 사용할 수 있으며 Standard open('my_file.txt') 사용하여 자신의 텍스트 파일에서 반복자를 얻을 수도 있습니다.

 train_stream = trax . data . TFDS ( 'imdb_reviews' , keys = ( 'text' , 'label' ), train = True )()
eval_stream = trax . data . TFDS ( 'imdb_reviews' , keys = ( 'text' , 'label' ), train = False )()
print ( next ( train_stream ))  # See one example.

 (b"This was an absolutely terrible movie. Don't be lured in by Christopher Walken or Michael Ironside. Both are great actors, but this must simply be their worst role in history. Even their great acting could not redeem this movie's ridiculous storyline. This movie is an early nineties US propaganda piece. The most pathetic scenes were those when the Columbian rebels were making their cases for revolutions. Maria Conchita Alonso appeared phony, and her pseudo-love affair with Walken was nothing but a pathetic emotional plug in a movie that was devoid of any real meaning. I am disappointed that there are movies like this, ruining actor's like Christopher Walken's good name. I could barely sit through it.", 0)

trax.data 모듈을 사용하여 입력 처리 파이프 라인 (예 : 데이터를 토큰 화하고 셔플 할 수 있습니다. trax.data.Serial 사용하여 데이터 파이프 라인을 생성하며 스트림에 적용하여 처리 된 스트림을 생성하는 기능입니다.

 data_pipeline = trax . data . Serial (
    trax . data . Tokenize ( vocab_file = 'en_8k.subword' , keys = [ 0 ]),
    trax . data . Shuffle (),
    trax . data . FilterByLength ( max_length = 2048 , length_keys = [ 0 ]),
    trax . data . BucketByLength ( boundaries = [  32 , 128 , 512 , 2048 ],
                             batch_sizes = [ 256 ,  64 ,  16 ,    4 , 1 ],
                             length_keys = [ 0 ]),
    trax . data . AddLossWeights ()
  )
train_batches_stream = data_pipeline ( train_stream )
eval_batches_stream = data_pipeline ( eval_stream )
example_batch = next ( train_batches_stream )
print ( f'shapes = { [ x . shape for x in example_batch ] } ' )  # Check the shapes.

 shapes = [(4, 1024), (4,), (4,)]

감독 훈련

모델과 데이터가 있으면 trax.supervised.training 사용하여 교육을 정의하고 작업을 평가하고 교육 루프를 만듭니다. Trax Training Loop은 교육을 최적화하고 텐서 보드 로그 및 모델 체크 포인트를 생성합니다.

 from trax . supervised import training

# Training task.
train_task = training . TrainTask (
    labeled_data = train_batches_stream ,
    loss_layer = tl . WeightedCategoryCrossEntropy (),
    optimizer = trax . optimizers . Adam ( 0.01 ),
    n_steps_per_checkpoint = 500 ,
)

# Evaluaton task.
eval_task = training . EvalTask (
    labeled_data = eval_batches_stream ,
    metrics = [ tl . WeightedCategoryCrossEntropy (), tl . WeightedCategoryAccuracy ()],
    n_eval_batches = 20  # For less variance in eval numbers.
)

# Training loop saves checkpoints to output_dir.
output_dir = os . path . expanduser ( '~/output_dir/' )
!r m - rf { output_dir }
training_loop = training . Loop ( model ,
                              train_task ,
                              eval_tasks = [ eval_task ],
                              output_dir = output_dir )

# Run 2000 steps (batches).
training_loop . run ( 2000 )

 Step      1: Ran 1 train steps in 0.78 secs
Step      1: train WeightedCategoryCrossEntropy |  1.33800304
Step      1: eval  WeightedCategoryCrossEntropy |  0.71843582
Step      1: eval      WeightedCategoryAccuracy |  0.56562500

Step    500: Ran 499 train steps in 5.77 secs
Step    500: train WeightedCategoryCrossEntropy |  0.62914723
Step    500: eval  WeightedCategoryCrossEntropy |  0.49253047
Step    500: eval      WeightedCategoryAccuracy |  0.74062500

Step   1000: Ran 500 train steps in 5.03 secs
Step   1000: train WeightedCategoryCrossEntropy |  0.42949259
Step   1000: eval  WeightedCategoryCrossEntropy |  0.35451687
Step   1000: eval      WeightedCategoryAccuracy |  0.83750000

Step   1500: Ran 500 train steps in 4.80 secs
Step   1500: train WeightedCategoryCrossEntropy |  0.41843575
Step   1500: eval  WeightedCategoryCrossEntropy |  0.35207348
Step   1500: eval      WeightedCategoryAccuracy |  0.82109375

Step   2000: Ran 500 train steps in 5.35 secs
Step   2000: train WeightedCategoryCrossEntropy |  0.38129005
Step   2000: eval  WeightedCategoryCrossEntropy |  0.33760912
Step   2000: eval      WeightedCategoryAccuracy |  0.85312500

모델을 훈련 한 후에는 모든 레이어처럼 실행하여 결과를 얻으십시오.

 example_input = next ( eval_batches_stream )[ 0 ][ 0 ]
example_input_str = trax . data . detokenize ( example_input , vocab_file = 'en_8k.subword' )
print ( f'example input_str: { example_input_str } ' )
sentiment_log_probs = model ( example_input [ None , :])  # Add batch dimension.
print ( f'Model returned sentiment probabilities: { np . exp ( sentiment_log_probs ) } ' )

 example input_str: I first saw this when I was a teen in my last year of Junior High. I was riveted to it! I loved the special effects, the fantastic places and the trial-aspect and flashback method of telling the story.<br /><br />Several years later I read the book and while it was interesting and I could definitely see what Swift was trying to say, I think that while it's not as perfect as the book for social commentary, as a story the movie is better. It makes more sense to have it be one long adventure than having Gulliver return after each voyage and making a profit by selling the tiny Lilliput sheep or whatever.<br /><br />It's much more arresting when everyone thinks he's crazy and the sheep DO make a cameo anyway. As a side note, when I saw Laputa I was stunned. It looks very much like the Kingdom of Zeal from the Chrono Trigger video game (1995) that also made me like this mini-series even more.<br /><br />I saw it again about 4 years ago, and realized that I still enjoyed it just as much. Really high quality stuff and began an excellent run of Sweeps mini-series for NBC who followed it up with the solid Merlin and interesting Alice in Wonderland.<pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad>
Model returned sentiment probabilities: [[3.984500e-04 9.996014e-01]]

확장하다