Descargar trax - Descargar el código fuente de trax

TRAX - Aprendizaje profundo con código claro y velocidad

Trax es una biblioteca de extremo a extremo para el aprendizaje profundo que se centra en el código y la velocidad claros. Se usa y mantiene activamente en el equipo de Google Brain. Este cuaderno (ejecutarlo en Colab) muestra cómo usar TRAX y dónde puede encontrar más información.

Ejecute un transformador previamente capacitado : cree un traductor en unas pocas líneas de código
Características y recursos : API Docs, dónde hablar con nosotros, cómo abrir un problema y más
Tutorial : cómo funciona Trax, cómo hacer nuevos modelos y entrenar en sus propios datos

¡Agradecemos las contribuciones a Trax! Damos la bienvenida a PRS con código para nuevos modelos y capas, así como mejoras en nuestro código y documentación. ¡Amamos especialmente los cuadernos que explican cómo funcionan los modelos y muestran cómo usarlos para resolver problemas!

Aquí hay algunos cuadernos de ejemplo:-

Trax.data API explicada : Explica algunas de las funciones principales en la API trax.data
Reconocimiento de entidades nombrado usando Reformer : utiliza un conjunto de datos Kaggle para implementar el reconocimiento de entidades nombrado utilizando la arquitectura de reformador.
Modelos de N-Gram profundos : implementación de modelos N-Gram profundos capacitados en Shakespeares Works

Configuración general

Ejecute la siguiente celda (una vez) antes de ejecutar cualquiera de las muestras de código.

 import os
import numpy as np

!p ip install - q - U trax
import trax

1. Ejecute un transformador previamente capacitado

Así es como crea un traductor de inglés en inglés en algunas líneas de código:

Cree un modelo de transformador en TRAX con TRAX.models.transformer
Inicializarlo desde un archivo con pesos previamente capacitados con model.init_from_file
Tokenice su oración de entrada para ingresar al modelo con Trax.Data.Tokenize
decodificación del transformador con trax.supervised.decoding.auTorregressessive_sample
desactivar el resultado decodificado para obtener la traducción con trax.data.detokenize

 # Create a Transformer model.
# Pre-trained model config in gs://trax-ml/models/translation/ende_wmt32k.gin
model = trax . models . Transformer (
    input_vocab_size = 33300 ,
    d_model = 512 , d_ff = 2048 ,
    n_heads = 8 , n_encoder_layers = 6 , n_decoder_layers = 6 ,
    max_len = 2048 , mode = 'predict' )

# Initialize using pre-trained weights.
model . init_from_file ( 'gs://trax-ml/models/translation/ende_wmt32k.pkl.gz' ,
                     weights_only = True )

# Tokenize a sentence.
sentence = 'It is nice to learn new things today!'
tokenized = list ( trax . data . tokenize ( iter ([ sentence ]),  # Operates on streams.
                                    vocab_dir = 'gs://trax-ml/vocabs/' ,
                                    vocab_file = 'ende_32k.subword' ))[ 0 ]

# Decode from the Transformer.
tokenized = tokenized [ None , :]  # Add batch dimension.
tokenized_translation = trax . supervised . decoding . autoregressive_sample (
    model , tokenized , temperature = 0.0 )  # Higher temperature: more diverse results.

# De-tokenize,
tokenized_translation = tokenized_translation [ 0 ][: - 1 ]  # Remove batch and EOS.
translation = trax . data . detokenize ( tokenized_translation ,
                                   vocab_dir = 'gs://trax-ml/vocabs/' ,
                                   vocab_file = 'ende_32k.subword' )
print ( translation )

 Es ist schön, heute neue Dinge zu lernen!

2. Características y recursos

TRAX incluye modelos básicos (como Resnet, LSTM, Transformer) y algoritmos RL (como Reforce, A2C, PPO). También se usa activamente para la investigación e incluye nuevos modelos como el reformador y los nuevos algoritmos RL como AWR. Trax tiene enlaces a una gran cantidad de conjuntos de datos de aprendizaje profundo, incluidos los conjuntos de datos Tensor2Tensor y TensorFlow.

Puede usar TRAX como una biblioteca de sus propios scripts y cuadernos de Python o como un binario del shell, que puede ser más conveniente para capacitar a modelos grandes. Se ejecuta sin ningún cambio en CPU, GPU y TPU.

Documentos de API
Chatear con nosotros
abrir un problema
Suscríbase a Trax-Discuss para noticias

3. Tutorial

Puede aprender aquí cómo funciona Trax, cómo crear nuevos modelos y cómo entrenarlos en sus propios datos.

Tensores y matemáticas rápidas

Las unidades básicas que fluyen a través de los modelos TRAX son tensores , matrices multidimensionales, a veces también conocidas como matrices numpy, debido al paquete más utilizado para operaciones tensoras, numpy . Debe echar un vistazo a la guía Numpy si no sabe cómo operar en tensores: Trax también usa la API Numpy para eso.

En Trax queremos que las operaciones numpy funcionen muy rápido, haciendo uso de GPU y TPU para acelerarlas. También queremos calcular automáticamente los gradientes de funciones en tensores. Esto se hace en el paquete trax.fastmath gracias a sus backends: Jax y Tensorflow Numpy.

 from trax . fastmath import numpy as fastnp
trax . fastmath . use_backend ( 'jax' )  # Can be 'jax' or 'tensorflow-numpy'.

matrix  = fastnp . array ([[ 1 , 2 , 3 ], [ 4 , 5 , 6 ], [ 7 , 8 , 9 ]])
print ( f'matrix = n { matrix } ' )
vector = fastnp . ones ( 3 )
print ( f'vector = { vector } ' )
product = fastnp . dot ( vector , matrix )
print ( f'product = { product } ' )
tanh = fastnp . tanh ( product )
print ( f'tanh(product) = { tanh } ' )

 matrix = 
[[1 2 3]
 [4 5 6]
 [7 8 9]]
vector = [1. 1. 1.]
product = [12. 15. 18.]
tanh(product) = [0.99999994 0.99999994 0.99999994]

Los gradientes se pueden calcular usando trax.fastmath.grad .

 def f ( x ):
  return 2.0 * x * x

grad_f = trax . fastmath . grad ( f )

print ( f'grad(2x^2) at 1 = { grad_f ( 1.0 ) } ' )

 grad(2x^2) at 1 = 4.0

Capas

Las capas son bloques de construcción básicos de modelos TRAX. Aprenderá todo sobre ellos en la introducción de capas, pero por ahora, solo eche un vistazo a la implementación de una capa de Trax de un núcleo, Embedding :

 class Embedding ( base . Layer ):
  """Trainable layer that maps discrete tokens/IDs to vectors."""

  def __init__ ( self ,
               vocab_size ,
               d_feature ,
               kernel_initializer = init . RandomNormalInitializer ( 1.0 )):
    """Returns an embedding layer with given vocabulary size and vector size.

    Args:
      vocab_size: Size of the input vocabulary. The layer will assign a unique
          vector to each ID in `range(vocab_size)`.
      d_feature: Dimensionality/depth of the output vectors.
      kernel_initializer: Function that creates (random) initial vectors for
          the embedding.
    """
    super (). __init__ ( name = f'Embedding_ { vocab_size } _ { d_feature } ' )
    self . _d_feature = d_feature  # feature dimensionality
    self . _vocab_size = vocab_size
    self . _kernel_initializer = kernel_initializer

  def forward ( self , x ):
    """Returns embedding vectors corresponding to input token IDs.

    Args:
      x: Tensor of token IDs.

    Returns:
      Tensor of embedding vectors.
    """
    return jnp . take ( self . weights , x , axis = 0 , mode = 'clip' )

  def init_weights_and_state ( self , input_signature ):
    """Returns tensor of newly initialized embedding vectors."""
    del input_signature
    shape_w = ( self . _vocab_size , self . _d_feature )
    w = self . _kernel_initializer ( shape_w , self . rng )
    self . weights = w

Las capas con pesas entrenables como Embedding deben inicializarse con la firma (forma y dtype) de la entrada, y luego se pueden ejecutar llamándolas.

 from trax import layers as tl

# Create an input tensor x.
x = np . arange ( 15 )
print ( f'x = { x } ' )

# Create the embedding layer.
embedding = tl . Embedding ( vocab_size = 20 , d_feature = 32 )
embedding . init ( trax . shapes . signature ( x ))

# Run the layer -- y = embedding(x).
y = embedding ( x )
print ( f'shape of y = { y . shape } ' )

 x = [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]
shape of y = (15, 32)

Modelos

Los modelos en Trax se construyen a partir de capas con mayor frecuencia utilizando los combinadores Serial y Branch . Puede leer más sobre esos combinadores en la introducción de capas y ver el código para muchos modelos en trax/models/ , por ejemplo, así es como se implementa el modelo de lenguaje Transformer. A continuación se muestra un ejemplo de cómo construir un modelo de clasificación de sentimientos.

 model = tl . Serial (
    tl . Embedding ( vocab_size = 8192 , d_feature = 256 ),
    tl . Mean ( axis = 1 ),  # Average on axis 1 (length of sentence).
    tl . Dense ( 2 ),      # Classify 2 classes.
    tl . LogSoftmax ()   # Produce log-probabilities.
)

# You can print model structure.
print ( model )

 Serial[
  Embedding_8192_256
  Mean
  Dense_2
  LogSoftmax
]

Datos

Para capacitar a su modelo, necesita datos. En Trax, los flujos de datos se representan como iteradores de Python, por lo que puede llamar next(data_stream) y obtener una tupla, por ejemplo, (inputs, targets) . Trax le permite usar datos de datos TensorFlow fácilmente y también puede obtener un iterador de su propio archivo de texto utilizando el estándar open('my_file.txt') .

 train_stream = trax . data . TFDS ( 'imdb_reviews' , keys = ( 'text' , 'label' ), train = True )()
eval_stream = trax . data . TFDS ( 'imdb_reviews' , keys = ( 'text' , 'label' ), train = False )()
print ( next ( train_stream ))  # See one example.

 (b"This was an absolutely terrible movie. Don't be lured in by Christopher Walken or Michael Ironside. Both are great actors, but this must simply be their worst role in history. Even their great acting could not redeem this movie's ridiculous storyline. This movie is an early nineties US propaganda piece. The most pathetic scenes were those when the Columbian rebels were making their cases for revolutions. Maria Conchita Alonso appeared phony, and her pseudo-love affair with Walken was nothing but a pathetic emotional plug in a movie that was devoid of any real meaning. I am disappointed that there are movies like this, ruining actor's like Christopher Walken's good name. I could barely sit through it.", 0)

Usando el módulo trax.data puede crear tuberías de procesamiento de entrada, por ejemplo, para tokenizar y barajar sus datos. Cree tuberías de datos utilizando trax.data.Serial y son funciones que aplica a las transmisiones para crear flujos procesados.

 data_pipeline = trax . data . Serial (
    trax . data . Tokenize ( vocab_file = 'en_8k.subword' , keys = [ 0 ]),
    trax . data . Shuffle (),
    trax . data . FilterByLength ( max_length = 2048 , length_keys = [ 0 ]),
    trax . data . BucketByLength ( boundaries = [  32 , 128 , 512 , 2048 ],
                             batch_sizes = [ 256 ,  64 ,  16 ,    4 , 1 ],
                             length_keys = [ 0 ]),
    trax . data . AddLossWeights ()
  )
train_batches_stream = data_pipeline ( train_stream )
eval_batches_stream = data_pipeline ( eval_stream )
example_batch = next ( train_batches_stream )
print ( f'shapes = { [ x . shape for x in example_batch ] } ' )  # Check the shapes.

 shapes = [(4, 1024), (4,), (4,)]

Entrenamiento supervisado

Cuando tenga el modelo y los datos, use trax.supervised.training para definir tareas de entrenamiento y evaluación y crear un bucle de entrenamiento. El bucle de entrenamiento TRAX optimiza el entrenamiento y creará registros de placa tensor y modelos de puntos de control para usted.

 from trax . supervised import training

# Training task.
train_task = training . TrainTask (
    labeled_data = train_batches_stream ,
    loss_layer = tl . WeightedCategoryCrossEntropy (),
    optimizer = trax . optimizers . Adam ( 0.01 ),
    n_steps_per_checkpoint = 500 ,
)

# Evaluaton task.
eval_task = training . EvalTask (
    labeled_data = eval_batches_stream ,
    metrics = [ tl . WeightedCategoryCrossEntropy (), tl . WeightedCategoryAccuracy ()],
    n_eval_batches = 20  # For less variance in eval numbers.
)

# Training loop saves checkpoints to output_dir.
output_dir = os . path . expanduser ( '~/output_dir/' )
!r m - rf { output_dir }
training_loop = training . Loop ( model ,
                              train_task ,
                              eval_tasks = [ eval_task ],
                              output_dir = output_dir )

# Run 2000 steps (batches).
training_loop . run ( 2000 )

 Step      1: Ran 1 train steps in 0.78 secs
Step      1: train WeightedCategoryCrossEntropy |  1.33800304
Step      1: eval  WeightedCategoryCrossEntropy |  0.71843582
Step      1: eval      WeightedCategoryAccuracy |  0.56562500

Step    500: Ran 499 train steps in 5.77 secs
Step    500: train WeightedCategoryCrossEntropy |  0.62914723
Step    500: eval  WeightedCategoryCrossEntropy |  0.49253047
Step    500: eval      WeightedCategoryAccuracy |  0.74062500

Step   1000: Ran 500 train steps in 5.03 secs
Step   1000: train WeightedCategoryCrossEntropy |  0.42949259
Step   1000: eval  WeightedCategoryCrossEntropy |  0.35451687
Step   1000: eval      WeightedCategoryAccuracy |  0.83750000

Step   1500: Ran 500 train steps in 4.80 secs
Step   1500: train WeightedCategoryCrossEntropy |  0.41843575
Step   1500: eval  WeightedCategoryCrossEntropy |  0.35207348
Step   1500: eval      WeightedCategoryAccuracy |  0.82109375

Step   2000: Ran 500 train steps in 5.35 secs
Step   2000: train WeightedCategoryCrossEntropy |  0.38129005
Step   2000: eval  WeightedCategoryCrossEntropy |  0.33760912
Step   2000: eval      WeightedCategoryAccuracy |  0.85312500

Después de entrenar el modelo, ejecutarlo como cualquier capa para obtener resultados.

 example_input = next ( eval_batches_stream )[ 0 ][ 0 ]
example_input_str = trax . data . detokenize ( example_input , vocab_file = 'en_8k.subword' )
print ( f'example input_str: { example_input_str } ' )
sentiment_log_probs = model ( example_input [ None , :])  # Add batch dimension.
print ( f'Model returned sentiment probabilities: { np . exp ( sentiment_log_probs ) } ' )

 example input_str: I first saw this when I was a teen in my last year of Junior High. I was riveted to it! I loved the special effects, the fantastic places and the trial-aspect and flashback method of telling the story.<br /><br />Several years later I read the book and while it was interesting and I could definitely see what Swift was trying to say, I think that while it's not as perfect as the book for social commentary, as a story the movie is better. It makes more sense to have it be one long adventure than having Gulliver return after each voyage and making a profit by selling the tiny Lilliput sheep or whatever.<br /><br />It's much more arresting when everyone thinks he's crazy and the sheep DO make a cameo anyway. As a side note, when I saw Laputa I was stunned. It looks very much like the Kingdom of Zeal from the Chrono Trigger video game (1995) that also made me like this mini-series even more.<br /><br />I saw it again about 4 years ago, and realized that I still enjoyed it just as much. Really high quality stuff and began an excellent run of Sweeps mini-series for NBC who followed it up with the solid Merlin and interesting Alice in Wonderland.<pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad>
Model returned sentiment probabilities: [[3.984500e-04 9.996014e-01]]

Expandir