janeiro 28, 2019 —
                                          
Posted by Sara Robinson
Have you ever started building an ML model, only to realize you’re not sure which model architecture will yield the best results? Enter the TensorFlow-based AdaNet framework. With AdaNet, you can feed multiple models into AdaNet’s algorithm and it’ll find the optimal combination of all of them as part of the training process. I’ve been playing with it recently and have bee…
AutoEnsembleEstimator. You can build any type of network with AdaNet (images, text, structured data, etc.). For this example, I’ll build a text classification model to predict the author given a few sentences of text they’ve written. In addition to AdaNet, here are the tools we’ll be using to build this model:
import adanet
import numpy as np
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
import urllib
from sklearn.preprocessing import LabelEncoderurllib, convert it to a Pandas DataFrame, shuffle the data, and preview it:urllib.request.urlretrieve('https://storage.googleapis.com/authors-training-data/data.csv', 'data.csv')
data = pd.read_csv('data.csv')
data = data.sample(frac=1) # Shuffles the data
data.head()train_size = int(len(data) * .8)
train_text = data['text'][:train_size]
train_authors = data['author'][:train_size]
test_text = data['text'][train_size:]
test_authors = data['author'][train_size:]encoder = LabelEncoder()
encoder.fit_transform(np.array(train_authors))
train_encoded = encoder.transform(train_authors)
test_encoded = encoder.transform(test_authors)
num_classes = len(encoder.classes_)ndim_embeddings = hub.text_embedding_column(
  "ndim",
  module_spec="https://tfhub.dev/google/nnlm-en-dim128/1", trainable=False 
)
encoder_embeddings = hub.text_embedding_column(
  "encoder", 
  module_spec="https://tfhub.dev/google/universal-sentence-encoder/2", trainable=False)DNNEstimator for both:estimator_ndim = tf.contrib.estimator.DNNEstimator(
  head=multi_class_head,
  hidden_units=[64,10],
  feature_columns=[ndim_embeddings]
)
estimator_encoder = tf.contrib.estimator.DNNEstimator(
  head=multi_class_head,
  hidden_units=[64,10],
  feature_columns=[encoder_embeddings]
)hidden_units tells TensorFlow the number of neurons our network will have in each layer. For each of these, it’ll have 64 in the first layer and 10 in the second. feature_columns is a list of the features for our model. In this example we have only one (the sentence of the book).AutoEnsembleEstimator which makes this pretty simple. It will take both estimators I’ve created, and incrementally create an ensemble by averaging the predictions of each model. For more customization, check out the adanet.subnetwork Builder and Generator classes. With AutoEnsembleEstimator, we can feed both of the models we’ve defined above into the ensemble in the candidate_pool param:model_dir=os.path.join('/path/to/model/dir')
multi_class_head = tf.contrib.estimator.multi_class_head(
  len(encoder.classes_),
  loss_reduction=tf.losses.Reduction.SUM_OVER_BATCH_SIZE
)
estimator = adanet.AutoEnsembleEstimator(
    head=multi_class_head,
    candidate_pool=[
        estimator_ndim,
        estimator_encoder
    ],
    config=tf.estimator.RunConfig(
      save_summary_steps=1000,
      save_checkpoints_steps=1000,
      model_dir=model_dir
    ),
    max_iteration_steps=5000
)head is an instance of tf.contrib.estimator.Head, and it tells our model how to compute loss and evaluation metrics for each possible ensemble. AdaNet calls these potential ensemble networks “candidates”. There are many different types of heads (for regression, multi-class classification, etc.). Here we’re using the multi_class_head since there are more than 2 possible label classes in our model. For a model assigning multiple labels to one particular input, we’d use multi_label_head.config sets up some parameters for running our training job: how often we want to save model summaries and checkpoints, and the directory where TF should save them to. Keep in mind that if you’re training a model in Colab, saving checkpoints too frequently could eat up your available disk space.max_iteration_steps tells AdaNet how many training steps to perform in a single iteration. An iteration refers to training for a group of candidates, so this number (along with total training steps which we’ll define later) tells AdaNet how often to generate new ensemble candidates.train_and_evaluate function for this, which will run training and evaluation at the same time. In order to set this up, we need to write our training and evaluation input functions. Input functions handle feeding the data into our model. We’ll use the tf.data API in our input functions. Even though we have two separate models with different feature columns, we can put both features in the same dict so we only need to write one input function: train_features = {
  "ndim": train_text,
  "encoder": train_text
}
def input_fn_train():
  dataset = tf.data.Dataset.from_tensor_slices((train_features, train_authors))
  dataset = dataset.repeat().shuffle(100).batch(64)
  iterator = dataset.make_one_shot_iterator()
  data, labels = iterator.get_next()
  return data, labelseval_features = {
  "ndim": test_text,
  "encoder": test_reviews
}
def input_fn_eval():
  dataset = tf.data.Dataset.from_tensor_slices((eval_features, test_authors))
  dataset = dataset.batch(64)
  iterator = dataset.make_one_shot_iterator()
  data, labels = iterator.get_next()
  return data, labelstrain_spec = tf.estimator.TrainSpec(
  input_fn=input_fn_train,
  max_steps=40000
)
eval_spec=tf.estimator.EvalSpec(
  input_fn=input_fn_eval,
  steps=None,
  start_delay_secs=10,
  throttle_secs=10
)max_iteration_steps above? The max_steps parameter in our TrainSpec refers to the total number of steps to train for. This means we’ll have 8 iterations total (8 groups of ensemble candidates).  Now it’s time to run training and evaluation: tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)setup.py
config.yaml
trainer/
  model.py
  __init__.pytrainer directory anything you like — this is the name of the Python package we’ll be uploading to ML Engine with our model. __init__.py is an empty file, and model.py contains all of the code above. setup.py contains the name and version of our package, along with any Python package dependencies we’re using to create the model.  config.yaml is where you specify any Cloud-specific parameters for training. These are things like whether you’ll make use of GPUs or TPUs for training, and how many workers you’ll need for your training job. All of the configuration options are listed here. 
model.py file mentioned above to export your model to Cloud Storage when it’s done training. If you don’t care about this right now, you can skip to the next section.  We’ll export our model using the LatestExporter class. To create an exporter, we’ll need to define a serving input function. This confused me at first, but it’s not too different from the other input functions we defined. It should return two things: the format of inputs our model should expect when it’s served, and the format of inputs the server should expect. In our model these are the same, but in some cases you may want to do some preprocessing on inputs before they’re fed into the model. Because ours are the same, the serving input function is pretty straightforward: 
def serving_input_fn():
    feature_placeholders = {
      'ndim' : tf.placeholder(tf.string, [None]),
      'encoder' : tf.placeholder(tf.string, [None])
    }
    return tf.estimator.export.ServingInputReceiver(feature_placeholders, feature_placeholders)exporter = tf.estimator.LatestExporter('exporter', serving_input_fn, exports_to_keep=None)export(), we’ll also need our model’s last checkpoint and the eval results from that checkpoint. We can get those with the following: latest_ckpt = tf.train.latest_checkpoint(model_dir)
last_eval = estimator.evaluate(
    input_fn_eval,
    checkpoint_path=latest_ckpt
)
exporter.export(estimator, model_dir, latest_ckpt, last_eval, is_the_final_export=True)export JOB_ID=unique_job_name
export JOB_DIR=gs://your/gcs/bucket/path
export PACKAGE_PATH=trainer/
export MODULE=trainer.model
export REGION=your_cloud_project_regiongcloud ml-engine jobs submit training $JOB_ID --package-path $PACKAGE_PATH --module-name $MODULE --job-dir $JOB_DIR --region $REGION --runtime-version 1.12 --python-version 3.5 --config config.yaml
Run the following command to point TensorBoard to your log directory on Cloud Storage:
tensorboard --logdir gs://your/gcs/checkpoint/pathlocalhost:6006 to view training progress, and navigate to the scalars tab:  Confessions: I had avoided using TensorBoard until now (so many graphs can be intimidating!). But as you’ll soon see, TensorBoard makes it much easier to understand how your model is performing and it’s especially useful for AdaNet. We’ll focus only on the accuracy and
 Confessions: I had avoided using TensorBoard until now (so many graphs can be intimidating!). But as you’ll soon see, TensorBoard makes it much easier to understand how your model is performing and it’s especially useful for AdaNet. We’ll focus only on the accuracy and adanet_loss graphs here. Let’s start with accuracy, looking at the adanet_weighted_ensemble graph:  Remember that our model has 5000 steps per iteration, meaning every 5000 steps AdaNet will generate new candidate ensembles (with the exception of the first iteration, which includes only the individual networks). If you hover over the graph you can see which iteration and ensemble each line refers to:
 Remember that our model has 5000 steps per iteration, meaning every 5000 steps AdaNet will generate new candidate ensembles (with the exception of the first iteration, which includes only the individual networks). If you hover over the graph you can see which iteration and ensemble each line refers to:  We can see that at this point in training, the second ensemble from iteration 7 (
 We can see that at this point in training, the second ensemble from iteration 7 (t6_DNNEstimator1/eval) has the best accuracy. TensorBoard really shows us the power of combining models with AdaNet — as training continues, ensemble accuracy improves and is much higher than the accuracy of the individual networks on their own (the pink and light blue lines on the left in the graph above).  The loss (or error) graph reveals similar trends: error steadily decreases as AdaNet generates and trains new ensembles.  
 
 If you’d like to serve your model on ML Engine (I’ll cover that in a follow-up post), you can point ML Engine to this bucket following the deploy steps here. You can also download these files locally and serve the model however you’d like.
 If you’d like to serve your model on ML Engine (I’ll cover that in a follow-up post), you can point ML Engine to this bucket following the deploy steps here. You can also download these files locally and serve the model however you’d like.  Because it would be sad to leave you hanging without doing any predictions on our trained model, let’s make use of ML Engine’s local predict to make a local prediction on our trained model from the command line. All we need to do is create a newline-delimited JSON file with an input we want a prediction for, following the same format as our serving input function.  
Here’s an example:
{"encoder": "A strange land indeed! Could it be one with his native New England? Did Congress assemble from the Antipodes?", "ndim": "A strange land indeed! Could it be one with his native New England? Did Congress assemble from the Antipodes?"}gcloud ml-engine local predict --model-dir=gs://path/to/saved_model.pb --json-instances=path/to/test.jsonCLASS_IDS  CLASSES                                                                                               PROBABILITIES
[1]        [u'1']    [0.0043347785249352455, 0.8382837176322937, 0.12185576558113098, 0.025106186047196388, 0.010419543832540512]encoder.classes_ above), which is Churchill. That’s correct! AutoEnsembleEstimator and train it on Cloud ML Engine. Want to learn more about what I covered here? Check out these resources: tf.train_and_evaluate 
janeiro 28, 2019
 —
                                  
Posted by Sara Robinson
Have you ever started building an ML model, only to realize you’re not sure which model architecture will yield the best results? Enter the TensorFlow-based AdaNet framework. With AdaNet, you can feed multiple models into AdaNet’s algorithm and it’ll find the optimal combination of all of them as part of the training process. I’ve been playing with it recently and have bee…