GCDEC/Engineering Tensorflow/Notes
From charlesreid1
Contents
Engineering TensorFlow Models
This section of the course covers two components:
- Feature engineering
- Creating data pipeline for feeding data to machine learning model
Module 4a: Feature Engineering
Basic Feature Engineering
Feature engineering and pre-processing using Cloud ML - ways of making our data set better
At this point: we have a way of submitting models to train them in the cloud, so we can train models faster, but we STILL don't have a model that is better than our heuristic
Still have the original TensorFlow model - improve it using feature engineering and hyperparameter tuning
Good features:
- Related to objective - reduce arbitrary data
- Known a priori
- Numeric
- Enough examples
- Bring human insight (domain expertise) into probem
Features Related To Objective
Related to objective:
- Need a reasonable hypothesis for why it matters
- For a given domain, different problems require different features
Stupid quiz: related or not?
Objective: predict total number of customers who will use a discount coupon. Which of the following features are important?
- Font of the text with which the discount is advertised on partner websites (TRUE)
- Price of the item the coupon applies to (TRUE)
- Number of items in stock (FALSE)
Objective: predict whether a credit-card transaction is fraudulent
- Whether cardholder has purchased items at store before (TRUE)
- Credit card chip reader speed (FALSE)
- Category of item being purchased (TRUE)
- Expiry date of credit card (FALSE)
Values Known A Priori
Mainly important for training on old data/predicting on new data
Suppose your data warehouse had all sorts of information, and you threw it all into the data warehouse
Sales information might have sales data - but it might be stale.
- Some information known immediately
- Some information is not known at prediction time
- If you train a model on data that you don't have at prediction time, your entire model will be useless
- Ensure every feature/every input will be available AT PREDICTION TIME
- There may be some ethical issues with gathering data IMMEDIATELY from user
Quiz: is the value knowable or not?
Objective: predict total number of customers who will use a discount coupon
- Total number of discountable items sold (DEPENDS - too vague)
- Number of discountable items sold previous month (YES - you will most likely have this data in real time, but it does depend on your system)
- Number of customers who viewed ads about item (YES - but question of time... how long does ad analysis take to get back)
Objective: predict whether credit card transaction is fraudulent
- Whether cardholder has purchased items at this store before (DEPENDS - may take 3-5 days to get transaction data... train with data AS OF 3 DAYS AGO... if stale data is what you have in real time, then stale data is what you need to use to TRAIN your model)
- Whether item is new at store (and cannot have been purchased before) (YES - should know from catalog)
- Category of item being purchased (YES - will definitely know the type of the item)
- Online or in-person purchase (YES - ditto... will def. know type of purchase)
Numerical
NN carries out simple mathematical functions on your inputs, so the inputs need to be numeric. Non-numeric features can be used, but we need a way to convert them to numerical form.
Stupid quiz: which is numeric?
Feature of discount coupon to predict number of coupons that will be used:
- Percent value of discount - TRUE
- Size of coupon (4 cm2, 24 cm2, 48 cm2, etc.) - TRUE (but not really a meaningful magnitude)
- Font of advertisement (Arial 18, Times New Roman 24, etc.) - FALSE (no meaningful magnitude)
- Color of coupon (red, black, blue) - FALSE (no meaningful magnitude)
- Item category (1 for dairy, 2 for deli, 3 for canned goods, etc.) - TRUE (using word2vec, you DO get a meaningful vector... meaningful magnitude)
Example questions: what if you subtract two values (e.g., subtract two colors)? Does that have a representative effect on the prediction?
If you do an arbitrary item categorization, you lose meaning and qualities of words (e.g., male/female, soft/hard, positive/negative, etc.). So, word2vec greatly improves the ability of word inputs to help improve the prediction.
Enough Examples
Each feature needs enough examples to be understandable in context
Rule of thumb - need AT LEAST five examples for a category to be usable in an example
Quiz: which is difficult to have enough examples of?
Predicting total number of customers who will use a coupon:
- Percent discount of coupon - DEPENDS (find five examples each, or throw it out; if you have continuous numbers, bin them into discrete groups)
- Date that promotional offer starts - DEPENDS (bin them up again, e.g., promotional offers starting in January or in Q1)
- Number of customers who opened advertising emails - TRUE (should have a number of different emails, and know how many customers opened each)
Predict whether CC transaction is fraudulent:
- Whether cardholder has purchased this item at this store - TRUE (should have this, unless it is too specific, e.g., bought diapers between 8 and 9 pm)
- Distance between cardholder and store - DEPENDS (again, should bin these up; may not have 5 examples of cardholders who bought something from a store more than 100 miles from their house)
- Category of item being purchased - TRUE
- Online or in-person purchase - TRUE
How to check? Plot histograms of data.
Turning Raw Data into Numeric Features
Example: running ice cream store, want to predict the rating a customer will give based on how long they've been waiting and what they bought
Raw data to TensorFlow feature column.
Raw data:
{ "transactionId" : 42, "name" : "Ice Cream", "price" : 2.50, "tags" : ["cold", "dessert"], "servedBy" : { "employeeId" : 45042, "waitTime" : 1.4, "customerRating" : 4 }, "storeLocation" : { "latitude" : 35.3, "longitude" : -98.7 } }
(This data comes from a web app, goes into a data warehouse, and is pulled out as JSON data)
Creating a feature column
To turn this into a feature column:
- Some fields can be used directly (e.g., customer rating, price, and wait time)
- Others (e.g., transactionId) should be ignored (don't have more than 5 examples)
- Some (e.g., employeeId) should be transformed - no meaningful magnitude... use one hot encoding
INPUT_COLUMNS = [ ... layers.real_valued_column('price'), layers.real_valued_column('waitTime'), ... ]
This calls the TensorFlow function real_valued_column because these columns are continuous and their magnitude is meaningful.
Preprocessing and Data Vocabulary
Preprocessing the data creates a new "vocabulary" of keys - and it needs to be available for BOTH training AND prediction steps (e.g., prediction sends employeeId 75534, and model needs to know how to convert that to a one hot encoding)
Three example scenarios:
First scenario: you already know the keys beforehand (e.g., employeeId and one hot encoding):
layers.sparse_column_with_keys('employeeId', keys=['12345', '48506', '28488', '23456'])
Second scenario: your data is already indexed 0 to N, but does not have a meaningful magnitude (e.g., hour of the day):
layers.sparse_column_with_integerized_feature('employeeId', 5)
Third scenario: you don't have a vocabulary of all possible values:
layers.sparse_column_hashbucket('employeeId', 500) # Hash the employee ID, and break it into 500 buckets
Hash bucket is similar to one hot encoding, but without having to explicitly build the encoding scheme.
All three of these use sparse_column_* methods in TensorFlow, because they create sparse columns (columns of booleans).
Columns leading to choices
Some columns lead you to choices you have to make.
Two questions:
- Question 1 - what to do with customer rating?
- Question 2 - what to do with missing data?
What approach should we use for customer rating?
- You have a choice.
- One hot encoding if you decide 1 and 2 and 3 and 4 are VERY different
- Continuous if you decide sliding scale is okay
What if you have missing data - if customer didn't provide a rating?
- You have options
- Can use a column to indicate whether the customer left a rating (0 or 1), and another column for the rating (0 if no rating)
- Can use one hot encoding (one column would indicate a rating of 4), so customers who don't leave a rating are just 0, 0, 0, 0, 0
- Be careful not to mix "magic (categorical) numbers" with "real (meaningful) numbers"
This also leads us to the difference between statistics and machine learning.
Statistics - imputation refers to the fact that we often fill in missing values with the average of the rest of the values (we want to preserve the information we have about the entire population as much as we can).
Machine learning - we want to separate out the cases where we have data from the cases where we don't have data, and build SEPARATE models (model behaviors) for those two cases.
Machine learning can build separate models for the data/no data case, because we have enough examples that we don't need to try to stretch our existing data set as far as we can. We just train the model to have different behavior in the data/no data case. (The same argument is true of outliers - in statistics, we throw out outliers because they contaminate the data we do have; in machine learning, we leave the outliers in because we have enough data that the outliers form their own separate model behavior.)
Creating New Features
What else can we do to go beyond the raw data?
Feature cross
Example to illustrate why feature crossing is important:
- Deciding whether a picture of a vehicle is a taxi
- Input columns: car color, and city
- Output prediction: is it a taxi
- Suppose we use a linear model - one input variable is color, another input variable is city - and output is whether the car is a taxi
- Then if we give it examples from New York, where all taxis are yellow, model will learn that all yellow vehicles are taxis (yellow gets high weight)
- If we train it on data from Rome, where most taxis are white, model will learn that white vehicles are taxis (white gets high weight)
- Linear model cannot "learn" that different cities have different color taxis
Solution:
- One solution is to add more layers - this will "mix" the inputs. But this creates more parameters, more complexity; especially if there are many inputs, many potential variable interactions, but only one or two variable interactions that are actually significant
- Better solution - take a combination of the two and add it as a new column
- If inputs are strings, one-hot-encode (example, Red Rome becomes RR, White Rome becomes WR, Yellow NYC becomes YN, etc.)
- This makes it EASY for the machine learning model to learn that this combination is important
- Use human insight to make it easier for the machine learning model
Feature cross with categorical columns
To do this in TensorFlow, create a crossed column with two sparse columns:
day_hr = layers.crossed_column([dayofweek, hourofday], 24*7)
24*7 is the number of buckets (if we choose fewer buckets, we get some grouping)
In the taxi model, this will help us capture "rush hour" (Thursday 5PM is different from Friday 5PM is different from Saturday 5PM)
Feature cross with real valued columns
How to do feature crosses with real-valued columns?
Need to discretize/bucketize floating points - this prevents overfitting (by treating the dimension as too highly discretized)
Example: predicting the price of a home in California
- House price vs Latitude: see two spikes (LA and SF)
- If we discretize too much, 34.001 and 34.002 will be considered "different"
- Group everything into bins
In TensorFlow, can bucketize two real valued columns as follows:
latbuckets = np.linspace(32.0, 42.0, nbuckets).tolist() discrete_lat = layers.bucketized_column(lat, latbuckets)
Pipeline for Processing ML Data
Here's what the pipeline looks like now:
- Inputs
- Pre-process inputs & create a model vocabulary (scaling, transforms, bucketizing, labeling, categorical features like states/zip codes/employee IDs)
- Feature creation (feature crossing)
- Train model
- TensorFlow model
Model Architectures
Two kinds of features: dense and sparse
Price: represented by just one real-valued column
- Dense feature
Employee ID: if you have N employees, need N-1 columns
- Sparse feature
Why dense features are easier:
- Suppose we are doing image processing - every pixel of the image is a dense feature
- This is easy for a neural network to deal with
- Images are perfect for doing the operations that neural networks are good at - multiplying, adding, crossing, etc.
Why sparse features are harder:
- Sparse features look very different - lots and lots of zeros, most rows are nearly all zeros
- When you add/subtract rows, the result is still going to be a row with almost all zeros
- This is difficult for a neural network to deal with - many weights in the network will have zero impact
- More likely to get stuck in a local region and be unable to get out of
Sparse features = linear models
- Linear models do very well with sparse features and sparse representations
- More likely for sparse number of neurons to get high weights
Observation:
- If we have many dense features (e.g., images), we want to have lots of layers, lots of neurons, lots of hidden layers, lots of dense embeddings, all leading to sparse features
- If we have sparse features, we want WIDER models - that is, models where there are fewer layers between the inputs and the outputs (neural network equivalent of a linear model - single layer of neurons)
Wide models vs. Deep models
- Sparse data requires wide models
- Wide models have fewer layers, behave more like linear models
- Dense data requires deep models
- Deep models have more layers, denser embeddings, more hidden layers
How to mix these?
- Real models have a mixture of both dense and sparse features
- Take your inputs and divide them into dense and sparse
- The dense inputs go into a deep model, the sparse inputs go into a wide model
Wide and Deep Networks in TensorFlow
To have your cake and eat it too, you can use the DNNLinearCombinedClassifier
class in TensorFlow:
model = tf.contrib.learn.DNNLinearCombinedClassifier( model_dir = ..., linear_feature_columns = wide_columns, dnn_feature_columns = deep_columns, dnn_hidden_units = [100, 50])
Specify the sparse columns as "wide_columns" by passing to "linear_feature_columns" argument.
Specify the dense columns as "deep_columns" by passing to "dnn_feature_columns" argument.
dnn_hidden_units specifies number of layers and number of nodes to use in the network.
Module 4b: Data Pipeline Engineering
Feature Engineering Laboratory
Link to lab: https://codelabs.developers.google.com/codelabs/dataeng-machine-learning/index.html?index=#10
Link to github repo: https://github.com/GoogleCloudPlatform/training-data-analyst
Start by connecting to the existing cloud datalab instance (or create a new one):
cloudlab connect cloudml
Open the feature engineering notebook at training-data-analyst/courses/machine_learning/feateng
Here is the BigQuery query that we're going to use to get some data to train this machine learning model with:
fare_amount,dayofweek,hourofday,pickuplon,pickuplat,dropofflon,dropofflat,passengers,key 10.1,1,21,-74.003402,40.749272,-73.963575,40.77455,1.0,2011-03-06 21:01:00.000000-74.003440.749340.7745-73.9636 9.7,6,18,-73.991711,40.764878,-73.966193,40.795124,1.0,2012-04-27 18:37:09.000000-73.991740.764940.7951-73.9662 6.9,7,13,-74.001624,40.730758,-73.992518,40.75371,1.0,2009-07-04 13:00:57.000000-74.001640.730840.7537-73.9925 8.5,7,17,-74.007512,40.741902,-73.989882,40.763757,5.0,2012-11-03 17:11:00.000000-74.007540.741940.7638-73.9899 10.0,7,17,-73.994495,40.72614,-73.969757,40.753692,1.0,2012-11-03 17:11:00.000000-73.994540.726140.7537-73.9698 4.1,6,7,-73.991414,40.744842,-73.985439,40.753518,1.0,2012-04-20 07:57:20.000000-73.991440.744840.7535-73.9854 9.5,5,13,-73.958801,40.764682,-73.956533,40.784182,1.0,2013-02-07 13:47:59.000000-73.958840.764740.7842-73.9565 10.5,4,16,-73.982177,40.765542,-73.960835,40.763187,5.0,2010-09-29 16:43:00.000000-73.982240.765540.7632-73.9608 11.0,4,5,-73.980926,40.738041,-73.960312,40.775551,1.0,2013-09-25 05:52:12.000000-73.980940.73840.7756-73.9603 16.5,1,23,-74.005266,40.722068,-73.953835,40.775138,1.0,2012-09-09 23:40:03.000000-74.005340.722140.7751-73.9538
Notebook outline and steps:
Setup and function definitions:
- Set environment variables (project, bucket, region)
- Set BigQuery query (constrain by latitude/longitude, passenger count, and fare amount)
- Define two functions:
- to_csv(): convert row dictionary to csv
- preprocess(): create a Beam pipeline to read bigquery, turn results into csv, and write the results of the training and validation data to to files
Preprocessing:
- Create an Apache Beam pipeline to process the BigQuery data and turn it into CSV data
- Create a Dataflow job to run the pipeline
- Check that the CSV files of data that result are OK
- Examine the TensorFlow model
- Train the model locally (test that training is OK)
- Evaluate the model locally (test that predictions are OK)
- Train the model in the cloud
(Note that there is an entire, separate lab and notebook on hyperparameter tuning that is not even covered...)
Beam Pipeline for Data Preprocessing
Here is the Beam pipeline to preprocess the BigQuery data:
options = { 'staging_location': os.path.join(OUTPUT_DIR, 'tmp', 'staging'), 'temp_location': os.path.join(OUTPUT_DIR, 'tmp'), 'job_name': 'preprocess-taxifeatures' + '-' + datetime.datetime.now().strftime('%y%m%d-%H%M%S'), 'project': PROJECT, 'teardown_policy': 'TEARDOWN_ALWAYS', 'no_save_main_session': True } opts = beam.pipeline.PipelineOptions(flags=[], **options) p = beam.Pipeline(RUNNER, options=opts) for n, step in enumerate(['train', 'valid']): query = create_query(n+1, EVERY_N) outfile = os.path.join(OUTPUT_DIR, '{}.csv'.format(step)) ( p | 'read_{}'.format(step) >> beam.io.Read(beam.io.BigQuerySource(query=query)) | 'tocsv_{}'.format(step) >> beam.Map(to_csv) | 'write_{}'.format(step) >> beam.io.Write(beam.io.WriteToText(outfile)) ) p.run()
This can be run locally, or using Dataflow:
preprocess(50*1000, 'DirectRunner') preprocess(50*1000, 'DataflowRunner')
When this is run, the resulting output is a set of sharded CSV files:
$ gsutil ls gs://<bucket>/taxifare/ch4/taxi_preproc/ 2556219 2017-10-22T03:43:41Z gs://charlesreid1-ml/taxifare/ch4/taxi_preproc/train.csv-00000-of-00001 2419169 2017-10-22T03:43:41Z gs://charlesreid1-ml/taxifare/ch4/taxi_preproc/valid.csv-00000-of-00001
And here's what these sharded files look like:
$ gsutil cat gs://<bucket>/taxifare/ch4/taxi_preproc/train.csv-* | head -n3 10.1,Fri,14,-73.966973,40.770045,-73.966973,40.770045,2.0,2010-10-08 14:52:00.000000-73.96740.7740.77-73.967 19.0,Wed,15,-73.981985,40.754645,-74.00003,40.714303,2.0,2012-09-05 15:45:00.000000-73.98240.754640.7143-74 12.1,Wed,10,-73.987306,40.75443,-73.999998,40.720361,2.0,2010-10-20 10:49:58.000000-73.987340.754440.7204-74
Local Model Training
Now it is time to train the model. Before using the Cloud ML Engine, we want to test out the model to make sure it works with the given inputs. Thus, we want to start by testing it LOCALLY.
We can do that from the command line using Python and passing the -m flag to point it to a module:
$ export PYTHONPATH=${PYTHONPATH}:${REPO}/courses/machine_learning/feateng/taxifare $ python -m trainer.task \ --train_data_paths="${REPO}/courses/machine_learning/feateng/sample/train*" \ --eval_data_paths=${REPO}/courses/machine_learning/feateng/sample/valid.csv \ --output_dir=${REPO}/courses/machine_learning/feateng/taxi_trained \ --num_epochs=100 --job-dir=/tmp
This generates LOTS of output:
WARNING:tensorflow:The default stddev value of initializer will change from "1/sqrt(vocab_size)" to "1/sqrt(dimension)" after 2017/02/25. WARNING:tensorflow:The default stddev value of initializer will change from "1/sqrt(vocab_size)" to "1/sqrt(dimension)" after 2017/02/25. INFO:tensorflow:Using default config. INFO:tensorflow:Using config: {'_save_checkpoints_secs': 600, '_num_ps_replicas': 0, '_keep_checkpoint_max': 5, '_task_type': None, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f6740620d90>, '_model_dir': '/content/datalab/training-data-analyst/courses/machine_learning/feateng/taxi_trained/', '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_session_config': None, '_tf_random_seed': None, '_environment': 'local', '_num_worker_replicas': 0, '_task_id': 0, '_save_summary_steps': 100, '_tf_config': gpu_options { per_process_gpu_memory_fraction: 1.0 } , '_evaluation_master': '', '_master': ''} INFO:tensorflow:Create CheckpointSaverHook. 2017-10-22 03:50:39.468616: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2017-10-22 03:50:39.468699: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2017-10-22 03:50:39.468722: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2017-10-22 03:50:39.468742: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2017-10-22 03:50:39.468761: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. INFO:tensorflow:Saving checkpoints for 2 into /content/datalab/training-data-analyst/courses/machine_learning/feateng/taxi_trained/model.ckpt. INFO:tensorflow:loss = 219.36, step = 2 INFO:tensorflow:Starting evaluation at 2017-10-22-03:50:41 INFO:tensorflow:Restoring parameters from /content/datalab/training-data-analyst/courses/machine_learning/feateng/taxi_trained/model.ckpt-2 INFO:tensorflow:Evaluation [1/10] INFO:tensorflow:Evaluation [2/10] INFO:tensorflow:Evaluation [3/10] INFO:tensorflow:Evaluation [4/10] INFO:tensorflow:Evaluation [5/10] INFO:tensorflow:Evaluation [6/10] INFO:tensorflow:Evaluation [7/10] INFO:tensorflow:Evaluation [8/10] INFO:tensorflow:Evaluation [9/10] INFO:tensorflow:Evaluation [10/10] INFO:tensorflow:Finished evaluation at 2017-10-22-03:50:41 INFO:tensorflow:Saving dict for global step 2: global_step = 2, loss = 187.879, rmse = 13.7069, training/hptuning/metric = 13.7069 INFO:tensorflow:Validation (step 1): loss = 187.879, rmse = 13.7069, global_step = 2, training/hptuning/metric = 13.7069 INFO:tensorflow:global_step/sec: 38.3801 INFO:tensorflow:global_step/sec: 117.015 INFO:tensorflow:loss = 85.6259, step = 202 (3.448 sec) INFO:tensorflow:global_step/sec: 115.051 INFO:tensorflow:global_step/sec: 116.872 INFO:tensorflow:loss = 78.5867, step = 402 (1.725 sec) INFO:tensorflow:global_step/sec: 114.599 INFO:tensorflow:global_step/sec: 115.519 INFO:tensorflow:loss = 120.288, step = 602 (1.738 sec) INFO:tensorflow:global_step/sec: 115.812 INFO:tensorflow:global_step/sec: 114.833 INFO:tensorflow:loss = 113.664, step = 802 (1.734 sec) INFO:tensorflow:global_step/sec: 117.19 INFO:tensorflow:global_step/sec: 116.784 INFO:tensorflow:loss = 94.2728, step = 1002 (1.710 sec) INFO:tensorflow:global_step/sec: 117.371 INFO:tensorflow:global_step/sec: 114.056 INFO:tensorflow:loss = 47.8031, step = 1202 (1.729 sec) INFO:tensorflow:global_step/sec: 117.469 INFO:tensorflow:global_step/sec: 116.353 INFO:tensorflow:loss = 87.418, step = 1402 (1.711 sec) INFO:tensorflow:global_step/sec: 114.669 INFO:tensorflow:global_step/sec: 112.542 INFO:tensorflow:loss = 98.5, step = 1602 (1.761 sec) INFO:tensorflow:global_step/sec: 113.169 INFO:tensorflow:global_step/sec: 115.86 INFO:tensorflow:loss = 59.003, step = 1802 (1.747 sec) INFO:tensorflow:global_step/sec: 114.944 INFO:tensorflow:global_step/sec: 115.029 INFO:tensorflow:loss = 55.2577, step = 2002 (1.739 sec) INFO:tensorflow:global_step/sec: 115.537 INFO:tensorflow:global_step/sec: 116.685 INFO:tensorflow:loss = 72.4153, step = 2202 (1.722 sec) INFO:tensorflow:global_step/sec: 117.122 INFO:tensorflow:global_step/sec: 115.95 INFO:tensorflow:loss = 75.5763, step = 2402 (1.716 sec) INFO:tensorflow:global_step/sec: 117.407 INFO:tensorflow:global_step/sec: 119.286 INFO:tensorflow:loss = 73.8332, step = 2602 (1.690 sec) INFO:tensorflow:global_step/sec: 117.103 INFO:tensorflow:global_step/sec: 116.435 INFO:tensorflow:loss = 102.064, step = 2802 (1.713 sec) INFO:tensorflow:global_step/sec: 114.933 INFO:tensorflow:global_step/sec: 115.552 INFO:tensorflow:loss = 97.372, step = 3002 (1.736 sec) INFO:tensorflow:global_step/sec: 116.721 INFO:tensorflow:global_step/sec: 117.912 INFO:tensorflow:loss = 89.4115, step = 3202 (1.705 sec) INFO:tensorflow:global_step/sec: 118.902 INFO:tensorflow:global_step/sec: 114.69 INFO:tensorflow:loss = 46.1676, step = 3402 (1.713 sec) INFO:tensorflow:global_step/sec: 112.942 INFO:tensorflow:global_step/sec: 116.155 INFO:tensorflow:loss = 88.1739, step = 3602 (1.746 sec) INFO:tensorflow:global_step/sec: 118.704 INFO:tensorflow:global_step/sec: 115.621 INFO:tensorflow:loss = 94.6579, step = 3802 (1.707 sec) INFO:tensorflow:global_step/sec: 113.662 INFO:tensorflow:global_step/sec: 113.851 INFO:tensorflow:loss = 58.6815, step = 4002 (1.758 sec) INFO:tensorflow:global_step/sec: 113.976 INFO:tensorflow:global_step/sec: 115.343 INFO:tensorflow:loss = 54.1024, step = 4202 (1.744 sec) INFO:tensorflow:global_step/sec: 118.518 INFO:tensorflow:global_step/sec: 116.656 INFO:tensorflow:loss = 71.1522, step = 4402 (1.701 sec) INFO:tensorflow:global_step/sec: 112.438 INFO:tensorflow:global_step/sec: 111.317 INFO:tensorflow:loss = 74.9945, step = 4602 (1.788 sec) INFO:tensorflow:global_step/sec: 112.002 INFO:tensorflow:global_step/sec: 111.535 INFO:tensorflow:loss = 72.992, step = 4802 (1.789 sec) INFO:tensorflow:global_step/sec: 114.263 INFO:tensorflow:global_step/sec: 111.892 INFO:tensorflow:loss = 96.8931, step = 5002 (1.769 sec) INFO:tensorflow:global_step/sec: 112.397 INFO:tensorflow:global_step/sec: 111.029 INFO:tensorflow:loss = 92.4883, step = 5202 (1.790 sec) INFO:tensorflow:global_step/sec: 111.141 INFO:tensorflow:global_step/sec: 111.983 INFO:tensorflow:loss = 88.0388, step = 5402 (1.793 sec) INFO:tensorflow:global_step/sec: 113.51 INFO:tensorflow:global_step/sec: 114.835 INFO:tensorflow:loss = 45.8094, step = 5602 (1.752 sec) INFO:tensorflow:global_step/sec: 116.208 INFO:tensorflow:global_step/sec: 115.034 INFO:tensorflow:loss = 88.8831, step = 5802 (1.730 sec) INFO:tensorflow:global_step/sec: 111.793 INFO:tensorflow:global_step/sec: 111.613 INFO:tensorflow:loss = 93.4177, step = 6002 (1.790 sec) INFO:tensorflow:global_step/sec: 109.171 INFO:tensorflow:global_step/sec: 110.661 INFO:tensorflow:loss = 58.9872, step = 6202 (1.820 sec) INFO:tensorflow:global_step/sec: 112.375 INFO:tensorflow:global_step/sec: 113.582 INFO:tensorflow:loss = 54.005, step = 6402 (1.770 sec) INFO:tensorflow:global_step/sec: 113.818 INFO:tensorflow:global_step/sec: 111.982 INFO:tensorflow:loss = 70.9862, step = 6602 (1.772 sec) INFO:tensorflow:global_step/sec: 113.755 INFO:tensorflow:global_step/sec: 113.387 INFO:tensorflow:loss = 74.8975, step = 6802 (1.761 sec) INFO:tensorflow:global_step/sec: 116.036 INFO:tensorflow:global_step/sec: 117.692 INFO:tensorflow:loss = 72.7621, step = 7002 (1.711 sec) INFO:tensorflow:global_step/sec: 118.154 INFO:tensorflow:global_step/sec: 117.842 INFO:tensorflow:loss = 94.6179, step = 7202 (1.695 sec) INFO:tensorflow:global_step/sec: 116.729 INFO:tensorflow:global_step/sec: 118.207 INFO:tensorflow:loss = 90.4749, step = 7402 (1.703 sec) INFO:tensorflow:global_step/sec: 118.334 INFO:tensorflow:global_step/sec: 118.668 INFO:tensorflow:loss = 87.4869, step = 7602 (1.688 sec) INFO:tensorflow:global_step/sec: 119.292 INFO:tensorflow:global_step/sec: 119.383 INFO:tensorflow:loss = 45.7827, step = 7802 (1.676 sec) INFO:tensorflow:global_step/sec: 117.361 INFO:tensorflow:global_step/sec: 118.052 INFO:tensorflow:loss = 89.3567, step = 8002 (1.699 sec) INFO:tensorflow:global_step/sec: 117.366 INFO:tensorflow:global_step/sec: 118.309 INFO:tensorflow:loss = 92.9361, step = 8202 (1.697 sec) INFO:tensorflow:global_step/sec: 115.531 INFO:tensorflow:global_step/sec: 115.798 INFO:tensorflow:loss = 59.2971, step = 8402 (1.729 sec) INFO:tensorflow:global_step/sec: 113.323 INFO:tensorflow:global_step/sec: 112.848 INFO:tensorflow:loss = 54.098, step = 8602 (1.769 sec) INFO:tensorflow:global_step/sec: 113.54 INFO:tensorflow:Saving checkpoints for 8800 into /content/datalab/training-data-analyst/courses/machine_learning/feateng/taxi_trained/model.ckpt. INFO:tensorflow:Loss for final step: 98.7846. INFO:tensorflow:Starting evaluation at 2017-10-22-03:51:59 INFO:tensorflow:Restoring parameters from /content/datalab/training-data-analyst/courses/machine_learning/feateng/taxi_trained/model.ckpt-8800 INFO:tensorflow:Evaluation [1/10] INFO:tensorflow:Evaluation [2/10] INFO:tensorflow:Evaluation [3/10] INFO:tensorflow:Evaluation [4/10] INFO:tensorflow:Evaluation [5/10] INFO:tensorflow:Evaluation [6/10] INFO:tensorflow:Evaluation [7/10] INFO:tensorflow:Evaluation [8/10] INFO:tensorflow:Evaluation [9/10] INFO:tensorflow:Evaluation [10/10] INFO:tensorflow:Finished evaluation at 2017-10-22-03:52:00 INFO:tensorflow:Saving dict for global step 8800: global_step = 8800, loss = 77.4217, rmse = 8.79896, training/hptuning/metric = 8.79896 INFO:tensorflow:Restoring parameters from /content/datalab/training-data-analyst/courses/machine_learning/feateng/taxi_trained/model.ckpt-8800 INFO:tensorflow:Assets added to graph. INFO:tensorflow:No assets to write. INFO:tensorflow:SavedModel written to: /content/datalab/training-data-analyst/courses/machine_learning/feateng/taxi_trained/export/Servo/1508644321/saved_model.pb
This dumps out a local result:
$ ls $REPO/courses/machine_learning/feateng/taxi_trained/export/Servo
Local Model Testing
Now we can provide the model with an input to check if the prediction step works:
$ cat test.json {"dayofweek":"Sun","hourofday":17,"pickuplon": -73.885262,"pickuplat": 40.773008,"dropofflon": -73.987232,"dropofflat": 40.732403,"passengers": 2}
Next, we run the model in predict mode using gcloud and pointing to the LOCAL model directory:
$ model_dir=$(ls ${REPO}/courses/machine_learning/feateng/taxi_trained/export/Servo) $ gcloud ml-engine local predict \ --model-dir=${REPO}/courses/machine_learning/feateng/taxi_trained/export/Servo/${model_dir} \ --json-instances=/tmp/test.json
The resulting output:
SCORES 12.4769 WARNING: 2017-10-22 03:52:04.141602: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2017-10-22 03:52:04.141677: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2017-10-22 03:52:04.141700: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2017-10-22 03:52:04.141715: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2017-10-22 03:52:04.141730: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. WARNING:root:MetaGraph has multiple signatures 2. Support for multiple signatures is limited. By default we select named signatures.
Model Training in the Cloud
Once we've tested out the model locally and made sure it works, we can train the model in the cloud by submitting it to ML Engine using gcloud:
OUTDIR=gs://${BUCKET}/taxifare/ch4/taxi_trained JOBNAME=lab4a_$(date -u +%y%m%d_%H%M%S) echo $OUTDIR $REGION $JOBNAME gsutil -m rm -rf $OUTDIR gcloud ml-engine jobs submit training $JOBNAME \ --region=$REGION \ --module-name=trainer.task \ --package-path=${REPO}/courses/machine_learning/feateng/taxifare/trainer \ --job-dir=$OUTDIR \ --staging-bucket=gs://$BUCKET \ --scale-tier=BASIC \ --runtime-version=1.2 \ -- \ --train_data_paths="gs://$BUCKET/taxifare/ch4/taxi_preproc/train*" \ --eval_data_paths="gs://${BUCKET}/taxifare/ch4/taxi_preproc/valid*" \ --output_dir=$OUTDIR \ --num_epochs=100
Note that, again, we pass the locations of training data and validation data on Google Cloud Storage.
Here is the output we'll see:
gs://charlesreid1-ml/taxifare/ch4/taxi_trained us-central1 lab4a_171022_035546 jobId: lab4a_171022_035546 state: QUEUED Removing gs://charlesreid1-ml/taxifare/ch4/taxi_trained/#1508644012044506... Removing gs://charlesreid1-ml/taxifare/ch4/taxi_trained/checkpoint#1508644013272533... Removing gs://charlesreid1-ml/taxifare/ch4/taxi_trained/eval/#1508644018167021... Removing gs://charlesreid1-ml/taxifare/ch4/taxi_trained/eval/events.out.tfevents.1508644018.master-12933efd23-0-4t4kh#1508644019102690... Removing gs://charlesreid1-ml/taxifare/ch4/taxi_trained/events.out.tfevents.1508644008.master-12933efd23-0-4t4kh#1508644498836422... Removing gs://charlesreid1-ml/taxifare/ch4/taxi_trained/graph.pbtxt#1508644009956409... Removing gs://charlesreid1-ml/taxifare/ch4/taxi_trained/model.ckpt-2.data-00000-of-00001#1508644012444080... Removing gs://charlesreid1-ml/taxifare/ch4/taxi_trained/model.ckpt-2.index#1508644012659448... Removing gs://charlesreid1-ml/taxifare/ch4/taxi_trained/model.ckpt-2.meta#1508644013983764... / [9/9 objects] 100% Done Operation completed over 9 objects. Job [lab4a_171022_035546] submitted successfully. Your job is still active. You may view the status of your job with the command $ gcloud ml-engine jobs describe lab4a_171022_035546 or continue streaming the logs with the command $ gcloud ml-engine jobs stream-logs lab4a_171022_035546
The results of the stream-logs command is shown below:
$ gcloud ml-engine jobs stream-logs lab4a_171022_035546 INFO 2017-10-21 20:55:49 -0700 service Validating job requirements... INFO 2017-10-21 20:55:49 -0700 service Job creation request has been successfully validated. INFO 2017-10-21 20:55:49 -0700 service Waiting for job to be provisioned. INFO 2017-10-21 20:55:49 -0700 service Job lab4a_171022_035546 is queued. INFO 2017-10-21 20:55:49 -0700 service Waiting for TensorFlow to start. INFO 2017-10-21 20:59:06 -0700 master-replica-0 Running task with arguments: --cluster={"master": ["master-9d41196544-0:2222"]} --task={"type": "master", "index": 0} --job={ INFO 2017-10-21 20:59:06 -0700 master-replica-0 "package_uris": ["gs://charlesreid1-ml/lab4a_171022_035546/4d5ca50256c8423aaacb74ca78d05b8e0c3611284eca9ddade03a1349cfa165d/trainer-0.0.0.tar.gz"], INFO 2017-10-21 20:59:06 -0700 master-replica-0 "python_module": "trainer.task", INFO 2017-10-21 20:59:06 -0700 master-replica-0 "args": ["--train_data_paths=gs://charlesreid1-ml/taxifare/ch4/taxi_preproc/train*", "--eval_data_paths=gs://charlesreid1-ml/taxifare/ch4/taxi_preproc/valid*", "--output_dir=gs://charlesreid1-ml/taxifare/ch4/taxi_trained", "--num_epochs=100"], INFO 2017-10-21 20:59:06 -0700 master-replica-0 "region": "us-central1", INFO 2017-10-21 20:59:06 -0700 master-replica-0 "runtime_version": "1.2", INFO 2017-10-21 20:59:06 -0700 master-replica-0 "job_dir": "gs://charlesreid1-ml/taxifare/ch4/taxi_trained" INFO 2017-10-21 20:59:06 -0700 master-replica-0 } INFO 2017-10-21 20:59:07 -0700 master-replica-0 Running module trainer.task. INFO 2017-10-21 20:59:07 -0700 master-replica-0 Downloading the package: gs://charlesreid1-ml/lab4a_171022_035546/4d5ca50256c8423aaacb74ca78d05b8e0c3611284eca9ddade03a1349cfa165d/trainer-0.0.0.tar.gz INFO 2017-10-21 20:59:07 -0700 master-replica-0 Running command: gsutil -q cp gs://charlesreid1-ml/lab4a_171022_035546/4d5ca50256c8423aaacb74ca78d05b8e0c3611284eca9ddade03a1349cfa165d/trainer-0.0.0.tar.gz trainer-0.0.0.tar.gz INFO 2017-10-21 20:59:08 -0700 master-replica-0 Installing the package: gs://charlesreid1-ml/lab4a_171022_035546/4d5ca50256c8423aaacb74ca78d05b8e0c3611284eca9ddade03a1349cfa165d/trainer-0.0.0.tar.gz INFO 2017-10-21 20:59:08 -0700 master-replica-0 Running command: pip install --user --upgrade --force-reinstall --no-deps trainer-0.0.0.tar.gz INFO 2017-10-21 20:59:08 -0700 master-replica-0 Processing ./trainer-0.0.0.tar.gz INFO 2017-10-21 20:59:09 -0700 master-replica-0 Building wheels for collected packages: trainer INFO 2017-10-21 20:59:09 -0700 master-replica-0 Running setup.py bdist_wheel for trainer: started INFO 2017-10-21 20:59:09 -0700 master-replica-0 creating '/tmp/tmptWD6EOpip-wheel-/trainer-0.0.0-cp27-none-any.whl' and adding '.' to it INFO 2017-10-21 20:59:09 -0700 master-replica-0 adding 'trainer/model.py' INFO 2017-10-21 20:59:09 -0700 master-replica-0 adding 'trainer/__init__.py' INFO 2017-10-21 20:59:09 -0700 master-replica-0 adding 'trainer/task.py' INFO 2017-10-21 20:59:09 -0700 master-replica-0 adding 'trainer/setup.py' INFO 2017-10-21 20:59:09 -0700 master-replica-0 adding 'trainer-0.0.0.dist-info/DESCRIPTION.rst' INFO 2017-10-21 20:59:09 -0700 master-replica-0 adding 'trainer-0.0.0.dist-info/metadata.json' INFO 2017-10-21 20:59:09 -0700 master-replica-0 adding 'trainer-0.0.0.dist-info/top_level.txt' INFO 2017-10-21 20:59:09 -0700 master-replica-0 adding 'trainer-0.0.0.dist-info/WHEEL' INFO 2017-10-21 20:59:09 -0700 master-replica-0 adding 'trainer-0.0.0.dist-info/METADATA' INFO 2017-10-21 20:59:09 -0700 master-replica-0 adding 'trainer-0.0.0.dist-info/RECORD' INFO 2017-10-21 20:59:09 -0700 master-replica-0 Running setup.py bdist_wheel for trainer: finished with status 'done' INFO 2017-10-21 20:59:09 -0700 master-replica-0 Stored in directory: /root/.cache/pip/wheels/0d/1b/db/f8e86b296734f0b137e17e5d34862f4ae4faf8388755c6272f INFO 2017-10-21 20:59:09 -0700 master-replica-0 Successfully built trainer INFO 2017-10-21 20:59:09 -0700 master-replica-0 Installing collected packages: trainer INFO 2017-10-21 20:59:09 -0700 master-replica-0 Successfully installed trainer-0.0.0 INFO 2017-10-21 20:59:10 -0700 master-replica-0 Running command: pip install --user trainer-0.0.0.tar.gz INFO 2017-10-21 20:59:10 -0700 master-replica-0 Processing ./trainer-0.0.0.tar.gz INFO 2017-10-21 20:59:11 -0700 master-replica-0 Requirement already satisfied (use --upgrade to upgrade): trainer==0.0.0 from file:///user_dir/trainer-0.0.0.tar.gz in /root/.local/lib/python2.7/site-packages INFO 2017-10-21 20:59:11 -0700 master-replica-0 Building wheels for collected packages: trainer INFO 2017-10-21 20:59:11 -0700 master-replica-0 Running setup.py bdist_wheel for trainer: started INFO 2017-10-21 20:59:11 -0700 master-replica-0 creating '/tmp/tmpjVEkX2pip-wheel-/trainer-0.0.0-cp27-none-any.whl' and adding '.' to it INFO 2017-10-21 20:59:11 -0700 master-replica-0 adding 'trainer/model.py' INFO 2017-10-21 20:59:11 -0700 master-replica-0 adding 'trainer/__init__.py' INFO 2017-10-21 20:59:11 -0700 master-replica-0 adding 'trainer/task.py' INFO 2017-10-21 20:59:11 -0700 master-replica-0 adding 'trainer/setup.py' INFO 2017-10-21 20:59:11 -0700 master-replica-0 adding 'trainer-0.0.0.dist-info/DESCRIPTION.rst' INFO 2017-10-21 20:59:11 -0700 master-replica-0 adding 'trainer-0.0.0.dist-info/metadata.json' INFO 2017-10-21 20:59:11 -0700 master-replica-0 adding 'trainer-0.0.0.dist-info/top_level.txt' INFO 2017-10-21 20:59:11 -0700 master-replica-0 adding 'trainer-0.0.0.dist-info/WHEEL' INFO 2017-10-21 20:59:11 -0700 master-replica-0 adding 'trainer-0.0.0.dist-info/METADATA' INFO 2017-10-21 20:59:11 -0700 master-replica-0 adding 'trainer-0.0.0.dist-info/RECORD' INFO 2017-10-21 20:59:11 -0700 master-replica-0 Running setup.py bdist_wheel for trainer: finished with status 'done' INFO 2017-10-21 20:59:11 -0700 master-replica-0 Stored in directory: /root/.cache/pip/wheels/0d/1b/db/f8e86b296734f0b137e17e5d34862f4ae4faf8388755c6272f INFO 2017-10-21 20:59:11 -0700 master-replica-0 Successfully built trainer INFO 2017-10-21 20:59:11 -0700 master-replica-0 Running command: python -m trainer.task --train_data_paths=gs://charlesreid1-ml/taxifare/ch4/taxi_preproc/train* --eval_data_paths=gs://charlesreid1-ml/taxifare/ch4/taxi_preproc/valid* --output_dir=gs://charlesreid1-ml/taxifare/ch4/taxi_trained --num_epochs=100 --job-dir gs://charlesreid1-ml/taxifare/ch4/taxi_trained WARNING 2017-10-21 20:59:14 -0700 master-replica-0 The default stddev value of initializer will change from "1/sqrt(vocab_size)" to "1/sqrt(dimension)" after 2017/02/25. WARNING 2017-10-21 20:59:14 -0700 master-replica-0 The default stddev value of initializer will change from "1/sqrt(vocab_size)" to "1/sqrt(dimension)" after 2017/02/25. WARNING 2017-10-21 20:59:14 -0700 master-replica-0 From /root/.local/lib/python2.7/site-packages/trainer/model.py:105: calling __init__ (from tensorflow.contrib.learn.python.learn.estimators.dnn_linear_combined) with fix_global_step_increment_bug=False is deprecated and will be removed after 2017-04-15. WARNING 2017-10-21 20:59:14 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:14 -0700 master-replica-0 Please set fix_global_step_increment_bug=True and update training steps in your pipeline. See pydoc for details. INFO 2017-10-21 20:59:14 -0700 master-replica-0 Using default config. INFO 2017-10-21 20:59:14 -0700 master-replica-0 Using config: {'_save_checkpoints_secs': 600, '_num_ps_replicas': 0, '_keep_checkpoint_max': 5, '_task_type': u'master', '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f8d67849bd0>, '_model_dir': 'gs://charlesreid1-ml/taxifare/ch4/taxi_trained/', '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_session_config': None, '_tf_random_seed': None, '_environment': u'cloud', '_num_worker_replicas': 0, '_task_id': 0, '_save_summary_steps': 100, '_tf_config': gpu_options { INFO 2017-10-21 20:59:14 -0700 master-replica-0 per_process_gpu_memory_fraction: 1.0 INFO 2017-10-21 20:59:14 -0700 master-replica-0 } INFO 2017-10-21 20:59:14 -0700 master-replica-0 , '_evaluation_master': '', '_master': ''} WARNING 2017-10-21 20:59:14 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/monitors.py:268: __init__ (from tensorflow.contrib.learn.python.learn.monitors) is deprecated and will be removed after 2016-12-05. WARNING 2017-10-21 20:59:14 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:14 -0700 master-replica-0 Monitors are deprecated. Please use tf.train.SessionRunHook. WARNING 2017-10-21 20:59:14 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 20:59:14 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:14 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 20:59:14 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 20:59:14 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 20:59:14 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 20:59:14 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 20:59:14 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 20:59:14 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 20:59:14 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:14 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 20:59:14 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 20:59:14 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 20:59:14 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 20:59:14 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 20:59:14 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 20:59:15 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 20:59:15 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:15 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 20:59:15 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 20:59:15 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 20:59:15 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 20:59:15 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 20:59:15 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 20:59:15 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 20:59:15 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:15 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 20:59:15 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 20:59:15 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 20:59:15 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 20:59:15 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 20:59:15 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 20:59:15 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 20:59:15 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:15 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 20:59:15 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 20:59:15 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 20:59:15 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 20:59:15 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 20:59:15 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 20:59:15 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 20:59:15 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:15 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 20:59:15 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 20:59:15 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 20:59:15 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 20:59:15 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 20:59:15 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 20:59:16 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/head.py:625: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30. WARNING 2017-10-21 20:59:16 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:16 -0700 master-replica-0 Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported. INFO 2017-10-21 20:59:16 -0700 master-replica-0 Create CheckpointSaverHook. INFO 2017-10-21 20:59:18 -0700 master-replica-0 Restoring parameters from gs://charlesreid1-ml/taxifare/ch4/taxi_trained/model.ckpt-6966 ERROR 2017-10-21 20:59:37 -0700 master-replica-0 2017-10-22 03:59:37.311904: I tensorflow/core/platform/cloud/retrying_utils.cc:77] The operation failed and will be automatically retried in 1.84837 seconds (attempt 1 out of 10), caused by: Unavailable: Error executing an HTTP request (HTTP response code 503, error code 0, error message '') ERROR 2017-10-21 20:59:37 -0700 master-replica-0 when renaming gs://charlesreid1-ml/taxifare/ch4/taxi_trained/graph.pbtxt.tmp502c020d88af41f7988f84855a66bbbc to gs://charlesreid1-ml/taxifare/ch4/taxi_trained/graph.pbtxt INFO 2017-10-21 20:59:40 -0700 master-replica-0 Saving checkpoints for 6968 into gs://charlesreid1-ml/taxifare/ch4/taxi_trained/model.ckpt. INFO 2017-10-21 20:59:43 -0700 master-replica-0 loss = 70.9161, step = 6968 WARNING 2017-10-21 20:59:43 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 20:59:43 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:43 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 20:59:43 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 20:59:43 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 20:59:43 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 20:59:43 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 20:59:43 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:44 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 20:59:44 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 20:59:44 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 20:59:44 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 20:59:44 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:44 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 20:59:44 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 20:59:44 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 20:59:44 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 20:59:44 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:44 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 20:59:44 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 20:59:44 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 20:59:44 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 20:59:44 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:44 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 20:59:44 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 20:59:44 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 20:59:44 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 20:59:44 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:44 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 20:59:44 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 20:59:44 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 20:59:44 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 20:59:44 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/head.py:625: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30. WARNING 2017-10-21 20:59:44 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 20:59:44 -0700 master-replica-0 Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported. INFO 2017-10-21 20:59:44 -0700 master-replica-0 Starting evaluation at 2017-10-22-03:59:44 INFO 2017-10-21 20:59:45 -0700 master-replica-0 Restoring parameters from gs://charlesreid1-ml/taxifare/ch4/taxi_trained/model.ckpt-6968 INFO 2017-10-21 20:59:46 -0700 master-replica-0 Evaluation [1/10] INFO 2017-10-21 20:59:46 -0700 master-replica-0 Evaluation [2/10] INFO 2017-10-21 20:59:46 -0700 master-replica-0 Evaluation [3/10] INFO 2017-10-21 20:59:46 -0700 master-replica-0 Evaluation [4/10] INFO 2017-10-21 20:59:46 -0700 master-replica-0 Evaluation [5/10] INFO 2017-10-21 20:59:46 -0700 master-replica-0 Evaluation [6/10] INFO 2017-10-21 20:59:46 -0700 master-replica-0 Evaluation [7/10] INFO 2017-10-21 20:59:46 -0700 master-replica-0 Evaluation [8/10] INFO 2017-10-21 20:59:46 -0700 master-replica-0 Evaluation [9/10] INFO 2017-10-21 20:59:46 -0700 master-replica-0 Evaluation [10/10] INFO 2017-10-21 20:59:46 -0700 master-replica-0 Finished evaluation at 2017-10-22-03:59:46 INFO 2017-10-21 20:59:46 -0700 master-replica-0 Saving dict for global step 6968: global_step = 6968, loss = 77.4875, rmse = 8.8027, training/hptuning/metric = 8.8027 INFO 2017-10-21 20:59:48 -0700 master-replica-0 Validation (step 6967): loss = 77.4875, rmse = 8.8027, global_step = 6968, training/hptuning/metric = 8.8027 INFO 2017-10-21 20:59:57 -0700 master-replica-0 global_step/sec: 6.90495 INFO 2017-10-21 21:00:13 -0700 master-replica-0 global_step/sec: 6.34928 INFO 2017-10-21 21:00:13 -0700 master-replica-0 loss = 74.8829, step = 7168 (30.230 sec) INFO 2017-10-21 21:00:23 -0700 master-replica-0 global_step/sec: 9.91498 WARNING 2017-10-21 21:00:26 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:00:26 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:00:26 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:00:26 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:00:26 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:00:26 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:00:26 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:00:26 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:00:26 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:00:26 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:00:26 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:00:26 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:00:26 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:00:26 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:00:26 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:00:26 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:00:27 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:00:27 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:00:27 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:00:27 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:00:27 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:00:27 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:00:27 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:00:27 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:00:27 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:00:27 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:00:27 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:00:27 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:00:27 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:00:27 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:00:27 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:00:27 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:00:27 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:00:27 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:00:27 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:00:27 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/head.py:625: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30. WARNING 2017-10-21 21:00:27 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:00:27 -0700 master-replica-0 Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported. INFO 2017-10-21 21:00:27 -0700 master-replica-0 Starting evaluation at 2017-10-22-04:00:27 INFO 2017-10-21 21:00:27 -0700 master-replica-0 Restoring parameters from gs://charlesreid1-ml/taxifare/ch4/taxi_trained/model.ckpt-8800 INFO 2017-10-21 21:00:29 -0700 master-replica-0 Evaluation [1/10] INFO 2017-10-21 21:00:29 -0700 master-replica-0 Evaluation [2/10] INFO 2017-10-21 21:00:29 -0700 master-replica-0 Evaluation [3/10] INFO 2017-10-21 21:00:29 -0700 master-replica-0 Evaluation [4/10] INFO 2017-10-21 21:00:29 -0700 master-replica-0 Evaluation [5/10] INFO 2017-10-21 21:00:29 -0700 master-replica-0 Evaluation [6/10] INFO 2017-10-21 21:00:29 -0700 master-replica-0 Evaluation [7/10] INFO 2017-10-21 21:00:29 -0700 master-replica-0 Evaluation [8/10] INFO 2017-10-21 21:00:29 -0700 master-replica-0 Evaluation [9/10] INFO 2017-10-21 21:00:29 -0700 master-replica-0 Evaluation [10/10] INFO 2017-10-21 21:00:29 -0700 master-replica-0 Finished evaluation at 2017-10-22-04:00:29 INFO 2017-10-21 21:00:29 -0700 master-replica-0 Saving dict for global step 8800: global_step = 8800, loss = 77.4176, rmse = 8.79873, training/hptuning/metric = 8.79873 INFO 2017-10-21 21:00:29 -0700 master-replica-0 Validation (step 7293): loss = 77.4176, rmse = 8.79873, global_step = 8800, training/hptuning/metric = 8.79873 INFO 2017-10-21 21:00:37 -0700 master-replica-0 global_step/sec: 7.05573 INFO 2017-10-21 21:00:37 -0700 master-replica-0 loss = 72.7523, step = 7368 (24.259 sec) INFO 2017-10-21 21:00:48 -0700 master-replica-0 global_step/sec: 9.29562 INFO 2017-10-21 21:00:59 -0700 master-replica-0 global_step/sec: 9.42005 INFO 2017-10-21 21:00:59 -0700 master-replica-0 loss = 94.4278, step = 7568 (21.373 sec) INFO 2017-10-21 21:01:09 -0700 master-replica-0 global_step/sec: 9.44944 INFO 2017-10-21 21:01:20 -0700 master-replica-0 global_step/sec: 9.48502 INFO 2017-10-21 21:01:20 -0700 master-replica-0 loss = 90.31, step = 7768 (21.385 sec) INFO 2017-10-21 21:01:30 -0700 master-replica-0 global_step/sec: 9.47531 INFO 2017-10-21 21:01:41 -0700 master-replica-0 global_step/sec: 9.59307 INFO 2017-10-21 21:01:41 -0700 master-replica-0 loss = 87.4527, step = 7968 (20.718 sec) INFO 2017-10-21 21:01:51 -0700 master-replica-0 global_step/sec: 9.56699 INFO 2017-10-21 21:02:02 -0700 master-replica-0 global_step/sec: 9.4193 INFO 2017-10-21 21:02:02 -0700 master-replica-0 loss = 45.8043, step = 8168 (21.069 sec) INFO 2017-10-21 21:02:12 -0700 master-replica-0 global_step/sec: 9.51152 INFO 2017-10-21 21:02:23 -0700 master-replica-0 global_step/sec: 9.53822 INFO 2017-10-21 21:02:23 -0700 master-replica-0 loss = 89.4081, step = 8368 (20.997 sec) INFO 2017-10-21 21:02:34 -0700 master-replica-0 global_step/sec: 9.13892 INFO 2017-10-21 21:02:45 -0700 master-replica-0 global_step/sec: 9.30457 INFO 2017-10-21 21:02:45 -0700 master-replica-0 loss = 92.906, step = 8568 (21.690 sec) INFO 2017-10-21 21:02:55 -0700 master-replica-0 global_step/sec: 9.69158 INFO 2017-10-21 21:03:05 -0700 master-replica-0 global_step/sec: 9.65663 INFO 2017-10-21 21:03:05 -0700 master-replica-0 loss = 59.3322, step = 8768 (20.674 sec) INFO 2017-10-21 21:03:16 -0700 master-replica-0 global_step/sec: 9.66961 INFO 2017-10-21 21:03:26 -0700 master-replica-0 global_step/sec: 9.63063 INFO 2017-10-21 21:03:26 -0700 master-replica-0 loss = 54.1047, step = 8968 (21.050 sec) INFO 2017-10-21 21:03:37 -0700 master-replica-0 global_step/sec: 9.17513 INFO 2017-10-21 21:03:48 -0700 master-replica-0 global_step/sec: 9.32243 INFO 2017-10-21 21:03:48 -0700 master-replica-0 loss = 71.0439, step = 9168 (21.302 sec) INFO 2017-10-21 21:03:58 -0700 master-replica-0 global_step/sec: 9.91924 INFO 2017-10-21 21:04:08 -0700 master-replica-0 global_step/sec: 9.60319 INFO 2017-10-21 21:04:08 -0700 master-replica-0 loss = 74.8709, step = 9368 (20.495 sec) INFO 2017-10-21 21:04:18 -0700 master-replica-0 global_step/sec: 9.84729 INFO 2017-10-21 21:04:28 -0700 master-replica-0 global_step/sec: 9.83899 INFO 2017-10-21 21:04:28 -0700 master-replica-0 loss = 72.6779, step = 9568 (20.318 sec) INFO 2017-10-21 21:04:39 -0700 master-replica-0 global_step/sec: 9.77917 INFO 2017-10-21 21:04:49 -0700 master-replica-0 global_step/sec: 10.0604 INFO 2017-10-21 21:04:49 -0700 master-replica-0 loss = 93.3378, step = 9768 (20.166 sec) INFO 2017-10-21 21:04:58 -0700 master-replica-0 global_step/sec: 10.1103 INFO 2017-10-21 21:05:09 -0700 master-replica-0 global_step/sec: 9.54114 INFO 2017-10-21 21:05:09 -0700 master-replica-0 loss = 89.444, step = 9968 (20.372 sec) INFO 2017-10-21 21:05:19 -0700 master-replica-0 global_step/sec: 9.79666 INFO 2017-10-21 21:05:29 -0700 master-replica-0 global_step/sec: 9.69717 INFO 2017-10-21 21:05:30 -0700 master-replica-0 loss = 87.2034, step = 10168 (20.773 sec) INFO 2017-10-21 21:05:40 -0700 master-replica-0 global_step/sec: 9.65896 INFO 2017-10-21 21:05:50 -0700 master-replica-0 global_step/sec: 9.51121 INFO 2017-10-21 21:05:50 -0700 master-replica-0 loss = 45.8686, step = 10368 (20.614 sec) INFO 2017-10-21 21:06:01 -0700 master-replica-0 global_step/sec: 9.12063 INFO 2017-10-21 21:06:11 -0700 master-replica-0 loss = 89.6852, step = 10567 (21.016 sec) INFO 2017-10-21 21:06:11 -0700 master-replica-0 global_step/sec: 9.86323 INFO 2017-10-21 21:06:22 -0700 master-replica-0 global_step/sec: 9.23614 INFO 2017-10-21 21:06:32 -0700 master-replica-0 loss = 92.7284, step = 10767 (20.958 sec) INFO 2017-10-21 21:06:32 -0700 master-replica-0 global_step/sec: 9.85408 INFO 2017-10-21 21:06:43 -0700 master-replica-0 global_step/sec: 9.54858 INFO 2017-10-21 21:06:53 -0700 master-replica-0 loss = 59.5421, step = 10967 (20.619 sec) INFO 2017-10-21 21:06:53 -0700 master-replica-0 global_step/sec: 9.87805 INFO 2017-10-21 21:07:03 -0700 master-replica-0 global_step/sec: 9.64981 INFO 2017-10-21 21:07:13 -0700 master-replica-0 loss = 54.2085, step = 11167 (20.361 sec) INFO 2017-10-21 21:07:13 -0700 master-replica-0 global_step/sec: 9.93814 INFO 2017-10-21 21:07:24 -0700 master-replica-0 global_step/sec: 9.60251 INFO 2017-10-21 21:07:34 -0700 master-replica-0 loss = 71.1363, step = 11367 (21.228 sec) INFO 2017-10-21 21:07:35 -0700 master-replica-0 global_step/sec: 9.30009 INFO 2017-10-21 21:07:45 -0700 master-replica-0 global_step/sec: 9.68538 INFO 2017-10-21 21:07:55 -0700 master-replica-0 loss = 74.8589, step = 11567 (20.173 sec) INFO 2017-10-21 21:07:55 -0700 master-replica-0 global_step/sec: 10.1645 INFO 2017-10-21 21:08:05 -0700 master-replica-0 global_step/sec: 9.67992 INFO 2017-10-21 21:08:16 -0700 master-replica-0 loss = 72.6526, step = 11767 (21.316 sec) INFO 2017-10-21 21:08:16 -0700 master-replica-0 global_step/sec: 9.09138 INFO 2017-10-21 21:08:26 -0700 master-replica-0 global_step/sec: 9.73093 INFO 2017-10-21 21:08:37 -0700 master-replica-0 loss = 92.7126, step = 11967 (20.601 sec) INFO 2017-10-21 21:08:37 -0700 master-replica-0 global_step/sec: 9.69434 INFO 2017-10-21 21:08:47 -0700 master-replica-0 global_step/sec: 9.49967 INFO 2017-10-21 21:08:57 -0700 master-replica-0 loss = 88.9892, step = 12167 (20.814 sec) INFO 2017-10-21 21:08:58 -0700 master-replica-0 global_step/sec: 9.72281 INFO 2017-10-21 21:09:08 -0700 master-replica-0 global_step/sec: 9.76142 INFO 2017-10-21 21:09:18 -0700 master-replica-0 loss = 87.0694, step = 12367 (20.474 sec) INFO 2017-10-21 21:09:18 -0700 master-replica-0 global_step/sec: 9.73585 INFO 2017-10-21 21:09:28 -0700 master-replica-0 global_step/sec: 9.60059 INFO 2017-10-21 21:09:39 -0700 master-replica-0 loss = 45.9486, step = 12567 (20.998 sec) INFO 2017-10-21 21:09:39 -0700 master-replica-0 global_step/sec: 9.4731 INFO 2017-10-21 21:09:40 -0700 master-replica-0 Saving checkpoints for 12573 into gs://charlesreid1-ml/taxifare/ch4/taxi_trained/model.ckpt. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:09:44 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:09:44 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:09:44 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:09:44 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:09:44 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:09:44 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:09:44 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:09:44 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:09:44 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:09:44 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:09:44 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:09:44 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:09:44 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:09:44 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:09:44 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:09:44 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:09:44 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:09:44 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:09:44 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:09:44 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:09:44 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:09:44 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:09:44 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:09:44 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:09:44 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:09:44 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:09:44 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:09:44 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:09:44 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:09:44 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:09:44 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:09:45 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/head.py:625: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30. WARNING 2017-10-21 21:09:45 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:09:45 -0700 master-replica-0 Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported. INFO 2017-10-21 21:09:45 -0700 master-replica-0 Starting evaluation at 2017-10-22-04:09:45 INFO 2017-10-21 21:09:45 -0700 master-replica-0 Restoring parameters from gs://charlesreid1-ml/taxifare/ch4/taxi_trained/model.ckpt-12573 INFO 2017-10-21 21:09:47 -0700 master-replica-0 Evaluation [1/10] INFO 2017-10-21 21:09:47 -0700 master-replica-0 Evaluation [2/10] INFO 2017-10-21 21:09:47 -0700 master-replica-0 Evaluation [3/10] INFO 2017-10-21 21:09:47 -0700 master-replica-0 Evaluation [4/10] INFO 2017-10-21 21:09:47 -0700 master-replica-0 Evaluation [5/10] INFO 2017-10-21 21:09:47 -0700 master-replica-0 Evaluation [6/10] INFO 2017-10-21 21:09:47 -0700 master-replica-0 Evaluation [7/10] INFO 2017-10-21 21:09:47 -0700 master-replica-0 Evaluation [8/10] INFO 2017-10-21 21:09:47 -0700 master-replica-0 Evaluation [9/10] INFO 2017-10-21 21:09:47 -0700 master-replica-0 Evaluation [10/10] INFO 2017-10-21 21:09:47 -0700 master-replica-0 Finished evaluation at 2017-10-22-04:09:47 INFO 2017-10-21 21:09:47 -0700 master-replica-0 Saving dict for global step 12573: global_step = 12573, loss = 77.4343, rmse = 8.79968, training/hptuning/metric = 8.79968 INFO 2017-10-21 21:09:47 -0700 master-replica-0 Validation (step 12572): loss = 77.4343, rmse = 8.79968, global_step = 12573, training/hptuning/metric = 8.79968 INFO 2017-10-21 21:09:57 -0700 master-replica-0 global_step/sec: 5.57763 INFO 2017-10-21 21:10:08 -0700 master-replica-0 loss = 89.8612, step = 12767 (28.677 sec) INFO 2017-10-21 21:10:08 -0700 master-replica-0 global_step/sec: 9.29792 INFO 2017-10-21 21:10:18 -0700 master-replica-0 global_step/sec: 9.38126 INFO 2017-10-21 21:10:29 -0700 master-replica-0 loss = 92.6565, step = 12967 (21.214 sec) INFO 2017-10-21 21:10:29 -0700 master-replica-0 global_step/sec: 9.47322 INFO 2017-10-21 21:10:39 -0700 master-replica-0 global_step/sec: 9.50101 INFO 2017-10-21 21:10:50 -0700 master-replica-0 loss = 59.6791, step = 13167 (21.163 sec) INFO 2017-10-21 21:10:50 -0700 master-replica-0 global_step/sec: 9.40882 INFO 2017-10-21 21:11:01 -0700 master-replica-0 global_step/sec: 9.40758 INFO 2017-10-21 21:11:11 -0700 master-replica-0 loss = 54.2877, step = 13367 (21.264 sec) INFO 2017-10-21 21:11:11 -0700 master-replica-0 global_step/sec: 9.4115 INFO 2017-10-21 21:11:22 -0700 master-replica-0 global_step/sec: 9.53999 INFO 2017-10-21 21:11:32 -0700 master-replica-0 loss = 71.2122, step = 13567 (21.068 sec) INFO 2017-10-21 21:11:32 -0700 master-replica-0 global_step/sec: 9.44938 INFO 2017-10-21 21:11:43 -0700 master-replica-0 global_step/sec: 9.22973 INFO 2017-10-21 21:11:53 -0700 master-replica-0 loss = 74.8419, step = 13767 (21.143 sec) INFO 2017-10-21 21:11:54 -0700 master-replica-0 global_step/sec: 9.68982 INFO 2017-10-21 21:12:04 -0700 master-replica-0 global_step/sec: 9.63594 INFO 2017-10-21 21:12:14 -0700 master-replica-0 loss = 72.6368, step = 13967 (20.643 sec) INFO 2017-10-21 21:12:14 -0700 master-replica-0 global_step/sec: 9.7084 INFO 2017-10-21 21:12:25 -0700 master-replica-0 global_step/sec: 9.67787 INFO 2017-10-21 21:12:35 -0700 master-replica-0 loss = 92.3282, step = 14167 (20.539 sec) INFO 2017-10-21 21:12:35 -0700 master-replica-0 global_step/sec: 9.82631 INFO 2017-10-21 21:12:46 -0700 master-replica-0 global_step/sec: 9.01691 INFO 2017-10-21 21:12:57 -0700 master-replica-0 loss = 88.7376, step = 14367 (22.742 sec) INFO 2017-10-21 21:12:57 -0700 master-replica-0 global_step/sec: 8.59312 INFO 2017-10-21 21:13:08 -0700 master-replica-0 global_step/sec: 9.81168 INFO 2017-10-21 21:13:18 -0700 master-replica-0 loss = 86.989, step = 14567 (20.255 sec) INFO 2017-10-21 21:13:18 -0700 master-replica-0 global_step/sec: 9.90485 INFO 2017-10-21 21:13:28 -0700 master-replica-0 global_step/sec: 9.52851 INFO 2017-10-21 21:13:39 -0700 master-replica-0 loss = 46.0254, step = 14767 (21.124 sec) INFO 2017-10-21 21:13:39 -0700 master-replica-0 global_step/sec: 9.43595 INFO 2017-10-21 21:13:50 -0700 master-replica-0 global_step/sec: 9.38907 INFO 2017-10-21 21:14:00 -0700 master-replica-0 loss = 89.9714, step = 14967 (20.917 sec) INFO 2017-10-21 21:14:00 -0700 master-replica-0 global_step/sec: 9.71983 INFO 2017-10-21 21:14:11 -0700 master-replica-0 global_step/sec: 9.08081 INFO 2017-10-21 21:14:21 -0700 master-replica-0 loss = 92.6318, step = 15167 (21.460 sec) INFO 2017-10-21 21:14:21 -0700 master-replica-0 global_step/sec: 9.58988 INFO 2017-10-21 21:14:31 -0700 master-replica-0 global_step/sec: 9.79417 INFO 2017-10-21 21:14:42 -0700 master-replica-0 loss = 59.7643, step = 15367 (20.698 sec) INFO 2017-10-21 21:14:42 -0700 master-replica-0 global_step/sec: 9.54277 INFO 2017-10-21 21:14:53 -0700 master-replica-0 global_step/sec: 9.35891 INFO 2017-10-21 21:15:03 -0700 master-replica-0 loss = 54.3425, step = 15567 (20.964 sec) INFO 2017-10-21 21:15:03 -0700 master-replica-0 global_step/sec: 9.70268 INFO 2017-10-21 21:15:13 -0700 master-replica-0 global_step/sec: 9.74716 INFO 2017-10-21 21:15:23 -0700 master-replica-0 Saving checkpoints for 15765 into gs://charlesreid1-ml/taxifare/ch4/taxi_trained/model.ckpt. INFO 2017-10-21 21:15:27 -0700 master-replica-0 Loss for final step: 97.5639. WARNING 2017-10-21 21:15:27 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:15:27 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:27 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:15:27 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:15:27 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:15:27 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:15:27 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:15:27 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:15:27 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:15:27 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:27 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:15:27 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:15:27 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:15:27 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:15:27 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:15:27 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:15:27 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:15:27 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:27 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:15:27 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:15:27 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:15:27 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:15:27 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:15:27 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:15:28 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:15:28 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:28 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:15:28 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:15:28 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:15:28 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:15:28 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:15:28 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:15:28 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:15:28 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:28 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:15:28 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:15:28 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:15:28 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:15:28 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:15:28 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:15:28 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:15:28 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:28 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:15:28 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:15:28 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:15:28 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:15:28 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:15:28 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:15:28 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/head.py:625: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30. WARNING 2017-10-21 21:15:28 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:28 -0700 master-replica-0 Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported. INFO 2017-10-21 21:15:28 -0700 master-replica-0 Starting evaluation at 2017-10-22-04:15:28 INFO 2017-10-21 21:15:28 -0700 master-replica-0 Restoring parameters from gs://charlesreid1-ml/taxifare/ch4/taxi_trained/model.ckpt-15765 INFO 2017-10-21 21:15:30 -0700 master-replica-0 Evaluation [1/10] INFO 2017-10-21 21:15:30 -0700 master-replica-0 Evaluation [2/10] INFO 2017-10-21 21:15:30 -0700 master-replica-0 Evaluation [3/10] INFO 2017-10-21 21:15:30 -0700 master-replica-0 Evaluation [4/10] INFO 2017-10-21 21:15:30 -0700 master-replica-0 Evaluation [5/10] INFO 2017-10-21 21:15:30 -0700 master-replica-0 Evaluation [6/10] INFO 2017-10-21 21:15:30 -0700 master-replica-0 Evaluation [7/10] INFO 2017-10-21 21:15:30 -0700 master-replica-0 Evaluation [8/10] INFO 2017-10-21 21:15:30 -0700 master-replica-0 Evaluation [9/10] INFO 2017-10-21 21:15:30 -0700 master-replica-0 Evaluation [10/10] INFO 2017-10-21 21:15:30 -0700 master-replica-0 Finished evaluation at 2017-10-22-04:15:30 INFO 2017-10-21 21:15:30 -0700 master-replica-0 Saving dict for global step 15765: global_step = 15765, loss = 77.4633, rmse = 8.80133, training/hptuning/metric = 8.80133 WARNING 2017-10-21 21:15:32 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:32 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:15:32 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:15:32 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:15:32 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:15:32 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:32 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:15:32 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:15:32 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:15:32 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:15:32 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:32 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:15:32 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:15:32 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:15:32 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:15:32 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:32 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:15:32 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:15:32 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:15:32 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:15:32 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:32 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:15:32 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:15:32 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:15:32 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:15:32 -0700 master-replica-0 as deprecated. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/feature_column.py:2306: calling sparse_feature_cross (from tensorflow.contrib.layers.python.ops.sparse_feature_cross_op) with hash_key=None is deprecated and will be removed after 2016-11-20. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 Instructions for updating: WARNING 2017-10-21 21:15:32 -0700 master-replica-0 The default behavior of sparse_feature_cross is changing, the default WARNING 2017-10-21 21:15:32 -0700 master-replica-0 value for hash_key will change to SPARSE_FEATURE_CROSS_DEFAULT_HASH_KEY. WARNING 2017-10-21 21:15:32 -0700 master-replica-0 From that point on sparse_feature_cross will always use FingerprintCat64 WARNING 2017-10-21 21:15:32 -0700 master-replica-0 to concatenate the feature fingerprints. And the underlying WARNING 2017-10-21 21:15:32 -0700 master-replica-0 _sparse_feature_cross_op.sparse_feature_cross operation will be marked WARNING 2017-10-21 21:15:32 -0700 master-replica-0 as deprecated. INFO 2017-10-21 21:15:33 -0700 master-replica-0 Restoring parameters from gs://charlesreid1-ml/taxifare/ch4/taxi_trained/model.ckpt-15765 INFO 2017-10-21 21:15:35 -0700 master-replica-0 Assets added to graph. INFO 2017-10-21 21:15:35 -0700 master-replica-0 No assets to write. INFO 2017-10-21 21:15:38 -0700 master-replica-0 SavedModel written to: gs://charlesreid1-ml/taxifare/ch4/taxi_trained/export/Servo/1508645733/saved_model.pb INFO 2017-10-21 21:15:38 -0700 master-replica-0 Module completed; cleaning up. INFO 2017-10-21 21:15:38 -0700 master-replica-0 Clean up finished. INFO 2017-10-21 21:15:38 -0700 master-replica-0 Task completed successfully. INFO 2017-10-21 21:15:53 -0700 service Tearing down TensorFlow. INFO 2017-10-21 21:17:08 -0700 service Finished tearing down TensorFlow. INFO 2017-10-21 21:18:06 -0700 service Job completed successfully.
Explanation of Model Used In Laboratory
The prior lab notes cover a complicated laboratory notebook, but the model that is used/trained is itself quite complicated. This section covers some of the details of that TensorFlow model.
The purpose of the model is to predict taxi fares using various inputs: data about how much taxi fares cost, taxi pickup and dropoff locations, number of passengers, and so on.
Links:
- Link to the model package on github: https://github.com/GoogleCloudPlatform/training-data-analyst/tree/master/courses/machine_learning/feateng/taxifare
- Link to model.py on github: https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/machine_learning/feateng/taxifare/trainer/model.py
TensorFlow Model Inputs
The inputs to the model consist of several sparse columns (day of the week, hour of the day), which will be crossed to create a new input variable. We also have the real-valued columns of the pickup and drop-off latitude and longitude.
These input variables must ultimately come from the CSV file, so the CSV columns must be defined first:
CSV_COLUMNS = 'fare_amount,dayofweek,hourofday,pickuplon,pickuplat,dropofflon,dropofflat,passengers,key'.split(',')
When we run the model (either to train it or to predict it), the input data must be a CSV file with columns in that specific order.
These will be transformed into the model input columns, so define these columns as TensorFlow variables:
INPUT_COLUMNS = [ # define features layers.sparse_column_with_keys('dayofweek', keys=['Sun', 'Mon', 'Tues', 'Wed', 'Thu', 'Fri', 'Sat']), layers.sparse_column_with_integerized_feature('hourofday', bucket_size=24), # engineered features that are created in the input_fn layers.real_valued_column('latdiff'), layers.real_valued_column('londiff'), layers.real_valued_column('euclidean'), # real_valued_column layers.real_valued_column('pickuplon'), layers.real_valued_column('pickuplat'), layers.real_valued_column('dropofflat'), layers.real_valued_column('dropofflon'), layers.real_valued_column('passengers'), ]
Now, the TensorFlow model file has the following structure:
- build_estimator() - the heart of the model, this function actually builds the estimator or neural network model
- add_engineered() - this computes derived quantities that are not in the input file, and are not crossed features (e.g., latitude difference and longitude difference)
- serving_input_fn() - this is called by task.py (though the call from task.py is not clear - part of assembling an "output strategy"?)
- generate_csv_input_fn() - returns a function that creates a text file reader that will load the CSV data (that is, this is a function that generates a function that reads the CSV input file)
- generate_tfrecord_input_fn() - returns a function that assembles and returns the input features as TensorFlow Feature objects, and the target Feature objects (another function that generates a function)
Not all of these functions will be covered, but the important ones will be explained below.
Build Estimator Function
The actual step of building an estimator (the neural network model itself) requires some arrangement. Specifically:
- Bucketizing the latitude and longitude values
- Feature-crossing the pickup latitude and longitude
- Feature-crossing the drop-off latitude and longitude
- Feature-crossing the (feature-crossed pickup lat/long) and (feature-crossed drop-off lat/long)
- Tagging the wide (sparse) columns
- Tagging the deep (dense) columns
def build_estimator(model_dir, nbuckets, hidden_units): """ Build an estimator starting from INPUT COLUMNS. These include feature transformations and synthetic features. The model is a wide-and-deep model. """ # input columns (dayofweek, hourofday, latdiff, londiff, euclidean, plon, plat, dlon, dlat, pcount) = INPUT_COLUMNS # bucketize the lats & lons latbuckets = np.linspace(38.0, 42.0, nbuckets).tolist() lonbuckets = np.linspace(-76.0, -72.0, nbuckets).tolist() b_plat = layers.bucketized_column(plat, latbuckets) b_dlat = layers.bucketized_column(dlat, latbuckets) b_plon = layers.bucketized_column(plon, lonbuckets) b_dlon = layers.bucketized_column(dlon, lonbuckets) # feature cross ploc = layers.crossed_column([b_plat, b_plon], nbuckets*nbuckets) dloc = layers.crossed_column([b_dlat, b_dlon], nbuckets*nbuckets) pd_pair = layers.crossed_column([ploc, dloc], nbuckets ** 4 ) day_hr = layers.crossed_column([dayofweek, hourofday], 24*7) # Wide columns and deep columns. wide_columns = [ # feature crosses dloc, ploc, pd_pair, day_hr, # sparse columns dayofweek, hourofday, # anything with a linear relationship pcount ] deep_columns = [ # embedding_column to "group" together ... layers.embedding_column(pd_pair, 10), layers.embedding_column(day_hr, 10), # real_valued_column plat, plon, dlat, dlon, latdiff, londiff, euclidean ]
Note that some of these inputs, such as latdiff and longdiff, are not defined. These are called by the add_engineered() function, which computes them. This, in turn, is called by serving_input_fn(), which is the entry-point to the function.
The last step is the actual creation of the deep-and-wide neural network with the specified deep and wide columns:
return tf.contrib.learn.DNNLinearCombinedRegressor( model_dir=model_dir, linear_feature_columns=wide_columns, dnn_feature_columns=deep_columns, dnn_hidden_units=hidden_units or [128, 32, 4])