|
|
| (11 intermediate revisions by the same user not shown) |
| Line 3: |
Line 3: |
| Adversarial neural networks use an architecture consisting of two separate neural networks - one network attempts to learn how to accomplish a task, and another network attempts to differentiate between the output of the first network and the "real" output. | | Adversarial neural networks use an architecture consisting of two separate neural networks - one network attempts to learn how to accomplish a task, and another network attempts to differentiate between the output of the first network and the "real" output. |
|
| |
|
| =TensorFlow Examples of Adversarial Neural Networks= | | =TensorFlow Adversarial Examples= |
|
| |
|
| ==Adversarial Crypto== | | ==Adversarial Crypto== |
| Line 9: |
Line 9: |
| This adversarial crypto neural network attempts to learn how to protect communications using the adversarial architecture. | | This adversarial crypto neural network attempts to learn how to protect communications using the adversarial architecture. |
|
| |
|
| Link to paper: "Learning to Protect Communications with Adversarial Neural Cryptography": https://arxiv.org/abs/1610.06918
| | Paper: "Learning to Protect Communications with Adversarial Neural Cryptography" |
| | |
| | Link to paper: https://arxiv.org/abs/1610.06918 |
|
| |
|
| Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_crypto | | Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_crypto |
| Line 27: |
Line 29: |
| ===The Model=== | | ===The Model=== |
|
| |
|
| We'll step through the code line-by-line again. Here's the link to the code: https://github.com/tensorflow/models/blob/master/research/adversarial_crypto/train_eval.py | | We'll step through the code line-by-line. Here's the link to the code: https://github.com/tensorflow/models/blob/master/research/adversarial_crypto/train_eval.py |
|
| |
|
| ====License====
| | Full model walkthrough is on the [[TensorFlow/Adversarial Crypto]] page. |
|
| |
|
| Obligatory license info:
| | The rundown is: |
| | * Create an AdversarialCrypto class that holds a training optimizer object for the Bob and Alice networks |
| | * Define a method that evaluates the networks as-is and prints the percent losses |
| | * Define a method that trains the network for a specified number of iterations, stopping early if the network reaches its target losses |
| | * Define a method that calls the training function (above), then re-trains Eve several more times from scratch |
|
| |
|
| <pre>
| | ==Adversarial Text== |
| # Copyright 2016 The TensorFlow Authors All Rights Reserved.
| |
| #
| |
| # Licensed under the Apache License, Version 2.0 (the "License");
| |
| # you may not use this file except in compliance with the License.
| |
| # You may obtain a copy of the License at
| |
| #
| |
| # http://www.apache.org/licenses/LICENSE-2.0
| |
| #
| |
| # Unless required by applicable law or agreed to in writing, software
| |
| # distributed under the License is distributed on an "AS IS" BASIS,
| |
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
| |
| # See the License for the specific language governing permissions and
| |
| # limitations under the License.
| |
| # ==============================================================================
| |
| </pre>
| |
| | |
| Some info about the network:
| |
| * There are actually 3 neural networks involved: Alice, Bob, and Eve
| |
| * Alice takes inputs in_m (message), in_k (key) and outputs the ciphertext
| |
| * Bob takes inputs in_k (key), ciphertext and attempts to output the plaintext
| |
| * Even takes input ciphertext (no key) and also attempts to output the plaintext
| |
| | |
| The file starts with imports/declarations to be compatible with Python 3:
| |
|
| |
|
| <pre>
| | This trains a neural network model to detect the sentiment in IMDB text. This illustrates semi-supervised learning. |
| # TensorFlow Python 3 compatibility
| |
| from __future__ import absolute_import
| |
| from __future__ import division
| |
| from __future__ import print_function
| |
| import signal
| |
| import sys
| |
| from six.moves import xrange # pylint: disable=redefined-builtin
| |
| import tensorflow as tf
| |
| </pre>
| |
|
| |
|
| ====Input Argument Flags and Parameters====
| | Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_text |
|
| |
|
| Hyperparameter flags can be set on the command line:
| | ==Running== |
|
| |
|
| <pre>
| | Running this model is slightly more complicated than running the adversarial crypto network. |
| flags = tf.app.flags
| |
| flags.DEFINE_float('learning_rate', 0.0008, 'Constant learning rate')
| |
| flags.DEFINE_integer('batch_size', 4096, 'Batch size')
| |
| FLAGS = flags.FLAGS
| |
| </pre>
| |
|
| |
|
| The FLAGS stuff does not seem to be defined anywhere in the documentation, so the usage is not clear here. But, as an author on TF project states [https://stackoverflow.com/questions/33932901/whats-the-purpose-of-tf-app-flags-in-tensorflow#33938519 here], it is intended to make demos more convenient, and essentially wraps argparse. | | The adversarial text network steps are as follows: |
| | * fetch data |
| | * generate vocab |
| | * generate training/validation/test data |
| | * pretrain language model |
| | * train classifier |
| | * evaluate classifier on test data |
|
| |
|
| Also see [[TensorFlow/Command Line Args]].
| | ===Get Vocabulary Data=== |
|
| |
|
| More parameter definitions follow:
| | Start by obtaining the data, which is an 80 MB tar file, and decompress it: |
|
| |
|
| <pre> | | <pre> |
| # Input and output configuration.
| | $ wget http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz -O /tmp/imdb.tar.gz |
| TEXT_SIZE = 16
| |
| KEY_SIZE = 16
| |
|
| |
|
| # Training parameters.
| | $ tar -xf /tmp/imdb.tar.gz -C /tmp |
| ITERS_PER_ACTOR = 1
| |
| EVE_MULTIPLIER = 2 # Train Eve 2x for every step of Alice/Bob
| |
| # Train until either max loops or Alice/Bob "good enough":
| |
| MAX_TRAINING_LOOPS = 850000
| |
| BOB_LOSS_THRESH = 0.02 # Exit when Bob loss < 0.02 and Eve > 7.7 bits
| |
| EVE_LOSS_THRESH = 7.7
| |
|
| |
|
| # Logging and evaluation.
| | $ du -hs /tmp/aclImdb |
| PRINT_EVERY = 200 # In training, log every 200 steps.
| | 487M /tmp/aclImdb |
| EVE_EXTRA_ROUNDS = 2000 # At end, train eve a bit more.
| |
| RETRAIN_EVE_ITERS = 10000 # Retrain eve up to ITERS*LOOPS times.
| |
| RETRAIN_EVE_LOOPS = 25 # With an evaluation each loop
| |
| NUMBER_OF_EVE_RESETS = 5 # And do this up to 5 times with a fresh eve.
| |
| # Use EVAL_BATCHES samples each time we check accuracy.
| |
| EVAL_BATCHES = 1
| |
| </pre> | | </pre> |
|
| |
|
| | ===Build the Vocabulary=== |
|
| |
|
| ====Batch of Random Booleans====
| | Use a Bazel job to build the vocabulary from the data: |
| | |
| This is a method to define an array of random booleans - this is used to create the message that Alice encrypts, and to define the key that Alice and Bob use to decrypt the message.
| |
| | |
| ====AdversarialCrypto Class====
| |
| | |
| The Adversarial Crypto class defines the set of three neural networks used to do the adversarial network. As part of the training and evaluation process <code>train_and_evaluate()</code>, an instance of this class is created and passed to the evaluation function <code>doeval()</code> in the main body of the code.
| |
| | |
| What does this class do?
| |
| * Creates the three networks for Alice, Bob, and Eve
| |
| * Creates connections from Alice to Bob and Alice to Eve to pass the correct info to the correct networks
| |
| * Defines the loss function for Eve and for Bob
| |
| * Defines the optimizers that the networks should use
| |
| * Manages the state of each network (i.e., allows you to reset the Eve network)
| |
| | |
| <pre>
| |
| class AdversarialCrypto(object):
| |
| """Primary model implementation class for Adversarial Neural Crypto.
| |
| This class contains the code for the model itself,
| |
| and when created, plumbs the pathways from Alice to Bob and
| |
| Eve, creates the optimizers and loss functions, etc.
| |
|
| |
| Attributes:
| |
| eve_loss: Eve's loss function.
| |
| bob_loss: Bob's loss function. Different units from eve_loss.
| |
| eve_optimizer: A tf op that runs Eve's optimizer.
| |
| bob_optimizer: A tf op that runs Bob's optimizer.
| |
| bob_reconstruction_loss: Bob's message reconstruction loss,
| |
| which is comparable to eve_loss.
| |
| reset_eve_vars: Execute this op to completely reset Eve.
| |
| """
| |
| </pre>
| |
| | |
| What does the constructor do?
| |
| * The constructor creates the Alice, Bob, and Eve model by calling the model() method with the right parameters
| |
| * Creates the optimizer for Bob and for Eve
| |
| * Sets up the loss function for Eve, based on <code>tf.reduce_sum()</code> and <code>optimizer.minimize()</code>
| |
| * Sets up the loss function for Bob, based on <code>tf.reduce_sum()</code>
| |
|
| |
|
| <pre> | | <pre> |
| def __init__(self):
| | $ IMDB_DATA_DIR=/tmp/imdb |
| in_m, in_k = self.get_message_and_key()
| |
| encrypted = self.model('alice', in_m, in_k)
| |
| decrypted = self.model('bob', encrypted, in_k)
| |
| eve_out = self.model('eve', encrypted, None)
| |
| | |
| self.reset_eve_vars = tf.group(
| |
| *[w.initializer for w in tf.get_collection('eve')])
| |
| | |
| optimizer = tf.train.AdamOptimizer(learning_rate=FLAGS.learning_rate)
| |
| | |
| # Eve's goal is to decrypt the entire message:
| |
| eve_bits_wrong = tf.reduce_sum(
| |
| tf.abs((eve_out + 1.0) / 2.0 - (in_m + 1.0) / 2.0), [1])
| |
| self.eve_loss = tf.reduce_sum(eve_bits_wrong)
| |
| self.eve_optimizer = optimizer.minimize(
| |
| self.eve_loss, var_list=tf.get_collection('eve'))
| |
| | |
| # Alice and Bob want to be accurate...
| |
| self.bob_bits_wrong = tf.reduce_sum(
| |
| tf.abs((decrypted + 1.0) / 2.0 - (in_m + 1.0) / 2.0), [1])
| |
| # ... and to not let Eve do better than guessing.
| |
| self.bob_reconstruction_loss = tf.reduce_sum(self.bob_bits_wrong)
| |
| bob_eve_error_deviation = tf.abs(float(TEXT_SIZE) / 2.0 - eve_bits_wrong)
| |
| # 7-9 bits wrong is OK too, so we squish the error function a bit.
| |
| # Without doing this, we often tend to hang out at 0.25 / 7.5 error,
| |
| # and it seems bad to have continued, high communication error.
| |
| bob_eve_loss = tf.reduce_sum(
| |
| tf.square(bob_eve_error_deviation) / (TEXT_SIZE / 2)**2)
| |
|
| |
|
| # Rescale the losses to [0, 1] per example and combine. | | $ bazel run data:gen_vocab -- \ |
| self.bob_loss = (self.bob_reconstruction_loss / TEXT_SIZE + bob_eve_loss) | | --output_dir=$IMDB_DATA_DIR \ |
| | | --dataset=imdb \ |
| self.bob_optimizer = optimizer.minimize( | | --imdb_input_dir=/tmp/aclImdb \ |
| self.bob_loss,
| | --lowercase=False |
| var_list=(tf.get_collection('alice') + tf.get_collection('bob')))
| |
| </pre> | | </pre> |
|
| |
|
| ====AdversarialCrypto Class - Creation of Neural Network Model====
| | This uses a build rule called <code>gen_vocab</code> located in <code>data/BUILD</code>: |
| | |
| Now, the actual creation of the models for Alice, Bob, and Eve happens in the call to <code>model()</code>. What happens with the method header?
| |
| * We pass in the name of the graph component ('alice', 'bob', or 'eve') to add new model components to
| |
| * We pass in the input message (either the plain text, to Alice, or the ciphertext, to Bob and Eve)
| |
| * We pass in the key (optional); if no key is passed in, the input to the neural network is just the message
| |
| | |
| Here's the model method definition:
| |
| | |
| <pre>
| |
| def model(self, collection, message, key=None):
| |
| """The model for Alice, Bob, and Eve. If key=None, the first FC layer
| |
| takes only the message as inputs. Otherwise, it uses both the key
| |
| and the message.
| |
| Args:
| |
| collection: The graph keys collection to add new vars to.
| |
| message: The input message to process.
| |
| key: The input key (if any) to use.
| |
| """
| |
| | |
| if key is not None:
| |
| combined_message = tf.concat(axis=1, values=[message, key])
| |
| else:
| |
| combined_message = message
| |
| | |
| </pre>
| |
| | |
| If we pass in both a message and a key, we concatenate the inputs using <code>tf.concat()</code>. Otherwise, the only input is the message.
| |
| | |
| The next step is to call <code>tf.contrib.framework.arg_scope()</code>. The [https://www.tensorflow.org/api_docs/python/tf/contrib/framework/arg_scope documentation] for this function will loop over each TensorFlow model graph passed to it, and add a <code>@add_arg_scope</code> decorator to it.
| |
| | |
| In other words, every time we have a fully_connected layer and a conv2d layer, we set them up to be on the specified graph (Alice, Bob, or Eve):
| |
|
| |
|
| <pre> | | <pre> |
| # Ensure that all variables created are in the specified collection. | | py_binary( |
| with tf.contrib.framework.arg_scope( | | name = "gen_vocab", |
| [tf.contrib.layers.fully_connected, tf.contrib.layers.conv2d], | | srcs = ["gen_vocab.py"], |
| variables_collections=[collection]): | | deps = [ |
| | ":data_utils", |
| | ":document_generators", |
| | # tensorflow dep, |
| | ], |
| | ) |
| </pre> | | </pre> |
|
| |
|
| Next, we create a fully connected neural network layer. We pass in the message (and optionally the key), give the layer a size (the text length, and optionally the key length), we initialize the bias of the fully-connected layer as all-zero, and do not set an activation function:
| | This build vocabulary step is, unfortunately, failing. See this Github issue (1917): https://github.com/tensorflow/models/issues/1917 |
|
| |
|
| <pre>
| | ==Adversarial Image Network== |
| fc = tf.contrib.layers.fully_connected(
| |
| combined_message,
| |
| TEXT_SIZE + KEY_SIZE,
| |
| biases_initializer=tf.constant_initializer(0.0),
| |
| activation_fn=None)
| |
| </pre>
| |
| | |
| Next, we assemble the layers of the neural network model.
| |
| | |
| The model architecture we use is:
| |
| | |
| <pre>
| |
| (Fully Connected) -> (Conv2D) -> (Conv2D) -> (Conv2D) -> (Squeeze)
| |
| </pre>
| |
| | |
| This performs a sequence of 1D convolutions (expands the message out, and squeezes it back down).
| |
| | |
| <pre>
| |
| fc = tf.contrib.layers.fully_connected(
| |
| combined_message,
| |
| TEXT_SIZE + KEY_SIZE,
| |
| biases_initializer=tf.constant_initializer(0.0),
| |
| activation_fn=None)
| |
| | |
| # Perform a sequence of 1D convolutions (by expanding the message out to 2D
| |
| # and then squeezing it back down).
| |
| fc = tf.expand_dims(fc, 2)
| |
| # 2,1 -> 1,2
| |
| conv = tf.contrib.layers.conv2d(
| |
| fc, 2, 2, 2, 'SAME', activation_fn=tf.nn.sigmoid)
| |
| # 1,2 -> 1, 2
| |
| conv = tf.contrib.layers.conv2d(
| |
| conv, 2, 1, 1, 'SAME', activation_fn=tf.nn.sigmoid)
| |
| # 1,2 -> 1, 1
| |
| conv = tf.contrib.layers.conv2d(
| |
| conv, 1, 1, 1, 'SAME', activation_fn=tf.nn.tanh)
| |
| conv = tf.squeeze(conv, 2)
| |
| return conv
| |
| </pre>
| |
| | |
| ==Adversarial Text== | |
|
| |
|
| =Flags= | | =Flags= |
Adversarial Neural Networks
Adversarial neural networks use an architecture consisting of two separate neural networks - one network attempts to learn how to accomplish a task, and another network attempts to differentiate between the output of the first network and the "real" output.
TensorFlow Adversarial Examples
Adversarial Crypto
This adversarial crypto neural network attempts to learn how to protect communications using the adversarial architecture.
Paper: "Learning to Protect Communications with Adversarial Neural Cryptography"
Link to paper: https://arxiv.org/abs/1610.06918
Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_crypto
Part of the tensorflow models repository (https://github.com/tensorflow/models/tree/master/research).
Running
To train the network:
$ python train_eval.py
The approach used by the training is to train the "defender" network (representing the Alice-Bob channel) until it is sufficiently well-trained, then reset the "attacker" network (representing the eavesdropper Eve) from scratch to give the eavesdropper multiple opportunities to find weaknesses in the cryptosystem.
The Model
We'll step through the code line-by-line. Here's the link to the code: https://github.com/tensorflow/models/blob/master/research/adversarial_crypto/train_eval.py
Full model walkthrough is on the TensorFlow/Adversarial Crypto page.
The rundown is:
- Create an AdversarialCrypto class that holds a training optimizer object for the Bob and Alice networks
- Define a method that evaluates the networks as-is and prints the percent losses
- Define a method that trains the network for a specified number of iterations, stopping early if the network reaches its target losses
- Define a method that calls the training function (above), then re-trains Eve several more times from scratch
Adversarial Text
This trains a neural network model to detect the sentiment in IMDB text. This illustrates semi-supervised learning.
Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_text
Running
Running this model is slightly more complicated than running the adversarial crypto network.
The adversarial text network steps are as follows:
- fetch data
- generate vocab
- generate training/validation/test data
- pretrain language model
- train classifier
- evaluate classifier on test data
Get Vocabulary Data
Start by obtaining the data, which is an 80 MB tar file, and decompress it:
$ wget http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz -O /tmp/imdb.tar.gz
$ tar -xf /tmp/imdb.tar.gz -C /tmp
$ du -hs /tmp/aclImdb
487M /tmp/aclImdb
Build the Vocabulary
Use a Bazel job to build the vocabulary from the data:
$ IMDB_DATA_DIR=/tmp/imdb
$ bazel run data:gen_vocab -- \
--output_dir=$IMDB_DATA_DIR \
--dataset=imdb \
--imdb_input_dir=/tmp/aclImdb \
--lowercase=False
This uses a build rule called gen_vocab located in data/BUILD:
py_binary(
name = "gen_vocab",
srcs = ["gen_vocab.py"],
deps = [
":data_utils",
":document_generators",
# tensorflow dep,
],
)
This build vocabulary step is, unfortunately, failing. See this Github issue (1917): https://github.com/tensorflow/models/issues/1917
Adversarial Image Network
Flags