From charlesreid1

Revision as of 07:09, 26 October 2017 by Admin (talk | contribs) (→‎The Network)

Adversarial Neural Networks

Adversarial neural networks use an architecture consisting of two separate neural networks - one network attempts to learn how to accomplish a task, and another network attempts to differentiate between the output of the first network and the "real" output.

TensorFlow Examples of Adversarial Neural Networks

Adversarial Crypto

This adversarial crypto neural network attempts to learn how to protect communications using the adversarial architecture.

Link to paper: "Learning to Protect Communications with Adversarial Neural Cryptography": https://arxiv.org/abs/1610.06918

Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_crypto

Part of the tensorflow models repository (https://github.com/tensorflow/models/tree/master/research).

Running

To train the network:

$ python train_eval.py

The approach used by the training is to train the "defender" network (representing the Alice-Bob channel) until it is sufficiently well-trained, then reset the "attacker" network (representing the eavesdropper Eve) from scratch to give the eavesdropper multiple opportunities to find weaknesses in the cryptosystem.

The Model

We'll step through the code line-by-line again. Here's the link to the code: https://github.com/tensorflow/models/blob/master/research/adversarial_crypto/train_eval.py

License

Obligatory license info:

# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

Some info about the network:

  • There are actually 3 neural networks involved: Alice, Bob, and Eve
  • Alice takes inputs in_m (message), in_k (key) and outputs the ciphertext
  • Bob takes inputs in_k (key), ciphertext and attempts to output the plaintext
  • Even takes input ciphertext (no key) and also attempts to output the plaintext

The file starts with imports/declarations to be compatible with Python 3:

# TensorFlow Python 3 compatibility
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import signal
import sys
from six.moves import xrange  # pylint: disable=redefined-builtin
import tensorflow as tf

Input Argument Flags and Parameters

Hyperparameter flags can be set on the command line:

flags = tf.app.flags
flags.DEFINE_float('learning_rate', 0.0008, 'Constant learning rate')
flags.DEFINE_integer('batch_size', 4096, 'Batch size')
FLAGS = flags.FLAGS

The FLAGS stuff does not seem to be defined anywhere in the documentation, so the usage is not clear here. But, as an author on TF project states here, it is intended to make demos more convenient, and essentially wraps argparse.

Also see TensorFlow/Command Line Args.

More parameter definitions follow:

# Input and output configuration.
TEXT_SIZE = 16
KEY_SIZE = 16

# Training parameters.
ITERS_PER_ACTOR = 1
EVE_MULTIPLIER = 2  # Train Eve 2x for every step of Alice/Bob
# Train until either max loops or Alice/Bob "good enough":
MAX_TRAINING_LOOPS = 850000
BOB_LOSS_THRESH = 0.02  # Exit when Bob loss < 0.02 and Eve > 7.7 bits
EVE_LOSS_THRESH = 7.7

# Logging and evaluation.
PRINT_EVERY = 200  # In training, log every 200 steps.
EVE_EXTRA_ROUNDS = 2000  # At end, train eve a bit more.
RETRAIN_EVE_ITERS = 10000  # Retrain eve up to ITERS*LOOPS times.
RETRAIN_EVE_LOOPS = 25  # With an evaluation each loop
NUMBER_OF_EVE_RESETS = 5  # And do this up to 5 times with a fresh eve.
# Use EVAL_BATCHES samples each time we check accuracy.
EVAL_BATCHES = 1


Batch of Random Booleans

This is a method to define an array of random booleans - this is used to create the message that Alice encrypts, and to define the key that Alice and Bob use to decrypt the message.

Adversarial Crypto Class

The Adversarial Crypto class defines the set of three neural networks used to do the adversarial network. As part of the training and evaluation process train_and_evaluate(), an instance of this class is created and passed to the evaluation function doeval() in the main body of the code.

What does this class do?

  • Creates the three networks for Alice, Bob, and Eve
  • Creates connections from Alice to Bob and Alice to Eve to pass the correct info to the correct networks
  • Defines the loss function for Eve and for Bob
  • Defines the optimizers that the networks should use
  • Manages the state of each network (i.e., allows you to reset the Eve network)
class AdversarialCrypto(object):
  """Primary model implementation class for Adversarial Neural Crypto.
  This class contains the code for the model itself,
  and when created, plumbs the pathways from Alice to Bob and
  Eve, creates the optimizers and loss functions, etc.
  Attributes:
    eve_loss:  Eve's loss function.
    bob_loss:  Bob's loss function.  Different units from eve_loss.
    eve_optimizer:  A tf op that runs Eve's optimizer.
    bob_optimizer:  A tf op that runs Bob's optimizer.
    bob_reconstruction_loss:  Bob's message reconstruction loss,
      which is comparable to eve_loss.
    reset_eve_vars:  Execute this op to completely reset Eve.
  """

What does the constructor do?

  • The constructor creates the Alice, Bob, and Eve model by calling the model() method with the right parameters
  • Creates the optimizer for Bob and for Eve
  • Sets up the loss function for Eve, based on tf.reduce_sum() and optimizer.minimize()
  • Sets up the loss function for Bob, based on tf.reduce_sum()
  def __init__(self):
    in_m, in_k = self.get_message_and_key()
    encrypted = self.model('alice', in_m, in_k)
    decrypted = self.model('bob', encrypted, in_k)
    eve_out = self.model('eve', encrypted, None)

    self.reset_eve_vars = tf.group(
        *[w.initializer for w in tf.get_collection('eve')])

    optimizer = tf.train.AdamOptimizer(learning_rate=FLAGS.learning_rate)

    # Eve's goal is to decrypt the entire message:
    eve_bits_wrong = tf.reduce_sum(
        tf.abs((eve_out + 1.0) / 2.0 - (in_m + 1.0) / 2.0), [1])
    self.eve_loss = tf.reduce_sum(eve_bits_wrong)
    self.eve_optimizer = optimizer.minimize(
        self.eve_loss, var_list=tf.get_collection('eve'))

    # Alice and Bob want to be accurate...
    self.bob_bits_wrong = tf.reduce_sum(
        tf.abs((decrypted + 1.0) / 2.0 - (in_m + 1.0) / 2.0), [1])
    # ... and to not let Eve do better than guessing.
    self.bob_reconstruction_loss = tf.reduce_sum(self.bob_bits_wrong)
    bob_eve_error_deviation = tf.abs(float(TEXT_SIZE) / 2.0 - eve_bits_wrong)
    # 7-9 bits wrong is OK too, so we squish the error function a bit.
    # Without doing this, we often tend to hang out at 0.25 / 7.5 error,
    # and it seems bad to have continued, high communication error.
    bob_eve_loss = tf.reduce_sum(
        tf.square(bob_eve_error_deviation) / (TEXT_SIZE / 2)**2)

    # Rescale the losses to [0, 1] per example and combine.
    self.bob_loss = (self.bob_reconstruction_loss / TEXT_SIZE + bob_eve_loss)

    self.bob_optimizer = optimizer.minimize(
        self.bob_loss,
        var_list=(tf.get_collection('alice') + tf.get_collection('bob')))


Creation of Neural Network Model

Now, the actual creation of the models for Alice, Bob, and Eve happens in the call to model(). What happens with the method header?

  • We pass in the name of the graph component ('alice', 'bob', or 'eve') to add new model components to
  • We pass in the input message (either the plain text, to Alice, or the ciphertext, to Bob and Eve)
  • We pass in the key (optional); if no key is passed in, the input to the neural network is just the message

Here's the model method definition:

  def model(self, collection, message, key=None):
    """The model for Alice, Bob, and Eve.  If key=None, the first FC layer
    takes only the message as inputs.  Otherwise, it uses both the key
    and the message.
    Args:
      collection:  The graph keys collection to add new vars to.
      message:  The input message to process.
      key:  The input key (if any) to use.
    """

    if key is not None:
      combined_message = tf.concat(axis=1, values=[message, key])
    else:
      combined_message = message

If we pass in both a message and a key, we concatenate the inputs using tf.concat(). Otherwise, the only input is the message.

The next step is to call tf.contrib.framework.arg_scope(). The documentation for this function will loop over each TensorFlow model graph passed to it, and add a @add_arg_scope decorator to it.

In other words, every time we have a fully_connected layer and a conv2d layer, we set them up to be on the specified graph (Alice, Bob, or Eve):

    # Ensure that all variables created are in the specified collection.
    with tf.contrib.framework.arg_scope(
        [tf.contrib.layers.fully_connected, tf.contrib.layers.conv2d],
        variables_collections=[collection]):

Next, we create a fully connected neural network layer. We pass in the message (and optionally the key), give the layer a size (the text length, and optionally the key length), we initialize the bias of the fully-connected layer as all-zero, and do not set an activation function:

    fc = tf.contrib.layers.fully_connected(
          combined_message,
          TEXT_SIZE + KEY_SIZE,
          biases_initializer=tf.constant_initializer(0.0),
          activation_fn=None)

Next, we assemble the layers of the neural network model.

The model architecture we use is:

(Fully Connected) -> (Conv2D) -> (Conv2D) -> (Conv2D) -> (Squeeze)

This performs a sequence of 1D convolutions (expands the message out, and squeezes it back down).

     fc = tf.contrib.layers.fully_connected(
          combined_message,
          TEXT_SIZE + KEY_SIZE,
          biases_initializer=tf.constant_initializer(0.0),
          activation_fn=None)

      # Perform a sequence of 1D convolutions (by expanding the message out to 2D
      # and then squeezing it back down).
      fc = tf.expand_dims(fc, 2)
      # 2,1 -> 1,2
      conv = tf.contrib.layers.conv2d(
          fc, 2, 2, 2, 'SAME', activation_fn=tf.nn.sigmoid)
      # 1,2 -> 1, 2
      conv = tf.contrib.layers.conv2d(
          conv, 2, 1, 1, 'SAME', activation_fn=tf.nn.sigmoid)
      # 1,2 -> 1, 1
      conv = tf.contrib.layers.conv2d(
          conv, 1, 1, 1, 'SAME', activation_fn=tf.nn.tanh)
      conv = tf.squeeze(conv, 2)
      return conv

Adversarial Text

Flags