Notebook

Step 1 Importing TF, Datasets¶

Getting the data here

In [1]:

import tensorflow as tf

In [2]:

from tensorflow.examples.tutorials.mnist import input_data

In [ ]:

mnist = input_data.read_data_sets("MNIST_data/",one_hot=True)

step 2 :: Create a bunch of helper functions for Conv2D¶

Conv2D
Regular functions that come along with it.

In [ ]:

def init_weights(shape):
    init_random_dist = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(init_random_dist)

In [18]:

def init_bias(shape):
    init_bias_vals = tf.constant(0.1, shape=shape)
    return tf.Variable(init_bias_vals)

In [19]:

def conv2d(X,W):
    return tf.nn.conv2d(X, W, strides=[1,1,1,1], padding="SAME")

In [20]:

def max_pool_2x2(X):
    return tf.nn.max_pool(X, ksize=[1,2,2,1],
                         strides=[1,2,2,1],
                         padding="SAME")

Explanation for why 1x2x2x1 `[batch, height, width, channel or depth]`¶

Here we just want to do a 2X2 grid max pooling for height and width of the image.

Remember: The first 1 is the batch: You don't usually want to skip over examples in your batch, or you shouldn't have included them in the first place. :)

The last 1 is the depth of the convolution: You don't usually want to skip inputs, for the same reason.

The conv2d operator is more general, so you could create convolutions that slide the window along other dimensions (this is what I was trying to explain), but that's not a typical use in convnets. The typical use is to use them spatially.

Why reshape to -1? -1 is a placeholder that says "adjust as necessary to match the size needed for the full tensor." It's a way of making the code be independent of the input batch size, so that you can change your pipeline and not have to adjust the batch size everywhere in the code.

In [21]:

def convolutional_layer(input_x, shape):
    W = init_weights(shape)
    b = init_bias([shape[3]])
    return tf.nn.relu(conv2d(input_x,W) + b)

In [22]:

def normal_full_layer(input_layer, size):
    input_size = int(input_layer.get_shape()[1])
    W = init_weights([input_size, size])
    b = init_bias([size])
    return tf.matmul(input_layer, W) + b

Build out our CNN¶

placeholders
layers
loss fxn
optimizers
init, run

In [23]:

# placeholders
X = tf.placeholder(tf.float32, shape=[None, 784])
y_true = tf.placeholder(tf.float32, shape=[None, 10])

In [24]:

# add hidden layers
x_image = tf.reshape(X, [-1, 28,28,1])

In [25]:

# layer 1
# conv_1 = convolutional_layer(x_image, shape=[6, 6,1, 32])
# conv_1_pooling = max_pool_2x2(conv_1)
conv_1 = convolutional_layer(x_image,shape=[5,5,1,32])
conv_1_pooling = max_pool_2x2(conv_1)

In [26]:

conv_2 = convolutional_layer(conv_1_pooling,shape=[5, 5, 32, 64])
conv_2_pooling = max_pool_2x2(conv_2)

In [27]:

conv_2_flat = tf.reshape(conv_2_pooling, [-1, 7*7*64])
full_layer_one = tf.nn.relu(normal_full_layer(conv_2_flat, 1024))

**Explanation to why `7x7` and why `[-1, 7x7x64]`¶

The 2x2 filter reduces height and width by 50% each. first as pool layer 1 to Our output tensor produced by max_pooling2d() (pool1) has a shape of [batch_size, 14, 14, 32]: the 2x2 filter reduces height and width by 50% each. and then again in pooling layer 2 to [batch_size, 7,7,64], thus 28/2/2 = 7 for h and 28/2/2 = 7 for w.

Next the flattening out thing...

In the reshape() operation above, the -1 signifies that the batch_size dimension will be dynamically calculated based on the number of examples in our input data. Each example has 7 (pool2 height) * 7 (pool2 width) * 64 (pool2 channels) features, so we want the features dimension to have a value of 7 * 7 * 64 (3136 in total). The output tensor, pool2_flat, has shape [batch_size, 3136].

In [28]:

hold_prob = tf.placeholder(tf.float32)
full_one_dropout = tf.nn.dropout(full_layer_one, keep_prob=hold_prob)

In [29]:

y_pred = normal_full_layer(full_one_dropout, 10)

Loss Function, optimizer, init Var etc¶

In [31]:

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits
                               (labels=y_true,logits=y_pred))

In [32]:

optimizer = tf.train.AdamOptimizer(learning_rate=0.0001)
train = optimizer.minimize(cross_entropy)

In [33]:

init = tf.global_variables_initializer()

In [34]:

steps = 5000

with tf.Session() as sess:
    sess.run(init)
    
    for i in range(steps):
        batch_x, batch_y = mnist.train.next_batch(50)
        
        sess.run(train, feed_dict={X:batch_x, y_true:batch_y, hold_prob:0.3})
        
        # print message every 50 steps
        if i%50 == 0:
            print("Currently on step{}".format(i))
            print("Accuracy i :")
            # Test the train model
            prediction = tf.equal(tf.argmax(y_pred,1), tf.argmax(y_true, 1))
            accuracy = tf.reduce_mean(tf.cast(prediction, tf.float32))
            
            print(sess.run(accuracy, feed_dict={X:mnist.test.images,
                                               y_true:mnist.test.labels,
                                               hold_prob:1.0}))
        

Currently on step0
Accuracy i :
0.0879
Currently on step50
Accuracy i :
0.6972
Currently on step100
Accuracy i :
0.8193
Currently on step150
Accuracy i :
0.8786
Currently on step200
Accuracy i :
0.8973
Currently on step250
Accuracy i :
0.9086
Currently on step300
Accuracy i :
0.9145
Currently on step350
Accuracy i :
0.9299
Currently on step400
Accuracy i :
0.9294
Currently on step450
Accuracy i :
0.9336
Currently on step500
Accuracy i :
0.9393
Currently on step550
Accuracy i :
0.9428
Currently on step600
Accuracy i :
0.9431
Currently on step650
Accuracy i :
0.9484
Currently on step700
Accuracy i :
0.9473
Currently on step750
Accuracy i :
0.9491
Currently on step800
Accuracy i :
0.9523
Currently on step850
Accuracy i :
0.9538
Currently on step900
Accuracy i :
0.9524
Currently on step950
Accuracy i :
0.954
Currently on step1000
Accuracy i :
0.9579
Currently on step1050
Accuracy i :
0.9601
Currently on step1100
Accuracy i :
0.9605
Currently on step1150
Accuracy i :
0.9617
Currently on step1200
Accuracy i :
0.9608
Currently on step1250
Accuracy i :
0.9653
Currently on step1300
Accuracy i :
0.9629
Currently on step1350
Accuracy i :
0.9642
Currently on step1400
Accuracy i :
0.9648
Currently on step1450
Accuracy i :
0.965
Currently on step1500
Accuracy i :
0.9668
Currently on step1550
Accuracy i :
0.9661
Currently on step1600
Accuracy i :
0.9673
Currently on step1650
Accuracy i :
0.9699
Currently on step1700
Accuracy i :
0.9689
Currently on step1750
Accuracy i :
0.9698
Currently on step1800
Accuracy i :
0.9698
Currently on step1850
Accuracy i :
0.97
Currently on step1900
Accuracy i :
0.9695
Currently on step1950
Accuracy i :
0.9714
Currently on step2000
Accuracy i :
0.9715
Currently on step2050
Accuracy i :
0.967
Currently on step2100
Accuracy i :
0.9721
Currently on step2150
Accuracy i :
0.9723
Currently on step2200
Accuracy i :
0.9733
Currently on step2250
Accuracy i :
0.9731
Currently on step2300
Accuracy i :
0.9746
Currently on step2350
Accuracy i :
0.9753
Currently on step2400
Accuracy i :
0.9746
Currently on step2450
Accuracy i :
0.9769
Currently on step2500
Accuracy i :
0.9742
Currently on step2550
Accuracy i :
0.9762
Currently on step2600
Accuracy i :
0.9751
Currently on step2650
Accuracy i :
0.9751
Currently on step2700
Accuracy i :
0.9749
Currently on step2750
Accuracy i :
0.9749
Currently on step2800
Accuracy i :
0.9759
Currently on step2850
Accuracy i :
0.9781
Currently on step2900
Accuracy i :
0.9775
Currently on step2950
Accuracy i :
0.9783
Currently on step3000
Accuracy i :
0.9773
Currently on step3050
Accuracy i :
0.9768
Currently on step3100
Accuracy i :
0.9794
Currently on step3150
Accuracy i :
0.9776
Currently on step3200
Accuracy i :
0.977
Currently on step3250
Accuracy i :
0.9781
Currently on step3300
Accuracy i :
0.9803
Currently on step3350
Accuracy i :
0.98
Currently on step3400
Accuracy i :
0.98
Currently on step3450
Accuracy i :
0.9804
Currently on step3500
Accuracy i :
0.9799
Currently on step3550
Accuracy i :
0.979
Currently on step3600
Accuracy i :
0.9802
Currently on step3650
Accuracy i :
0.9805
Currently on step3700
Accuracy i :
0.9773
Currently on step3750
Accuracy i :
0.9798
Currently on step3800
Accuracy i :
0.9797
Currently on step3850
Accuracy i :
0.9815
Currently on step3900
Accuracy i :
0.9811
Currently on step3950
Accuracy i :
0.9814
Currently on step4000
Accuracy i :
0.9815
Currently on step4050
Accuracy i :
0.9826
Currently on step4100
Accuracy i :
0.9812
Currently on step4150
Accuracy i :
0.9825
Currently on step4200
Accuracy i :
0.983
Currently on step4250
Accuracy i :
0.9838
Currently on step4300
Accuracy i :
0.9825
Currently on step4350
Accuracy i :
0.9825
Currently on step4400
Accuracy i :
0.9837
Currently on step4450
Accuracy i :
0.9838
Currently on step4500
Accuracy i :
0.9832
Currently on step4550
Accuracy i :
0.9818
Currently on step4600
Accuracy i :
0.9822
Currently on step4650
Accuracy i :
0.9826
Currently on step4700
Accuracy i :
0.9834
Currently on step4750
Accuracy i :
0.9846
Currently on step4800
Accuracy i :
0.9839
Currently on step4850
Accuracy i :
0.9842
Currently on step4900
Accuracy i :
0.9833
Currently on step4950
Accuracy i :
0.984

RMSprop, 6x6 filter == 97.75 Adamoptimizer = 5x5 filter == 99.25%

In [ ]:

Step 1 Importing TF, Datasets¶

step 2 :: Create a bunch of helper functions for Conv2D¶

Explanation for why 1x2x2x1 [batch, height, width, channel or depth]¶

Build out our CNN¶

**Explanation to why 7x7 and why [-1, 7x7x64]¶

Loss Function, optimizer, init Var etc¶

Explanation for why 1x2x2x1 `[batch, height, width, channel or depth]`¶

**Explanation to why `7x7` and why `[-1, 7x7x64]`¶