[Deep learning combat 01] – RNN implements binary adder

Hits: 0

The network is mainly composed of an input layer (composed of two numbers), an intermediate layer (hidden layer composed of multiple [neurons], an intermediate layer (hidden layer composed of multiple [neurons] ), and an output layer;

import copy, numpy as np                                       
np.random.seed(0)

#compute sigmoid nonlinearity #Define sigmoid function
def sigmoid(x):
    output = 1/(1+np.exp(-x))
    return output

# convert output of sigmoid function to its derivative #Calculate the inverse of the sigmoid function
def sigmoid_output_to_derivative(output):
    return output*(1-output)

# training dataset generation                             
int2binary = {} #Used                                     to convert the input integer into a computer-runnable binary number with 
binary_dim = 8 #Define                                      the length of the binary number=8

largest_number = pow(2,binary_dim) #The                          largest number that a binary number can take is = 256
binary = np.unpackbits(
    np.array([range(largest_number)],dtype=np.uint8).T,axis=1)            
for i in range(largest_number): #Make                                 a one-to-one correspondence between binary numbers and decimal numbers
    int2binary[i] = binary[i]

# input variables 
alpha = 0.1               #The speed of parameter w update during 
backpropagation input_dim = 2 #The             dimension of the input data, the program is to realize the addition of two numbers 
hidden_dim = 16 #Number           of hidden layer neurons=16 
output_dim = 1            # The output result value is 1-dimensional


# initialize neural network weights #initialize the weight parameters of the neural network 
synapse_0 = 2*np.random.random((input_dim,hidden_dim)) - 1 #input    to the w0 of the neuron, the dimension is 2X16, and the value is constrained to [-1, 1] between 
synapse_1 = 2*np.random.random((hidden_dim,output_dim)) - 1   #The weight w1 from the neuron to the output layer, the dimension is 16X1, the value is constrained between [-1,1] 
synapse_h = 2* np.random.random((hidden_dim,hidden_dim)) - 1 #The   weight wh from the previous state of the neuron to the current state, the dimension is 16X16, and the value is constrained between [-1,1]                                   

synapse_0_update = np.zeros_like(synapse_0) #Construct           a matrix of the same dimension as w0 and initialize it to all 0s;
synapse_1_update = np.zeros_like(synapse_1)
synapse_h_update = np.zeros_like(synapse_h)

# training logic 
for j in range(10000):           #Number of model iterations, you can change it yourself

    # generate a simple addition problem (a + b = c) 
    a_int = np.random.randint(largest_number/2) # int version #The value of the input addend a of constraint initialization does not exceed 128 
    a = int2binary[a_int] # binary encoding #Convert the addend a to the corresponding binary number 
    b_int = np.random.randint(largest_number/2) # int version 
    b = int2binary[b_int] # binary encoding

    # true answer 
    c_int = a_int + b_int     # true and
    c = int2binary[c_int]    

    # where we'll store our best guess (binary encoded) 
    d = np.zeros_like(c) #Used                                         to store the predicted sum

    overallError = 0                                            # print display error

    layer_2_deltas = list() #For                                   reverse derivation
    layer_1_values = list()
    layer_1_values.append(np.zeros(hidden_dim)) #First              initialize the previous state of the hidden layer to 0

    # moving along the positions in the binary encoding 
    for position in range(binary_dim): #forward                          propagation; binary summation, low bits on the right, high bits on the left

        # generate input and output 
        X = np.array([[a[binary_dim - position - 1],b[binary_dim - position - 1]]]) #input a and b (binary form) 
        y = np.array([ [c[binary_dim - position - 1]]]).T #Real                             label value

        # hidden layer (input ~+ prev_hidden) 
        layer_1 = sigmoid(np.dot(X,synapse_0) + np.dot(layer_1_values[-1],synapse_h))   # X*w0+RNN previous state value*wh

        # output layer (new binary representation)
        layer_2 = sigmoid(np.dot(layer_1,synapse_1))          #layer_1*w1

        # did we miss?... if so, by how much? 
        layer_2_error = y - layer_2                                             #seek error 
        layer_2_deltas.append((layer_2_error)*sigmoid_output_to_derivative(layer_2))   #cost function 
        overallError += np.abs(layer_2_error[0] )                              #Error, print and display

        # decode estimate so we can print it out 
        d[binary_dim - position - 1] = np.round(layer_2[0][0]) #Predicted                     sum

        # store hidden layer so we can use it in the next timestep 
        layer_1_values.append(copy.deepcopy(layer_1)) #Deep                      copy, store the RNN module state value for back propagation

    future_layer_1_delta = np.zeros(hidden_dim)

    for position in range(binary_dim): #Backpropagation    , calculation from left to right, that is, binary high to low

        X = np.array([[a[position],b[position]]])
        layer_1 = layer_1_values[-position-1]
        prev_layer_1 = layer_1_values[-position-2]

        # error at output layer
        layer_2_delta = layer_2_deltas[-position-1]
        # error at hidden layer
        layer_1_delta = (future_layer_1_delta.dot(synapse_h.T) + layer_2_delta.dot(synapse_1.T)) * sigmoid_output_to_derivative(layer_1)

        # let's update all our weights so we can try again 
        synapse_1_update += np.atleast_2d(layer_1).T.dot(layer_2_delta) #Update        w1 
        synapse_h_update += np.atleast_2d(prev_layer_1).T.dot(layer_1_delta)   #Yes wh update 
        synapse_0_update += XTdot(layer_1_delta) #Update                             w0

        future_layer_1_delta = layer_1_delta


    synapse_0 += synapse_0_update * alpha
    synapse_1 += synapse_1_update * alpha
    synapse_h += synapse_h_update * alpha    

    synapse_0_update *= 0
    synapse_1_update *= 0
    synapse_h_update *= 0

    # print out progress 
    if(j % ​​1000 == 0):   # print out the result every 1000 times 
        print ( "Error:" + str(overallError))
        print ("Pred:" + str(d))
        print ("True:" + str(c))
        out = 0
        for index,x in enumerate(reversed(d)):
            out += x*pow(2,index)
        print (str(a_int) + " + " + str(b_int) + " = " + str(out))
        print ("------------")

The running results of the prompt brought by anaconda are as follows. As the number of iterations increases, the accuracy of the running results increases;

Since the assignment between objects in python is passed by reference, if you want to copy the object, you need to use the copy module in the standard library;

copy.copy is a shallow copy, only the parent object is copied, and the internal child objects of the object are not copied;

copy.deepcopy deep copy, copy the object and its sub-objects;

  1. At the beginning of the program, the import copy module is required;

  2. The sigmoid used for common nonlinear change functions is defined at the beginning of the program;

  3. The derivation of the sigmoid function is probably: Let out=1/(1+exp(-x)); then out is derived from x to get out’=exp(-x)/( 1+exp(-x) ) ^2=out*(1-out);

  4. Since the RNN module has the memory of the previous state, it will consider the state value of the previous moment;

  5. The function of np.zeros_like   is to construct a matrix of the same dimension and assign it to 0;

  6. Since in the RNN module, the state value of the previous moment at the initial moment is unknown, it is assigned a value of 0;

  7. Check your python version, if print is added ()

Leave a Reply

Your email address will not be published.