miércoles, 23 de enero de 2019

Forward propagation in deep learning


DEEP LEARNING

JOHN D. KELLEHER

THE

PRESS ESSENTIAL KNOWLEDGE SERIES



Forward propagation in deep learning.Deep learning has three layes.input layer,hidden layer and output layer. I want to see how forward propagation works in the neural metwork. This is my problem statement. Each data point is a customer. The first input is how many accounts they have, and the second input is how many children they have. The model will predict how many transactions the user makes in the next year.

https://www.internetworldstats.com/weather.htm
http://www.thirstydream.com/python/
http://www.thirstydream.com/wp-content/uploads/2018/08/Assignment.html
http://www.thirstydream.com/wp-content/uploads/2018/08/Deep-learning.html
In [1]:
import numpy as np
In [2]:
input_data = np.array([2,3])

node_weights = {'node_0': np.array([1,1]),
               'node_1': np.array([-1,1]),
               'output_node': np.array([2,-1])}
In [3]:
# Calculating node 0 value

value_1 = (input_data * node_weights['node_0']).sum()
value_2 = (input_data * node_weights['node_1']).sum()
In [4]:
print(value_1)
print(value_2)
5
1
In [5]:
hidden_layer = np.array([value_1 , value_2])
print(hidden_layer)

output = (hidden_layer * node_weights['output_node']).sum()

print('Total Transactions :' ,output)
[5 1]
Total Transactions : 9
It means the network will generate a prediction of total 9 transactons if customer has 2 accounts and 3 children.
Sometimes output gets a negative number so its very difficult to predict anything just by applying weights.To overcome this problem neural network has its own activation function to check the linearity of the input variables.like tanh() and relu(rectified linear activation function).Relu has two value if x < 0 = 0 and if x > 0 = x.
In [6]:
def relu(input):
    
    output = max(0,input)
    return(output)


value_1_output = relu(value_1)
value_2_output = relu(value_2)
In [7]:
hiden_layer = np.array([value_1_output , value_2_output])

output = (hiden_layer * node_weights['output_node']).sum()
print(output)
9
one of the important part of the forward propagation is gradient descent.Well i am not going to describe how gradient descent works but the main idea behind it is to find the lowest value where loss funtion is minimum. When plotting the mean-squared error loss function against predictions,the slope is 2 * x * (y-xb), or 2 * input_data * error. Note that x and b may have multiple numbers (x is a vector for each data point, and b is a vector). In this case, the output will also be a vector.
In [48]:
input_data = np.array([1,3,5])
weights = np.array([0,2,1])
target = 4
In [49]:
preds = (weights * input_data).sum()
print(preds)
11
In [50]:
error = preds - target
print(error)
slope = 2 * input_data * error
print(slope)
7
[14 42 70]
In [51]:
learning_rate = 0.01
#learning rate help to upadate the new weight so that the loss function will be minimum and in 
#other words it is close to the target value

weight_update = weights - learning_rate * slope 
new_preds = (weight_update * input_data).sum()

new_error = new_preds - target
print(new_error)
2.0999999999999996
In [8]:
import numpy as np
import pandas as pd
wages = pd.read_csv('/Users/shivampandey/Downloads/python files/datasets/hourly_wages.csv')
wages.head()
Out[8]:
wage_per_hourunioneducation_yrsexperience_yrsagefemalemarrsouthmanufacturingconstruction
05.1008213511010
14.9509425711010
26.6701211900010
34.0001242200000
47.50012173501000
I have already preprocessed the data.
In [12]:
from keras.models import Sequential
from keras.layers import Dense
Using TensorFlow backend.
In [13]:
wages_target = wages['wage_per_hour']
In [14]:
wages = wages.drop(['wage_per_hour'],axis = 'columns')
In [15]:
n_cols = wages.shape[1] # this will give number of columns.
In [16]:
model = Sequential()

model.add(Dense(50 , activation = 'relu' , input_shape = (n_cols,)))
model.add(Dense(32 , activation = 'relu'))

model.add(Dense(1))
In [17]:
model.compile(optimizer = 'adam' , loss = 'mean_squared_error')
# Verify that model contains information from compiling
print("Loss function: " + model.loss)
Loss function: mean_squared_error
In [18]:
model.fit(wages,wages_target)
Epoch 1/1
534/534 [==============================] - 0s 782us/step - loss: 132.3741
Out[18]:
<keras.callbacks.History at 0x10d89de48>
Now for Classification problem. for this i am taking another dataset from the kaggel. In classification problem we have to change just few function like activation in the output layer and as for the loss the value will be categorical_crossentropy
In [19]:
titanic = pd.read_csv('/Users/shivampandey/Downloads/python files/datasets/titanic_all_numeric.csv')
titanic.head()
Out[19]:
survivedpclassagesibspparchfaremaleage_was_missingembarked_from_cherbourgembarked_from_queenstownembarked_from_southampton
00322.0107.25001False001
11138.01071.28330False100
21326.0007.92500False001
31135.01053.10000False001
40335.0008.05001False001
In [30]:
predictions = titanic.drop(['survived'],axis = 1).as_matrix()
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ipykernel_launcher.py:1: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
  """Entry point for launching an IPython kernel.
In [38]:
n_cols = predictions.shape[1]
n_cols
Out[38]:
10
In [39]:
from keras.utils import to_categorical
target = to_categorical(titanic.survived)
In [40]:
model = Sequential()

#hidden layer
model.add(Dense(50 , activation = 'relu' , input_shape = (n_cols,)))
In [41]:
import tensorflow as trf
# output have separate node for each possible outcome softmax will use as activation for this.
model.add(Dense(2, activation = 'softmax'))
In [45]:
model.compile(optimizer = 'sgd' , loss = 'categorical_crossentropy')
In [43]:
model.fit(predictions , target)
Epoch 1/1
891/891 [==============================] - 0s 426us/step - loss: 2.9001
Out[43]:
<keras.callbacks.History at 0x116257a58>
In [ ]:
# Specify, compile, and fit the model
model = Sequential()
model.add(Dense(32, activation='relu', input_shape = (n_cols,)))
model.add(Dense(2, activation='softmax'))
model.compile(optimizer='sgd', 
              loss='categorical_crossentropy', 
              metrics=['accuracy'])
model.fit(predictors, target)

# Calculate predictions: predictions
predictions = model.predict(pred_data)

# Calculate predicted probability of survival: predicted_prob_true
predicted_prob_true = predictions[:,1]

# print predicted_prob_true
print(predicted_prob_true)
Changing optimization parameters I will try optimizing a model at a very low learning rate, a very high learning rate, and a "just right" learning rate.I want to look at the results after running this a low value for the loss function is good.
In [46]:
from keras.optimizers import SGD
In [61]:
def get_new_model():
    model = Sequential()
    
    model.add(Dense(32 , activation = 'relu' , input_shape = (n_cols,)))
    model.add(Dense(2 , activation = 'softmax'))
    
    return(model)
In [62]:
lr_list = [0.000001 , 0.01 , 0.1]

for lr in lr_list:
    
    print('\n\nTesting model with learning rate: %f\n'%lr )
    
    #new model
    model = get_new_model()
    
    # SGD optimizer with specified learning rate: sgd_optimizer
    sgd_optimizer = SGD(lr = lr)
    
    model.compile(optimizer = sgd_optimizer , loss = 'categorical_crossentropy')
    
    model.fit(predictions,target)
Testing model with learning rate: 0.000001

Epoch 1/1
891/891 [==============================] - 0s 429us/step - loss: 1.7788


Testing model with learning rate: 0.010000

Epoch 1/1
891/891 [==============================] - 0s 407us/step - loss: 1.5069


Testing model with learning rate: 0.100000

Epoch 1/1
891/891 [==============================] - 1s 697us/step - loss: 2.4433
In [84]:
#Evaluating model accuracy on validation dataset
predictions = titanic.drop(['survived'],axis = 1)
n_cols = predictions.shape[1]

model =Sequential()

model.add(Dense(100 , activation = 'relu' , input_shape = (n_cols,)))
model.add(Dense(100 , activation = 'relu'))
model.add(Dense(2 , activation = 'softmax'))
In [85]:
model.compile(optimizer = 'adam' , loss = 'categorical_crossentropy')

model.fit(predictions , target , validation_split = 0.3)
Train on 623 samples, validate on 268 samples
Epoch 1/1
623/623 [==============================] - 1s 1ms/step - loss: 0.9406 - val_loss: 0.6215
Out[85]:
<keras.callbacks.History at 0x11c022f60>
Early stopping: Optimizing the optimization Early stopping is use to stop optimization when it isn't helping any more. Since the optimization stops automatically when it isn't helping It is also possible to set a high value for epochs in call to .fit()
In [86]:
from keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(patience = 2)
#it is 2 beacuse if loss function or accuracy are not going to improve after 2 epochs then it will stop automatically

model.fit(predictions , target , validation_split = 0.3 , epochs = 30 , callbacks = [early_stopping])
Train on 623 samples, validate on 268 samples
Epoch 1/30
623/623 [==============================] - 0s 154us/step - loss: 0.7454 - val_loss: 0.5439
Epoch 2/30
623/623 [==============================] - 0s 157us/step - loss: 0.6453 - val_loss: 0.5331
Epoch 3/30
623/623 [==============================] - 0s 156us/step - loss: 0.7245 - val_loss: 0.5422
Epoch 4/30
623/623 [==============================] - 0s 145us/step - loss: 0.6064 - val_loss: 0.5391
Out[86]:
<keras.callbacks.History at 0x11beb2ba8>
Now lets see how the value is going to change with wider and lesser network.
In [87]:
model_1 = Sequential()
model_1.add(Dense(10 , activation = 'relu' , input_shape = (n_cols , )))
model_1.add(Dense(10 , activation = 'relu'))

model_1.add(Dense(2 , activation = 'softmax'))
In [89]:
model.fit(predictions , target , validation_split = 0.3 , epochs = 30 , callbacks = [early_stopping] , 
          verbose = False)

model_1.compile(optimizer = 'adam' , loss = 'categorical_crossentropy')
model_1.fit(predictions , target , validation_split = 0.3 , epochs =30 , callbacks = [early_stopping] , 
            verbose = False)
Out[89]:
<keras.callbacks.History at 0x11b175518>
In [ ]:
import matplotlib.pyplot as plt
plt.plot(model.history['val_loss'] , 'r' , model_1.history['val_loss'] , 'b')
plt.xlabel('Epochs')
plt.ylabel('Validation score')
plt.show()

No hay comentarios:

Publicar un comentario

zen consultora

Blogger Widgets

Entrada destacada

Platzy y el payaso Freddy Vega, PLATZI APESTA, PLATZI NO SIRVE, PLATZI ES UNA ESTAFA

  Platzy y los payasos fredy vega y cvander parte 1,  PLATZI ES UNA ESTAFA Hola amigos, este post va a ir creciendo conforme vaya escribiend...