Specifying a model
Now you'll get to work with your first model in Keras, and will immediately be able to run more complex neural network models on larger datasets compared to the first two chapters.
To start, you'll take the skeleton of a neural network and add a hidden layer and an output layer. You'll then fit that model and see Keras do the optimization so your model continually gets better.
As a start, you'll predict workers wages based on characteristics like their industry, education and level of experience. You can find the dataset in a pandas dataframe called df
. For convenience, everything in df
except for the target has been converted to a NumPy matrix called predictors
. The target, wage_per_hour
, is available as a NumPy matrix called target
.
For all exercises in this chapter, we've imported the Sequential
model constructor, the Dense
layer constructor, and pandas.
- Store the number of columns in the
predictors
data to n_cols
. This has been done for you.
- Start by creating a
Sequential
model called model
.
- Use the
.add()
method on model
to add a Dense
layer.
- Add
50
units, specify activation='relu'
, and the input_shape
parameter to be the tuple (n_cols,)
which means it has n_cols
items in each row of data, and any number of rows of data are acceptable as inputs.
- Add another
Dense
layer. This should have 32
units and a 'relu'
activation.
- Finally, add an output layer, which is a
Dense
layer with a single node. Don't use any activation function here.
SOLUTION
# Import necessary modules
import keras
from keras.layers import Dense
from keras.models import Sequential
# Save the number of columns in predictors: n_cols
n_cols = predictors.shape[1]
# Set up the model: model
model = Sequential()
# Add the first layer
model.add(Dense(50, activation='relu', input_shape=(n_cols,)))
# Add the second layer
model.add(Dense(32, activation='relu'))
# Add the output layer
model.add(Dense(1))
http://www.emergentmind.com/neural-network
ADAM algorithm optimizer
https://www.youtube.com/watch?v=JXQT_vxqwIs
http://ruder.io/optimizing-gradient-descent/
Now you'll get to work with your first model in Keras, and will immediately be able to run more complex neural network models on larger datasets compared to the first two chapters.
To start, you'll take the skeleton of a neural network and add a hidden layer and an output layer. You'll then fit that model and see Keras do the optimization so your model continually gets better.
As a start, you'll predict workers wages based on characteristics like their industry, education and level of experience. You can find the dataset in a pandas dataframe called
df
. For convenience, everything in df
except for the target has been converted to a NumPy matrix called predictors
. The target, wage_per_hour
, is available as a NumPy matrix called target
.
For all exercises in this chapter, we've imported the
Sequential
model constructor, the Dense
layer constructor, and pandas.- Store the number of columns in the
predictors
data ton_cols
. This has been done for you. - Start by creating a
Sequential
model calledmodel
. - Use the
.add()
method onmodel
to add aDense
layer.- Add
50
units, specifyactivation='relu'
, and theinput_shape
parameter to be the tuple(n_cols,)
which means it hasn_cols
items in each row of data, and any number of rows of data are acceptable as inputs.
- Add
- Add another
Dense
layer. This should have32
units and a'relu'
activation. - Finally, add an output layer, which is a
Dense
layer with a single node. Don't use any activation function here.
SOLUTION
# Import necessary modules
import keras
from keras.layers import Dense
from keras.models import Sequential
# Save the number of columns in predictors: n_cols
n_cols = predictors.shape[1]
# Set up the model: model
model = Sequential()
# Add the first layer
model.add(Dense(50, activation='relu', input_shape=(n_cols,)))
# Add the second layer
model.add(Dense(32, activation='relu'))
# Add the output layer
model.add(Dense(1))
http://www.emergentmind.com/neural-network
ADAM algorithm optimizer
https://www.youtube.com/watch?v=JXQT_vxqwIs
http://ruder.io/optimizing-gradient-descent/
Compiling the model
You're now going to compile the model you specified earlier. To compile the model, you need to specify the optimizer and loss function to use. In the video, Dan mentioned that the Adam optimizer is an excellent choice. You can read more about it as well as other keras optimizers here, and if you are really curious to learn more, you can read the original paper that introduced the Adam optimizer.
In this exercise, you'll use the Adam optimizer and the mean squared error loss function. Go for it!
- Compile the model using
model.compile()
. Your optimizer
should be 'adam'
and the loss
should be 'mean_squared_error'
.
# Import necessary modules
import keras
from keras.layers import Dense
from keras.models import Sequential
# Specify the model
n_cols = predictors.shape[1]
model = Sequential()
model.add(Dense(50, activation='relu', input_shape = (n_cols,)))
model.add(Dense(32, activation='relu'))
model.add(Dense(1))
# Compile the model
model.compile(optimizer='adam',loss= 'mean_squared_error')
# Verify that model contains information from compiling
print("Loss function: " + model.loss)
You're now going to compile the model you specified earlier. To compile the model, you need to specify the optimizer and loss function to use. In the video, Dan mentioned that the Adam optimizer is an excellent choice. You can read more about it as well as other keras optimizers here, and if you are really curious to learn more, you can read the original paper that introduced the Adam optimizer.
In this exercise, you'll use the Adam optimizer and the mean squared error loss function. Go for it!
- Compile the model using
model.compile()
. Youroptimizer
should be'adam'
and theloss
should be'mean_squared_error'
.
# Import necessary modules
import keras
from keras.layers import Dense
from keras.models import Sequential
# Specify the model
n_cols = predictors.shape[1]
model = Sequential()
model.add(Dense(50, activation='relu', input_shape = (n_cols,)))
model.add(Dense(32, activation='relu'))
model.add(Dense(1))
# Compile the model
model.compile(optimizer='adam',loss= 'mean_squared_error')
# Verify that model contains information from compiling
print("Loss function: " + model.loss)
Fitting the model
You're at the most fun part. You'll now fit the model. Recall that the data to be used as predictive features is loaded in a NumPy matrix called predictors
and the data to be predicted is stored in a NumPy matrix called target
. Your model
is pre-written and it has been compiled with the code from the previous exercise.
- Fit the
model
. Remember that the first argument is the predictive features (predictors
), and the data to be predicted (target
) is the second argument.
# Import necessary modules
import keras
from keras.layers import Dense
from keras.models import Sequential
# Specify the model
n_cols = predictors.shape[1]
model = Sequential()
model.add(Dense(50, activation='relu', input_shape = (n_cols,)))
model.add(Dense(32, activation='relu'))
model.add(Dense(1))
# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')
# Fit the model
model.fit(predictors, target)
You're at the most fun part. You'll now fit the model. Recall that the data to be used as predictive features is loaded in a NumPy matrix called
predictors
and the data to be predicted is stored in a NumPy matrix called target
. Your model
is pre-written and it has been compiled with the code from the previous exercise.- Fit the
model
. Remember that the first argument is the predictive features (predictors
), and the data to be predicted (target
) is the second argument.
# Import necessary modules
import keras
from keras.layers import Dense
from keras.models import Sequential
# Specify the model
n_cols = predictors.shape[1]
model = Sequential()
model.add(Dense(50, activation='relu', input_shape = (n_cols,)))
model.add(Dense(32, activation='relu'))
model.add(Dense(1))
# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')
# Fit the model
model.fit(predictors, target)
Understanding your classification data
Now you will start modeling with a new dataset for a classification problem. This data includes information about passengers on the Titanic. You will use predictors such as age
, fare
and where each passenger embarked from to predict who will survive. This data is from a tutorial on data science competitions. Look here for descriptions of the features.
The data is pre-loaded in a pandas DataFrame called df
.
It's smart to review the maximum and minimum values of each variable to ensure the data isn't misformatted or corrupted. What was the maximum age of passengers on the Titanic? Use the .describe()
method in the IPython Shell to answer this question.
- Convert
df.survived
to a categorical variable using the to_categorical()
function.
- Specify a
Sequential
model called model
.
- Add a
Dense
layer with 32
nodes. Use 'relu'
as the activation
and (n_cols,)
as the input_shape
.
- Add the
Dense
output layer. Because there are two outcomes, it should have 2 units, and because it is a classification model, the activation
should be 'softmax'
.
- Compile the model, using
'sgd'
as the optimizer
, 'categorical_crossentropy'
as the loss function, and metrics=['accuracy']
to see the accuracy (what fraction of predictions were correct) at the end of each epoch.
- Fit the model using the
predictors
and the target
.
# Import necessary modules
import keras
from keras.layers import Dense
from keras.models import Sequential
from keras.utils import to_categorical
# Convert the target to categorical: target
target = to_categorical(df.survived)
# Set up the model
model = Sequential()
# Add the first layer
model.add(Dense(32, activation='relu',input_shape= (n_cols,)))
# Add the output layer
model.add(Dense(2, activation='softmax'))
# Compile the model
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])
# Fit the model
model.fit(predictors, target)
output
<script.py> output:
Epoch 1/10
32/891 [>.............................] - ETA: 1s - loss: 4.7317 - acc: 0.4375
704/891 [======================>.......] - ETA: 0s - loss: 2.7083 - acc: 0.4517
891/891 [==============================] - 0s - loss: 2.3808 - acc: 0.4938
Epoch 2/10
32/891 [>.............................] - ETA: 0s - loss: 0.6341 - acc: 0.7188
704/891 [======================>.......] - ETA: 0s - loss: 0.7846 - acc: 0.6918
891/891 [==============================] - 0s - loss: 0.7650 - acc: 0.6846
Epoch 3/10
32/891 [>.............................] - ETA: 0s - loss: 0.6008 - acc: 0.7188
736/891 [=======================>......] - ETA: 0s - loss: 0.5787 - acc: 0.6984
891/891 [==============================] - 0s - loss: 0.5681 - acc: 0.7082
Epoch 4/10
32/891 [>.............................] - ETA: 0s - loss: 0.5486 - acc: 0.7188
704/891 [======================>.......] - ETA: 0s - loss: 0.5750 - acc: 0.7003
891/891 [==============================] - 0s - loss: 0.5588 - acc: 0.7127
Epoch 5/10
32/891 [>.............................] - ETA: 0s - loss: 0.5612 - acc: 0.6562
704/891 [======================>.......] - ETA: 0s - loss: 0.5492 - acc: 0.7244
891/891 [==============================] - 0s - loss: 0.5561 - acc: 0.7183
Epoch 6/10
32/891 [>.............................] - ETA: 0s - loss: 0.7451 - acc: 0.5312
480/891 [===============>..............] - ETA: 0s - loss: 0.5541 - acc: 0.7104
891/891 [==============================] - 0s - loss: 0.5496 - acc: 0.7104
Epoch 7/10
32/891 [>.............................] - ETA: 0s - loss: 0.6328 - acc: 0.6875
448/891 [==============>...............] - ETA: 0s - loss: 0.5502 - acc: 0.7098
891/891 [==============================] - 0s - loss: 0.5489 - acc: 0.7160
Epoch 8/10
32/891 [>.............................] - ETA: 0s - loss: 0.5035 - acc: 0.7188
704/891 [======================>.......] - ETA: 0s - loss: 0.5539 - acc: 0.7159
891/891 [==============================] - 0s - loss: 0.5528 - acc: 0.7138
Epoch 9/10
32/891 [>.............................] - ETA: 0s - loss: 0.6945 - acc: 0.5938
704/891 [======================>.......] - ETA: 0s - loss: 0.5567 - acc: 0.7216
891/891 [==============================] - 0s - loss: 0.5452 - acc: 0.7306
Epoch 10/10
32/891 [>.............................] - ETA: 0s - loss: 0.6644 - acc: 0.6250
736/891 [=======================>......] - ETA: 0s - loss: 0.5352 - acc: 0.7201
891/891 [==============================] - 0s - loss: 0.5292 - acc: 0.7284
<script.py> output:
Epoch 1/10
32/891 [>.............................] - ETA: 0s - loss: 4.7317 - acc: 0.4375
736/891 [=======================>......] - ETA: 0s - loss: 2.5428 - acc: 0.5870
891/891 [==============================] - 0s - loss: 2.5158 - acc: 0.5881
Epoch 2/10
32/891 [>.............................] - ETA: 0s - loss: 2.5521 - acc: 0.3438
736/891 [=======================>......] - ETA: 0s - loss: 1.0522 - acc: 0.6508
891/891 [==============================] - 0s - loss: 1.0600 - acc: 0.6476
Epoch 3/10
32/891 [>.............................] - ETA: 0s - loss: 0.6703 - acc: 0.7500
736/891 [=======================>......] - ETA: 0s - loss: 0.7885 - acc: 0.6250
891/891 [==============================] - 0s - loss: 0.7784 - acc: 0.6352
Epoch 4/10
32/891 [>.............................] - ETA: 0s - loss: 1.2284 - acc: 0.5312
736/891 [=======================>......] - ETA: 0s - loss: 0.7460 - acc: 0.6522
891/891 [==============================] - 0s - loss: 0.7141 - acc: 0.6667
Epoch 5/10
32/891 [>.............................] - ETA: 0s - loss: 0.5634 - acc: 0.7500
736/891 [=======================>......] - ETA: 0s - loss: 0.6167 - acc: 0.6984
891/891 [==============================] - 0s - loss: 0.6139 - acc: 0.6947
Epoch 6/10
32/891 [>.............................] - ETA: 0s - loss: 0.7388 - acc: 0.5625
736/891 [=======================>......] - ETA: 0s - loss: 0.6097 - acc: 0.6957
891/891 [==============================] - 0s - loss: 0.6145 - acc: 0.6857
Epoch 7/10
32/891 [>.............................] - ETA: 0s - loss: 0.7294 - acc: 0.6250
736/891 [=======================>......] - ETA: 0s - loss: 0.6105 - acc: 0.6875
891/891 [==============================] - 0s - loss: 0.5990 - acc: 0.7003
Epoch 8/10
32/891 [>.............................] - ETA: 0s - loss: 0.5101 - acc: 0.7188
736/891 [=======================>......] - ETA: 0s - loss: 0.5877 - acc: 0.7079
891/891 [==============================] - 0s - loss: 0.5933 - acc: 0.7104
Epoch 9/10
32/891 [>.............................] - ETA: 0s - loss: 0.8791 - acc: 0.4062
736/891 [=======================>......] - ETA: 0s - loss: 0.5938 - acc: 0.7052
891/891 [==============================] - 0s - loss: 0.5905 - acc: 0.7104
Epoch 10/10
32/891 [>.............................] - ETA: 0s - loss: 0.8245 - acc: 0.6250
736/891 [=======================>......] - ETA: 0s - loss: 0.6168 - acc: 0.6739
891/891 [==============================] - 0s - loss: 0.6047 - acc: 0.6857
In [1]:
Now you will start modeling with a new dataset for a classification problem. This data includes information about passengers on the Titanic. You will use predictors such as
age
, fare
and where each passenger embarked from to predict who will survive. This data is from a tutorial on data science competitions. Look here for descriptions of the features.
The data is pre-loaded in a pandas DataFrame called
df
.
It's smart to review the maximum and minimum values of each variable to ensure the data isn't misformatted or corrupted. What was the maximum age of passengers on the Titanic? Use the
.describe()
method in the IPython Shell to answer this question.- Convert
df.survived
to a categorical variable using theto_categorical()
function. - Specify a
Sequential
model calledmodel
. - Add a
Dense
layer with32
nodes. Use'relu'
as theactivation
and(n_cols,)
as theinput_shape
. - Add the
Dense
output layer. Because there are two outcomes, it should have 2 units, and because it is a classification model, theactivation
should be'softmax'
. - Compile the model, using
'sgd'
as theoptimizer
,'categorical_crossentropy'
as the loss function, andmetrics=['accuracy']
to see the accuracy (what fraction of predictions were correct) at the end of each epoch. - Fit the model using the
predictors
and thetarget
.
# Import necessary modules
import keras
from keras.layers import Dense
from keras.models import Sequential
from keras.utils import to_categorical
# Convert the target to categorical: target
target = to_categorical(df.survived)
# Set up the model
model = Sequential()
# Add the first layer
model.add(Dense(32, activation='relu',input_shape= (n_cols,)))
# Add the output layer
model.add(Dense(2, activation='softmax'))
# Compile the model
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])
# Fit the model
model.fit(predictors, target)
output
<script.py> output:
Epoch 1/10
32/891 [>.............................] - ETA: 1s - loss: 4.7317 - acc: 0.4375
704/891 [======================>.......] - ETA: 0s - loss: 2.7083 - acc: 0.4517
891/891 [==============================] - 0s - loss: 2.3808 - acc: 0.4938
Epoch 2/10
32/891 [>.............................] - ETA: 0s - loss: 0.6341 - acc: 0.7188
704/891 [======================>.......] - ETA: 0s - loss: 0.7846 - acc: 0.6918
891/891 [==============================] - 0s - loss: 0.7650 - acc: 0.6846
Epoch 3/10
32/891 [>.............................] - ETA: 0s - loss: 0.6008 - acc: 0.7188
736/891 [=======================>......] - ETA: 0s - loss: 0.5787 - acc: 0.6984
891/891 [==============================] - 0s - loss: 0.5681 - acc: 0.7082
Epoch 4/10
32/891 [>.............................] - ETA: 0s - loss: 0.5486 - acc: 0.7188
704/891 [======================>.......] - ETA: 0s - loss: 0.5750 - acc: 0.7003
891/891 [==============================] - 0s - loss: 0.5588 - acc: 0.7127
Epoch 5/10
32/891 [>.............................] - ETA: 0s - loss: 0.5612 - acc: 0.6562
704/891 [======================>.......] - ETA: 0s - loss: 0.5492 - acc: 0.7244
891/891 [==============================] - 0s - loss: 0.5561 - acc: 0.7183
Epoch 6/10
32/891 [>.............................] - ETA: 0s - loss: 0.7451 - acc: 0.5312
480/891 [===============>..............] - ETA: 0s - loss: 0.5541 - acc: 0.7104
891/891 [==============================] - 0s - loss: 0.5496 - acc: 0.7104
Epoch 7/10
32/891 [>.............................] - ETA: 0s - loss: 0.6328 - acc: 0.6875
448/891 [==============>...............] - ETA: 0s - loss: 0.5502 - acc: 0.7098
891/891 [==============================] - 0s - loss: 0.5489 - acc: 0.7160
Epoch 8/10
32/891 [>.............................] - ETA: 0s - loss: 0.5035 - acc: 0.7188
704/891 [======================>.......] - ETA: 0s - loss: 0.5539 - acc: 0.7159
891/891 [==============================] - 0s - loss: 0.5528 - acc: 0.7138
Epoch 9/10
32/891 [>.............................] - ETA: 0s - loss: 0.6945 - acc: 0.5938
704/891 [======================>.......] - ETA: 0s - loss: 0.5567 - acc: 0.7216
891/891 [==============================] - 0s - loss: 0.5452 - acc: 0.7306
Epoch 10/10
32/891 [>.............................] - ETA: 0s - loss: 0.6644 - acc: 0.6250
736/891 [=======================>......] - ETA: 0s - loss: 0.5352 - acc: 0.7201
891/891 [==============================] - 0s - loss: 0.5292 - acc: 0.7284
<script.py> output:
Epoch 1/10
32/891 [>.............................] - ETA: 0s - loss: 4.7317 - acc: 0.4375
736/891 [=======================>......] - ETA: 0s - loss: 2.5428 - acc: 0.5870
891/891 [==============================] - 0s - loss: 2.5158 - acc: 0.5881
Epoch 2/10
32/891 [>.............................] - ETA: 0s - loss: 2.5521 - acc: 0.3438
736/891 [=======================>......] - ETA: 0s - loss: 1.0522 - acc: 0.6508
891/891 [==============================] - 0s - loss: 1.0600 - acc: 0.6476
Epoch 3/10
32/891 [>.............................] - ETA: 0s - loss: 0.6703 - acc: 0.7500
736/891 [=======================>......] - ETA: 0s - loss: 0.7885 - acc: 0.6250
891/891 [==============================] - 0s - loss: 0.7784 - acc: 0.6352
Epoch 4/10
32/891 [>.............................] - ETA: 0s - loss: 1.2284 - acc: 0.5312
736/891 [=======================>......] - ETA: 0s - loss: 0.7460 - acc: 0.6522
891/891 [==============================] - 0s - loss: 0.7141 - acc: 0.6667
Epoch 5/10
32/891 [>.............................] - ETA: 0s - loss: 0.5634 - acc: 0.7500
736/891 [=======================>......] - ETA: 0s - loss: 0.6167 - acc: 0.6984
891/891 [==============================] - 0s - loss: 0.6139 - acc: 0.6947
Epoch 6/10
32/891 [>.............................] - ETA: 0s - loss: 0.7388 - acc: 0.5625
736/891 [=======================>......] - ETA: 0s - loss: 0.6097 - acc: 0.6957
891/891 [==============================] - 0s - loss: 0.6145 - acc: 0.6857
Epoch 7/10
32/891 [>.............................] - ETA: 0s - loss: 0.7294 - acc: 0.6250
736/891 [=======================>......] - ETA: 0s - loss: 0.6105 - acc: 0.6875
891/891 [==============================] - 0s - loss: 0.5990 - acc: 0.7003
Epoch 8/10
32/891 [>.............................] - ETA: 0s - loss: 0.5101 - acc: 0.7188
736/891 [=======================>......] - ETA: 0s - loss: 0.5877 - acc: 0.7079
891/891 [==============================] - 0s - loss: 0.5933 - acc: 0.7104
Epoch 9/10
32/891 [>.............................] - ETA: 0s - loss: 0.8791 - acc: 0.4062
736/891 [=======================>......] - ETA: 0s - loss: 0.5938 - acc: 0.7052
891/891 [==============================] - 0s - loss: 0.5905 - acc: 0.7104
Epoch 10/10
32/891 [>.............................] - ETA: 0s - loss: 0.8245 - acc: 0.6250
736/891 [=======================>......] - ETA: 0s - loss: 0.6168 - acc: 0.6739
891/891 [==============================] - 0s - loss: 0.6047 - acc: 0.6857
In [1]:
Making predictions
The trained network from your previous coding exercise is now stored as model
. New data to make predictions is stored in a NumPy array as pred_data
. Use model
to make predictions on your new data.
In this exercise, your predictions will be probabilities, which is the most common way for data scientists to communicate their predictions to colleagues.
- Create your predictions using the model's
.predict()
method on pred_data
.
- Use NumPy indexing to find the column corresponding to predicted probabilities of survival being True. This is the second column (index
1
) of predictions
. Store the result in predicted_prob_true
and print it.
# Specify, compile, and fit the model
model = Sequential()
model.add(Dense(32, activation='relu', input_shape = (n_cols,)))
model.add(Dense(2, activation='softmax'))
model.compile(optimizer='sgd',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(predictors, target)
# Calculate predictions: predictions
predictions = model.predict(pred_data)
# Calculate predicted probability of survival: predicted_prob_true
predicted_prob_true = predictions[:,1]
The trained network from your previous coding exercise is now stored as
model
. New data to make predictions is stored in a NumPy array as pred_data
. Use model
to make predictions on your new data.
In this exercise, your predictions will be probabilities, which is the most common way for data scientists to communicate their predictions to colleagues.
- Create your predictions using the model's
.predict()
method onpred_data
. - Use NumPy indexing to find the column corresponding to predicted probabilities of survival being True. This is the second column (index
1
) ofpredictions
. Store the result inpredicted_prob_true
and print it.
# Specify, compile, and fit the model
model = Sequential()
model.add(Dense(32, activation='relu', input_shape = (n_cols,)))
model.add(Dense(2, activation='softmax'))
model.compile(optimizer='sgd',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(predictors, target)
# Calculate predictions: predictions
predictions = model.predict(pred_data)
# Calculate predicted probability of survival: predicted_prob_true
predicted_prob_true = predictions[:,1]
No hay comentarios:
Publicar un comentario