.fit VS .fit_generator in Keras
For small, simplistic datasets it’s perfectly acceptable to use Keras’ .fit
function.
These datasets are often not very challenging and do not require any data augmentation.
However, real-world datasets are rarely that simple:
- Real-world datasets are often too large to fit into memory.
- They also tend to be challenging, requiring us to perform data augmentation to avoid overfitting and increase the ability of our model to generalize.
In those situations we need to utilize Keras’ .fit_generator
function:
# initialize the number of epochs and batch size
EPOCHS = 100
BS = 32
# construct the training image generator for data augmentation
aug = ImageDataGenerator(rotation_range=20, zoom_range=0.15,
width_shift_range=0.2, height_shift_range=0.2, shear_range=0.15,
horizontal_flip=True, fill_mode="nearest")
# train the network
H = model.fit_generator(aug.flow(trainX, trainY, batch_size=BS),
validation_data=(testX, testY), steps_per_epoch=len(trainX) // BS,
epochs=EPOCHS)
Here we start by first initializing the number of epochs we are going to train our network for along with the batch size.
We then initialize aug
, a Keras ImageDataGenerator
object that is used to apply data augmentation, randomly translating, rotating, resizing, etc. images on the fly.
Performing data augmentation is a form of regularization, enabling our model to generalize better.
However, applying data augmentation implies that our training data is no longer “static” — the data is constantly changing.
Each new batch of data is randomly adjusted according to the parameters supplied to ImageDataGenerator
.
Thus, we now need to utilize Keras’ .fit_generator
function to train our model.
As the name suggests, the .fit_generator
function assumes there is an underlying function that is generating the data for it.
The function itself is a Python generator.
Internally, Keras is using the following process when training a model with .fit_generator
:
- Keras calls the generator function supplied to
.fit_generator
(in this case,aug.flow
). - The generator function yields a batch of size
BS
to the.fit_generator
function. - The
.fit_generator
function accepts the batch of data, performs backpropagation, and updates the weights in our model.
This process is repeated until we have reached the desired number of epochs.
You’ll notice we now need to supply asteps_per_epoch
parameter when calling.fit_generator
(the.fit
method had no such parameter).
Why do we need steps_per_epoch
?
Keep in mind that a Keras data generator is meant to loop infinitely — it should never return or exit.
Since the function is intended to loop infinitely, Keras has no ability to determine when one epoch starts and a new epoch begins.
Therefore, we compute the steps_per_epoch
value as the total number of training data points divided by the batch size. Once Keras hits this step count it knows that it’s a new epoch.