Keras tutorial – build a convolutional neural network in 11 lines

Keras tutorial - keras logo

In a previous tutorial, I demonstrated how to create a convolutional neural network (CNN) using TensorFlow to classify the MNIST handwritten digit dataset.  TensorFlow is a brilliant tool, with lots of power and flexibility.  However, for quick prototyping work it can be a bit verbose.  Enter Keras and this Keras tutorial.  Keras is a higher level library which operates over either TensorFlow or Theano, and is intended to stream-line the process of building deep learning networks.  In fact, what was accomplished in the previous tutorial in TensorFlow in around 42 lines* can be replicated in only 11 lines* in Keras.  This Keras tutorial will show you how to do this.

*excluding input data preparation and visualisation

This Keras tutorial will show you how to build a CNN to achieve >99% accuracy with the MNIST dataset.  It will be precisely the same structure as that built in my previous convolutional neural network tutorial and the figure below shows the architecture of the network:

Keras tutorial - network

Convolutional neural network that will be built

The full code of this Keras tutorial can be found here. If you’d like to check out more Keras awesomeness after reading this post, have a look at my Keras LSTM tutorial or my Keras Reinforcement Learning tutorial. Also check out my tutorial on Convolutional Neural Networks in PyTorch if you’re interested in the PyTorch library.

The main code in this Keras tutorial

The code below is the “guts” of the CNN structure that will be used in this Keras tutorial:

model = Sequential()
model.add(Conv2D(32, kernel_size=(5, 5), strides=(1, 1),
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Conv2D(64, (5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dense(1000, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

I’ll go through most of the lines in turn, explaining as we go.

model = Sequential()

Models in Keras can come in two forms – Sequential and via the Functional API.  For most deep learning networks that you build, the Sequential model is likely what you will use.  It allows you to easily stack sequential layers (and even recurrent layers) of the network in order from input to output.  The functional API allows you to build more complicated architectures, and it won’t be covered in this tutorial.

The first line declares the model type as Sequential().

model.add(Conv2D(32, kernel_size=(5, 5), strides=(1, 1),

Next, we add a 2D convolutional layer to process the 2D MNIST input images.  The first argument passed to the Conv2D() layer function is the number of output channels – in this case we have 32 output channels (as per the architecture shown at the beginning).  The next input is the kernel_size, which in this case we have chosen to be a 5×5 moving window, followed by the strides in the x and y directions (1, 1).  Next, the activation function is a rectified linear unit and finally we have to supply the model with the size of the input to the layer (which is declared in another part of the code – see here).  Declaring the input shape is only required of the first layer – Keras is good enough to work out the size of the tensors flowing through the model from there.

Also notice that we don’t have to declare any weights or bias variables like we do in TensorFlow, Keras sorts that out for us.

model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

Next we add a 2D max pooling layer.  The definition of the layer is dead easy.  We simply specify the size of the pooling in the x and y directions – (2, 2) in this case, and the strides.  That’s it.

model.add(Conv2D(64, (5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

Next we add another convolutional + max pooling layer, with 64 output channels.  The default strides argument in the Conv2D() function is (1, 1) in Keras, so we can leave it out.  The default strides argument in Keras is to make it equal ot the pool size, so again, we can leave it out.

The input tensor for this layer is (batch_size, 28, 28,  32) – the 28 x 28 is the size of the image, and the 32 is the number of output channels from the previous layer.  However, notice we don’t have to explicitly detail what the shape of the input is – Keras will work it out for us.  This allows rapid assembling of network architectures without having to worry too much about the sizes of the tensors flowing around our networks.

model.add(Dense(1000, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

Now that we’ve built our convolutional layers in this Keras tutorial, we want to flatten the output from these to enter our fully connected layers (all this is detailed in the convolutional neural network tutorial in TensorFlow).  In TensorFlow, we had to figure out what the size of our output tensor from the convolutional layers was in order to flatten it, and also to determine explicitly the size of our weight and bias variables.  Sure, this isn’t too difficult – but it just makes our life easier not to have to think about it too much.

The next two lines declare our fully connected layers – using the Dense() layer in Keras.  Again, it is very simple.  First we specify the size – in line with our architecture, we specify 1000 nodes, each activated by a ReLU function.  The second is our soft-max classification, or output layer, which is the size of the number of our classes (10 in this case, for our 10 possible hand-written digits).

That’s it – we have successfully developed the architecture of our CNN in only 8 lines.  Now let’s see what we have to do to train the model and perform predictions.

Training and evaluating our convolutional neural network

We have now developed the architecture of the CNN in Keras, but we haven’t specified the loss function, or told the framework what type of optimiser to use (i.e. gradient descent, Adam optimiser etc.).  In Keras, this can be performed in one command:


Keras supplies many loss functions (or you can build your own) as can be seen here.  In this case, we will use the standard cross entropy for categorical class classification (keras.losses.categorical_crossentropy).  Keras also supplies many optimisers – as can be seen here.  In this case, we’ll use the Adam optimizer (keras.optimizers.Adam) as we did in the CNN TensorFlow tutorial.  Finally, we can specify a metric that will be calculated when we run evaluate() on the model.  In TensorFlow we would have to define an accuracy calculating operation which we would need to call in order to assess the accuracy.  In this case, Keras makes it easy for us.  See here for a list of metrics that can be used.

Next, we want to train our model.  This can be done by again running a single command in Keras:, y_train,
          validation_data=(x_test, y_test),

This command looks similar to the syntax used in the very popular scikit learn Python machine learning library.  We first pass in all of our training data – in this case x_train and y_train.  The next argument is the batch size – we don’t have to explicitly handle the batching up of our data during training in Keras, rather we just specify the batch size and it does it for us (I have a post on mini-batch gradient descent if this is unfamiliar to you).  In this case we are using a batch size of 128.  Next we pass the number of training epochs (10 in this case).  The verbose flag, set to 1 here, specifies if you want detailed information being printed in the console about the progress of the training.  During training, if verbose is set to 1, the following is output to the console:

3328/60000 [>.............................] - ETA: 87s - loss: 0.2180 - acc: 0.9336
3456/60000 [>.............................] - ETA: 87s - loss: 0.2158 - acc: 0.9349
3584/60000 [>.............................] - ETA: 87s - loss: 0.2145 - acc: 0.9350
3712/60000 [>.............................] - ETA: 86s - loss: 0.2150 - acc: 0.9348

Finally, we pass the validation or test data to the fit function so Keras knows what data to test the metric against when evaluate() is run on the model.  Ignore the callbacks argument for the moment – that will be discussed shortly.

Once the model is trained, we can then evaluate it and print the results:

score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

After 10 epochs of training the above model, we achieve an accuracy of 99.2%, which is the same as what we achieved in TensorFlow for the same network.  You can see the improvement in the accuracy for each epoch in the figure below:

Keras tutorial - MNIST training accuracy

Keras CNN MNIST training accuracy

Keras makes things pretty easy, don’t you think? I hope this Keras tutorial has demonstrated how it can be a useful framework for rapidly prototyping deep learning solutions.

As a kind of appendix I’ll show you how to keep track of the accuracy as we go through the training epochs, which enabled me to generate the graph above.

Logging metrics in Keras

Keras has a useful utility titled “callbacks” which can be utilised to track all sorts of variables during training.  You can also use it to create checkpoints which saves the model at different stages in training to help you avoid work loss in case your poor overworked computer decides to crash.  It is passed to the .fit() function as observed above.  I’ll only show you a fairly simple use case below, which logs the accuracy.

To create a callback we create an inherited class which inherits from keras.callbacks.Callback:

class AccuracyHistory(keras.callbacks.Callback):
    def on_train_begin(self, logs={}):
        self.acc = []

    def on_epoch_end(self, batch, logs={}):

The Callback super class that the code above inherits from has a number of methods that can be overridden in our callback definition such as on_train_begin, on_epoch_end, on_batch_begin and on_batch_end.  The name of these methods are fairly self explanatory, and represent moments in the training process where we can “do stuff”.  In the code above, at the beginning of training we initialise a list self.acc = [] to store our accuracy results.  Using the on_epoch_end() method, we can extract the variable we want from the logs, which is a dictionary that holds, as a default, the loss and accuracy during training.  We then instantiate this callback like so:

history = AccuracyHistory()

Now we can pass history to the .fit() function using the callback parameter name.  Note that .fit() takes a list for the callback parameter, so you have to pass it history like this: [history].  To access the accuracy list that we created after the training is complete, you can simply call history.acc, which I then also plotted:

plt.plot(range(1,11), history.acc)

Hope that helps.  Have fun using Keras. As I said at the beginning of the post, if you’d like to check out more Keras awesomeness after reading this post, have a look at my Keras LSTM tutorial.

11 thoughts on “Keras tutorial – build a convolutional neural network in 11 lines”

  1. Hey this is kinda of off topic but I was wondering if blogs use WYSIWYG editors or if you have to manually code with HTML. I’m starting a blog soon but have no coding knowledge so I wanted to get guidance from someone with experience. Any help would be enormously appreciated!

  2. The compounding of nivolumab and ipilimumab maintained its survival happier wholly with chemotherapy with at least 3 years of forces surrounded by patients with unresectable septic pleural mesothelioma, according to CheckMate 743 ponder results.

    Researchers observed the extras of the first-line immunotherapy regimen undeterred by patients having been beneath average treatment on back 1 year. The findings, presented during the agreed ESMO Congress, also showed no reborn bolt-hole signals with nivolumab (Opdivo, Bristol Myers Squibb) coupled with ipilimumab (Yervoy, Bristol Myers Squibb).

    Statistics derived from Peters S, et al. Pr‚cis LBA65. Presented at: European Codification usefulness perquisites of Medical Oncology Congress (agreed converging); Sept. 17-21, 2021.

    “Mesothelioma has historically been an exceptionally difficult?to?treat cancer, as it forms in the lining of the lungs nothing loath approve than as a unattached tumor. It is also an pugnacious cancer with in need prognostication and 5?year survival rates of there 10%,” Solange Peters, MD, PhD, of the medical oncology services and chair of thoracic oncology at Lausanne University Salubrity focal point in Switzerland, told Healio. “Preceding the affirmation of nivolumab additional ipilimumab, no curious systemic treatment options that could give survival to patients with this gripping cancer had been at as a countermeasure representing more than 15 years.”

    The randomized interval 3 CheckMate 743 bad news included 605 patients with untreated virulent pleural mesothelioma, stratified according to making guy and histology (epithelioid vs. non-epithelioid).

    Researchers randomly assigned 303 patients to 3 mg/kg nivolumab, a PD-1 inhibitor, every 2 weeks and 1 mg/kg ipilimumab, which targets CTLA-4, every 6 weeks in the management of up to 2 years. The other 302 patients received platinum-based doublet chemotherapy with 75 mg/m2 cisplatin or carboplatin hockey controlled by way of the curve 5 additional 500 mg/m2 pemetrexed on the side of six cycles.

    As Healio covet ago reported, patients in the immunotherapy and chemotherapy groups had comparable baseline characteristics, including median duration (69 years on the side of both), interest of men (77% in search the benefit perquisites of both) and histology (epithelioid, 76% vs. 75%).

    OS served as the earliest endpoint, with retreat and biomarker assessments as prespecified exploratory endpoints.

    Researchers acclimated to RNA sequencing to believe the relationship of OS with an inflammatory gene nuance signature that included CD8A, PD-L1, STAT-1 and LAG-3, and they categorized countenance scores as true-blue vs. smaller low-cut in interdependence to median score. They also evaluated tumor mutational shipment and assessed lung inoculated prognostic sign up for based on lactate dehydrogenase levels and derived neutrophil-to-lymphocyte measure at baseline using outer blood samples.

    Results showed the immunotherapy regimen continued to prearranged an OS ruff benefits compared with chemotherapy after littlest succour of 35.5 months (median OS, 18.1 months vs. 14.1 months; HR = 0.73; 95% CI, 0.61-0.87). Researchers reported 3-year OS rates of 23.2% gather patients who received nivolumab added ipilimumab vs. 15.4% immensity patients who received chemotherapy, and 3-year PFS rates sooner than blinded unrelated prime re-examine of 13.6% vs. 0.8% (median PFS, 6.8 months vs. 7.2 months; HR = 0.92; 95% CI, 0.76-1.11).

    “These results are auspicious, providing farther impenetrable of the durability of the outcomes achieved with this conglomeration,” Peters told Healio.

    Median OS integrity 455 patients with epithelioid affliction was 18.2 months with the federation vs. 16.7 months with chemotherapy (HR = 0.85; 95% CI, 0.69-1.04) and to each 150 patients with non-epithelioid handicap was 18.1 months vs. 8.8 months (HR = 0.48; 95% CI, 0.34-0.69).

    Exploratory biomarker analyses in the nivolumab-ipilimumab tie showed longer median OS oodles patients with death vs. foxed unmanageable gene signature refund (21.8 months vs. 16.8 months; HR = 0.57; 95% CI, 0.4-0.82). The tier did not take away the task associated with longer OS in the chemotherapy group.

    The organism showed a trend toward improved OS vs. chemotherapy across subgroups of patients with a appropriate (HR = 0.78; 95% CI, 0.6-1.01) midriff (HR = 0.76; 95% CI, 0.57-1.01) or reduced (HR = 0.83; 95% CI, 0.44-1.57) baseline lung vaccinated prognostic index.

    Tumor mutational overload did not manifest associated with survival benefit.

    Even-handed their own medicine rates appeared comparable between the immunotherapy and chemotherapy groups (39.6% vs. 44%); civility, duration of rejoinder was nearly twice as hanker come up to b chance to responders in the immunotherapy guild (11.6 months vs. 6.7 months). Three-year duration of riposte rates were 28% with immunotherapy and 0% with chemotherapy.

    Rates of grade 3 to motionless 4 treatment-related adverse events remained unswerving with those reported at joined clobber (30.7% with immunotherapy vs. 32% with chemotherapy), with no late-model wrap signals identified.

    A post-hoc classification of 52 patients who discontinued all components of the balance owed to treatment-related adverse events showed no antagonistic supporting on long-term benefits. “With these follow?up subject-matter, CheckMate 743 remains the initially and no more than usher in 3 dispassionate times in which an immunotherapy has demonstrated a durable survival fringe benefits vs. standard?of?care platinum and pemetrexed chemotherapy in start oline unresectable toxic pleural mesothelioma,” Peters told Healio.

    Grasp more about


    Inspirit adopt measures your email accost to be confirmed an email when up to bodyguard articles are posted on Hematology Oncology: Lung Cancer.


    You’ve successfully added Hematology Oncology: Lung Cancer to your alerts. You individualize get an email when additional wwi doughboy is published.

    Click Here to Direct Email Alerts

    You’ve successfully added Hematology Oncology: Lung Cancer to your alerts. You wishes treat into a houseboy’s tenancy an email when fresh essence is published.

  3. Hello there I am so happy I found your blog page, I really found you by mistake, while I was searching on Bing for something else, Anyhow I am here now and would just like to say kudos for a tremendous post and a all round entertaining blog (I also love the theme/design), I don’t have time to read it all at the minute but I have book-marked it and also added your RSS feeds, so when I have time I will be back to read more, Please do keep up the great job.

Leave a Reply

Your email address will not be published. Required fields are marked *