1.2 Most Famous Convolutional Neural Networks and Comparisons

You are probably familiar with some of the neural network architectures. Success story of AlexNet comes up at first glimpse. A Convolutional Neural Network application that reduces (top-5, but don’t worry the technical detail for now) error by from 26% to 15.3% at ImageNet Classification Problem. Closest one, not a Convolutional Neural Network with 26.2% error.

Makine Öğrenmesi Channel

If you’ve coded a Convolutional Neural Network for MNIST, this is the closest structure to AlexNet. No Dropouts, Few Layers and so on.

Small Portion from Keras

One of the first application for CNNs was done by Yann Lecunn, as you can see here.

Smile of Success

Things were started at 1993, hyped at 2012 with AlexNet and get complicated nowadays. Let’s dive them little bit and see basic differences between modern architectures. I will not make you feel uncomfortable with technical details, I hope.


Just add more convolutional layers to the AlexNet. Winner of ImageNet-2014.

I was thinking just to finish with above sentence for VGG-16, however, if you check ImageNet Competition, the change by year is the number of layers. Nowadays, we see winners that has more than 100 layers, however, there is no dramatic revolution. We need to wait for another change, we need to wait for new LeNet.


Filter size get smaller and smaller, and it leads to reduction at number of parameters. You can think of weights, as a parameter. Less parameter, less number of weights, less thing to adjust, faster convergence.

Filter sizes are as you see, different filter sizes for the same layers, unusual when it is compared to VGG and AlexNet. Also, you will have smaller MBs of weights when you save it. I mean, really small. Inception V3 model has a great success at that year, and it is sufficient for your current projects.

Keras has pre-trained weights that we’ll discuss, see VGG19 and InceptionV3, don’t care others for now

We then have capsule networks, that is proposed by Geoffrey Hinton, however, it requires more technical explanations, and I’m not expert on it.

Why we are learning these architectures,

  • To get familiar with them, we need to learn their powers and deficiencies
  • We’ll talk about transfer learning, that’s the core idea
  • In a practical side, we should give a chance for them, you see lots of architectures in the table of Keras, some of them contribute to different problem. We’ll discuss them and this is the little background information you need.

Bir cevap yazın

E-posta hesabınız yayımlanmayacak. Gerekli alanlar * ile işaretlenmişlerdir