Recent developments in artificial intelligence (AI) have ushered in an era of so-called “deep learning.” This technique is characterized by AI algorithms that learn complex patterns from large datasets through layers of computational functions or “neural networks.”
Deep neural networks are very powerful tools, but how they work may be difficult to understand for non-experts. Fortunately, there is a relatively straightforward way to visualize how deep learning models function!
In this article you will learn about one such model called convolutional neural network (CNN). Once you fully understand how CNNs work, you can begin applying them towards solving practical problems.
But first, let’s take a quick look at what makes CNNs special.
What sets CNNs apart?
There are three main reasons why CNNs are more effective than other types of machine learning techniques. They are:
1. Depth
A second important factor behind CNNs’ efficiency is their depth -or the number of connections each neuron has. More neurons mean greater capacity to store information.
The deeper the net, the better it learns. When we say ‘net’ we refer to either the horizontal structure of the layer (the input, hidden, output nodes), or the vertical structure of the whole architecture (layers connected with each other).
2.
Understanding neural network architectures
Neural networks are one of the most powerful machine learning algorithms available today. They have seen dramatic growth in popularity over the past few years, with applications ranging from image recognition to natural language processing (NLP).
By its very nature, neural networks can learn complex functions. This makes them effective at solving problems that require more than just identifying individual items or characters. For example, they can identify objects by combining small pieces of information about shape, position, and context.
Because of this, neural networks play an important role in computer vision, where they are often used to classify images as well as determine the contents of documents and spoken words.
Deep convolutional neural networks (DCNNs) are one type of neural network that have become particularly popular due to their impressive performance in visual tasks like object classification. DCNNs achieve this effectiveness through what’s called backpropagation – the process whereby errors in the model are corrected using previous results.
In fact, it is common to use the term ‘backpropagate’ when referring to how error corrections occur within deep models.
Backpropagating error correction
The way forward-facing layers such as neurons in a neural net receive input values and compute output values is referred to as activation propagation or feedforward propagation. The opposite direction, however, is usually referred to as backward propagation.
Activation functions
In deep learning, an activation function is a mathematical function that is used in neural networks to increase the magnitude of input values so they can be learned by the model.
The most common type of activation function is the sigmoid which produces output values between 0 and 1. These are often referred to as softmax functions because they create a sort of natural ranking where higher numbers are more likely than lower ones.
Other less commonly used types of activation functions include the ReLU (Rectifier Linear Unit), tanh, leaky relu, and hardtanh. While some work better than others depending on the task, all perform well compared to using no activation.
Keras includes several built-in activation functions for you to use including these five! To add one, just write your own activation layer and then tell it what kind of activation function you want to use. You also have the option to mix and match if you would like.
Loss functions
A loss function is a metric that defines how well your model is performing its job, in this case, to classify images. The smaller the loss value, the better the classification!
There are two main types of losses used for deep learning models: classification losses and regression losses.
A loss function can have different parameters that determine what it computes, such as mean squared error (MSE) which equals the average difference between an output and the true label, or absolute error which calculates the total number of differences, etc.
Some common classification losses include cross-entropy loss, categorical cross entropy loss, focal loss, softmax loss and multi-label sigmoid loss. Some regression losses include MSE, root mean square error (RMSE), mean absolute error (MAE) and sum of squares error (SSE).
How to choose a neural network architecture
Choosing an appropriate model architecture is one of the most important things you will do in AI research!
In this article, we’ll go over some common models such as convolutional networks and recurrent networks, before looking into how they are built using Theano (a popular Python deep learning library). By understanding these concepts, you’ll be well prepared for any new architectures that you want to try out!
We’ll also take a look at what makes up the numbers when training a neural net, why different layers work and where your network can fail. This way you’ll have a better idea of how to pick your layer types and number of neurons per layer.
Lastly, we’ll discuss activation functions and their effects on the final network.
Transfer learning
A common beginner’s mistake when it comes to deep neural networks is trying to train and test your model directly from data. This can be tricky because most datasets are much larger than the amount of computer memory you have!
To prevent this, we use what’s called transfer learning. With this technique, instead of training an algorithm on all of the examples in the dataset, the algorithm only gets trained on some part of the dataset that it has seen before. Then, the algorithm uses its past experiences to apply those learnings to other parts of the task.
For example, if you were teaching someone how to bake cookies, you would start with recipes for making chocolate chip cookies. You could then teach them how to make oatmeal raisin cookies using the knowledge of baking chocolaty cookies. Using this analogy, the AI that learns about AI bakes will know how to make oatmeal raisin cookies by applying its previous lessons on how to make chocolate chip cookies.
By transferring the learned skills between tasks, the system does not need to re-learn everything completely new like it would if it was starting from scratch every time. This saves computational time and energy for the system to focus on more complex problems.
There are three main types of transfer learning in NN models: domain adaptation, fine tuning, and feature extraction. Let us look at each one in detail.
How to create a neural network
In this article, you will learn how to use one of the most popular deep learning frameworks for creating neural networks- Keras. You will also learn about some important concepts in neural networks such as layers and parameters.
Keras is a high level API that uses Python as its scripting language. It was designed to make it easy to construct and evaluate state-of-the-art architectures using neural networks.
One of the things that makes Keras unique is that it comes with an interface known as “chaining” or “sequentialization” of models together. This allows users to build complex systems by connecting simple components together into more advanced ones.
This article will go through all steps necessary to train and test a very basic image classification model using the VGG16 architecture which has 16 convolutional layers and two fully connected (classification) layers.
Understanding deep learning models
Recent developments in artificial intelligence have been characterized by increasingly complex architectures that try to take as much information as possible from our surroundings and use it to learn tasks for us.
Deep neural networks are one such architecture. They consist of several layers of computational units, or neurons, which are connected to each other. The way these connections are organized is called a structure, or network, topology.
The most famous type of netowrk topologies are convolutional nets and recurrent nets. Recurrent nets organize their neurons into loops so they can remember past events, while convnets look at small windows of surrounding areas of an image and find patterns within them.
By combining both types of structures, you get what’s known as a hybrid netowrk. These hybrids work best when there are spatial correlations in the data, like images, videos, or sound.
Because of how well they work, many companies now develop their own versions of this model and make them accessible to anyone through pre-trained weights that users can switch out and test. This is where keras comes in!
We will go over how to implement your own version of a pretrained VGG16 (Visual Geometry Group) model here! If you would rather just watch a video instead, check out this YouTube playlist we made with all the steps explained.