Recent developments in artificial intelligence have ushered in an era of so-called “deep learning”. The term was coined back in 2014 when two researchers published their findings about how computers could learn complex patterns by layering simple algorithms together. Since then, deep learning has exploded in popularity and applications!
Deep neural networks are very powerful functions that take lots of inputs and process those inputs using layers to achieve something meaningful. For example, they can recognize objects and determine what category they belong to, analyze speech or images, and predict disease or symptoms.
Because of this, engineers and computer scientists now use them as the basis for many AI systems. This includes things like voice recognition software, chatbots, and self-driving cars. And while there is no clear winner between different architectures, most agree that larger models usually perform better than smaller ones.
In fact, some studies show that bigger networks often be more efficient than thinner ones! It’s possible that these very large networks just need more time to train before they’re good enough to do its job, which makes sense since it takes longer to get through all the examples. Or maybe they’re simply easier to optimize!
There are several ways to visualize the inner workings of a deep network, but one of the most popular is with pooling.
Look at the model graph
A very important part of understanding how a deep learning model works is looking at the model graph. This includes not only what layers exist in the model, but also where each layer’s output goes and whether those outputs are fed back into another layer as input or modified somehow (for example, by being passed through an activation function).
The more you know about this diagram, the better! Because here’s the thing: even if you don’t fully understand all the parts of a given model, you can still use it effectively.
That’s because most people when they start using AI technology already have the computational resources needed to run almost any kind of neural network. What we really need now are efficient ways to design new networks that perform well on specific tasks, just like humans do.
And knowing the internals of some pre-existing models can help with that. By taking time to study their architecture, you’ll get inspiration for your own designs.
Analyze the loss function
The most important part of any machine learning model is its loss function or goal. A loss function determines how your model will be evaluated against its internal benchmark or target.
In deep neural networks, one of the main losses are what we call “cross-entropy” functions. Cross entropy functions measure the difference between two sets by calculating the number of elements in one set that do not belong to the other.
Cross entropy functions were first used for binary classification (i.e., whether an item belongs to one category or another) but can easily be adapted for multiclass problems as well. For instance, if there are five different categories for a classifier, then it would use one value per category plus 5 more values for all possible combinations of at least one element from each of the five groups.
The reason why this is so important comes down to the way the human brain works. As you know, humans have evolved over millions of years to recognize patterns and associate them with things. This process is done through repetition – when something becomes associated with something else, you learn about the second thing because you already learned about the first.
By using cross entropy functions, deep learning models rely on this principle to teach themselves new concepts. Because the algorithm learns how to identify certain patterns based on past experiences, it uses those same strategies to understand newly presented information.
Examine the accuracy of the model
A very important part in understanding how well your model is performing is looking at its accuracy. Accuracy is defined as the proportion of cases that the model correctly classifies.
The more instances the model gets right, the higher the accuracy!
However, the model will not perform well unless the examples it produces are accurate as well. An easy way to determine if the example predictions match the actual label is by using confusion matrices.
A confusion matrix shows you whether an instance was predicted into a certain category or not, and what percentage it was correct. For example, say your model predicts “cat” for all images containing a cat. This would result in every row of the matrix having a 1 in the top left corner and a 0 everywhere else.
This does not make much sense because a cat definitely does not mean it has been categorized as such! Therefore, the accuracy of this model is zero percent since no examples were accurately labeled.
Compare with human perception
When talking about how intelligent computers are, one of the most prominent features is their ability to recognize patterns. Computers can now perform tasks that require them to look at lots of examples or instances of something and be able to determine what the thing is by figuring out the pattern it follows.
This is very similar to how humans perceive things. We are born with an innate understanding of how objects relate to each other and how concepts such as colors develop in our brains. This process is referred to as perceptual learning.
Perceptual learning happens when we pay closer attention to certain stimuli and learn from those experiences. For example, if you watch enough action movies, your brain will get smarter at recognizing guns!
At the same time, there’s another type of perceptual learning that goes beyond just identifying discrete items — this is called conceptualization. Concepts like “gun” include not only knowledge of the item itself, but also what kind of gun it is (handgun vs rifle) and where it comes from (military equipment).
So instead of just knowing that something is a gun, your computer now knows what category “gun” falls into! All of these lessons are internalized through repeated exposure.
By incorporating ideas like these into your deep learning model, you’ll see improved performance. You may want to test this out by looking at some old data and seeing if your model works better than before.
Use visualization tools
A common beginner’s mistake when it comes to deep learning is trying to apply what they have learned directly onto a new task.
By this we mean using an image recognition model that was trained for detecting cats as if it were trained to recognize cars.
This won’t work! The reason is simple — the two tasks are just not related in nature.
Cat images and car images share some similarities, but that doesn’t make them equivalent! They can’t be used interchangeably to achieve the same result.
That is why it is so important to use visualizations or “what you look like” tests to determine whether your model works.
There are many ways to do this, and the best one depends on how much time you have and how quickly you need to get results.
In this article, we will discuss three of the most useful methods for determining the quality of your model.
Create a confusion matrix
A common way to evaluate the performance of a classification model is by creating a *confusion matrix*. This gives you an idea of how well your model was able to classify all examples it had, and why some classes were more difficult for it than others.
The most basic form of a confusion matrix contains two columns and two rows, with one row and column representing each class in your dataset. The other row or column represents instances that are not members of that specific class, so they’re considered “wrong” (or misclassified).
For example, if our confusion matrix contained three classes, then there would be a 1×1 cell representing Class A, a 2×2 cell representing Class B, and a 3×3 cell representing Class C.
To determine the accuracy of our model, we can take the ratio of correct answers over the total number of questions.
Use gradient visualization
A very popular way to visualize how deep neural networks work is by looking at how gradients move across layers.
Gradient visualization comes in two forms: backpropagation, which looks at the changes made to the parameters of each layer as they are learned, and reverse-mode differentiation (RMD), which looks at the inputs that influence the output of a specific layer.
Both of these concepts apply directly to linear models such as logistic regression or ridge regression. However, when working with more advanced architectures like convolutional nets or recurrent networks, it becomes much harder to do so.
That’s why we now have activation mapping. Activation mappings take input tensors from earlier layers and produce new outputs in a succeeding layer. By taking advantage of this concept, you can use RMD to analyze what part of the network learns about different features within an image!
Here, we will look at some examples using VGGNet, one of the most common pre-trained CNN architectures for visual recognition tasks.
Create a histogram of the model
A very important step in understanding how neural networks work is creating an internal representation or visualization of the network.
This process, often called “visualizing” the net, allows you to see what parts of the network are learning about the data.
The most common way to do this is by making use of feature detectors- these take input material and create maps or representations of different features in the image/scene.
These features can be shape patterns, color distributions, texture gradients, etc. The map with have the brightest pixel represents the strongest feature detected for that element.
By using such feature detection algorithms on various layers of the net, you get more detail information than just knowing which areas of the net were active during training.