When it comes down to it, deep learning is all about tuning your hyperparameters. This term refers to the settings you have for your model as well as any external software or tools that help run the model.
External softwares like GPUs require specific configurations of settings, whereas other things (like batch size) are more general.
Hyperparameter settings can be difficult to figure out unless you use them frequently. That’s why there are many online resources with easy-to-follow tutorials on how to best tune yours!
Tutorials will tell you how to find good values for different parameters by testing across several datasets using either grid search or random search.
Grid search goes into detail about what each parameter does and finds optimal values by trying every possible value within defined ranges. While random search just picks a value at random and reruns the algorithm until it gets a successful result.
This article will go over both types of tuning and some tips and tricks when choosing which one to use. However, before diving in, let us first take a look at an example.
Image credit: tony rogers/piemonte photography
In this section we will be going through the steps to tuned the kernel size of a convolutional neural network (CNN).
Know your models parameters
A deep neural network has many different layers that are typically referred to as neurons, or more specifically, activation functions, kernel sizes, batch size, momentum, and so on. These are all considered hyperparameters because they can be changed depending on what settings you want the model to have.
Typically, people begin tuning these settings when their test accuracy is low and they’re looking to improve it. That’s not the best approach though!
Tuning hyperparameters at this stage could actually cause the algorithm to get worse instead of getting better. This is due to two reasons:
The algorithm may find a local optimum where performance is good but there’s no generalization beyond the training set. For example, if the algorithm finds perfect accuracy for every sample in the training dataset, then it will probably perform poorly on new data because it won’t learn how to apply its knowledge outside of the trained patterns.
Too much experimentation can result in overfitting – having poor generalization even on additional testing samples. Because the algorithm has adapted to the training examples, it becomes very accurate on those cases but loses effectiveness on new ones.
How to tune neural network hyperparameters
When it comes down to it, deep learning is an extremely complex field that requires significant experimentation to achieve success. There are so many different variables involved with this technology that choosing good defaults for your settings can sometimes be tricky.
That’s why having strong general strategies for tuning your model’s performance levels is so important! In this article we will go over some easy ways to do just that.
What are sensitivity and precision?
A metric that helps determine whether a model is performing well is accuracy, but it can be tricky to define accurately. Accuracy usually refers to how many examples your model predicts correctly, but some studies use the difference between false positives and false negatives to calculate accuracy as an additional measure. This is called specificity or true negative rate, because it calculates how often your model identifies items that belong as negatives (no epidemic) as opposed to misidentifying them as positive (an epidemic).
By using this more specific definition of accuracy, you get a higher value than just relying on the simple ratio of correct over all predictions. However, it also includes cases where your model does not identify wrong information so it may under-estimate the prevalence of epidemics!
Another important factor when calculating accuracy is determining what constitutes a positive result. If we assume that having symptoms means that someone has the disease, then clearly identifying people with symptoms is a way to check for accuracy. But if there are no confirmed symptoms, then telling everyone else that they have the disease could actually cause harm by increasing anxiety and spread of the infection.
In such cases, precision or the fraction of samples that are identified as being relevant is a better indicator of performance. Precision measures how likely our model is to agree about something’s status, even if it isn’t completely sure.
Try different values of a parameter
A common way to tune hyperparameters is by trying out various values you can test them on, either your training or testing dataset!
This is an important part of any machine learning model because it impacts both the performance and efficiency of the algorithm.
For example, if you are looking to train your model on more data then longer would be better, but if you have very little data then shorter is better!
There are two main reasons why this is done. The first is so that the algorithm has enough time to learn the structure of the data, and the second is to find the optimal value for the parameter.
Perform validation
After you have trained your model, it is time to validate its accuracy! This means testing how well your model predicts data that has already been used for training. If everything goes well, the theory says that the model will also work well on new data because of how well it performed during this test.
Validation can be done using any kind of dataset that is similar to the one you trained with. The most common way is to use holdout or internal-external validation. For external validation, you must find datasets that are nearly identical to the ones you trained with (within classes or across all classifications).
For example, if your model was designed to recognize cats then you could test by looking at a set of pictures of dogs and determining whether they look like a cat or not. Or you could determine whether a dog looks more like a cat than another animal.
Internal validation is simply testing against our knowledge of the same data we used to train the model. For instance, if your model predicted that every picture of a car would result in “car” as the classification, then you could test this by taking a look at a few examples of other things such as chairs or houses.
Tips: When performing internal or external validation, make sure to run the tests many times so that there is no variance between them.
Use cross-validation
A hyper parameter is an internal variable that you can tweak in your model architecture or training procedure. For instance, you may want to determine how large to make your network’s convolutional layers or whether to use batch normalization before each activation.
The most common way to tune these parameters is using what’s called “cross validation.” This involves dividing your dataset into two groups– one for testing and one for changing the hyperparameter. You then optimize the parameter for the test group and evaluate the performance of your algorithm on the other set.
You do this many times with different values for the parameter, and eventually come up with a setting that gives the best results. The ratio between the performance of the algorithm when it was tuned versus after tuning is referred to as its “generalization ability.” Generalization refers to predicting future data points based on past patterns.
Cross validation is a powerful tool for finding good settings for deep learning algorithms because you get the same quality of result every time you run it. Because of this, there are some standard practices for doing it.
Use a grid of values
There are two main types of hyperparameter tuning for neural networks, which we will discuss here. The first is optimizing the network’s architecture, or choosing how many layers there are, what type of layer each one is, and so on. This is usually referred to as architectural optimization.
The second kind of parameter tuning looks into changing the internal parameters of the individual neurons and/or layers. These internal parameters can be learning rates, momentum terms, regularization constants, and so forth.
By splitting up this process into separate modules, it becomes much easier to test different configurations quickly.
Use a random search
Recent developments in neural network architecture have led to so-called deep learning, which are networks with more than one layer of neurons. Because there is an infinite number of possible architectures you can make the network, going as far as having very thin or thick layers, different numbers of hidden layers, etc., it becomes easy to overfit the data.
By adding additional complexity to the model, this effect gets worse. Therefore, it is important to reduce the chance that your model will overfit by testing many different configurations of the network.
One method used to do this is called “hyperparameter optimization” (HO). With HO, you find good hyperparameters for your CNN using a randomized search where each parameter is tested several times under slightly modified conditions. This way, you get robust performance even if the original parameters were not working well.
There are various ways to apply HO to DNNs. One common approach is grid search, where all combinations of the parameters are tried out at once. Another strategy is sequential search, where only a few variables are tuned at a time until the best results are found.