Recent developments in artificial intelligence (AI) require large amounts of data to work. This is not surprising, as most AI applications rely heavily on computer software algorithms that learn by analyzing past examples.
The more examples there are, the better these algorithms can learn how to perform their tasks.
In fact, some experts say we’re at an inflection point where accessibility of powerful AI technology becomes a significant factor in shaping our future. It will play important roles in areas such as healthcare, finance, education and productivity, just to name a few.
If you’re thinking about diving into this frontier area, you need to know what kind of hardware and bandwidth you’ll need to run your deep learning experiments. And it doesn’t matter if you’re running small scale tests or production level runs!
Luckily, we have some helpful information here to get you started.
History of deep learning
Neural networks are not new! In fact, they have been around since at least 1952 when Frank Rosenblatt published his paper “The Perceptron”. The term neural network was first used in 1957 by Geoffrey Hinton to refer to what we now know as feed-forward artificial neurons. It took another eight years before computer scientists began using the term neural network to describe this type of system.
In 1989 Alex Krizhevsky and Ilya Sutskever developed an architecture called Convolutional Networks that uses convolutions (layers of weighted connections) to create deeper layers of perception. This style of net has become very popular due to its impressive test accuracy on large datasets such as ImageNet.
Deep learning exploded into mainstream popularity in 2014 with the release of Google’s famous LeNet5 image classification algorithm. Since then, many variations of these architectures have emerged, making it possible to implement even more advanced algorithms like object detection and natural language processing.
Take some time to read through each section above. When you’re done, reread the introductory paragraph to make sure you understand how neural nets work. Then, practice applying this knowledge by experimenting with pre-trained models available online.
Types of neural networks
Recent developments in deep learning have focused on exploring different types of neural networks. There are three main categories: convolutional, recurrent, and hybrid networks.
Convolutional nets learn spatial information by looking at small windows of the image to see how patterns fit together. For example, they look at little patches of pixels that contain repeated motifs and use this data to determine what larger areas of the picture mean.
Recurrent nets work similarly to humans- we think about things sequentially so they also try to understand sequential data (like conversations or stories).
Hybrid networks combine some of both convolutional and recursive structures to get more detailed information. These typically start with a pretrained network as a feature extractor and then add additional layers designed to perform specific tasks.
Deep learning models now almost always include some kind of convolutional net since it is one of the most common building blocks. Recurrent networks still play an important role in many applications like natural language processing (NLP) and speech recognition.
The number of samples needed
A lot of things can be determined by looking at how many samples you have! If we look back at our Iris dataset example, this is important to note because there are three different categories or classes in determining if a flower is a Setia (Dandelion), Irises, or not.
If there are more examples of one class than another, then it’s easy to tell which group the sample belongs to! This is called over-fitting the data.
Over fitting will result in your model doing well with the training set but poorly when tested on new datasets or tests. Over fitting happens when someone uses too much information from the training set to make their models work.
It also means that even though the model knows what a dandelion looks like, it doesn’t know what all different types of irises look like so it produces confusing results.
That isn’t very helpful! Luckily, you don’t need a vast amount of data to get good quality results with neural networks.
The number of variables
Another important factor in deep learning is how many parameters there are in your model. These include things like weights for networks, or coefficients in equations you use to calculate results.
The more parameters that your neural network has, the greater chance it will overfit the data. That means it will learn patterns not because of true correlations between features and targets, but because it has enough training examples to make such assumptions.
Deep learning models with lots of parameters are much harder to tune than ones with fewer, potentially leading to poorer performance overall. And although difficult to achieve, having very few parameters is almost impossible to do without using feature extraction before inputting into the classifier!
Feature extractors take raw information as inputs and create new features from them. For example, an image may be broken down into its colors, shapes, and textures, all of which can then be used to determine what category the picture belongs to (cat face, house, beach).
The number of layers
A layer in deep learning is like an old-fashioned sheet of paper that you can add to or take away from your diagram.
A layer in neural networks typically comes with its own set of weights (or variables) determined by how much information it contains and how well those bits of information are connected to other parts of the network.
The more layers there are, the higher the potential accuracy of the model, but only if enough training data exists. More layers also mean longer training times as the computer must adjust each weight several times during optimization.
There’s no hard and fast rule about the ideal amount of layers for any problem domain, but most experts agree that two or three at the very least are necessary for almost all applications.
The size of the network
A deep neural network (DNN) is not like other machine learning algorithms where you can run it on as little data as possible. DNNs require large amounts of training data to work properly.
In fact, some experts say that too much data is a bad thing when it comes to using DNNs!
There are several reasons why this is the case. First, very few people have lots of natural looking images with facial expressions. If a model was able to learn how to identify just plain faces, it would probably do well because there are so many of them.
Second, most image-based tasks don’t require extremely high levels of accuracy. For example, if your computer vision app asked whether an object in an image is real or not, a low degree of accuracy will usually suffice.
A third reason has to do with overfitting. Almost every algorithm gets better and better as it is exposed to more and more examples, which is what makes it work.
For example, let’s say we trained our neural net to recognize dogs by looking at their noses. After enough exposure, the nose area will be learned really well, and then one day someone uploads a picture of a dog without a nose.
The accuracy of the model
Accuracy is a key factor in determining how well AI will perform a task. More data implies better accuracy, but not always!
Deep learning algorithms learn by looking at lots of examples to find patterns. As such, more data means higher accuracy because there are more potential examples to learn from.
However, this doesn’t mean that having too much data is always beneficial. Overfitting occurs when the algorithm assumes something about the dataset when it actually has no proof that it holds true.
Overfit models often use very complex strategies to achieve lower errors, which sometimes don’t work as intended.
Applications of deep learning
Recent developments in artificial intelligence (AI) have focused on what are called “deep neural networks” or, more simply, “neural networks.” Neural networks are computational models inspired by how our brains work.
Just as we have neurons within our brain that connect with other neurons to carry out simple functions, so too do neural network architectures contain neuron-like components that perform complex tasks via connected layers.
The key difference is that instead of firing off individual messages like thoughts, dreams and emotions, these new AI systems process large amounts of data to achieve their goals.
By having many connections between nodes, they can learn powerful rules and patterns from the data. That’s why people use computers to solve increasingly difficult problems—they only need small samples of information to pick up on trends!
Because they rely heavily on datasets for knowledge, it is important to know how much training material there should be available. This article will go into detail about how different types of neural networks require different levels of data to function properly.