Deep Learning: A Complete Image Classification Guide: Neural Network

Wednesday, January 3, 2024

Machine Learning: An Introduction

Arthur Samuel, an American Pioneer in the field of Artificial Intelligence, defined Machine Learning in 1959. His definition succinctly captures the essence of this transformative technology. Samuel defined machine learning as:

"Field of study that gives computers the ability to learn without being explicitly programmed."

Arthur Samuel's definition highlights the core concept of machine learning, emphasizing the capacity of computers to learn and improve their performance over time through experience and exposure to data, without the need for explicit programming for every possible scenario. Samuel's groundbreaking work laid the foundation for development and evolution of machine learning as we know it today.

In 1962, Arthur Samuel wrote an essay "Artificial Intelligence: A Frontier of Automation", he wrote:

"Programming a computer for such computations is at best, a difficult task, not primarily because of any inherit complexity in the computer itself but, rather, because of the need to spell out every minute step of process in the most exasperating detail. Computers, as any programmer will tell you, are giant morons, not giant brains."

Machine learning is like a regular programming, a way to get computers to compute a specific task. But Machine learning is similar to traditional programming in that it is a method of instructing computers to do a certain computation. However, how would you go about using regular programming to do a task that is challenging, such as differentiating between images of cats and dogs? In most cases, when we are building a program, it is not difficult for us to write down the steps that need to be taken in order to do a job. In most cases, we will construct a function that seems to be something like this:

Fig 1. Traditional Programming

His basic idea was this: instead of telling the computer the exact steps required to solve a problem, instead show it examples of the problems to solve, and let it figure out to solve itself! This turned out to very effective, by 1961 his checkers playing program had learned to do so much that it beat Connecticut State Champion!

Here's how he described his idea: Suppose we arrange for some automatic means testing the effectiveness of any current weight assignment in terms of actual performance and provide a mechanism for altering the weight assignment so as to maximize the performance. We need not go into the details of such procedure to see that it could be made entirely automatic and to see that a machine so programmed would "learn" from its experience.

Fig 2. A Program using Weight Assignment

A machine so programmed would "learn" from its experience. Learning would become entirely automatic when the adjustment of the weights was also automatic when instead of us improving a model by adjusting its weights manually, we relied on an automated mechanism that produced adjustments based on performance.

The full picture of Samuel's idea of training a machine learning model.

Fig 3. Training a Machine Leaning Model

Also note that, once the model is trained, that is once we have chosen our final, best favorite weight assignment -- then we can think of the weights as being part of the model, since we are not varying them anymore. Therefore, actually using a model after its trained looks like:

Fig 4. Using a Trained Model as a Program

Samuel was working in 1960s, but terminology has changed. Here is the modern deep learning terminology:

The functional form of the model is called its architecture (but be careful, sometimes people use model as a synonym of architecture, so this can get confusing).
The weights are called parameters.
The predictions are calculated from the independent variables, which is the data not including the labels.
The result of the model is called parameters.
The measure of the performance is called the loss.
The loss depends not only the predictions, but also the correct labels (also known as targets or dependent variable) e.g. "dog" or "cat".

Fig 5. Machine Learning

Why Deep Learning for Image Classification

Deep learning has become a groundbreaking method for picture categorization because of its exceptional capacity to autonomously acquire complex patterns and features from data. Deep learning excels in picture categorization due to many crucial factors:

Hierarchical Feature Representation

Deep learning models, specifically convolutional neural networks (CNNs), are designed to autonomously acquire hierarchical representations of data. These networks are composed of numerous layers, with each layer capturing distinct degrees of abstraction. The hierarchical technique enables the model to acquire rudimentary characteristics such as edges and textures in the first levels, eventually advancing towards intricate and conceptual aspects in the subsequent layers.

Flexibility with Varied Data

Deep learning models provide the capacity to adjust and accommodate a broad range of visual data without requiring laborious human feature engineering. Conventional computer vision techniques sometimes need the involvement of human specialists to devise distinct characteristics for various picture categories, resulting in a laborious and less efficient process. Conversely, deep learning models acquire significant characteristics straight from the data, making them adaptable and proficient in managing various datasets.

Scale and Complexity

Deep learning models excel when trained on extensive datasets. With the growth of tagged picture data, deep learning algorithms improve their ability to identify patterns and make precise predictions. The capacity to scale is especially beneficial in jobs involving picture classification, since there are often vast datasets available, enabling deep learning models to reach exceptional performance.

Transfer Learning

Deep learning enables the use of transfer learning, a method in which a pre-trained model on a vast dataset may be adjusted for a particular task using a smaller dataset. This is very advantageous for picture classification, particularly in situations when gathering extensive labeled datasets is difficult. Transfer learning allows the use of acquired information from one activity to enhance performance on a related one, resulting in time and computing resource savings.

End-to-End Learning

Deep learning models facilitate end-to-end learning, which refers to their ability to acquire knowledge directly from unprocessed input data in order to generate the intended output. Within the realm of picture classification, this obviates the need for human extraction of pertinent features, since the model acquires the ability to automatically extract and amalgamate features throughout the training procedure. The use of an end-to-end learning method streamlines the whole process and often yields outcomes that are more precise.

State of the Performance

Art Current advancements in deep learning, particularly convolutional neural networks (CNNs), have repeatedly shown the highest level of performance across a range of image categorization benchmarks. Their exceptional capacity to catch detailed features and comprehend sophisticated patterns has established new benchmarks for precision, becoming them the preferred option for several image recognition jobs.

Deep learning's success in image classification can be attributed to its ability to autonomously acquire hierarchical representations, adapt to various datasets, handle large amounts of data, facilitate transfer learning, enable end-to-end learning, and consistently achieve exceptional performance. The combination of these features establishes deep learning as a potent and adaptable technique in the realm of computer vision.