Support Vector Machine (SVM) Simply Explained

Support Vector Machine (SVM) Simply Explained

In this blog post, we will explain what is Support Vector Machine and how it works, we will also explore the types of Machine Learning, and the best part is that you won't need more than high school maths to understand the basics of SVM.

So let's begin!

We have explored in our previous blog post the difference between Artificial Intelligence, Machine Learning, Deep Learning and Data Science and how they all relate to each other. In this blog post, we will continue from there, if you haven't read the previous post then you might want to check it out: Artificial Intelligence, Machine Learning, Deep Learning, and Data Science Explained.

The 4 main categories in Machine Learning are:

  1. Supervised Learning: In this type, the model is trained on a labeled dataset where the input data is paired with the correct output. The goal is to learn a mapping function that can accurately predict outputs for new, unseen inputs. Supervised learning has two types, Regression (prediction of values), Classification (Prediction of classes).

  2. Unsupervised Learning: Here, the model is given input data without explicit labels. The objective is to find patterns, relationships, or structures within the data, often used for clustering and dimensionality reduction.

  3. Semi-Supervised Learning: A mix of supervised and unsupervised learning, this approach uses a small amount of labeled data along with a larger amount of unlabeled data. It combines elements of both approaches to improve learning accuracy.

  4. Reinforcement Learning: In reinforcement learning, an agent learns to interact with an environment to maximize a reward signal. The agent takes action and learns from the consequences of those actions, aiming to make better decisions over time.

SVM falls under Supervised Learning.

Support Vector Machine

SVM is an algorithm in machine learning which is used for classification and regression. When we are doing classification tasks using SVM we call it Support Vector Classifier, and when we are performing regression using SVM we call it Support Vector Regressor. SVM is arguably among the most efficient machine learning algorithm for classification.

What is classification?

Classification is grouping the objects into two or more classes, for example, if we have collected data about the humidity, temperature and whether it is raining or not; we can plot these values on a graph and we will find that the points which signify that it was raining forms a different group from the points that signify that it was not raining. We will see this in detail in this article.

Suppose we have a data set that contains the values of Humidity, Temperature, and Rain, something like this:

You can assume each row as a day, for example, it was 5 June 2023, the humidity was 1 on that day and the temperature was also 1 and it wasn't raining (1st row). On 11 June 2023, the humidity was 4 units and the temperature was also 4 units and it was raining. Let's assume No = -1 and Yes = +1 since computers are more comfortable working with numerical values.

This is how machine learning learns from past experience or data to make predictions.

Let's assume that there are many such data rows in our data set, when we plot it, we will get something like this:

The (.) dots represent the days when it wasn't raining, the value of rain column was No, and (x) represents the days when it was raining, the value of rain column was yes.

Now we create a best-fit line that separate these two groups, so that we can easily separate the points, it is similar to logistic regression, if you don't know what logistic regression is then I highly recommend you read this article:

After we have drawn a best-fit line that successfully separate these two groups (refer to the article linked above) -

We then draw two more lines that go through the nearest point to the hyperplane (our best-fit line) from it's either side (left and right). The points through which these two additional line goes are called support vectors.

We call these two points support vectors because they support the marginal planes. Our aim is that the distance between marginal planes (margin) should be maximum. The larger the margin, the better the classifier's ability to correctly classify future data points. Once we have found our optimal hyperplane and marginal planes, the training of the machine learning model will be complete.

Suppose we would like to predict if it will rain or not if the humidity and temperature are both 5.

Now as we can see that the point (5, 5) lies in the raining group, the model will output +1 or Yes

So, this is how a basic SVM works. In this blog post we have developed an intuitive of how SVM works without maths or much technical jargon, in later blog posts we will dive deeper and make things a little more complex. We will dive into Kernel tricks, and how we can transform higher dimensional space enabling SVM to find non-linear decision boundaries to handle complex relationships in data and improve its results.

Comment down below and share your thoughts about the topics discussed in this blog post.