Support Vector Machine is one of the popular classification algorithm used in machine learning
. We have already discussed different types classification algorithms, if you want a quick revision check the blog and refresh your knowledge.Let's dive deep into the Support vector machine classification
It's a supervised machine learning classification algorithm which classifies the data points into distinct classes. Its is done by considering a hyper plane for separating the data points into different classes.
What is a hyperplane?? A hyperplane is an n-1 dimensional Euclidean space which classifies the n dimension Euclidean space into different classes,That means if its a two dimensional space the hyper plane will be a line and in 3D space hyperplane is a plane.Usually hyperplane are placed in such a way that to maximize the margin, that is the distance between support vectors(these are the points which are closer to the hyperplane which influence the position and alignment of hyperplane ) and hyperplane is as far as possible.
If our data points are not linearly separable we cannot separate it using a simple line like above. So we should have to do some other techniques for this. For this condition we project our data points into higher dimension. In higher dimension the datas are in different shape and hence linearly separable. After separating into classes, we can project it back into normal dimension. In the following example, we are using a non linear separable data. We project it to the 3D and separate it using a hyperplane and then project back to the 2D.
Let's implement Support vector classifier using python
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import sklearn
dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[:, [1, 2, 3]].values
y = dataset.iloc[:, -1].values
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
X[:,0] = le.fit_transform(X[:,0])
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 0)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
from sklearn.svm import SVC
classifier = SVC(kernel = 'linear', random_state = 0)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
from sklearn.metrics import confusion_matrix,accuracy_score
cm = confusion_matrix(y_test, y_pred)
ac = accuracy_score(y_test,y_pred)
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import sklearn
# Importing the dataset
dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[:, [1,2, 3]].values
y = dataset.iloc[:, -1].values
#encoding the data
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
X[:,0] = le.fit_transform(X[:,0])
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# Training the SVM model on the Training set
from sklearn.svm import SVC
classifier = SVC(kernel = 'linear', random_state = 0)
classifier.fit(X_train, y_train)
# Predicting the Test set results
y_pred = classifier.predict(X_test)
# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix,accuracy_score
cm = confusion_matrix(y_test, y_pred)
ac = accuracy_score(y_test,y_pred)
0 Comments