Kickstart ML with Python snippets

Explore Support vector machines (SVMs) algorithm basic concepts

Support Vector Machines (SVMs) are supervised learning models used for classification, regression, and outlier detection. SVMs are effective in high-dimensional spaces and are used when the number of dimensions exceeds the number of samples.

  1. Hyperplane:

    • In SVM, the goal is to find the optimal hyperplane that separates the data points of different classes. In a two-dimensional space, this hyperplane is a line; in higher dimensions, it is a plane or hyperplane.
  2. Support Vectors:

    • Support vectors are the data points that are closest to the hyperplane and influence its position and orientation. These points are critical for defining the hyperplane.
  3. Margin:

    • The margin is the distance between the hyperplane and the nearest data points from either class. SVM aims to maximize this margin to ensure the model's robustness.
  4. Kernel Trick:

    • SVMs use kernel functions to transform the data into a higher-dimensional space where it is easier to find a separating hyperplane. Common kernels include linear, polynomial, radial basis function (RBF), and sigmoid.

Steps in SVM Algorithm

  1. Choose a kernel function and its parameters.
  2. Map the data to a higher-dimensional space using the kernel function.
  3. Find the optimal hyperplane that maximizes the margin between classes in this higher-dimensional space.
  4. Classify new data points based on which side of the hyperplane they fall on.

Practical Example in Python

Let's implement an SVM for a classification task using Python and the scikit-learn library.

Step-by-Step Example

  1. Import Libraries:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
  1. Load and Prepare Data:
# Load the Iris dataset
iris = datasets.load_iris()
X, y = iris.data, iris.target

# For simplicity, we will use only two classes and two features
X = X[y != 2, :2]
y = y[y != 2]

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Standardize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
  1. Train the SVM Model:
# Initialize the SVM classifier with a linear kernel
svm = SVC(kernel='linear')

# Fit the model
svm.fit(X_train, y_train)
  1. Make Predictions:
# Predict on the test set
y_pred = svm.predict(X_test)
  1. Evaluate the Model:
# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

# Print confusion matrix
conf_matrix = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(conf_matrix)

# Print classification report
class_report = classification_report(y_test, y_pred)
print("Classification Report:")
print(class_report)
  1. Visualize the Decision Boundary:
# Create a mesh to plot the decision boundary
h = .02  # step size in the mesh
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

# Plot the decision boundary
Z = svm.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.8, cmap=plt.cm.coolwarm)

# Plot the training points
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', cmap=plt.cm.coolwarm)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('SVM Decision Boundary')
plt.show()
  1. Data Preparation:

    • We load the Iris dataset and select only two classes and two features for simplicity.
    • We split the data into training and testing sets and standardize the features to ensure that all features contribute equally to the distance metric.
  2. Model Training:

    • We initialize the SVM classifier with a linear kernel using SVC(kernel='linear').
    • We fit the model to the training data using svm.fit(X_train, y_train).
  3. Making Predictions:

    • We use the trained model to predict the class labels for the test set using svm.predict(X_test).
  4. Model Evaluation:

    • We calculate the accuracy of the model, which is the proportion of correctly predicted instances.
    • We generate a confusion matrix to see how well the model performs for each class.
    • We print a classification report, which includes precision, recall, and F1-score for each class.
  5. Visualizing the Decision Boundary:

    • We create a mesh grid to plot the decision boundary of the SVM.
    • We use plt.contourf to plot the decision regions and plt.scatter to plot the data points.

Practical Tips

  1. Choosing the Kernel:

    • Experiment with different kernels (linear, polynomial, RBF) to see which one works best for your data.
    svm = SVC(kernel='rbf', gamma='scale')# RBF kernel with default gamma
  2. Hyperparameter Tuning:

    • Use grid search or cross-validation to find the best hyperparameters for your SVM model.
    from sklearn.model_selection import GridSearchCV
    
    param_grid = {'C': [0.1,1,10,100],'gamma': [1,0.1,0.01,0.001],'kernel': ['rbf']}
    grid = GridSearchCV(SVC(), param_grid, refit=True, verbose=2)
    grid.fit(X_train, y_train)print(grid.best_params_)
  3. Scaling Features:

    • Always scale your features before applying SVM, as the algorithm is sensitive to feature scales.
  4. Handling Imbalanced Data:

    • If your data is imbalanced, consider using techniques like SMOTE (Synthetic Minority Over-sampling Technique) or adjusting class weights in the SVM.
    svm = SVC(kernel='linear', class_weight='balanced')

Back to Kickstart ML with Python cookbook page