AI/ML Demystified – Part 2: Classification Vs. Regression

In Part 1 of this series, we went through the three core types of machine learning: Supervised, Unsupervised, and Reinforcement Learning.

Now, let’s zoom in on the two most common types of supervised learning: Classification and Regression.

Both are used to make predictions, but they answer different kinds of questions.

Classification

Classification is about predicting a category. The output is discrete – it falls into classes or labels.

Sample Python Code Snippet:

from sklearn.linear_model import LogisticRegression

# Classify whether an email is spam or not
model = LogisticRegression()
model.fit(X_train, y_train)

Real-world use cases:

  • Spam vs. Not Spam (Email filters).
  • Will a loan default? Yes or No.
  • Medical diagnosis: Cancer or No Cancer.
  • Fraud detection: Fraudulent or Legitimate.

Output Example:

['spam','not spam', 'spam']

Regression

Regression is used when you want to predict a numbera continuous value.

Sample Python Code Snippet:

from sklearn.linear_model import LinearRegression

# Predict house price based on size and location
model = LinearRegression()
model.fit(X_train, y_train)

Real-world use cases:
  • Predicting house or stock prices.
  • Forecasting sales.
  • Estimating delivery time.
  • Predicting customer lifetime value.

Output Example:

[232000, 189500, 211750]


Summary:
AspectClassificationRegression
Output TypeDiscrete (categories)Continuous (numbers)
Example TargetSpam/Not SpamHouse price
Common AlgorithmLogistic Regression, SVMLinear Regression, XGBoostRegressor
Real-world use casesFraud Detection, Disease DiagnosisPrice Prediction, Sales Forecasting.

Bonus Concepts: Overfitting & Underfitting

These two issues can affect both classification and regression models.

Overfitting:

The model learns the data too well that includes its noise. So, it performs poorly on new data.

For example, think of a student who memorizes practice questions but fails in the real exam.

Underfitting:

The model is too simple and misses patterns in the data. So, it performs badly even on training data.

For example, think of a student who did not study enough and does not understand the subject at all.

The solution is balance model complexity and train/test performance using cross-validation, regularization, or hyperparameter tuning.

What’s Next?

In Part 3, we will dig into Feature Engineering, Feature Selection, and Cross-validation – the secret weapons that help your model perform better.

Leave a Reply

Your email address will not be published. Required fields are marked *