AI/ML Demystified – Part 2: Classification Vs. Regression

In Part 1 of this series, we went through the three core types of machine learning: Supervised, Unsupervised, and Reinforcement Learning.

Now, let’s zoom in on the two most common types of supervised learning: Classification and Regression.

Both are used to make predictions, but they answer different kinds of questions.

Classification

Classification is about predicting a category. The output is discrete – it falls into classes or labels.

Sample Python Code Snippet:

from sklearn.linear_model import LogisticRegression

# Classify whether an email is spam or not
model = LogisticRegression()
model.fit(X_train, y_train)

Real-world use cases:

Spam vs. Not Spam (Email filters).
Will a loan default? Yes or No.
Medical diagnosis: Cancer or No Cancer.
Fraud detection: Fraudulent or Legitimate.

Output Example:

['spam','not spam', 'spam']

Regression

Regression is used when you want to predict a number – a continuous value.

Sample Python Code Snippet:

from sklearn.linear_model import LinearRegression

# Predict house price based on size and location
model = LinearRegression()
model.fit(X_train, y_train)

Real-world use cases:

Predicting house or stock prices.
Forecasting sales.
Estimating delivery time.
Predicting customer lifetime value.

Output Example:

[232000, 189500, 211750]


Summary:

Aspect	Classification	Regression
Output Type	Discrete (categories)	Continuous (numbers)
Example Target	Spam/Not Spam	House price
Common Algorithm	Logistic Regression, SVM	Linear Regression, XGBoostRegressor
Real-world use cases	Fraud Detection, Disease Diagnosis	Price Prediction, Sales Forecasting.

Bonus Concepts: Overfitting & Underfitting

These two issues can affect both classification and regression models.

Overfitting:

The model learns the data too well that includes its noise. So, it performs poorly on new data.

For example, think of a student who memorizes practice questions but fails in the real exam.

Underfitting:

The model is too simple and misses patterns in the data. So, it performs badly even on training data.

For example, think of a student who did not study enough and does not understand the subject at all.

The solution is balance model complexity and train/test performance using cross-validation, regularization, or hyperparameter tuning.

What’s Next?

In Part 3, we will dig into Feature Engineering, Feature Selection, and Cross-validation – the secret weapons that help your model perform better.

Code to Cognition

AI/ML Demystified – Part 2: Classification Vs. Regression

Related Posts

AI/ML Demystified – Part 5: Building Smarter Models

AI/ML Demystified – Part 4: Measuring Model Success

AI/ML Demystified – Part 3: Making Data work for you