Machine Learning

WHAT IS MACHINE LEARNING?
Machine learning is a field of AI that enables systems to learn patterns from data and improve performance on a task without being explicitly programmed.

WHAT ARE THE MAIN TYPES OF MACHINE LEARNING?
Supervised learning (labeled data)
Unsupervised learning (unlabeled data)
Reinforcement learning (learning via rewards and penalties)

WHAT IS THE DIFFERENCE BETWEEN AI, ML, AND DEEP LEARNING?
AI is the broad concept of intelligent systems, ML is a subset of AI focused on learning from data, and deep learning is a subset of ML using neural networks with many layers.

WHAT IS SUPERVISED LEARNING?
Supervised learning uses labeled input-output pairs to train models for prediction or classification.

WHAT IS UNSUPERVISED LEARNING?
Unsupervised learning identifies hidden patterns or structures in unlabeled data, such as clustering and dimensionality reduction.

WHAT IS OVERFITTING?
Overfitting occurs when a model learns noise in the training data and performs poorly on unseen data.

HOW DO YOU PREVENT OVERFITTING?
By using regularization, cross-validation, early stopping, pruning, and increasing training data.

WHAT IS UNDERFITTING?
Underfitting occurs when a model is too simple to capture underlying patterns in the data.

EXPLAIN BIAS–VARIANCE TRADE-OFF.
Bias measures error from overly simple models, while variance measures sensitivity to training data. The trade-off balances these to minimize total error.

WHAT IS TRAIN-TEST SPLIT?
It divides data into training and testing sets to evaluate model performance on unseen data.

WHAT IS CROSS-VALIDATION?
Cross-validation repeatedly splits data into training and validation sets to obtain a reliable performance estimate.

DIFFERENCE BETWEEN REGRESSION AND CLASSIFICATION?
Regression predicts continuous outcomes, while classification predicts discrete classes.

WHAT IS LINEAR REGRESSION?
A model that predicts a continuous variable using a linear relationship between inputs and output.

WHAT IS LOGISTIC REGRESSION?
A classification algorithm that models the probability of a binary outcome using a logistic function.

WHAT EVALUATION METRICS ARE USED FOR REGRESSION?
MAE, MSE, RMSE, and R².

WHAT EVALUATION METRICS ARE USED FOR CLASSIFICATION?
Accuracy, Precision, Recall, F1-score, ROC-AUC.

WHAT IS A DECISION TREE?
A tree-based model that splits data based on feature values to make predictions.

WHAT IS RANDOM FOREST?
An ensemble method that builds multiple decision trees and averages their predictions to reduce overfitting.

WHAT IS GRADIENT BOOSTING?
An ensemble technique where models are built sequentially, each correcting errors of the previous model.

DIFFERENCE BETWEEN BAGGING AND BOOSTING?
Bagging builds models independently to reduce variance, while boosting builds models sequentially to reduce bias.

WHAT IS FEATURE ENGINEERING?
The process of transforming raw data into meaningful features that improve model performance.

HOW DO YOU HANDLE MISSING DATA?
By removing records, imputing values, or using models that handle missing values automatically.

HOW DO YOU HANDLE CATEGORICAL VARIABLES?
Using one-hot encoding, label encoding, or target encoding.

WHAT IS NORMALIZATION VS STANDARDIZATION?
Normalization scales values between 0 and 1, while standardization scales data to zero mean and unit variance.

WHAT IS CLASS IMBALANCE?
When one class appears much more frequently than others in the dataset.

HOW DO YOU HANDLE IMBALANCED DATA?
Using resampling techniques, class weighting, or suitable evaluation metrics.

WHAT IS MODEL INTERPRETABILITY?
The ability to understand how a model makes predictions.

WHAT ARE FEATURE IMPORTANCE METHODS?
Techniques like permutation importance, SHAP, and LIME that explain feature contributions.

WHAT IS DATA DRIFT?
When the statistical properties of input data change over time, affecting model performance.

WHAT STEPS ARE INVOLVED IN BUILDING AN ML MODEL?
Problem definition, data collection, preprocessing, feature engineering, model training, evaluation, deployment, and monitoring.

 

× Popup