Applied Machine Learning
Implementing fundamental machine learning algorithms
Below is summarized six small projects completed as part of a machine learning course with EPFL. Each project focuses on implementing a fundamental algorithm or technique in machine learning, and for this course, the implementations were evaluated in MATLAB.
1. Principal Component Analysis (PCA)
- Objective: Implement PCA for dimensionality reduction and apply it to real-world datasets.
- Applications:
- Dimensionality reduction on the Fisher-Iris dataset.
- Image compression.
- Key Features:
- Covariance matrix calculation and eigenvalue decomposition.
- Projection to lower-dimensional spaces and data reconstruction.
- Explained variance analysis for optimal component selection.
2. K-Means Clustering
- Objective: Implement and evaluate the K-Means clustering algorithm.
- Applications:
- Clustering a high-dimensional dataset.
- Developing a recommendation system.
- Key Features:
- Centroid initialization and assignment.
- Metrics such as RSS, AIC, and BIC for model evaluation.
- Visualization of cluster boundaries and optimization of the number of clusters.
3. K-Nearest Neighbors (KNN)
- Objective: Implement KNN for classification.
- Applications:
- Testing on synthetic and real-world datasets.
- Key Features:
- Distance computation using L1, L2, and Linf norms.
- Majority voting mechanism for classification.
- Evaluation of classifier performance with accuracy metrics.
4. Gaussian Mixture Models (GMM)
- Objective: Develop the GMM algorithm using Expectation-Maximization (EM) for parameter estimation.
- Applications:
- Clustering datasets with multi-modal distributions.
- Key Features:
- Implementation of full, diagonal, and isotropic covariance matrices.
- Model evaluation using AIC and BIC metrics.
- Visualization of Gaussian components.
5. GMM Applications
- Objective: Apply GMMs to classification, resampling, and regression tasks.
- Applications:
- Multi-class classification using GMMs.
- Generating synthetic data points via resampling.
- Gaussian Mixture Regression (GMR) for non-linear regression problems.
- Key Features:
- Supervised classification with GMMs.
- Data augmentation through GMM-based resampling.
- Regression using the conditional density of GMMs.
6. Neural Networks (NN)
- Objective: Build a foundational neural network for binary classification.
- Applications:
- Training on synthetic datasets with visualization of decision boundaries.
- Key Features:
- Implementation of activation functions (Sigmoid, ReLU, Leaky ReLU).
- Weight initialization strategies for stable convergence.
- Mini-batch learning and visualization of training progress.
How to Use
- GitHub Repository here.
- Each project folder contains:
- MATLAB scripts for implementation (.m files).
- Dataset files for testing and evaluation.
- Instructions for running the code and visualizing results.