← Home
Wine KNN
📅 2023 · Tech: Statistics, Machine Learning, Python
In this data science project, I developed a classification model using
the K-Nearest Neighbors (KNN) algorithm—entirely from scratch. This
was a deep dive into the core mechanics of distance-based
classification, focused on clarity, control, and precision at each
step of the ML pipeline.
Highlights
-
Algorithm Implementation: Crafted the full KNN
algorithm manually without libraries to build conceptual
understanding and flexibility.
-
Data Analysis: Conducted EDA, cleaned the dataset,
and engineered features for better decision boundaries.
-
Nested Cross-Validation: Employed nested CV for
robust model evaluation and hyperparameter tuning.
-
Noise Resilience: Introduced controlled noise to
test model robustness, simulating real-world scenarios with
imperfect data.
Outcomes
This project delivered a fully custom KNN classifier with structured
evaluation, optimized performance, and real-world robustness
considerations. It served as a strong foundation for future
classification projects and applied statistical learning.