1 Learning Path
1.1 Phase 1: Foundations (2-3개월, 수학적 기초와 ML 핵심 개념 확립)
Mathematical Prerequisites - Linear Algebra ⭐ - Calculus and Optimization ⭐ - Probability and Statistics ⭐
ML Fundamentals - ML 개념과 학습 유형 - Bias-Variance Tradeoff ⭐ - Data Preprocessing - Feature Engineering ⭐
Basic Algorithms - Linear Regression ⭐ - Logistic Regression ⭐ - Decision Trees ⭐
1.2 Historical Methods
- 2022-12-09, PCA (Principal Component Analysis)
- 2023-02-03, PyTorch Introduction
- 2023-02-03, TensorFlow Introduction
- 2023-02-06, Naive Bayes
- 2023-03-14, 산술·기하·조화평균의 직관적 이해
1.3 Phase 2: Core Algorithms (2-3개월, 주요 알고리즘)
Classification & Regression - SVM ⭐ - KNN - Naive Bayes - Regularization ⭐
Ensemble Methods - Random Forest ⭐ - Gradient Boosting ⭐ - XGBoost ⭐ - LightGBM ⭐
Unsupervised Learning - K-Means ⭐ - Hierarchical Clustering - PCA ⭐ - DBSCAN
1.4 Phase 3: Advanced Methods (2-3개월, 고급 기법)
Model Evaluation - Cross-Validation ⭐ - Performance Metrics ⭐ - Hyperparameter Tuning ⭐
Advanced Topics - Imbalanced Learning ⭐ - Gaussian Processes - Model Interpretability (SHAP) ⭐
Special Applications - Time Series - NLP Basics - Recommender Systems
1.5 Phase 4: ML Engineering (2개월, 프로덕션 배포와 운영)
Production ML - Model Serving - Model Monitoring ⭐ - A/B Testing for Models
Scalability - Distributed Training - Model Compression - Feature Stores
2 Implementation Notes
2.1 Python Libraries
Interpretability - import shap ⭐ - import lime
Production - import mlflow ⭐ - import bentoml
2.2 Recommended Project Structure
ml_project/ ├── data/ │ ├── raw/ │ ├── processed/ │ └── features/ ├── notebooks/ │ ├── eda.ipynb │ ├── modeling.ipynb │ └── evaluation.ipynb ├── src/ │ ├── preprocessing/ │ ├── features/ │ ├── models/ │ └── evaluation/ ├── tests/ ├── configs/ └── models/
3 Foundations
3.1 Mathematical Prerequisites
- Linear Algebra for ML (ML을 위한 선형대수)
- 1111-11-11, Vectors and Matrices (벡터와 행렬)
- 1111-11-11, Matrix Operations (행렬 연산)
- 1111-11-11, Eigenvalues and Eigenvectors (고유값과 고유벡터)
- 1111-11-11, Singular Value Decomposition (특이값 분해)
- Calculus and Optimization (미적분과 최적화)
- 2026-03-23, 함수의 합 vs 합성함수 — Sequential vs Joint 추정
- 1111-11-11, Derivatives and Gradients (도함수와 기울기)
- 1111-11-11, Chain Rule and Backpropagation (연쇄법칙과 역전파)
- 1111-11-11, Convex Optimization (볼록 최적화)
- 1111-11-11, Gradient Descent Methods (경사하강법)
- Probability and Statistics (확률과 통계)
- 1111-11-11, Probability Distributions (확률분포)
- 1111-11-11, Maximum Likelihood Estimation (최대우도추정)
- 1111-11-11, Bayesian Inference (베이지안 추론)
- 1111-11-11, Statistical Testing (통계적 검정)
3.2 ML Fundamentals
- Introduction to Machine Learning (머신러닝 소개)
- 1111-11-11, What is Machine Learning? (머신러닝이란?)
- 1111-11-11, Types of Learning (학습의 종류)
- Supervised Learning (지도학습)
- Unsupervised Learning (비지도학습)
- Semi-supervised Learning (준지도학습)
- Reinforcement Learning (강화학습)
- 1111-11-11, Bias-Variance Tradeoff (편향-분산 트레이드오프) ⭐
- 1111-11-11, Overfitting and Underfitting (과적합과 과소적합)
- 1111-11-11, No Free Lunch Theorem (공짜 점심 정리)
- Data Preprocessing (데이터 전처리)
- 1111-11-11, Data Cleaning (데이터 정제)
- Missing Value Handling (결측치 처리)
- Outlier Detection (이상치 탐지)
- Data Quality Assessment (데이터 품질 평가)
- 1111-11-11, Feature Scaling (특성 스케일링)
- Standardization (표준화)
- Normalization (정규화)
- Robust Scaling (로버스트 스케일링)
- 1111-11-11, Encoding Categorical Variables (범주형 변수 인코딩)
- One-Hot Encoding
- Label Encoding
- Target Encoding
- Embeddings
- 1111-11-11, Data Cleaning (데이터 정제)
- Feature Engineering (특성 공학) ⭐
- 1111-11-11, Feature Creation (특성 생성)
- Polynomial Features (다항 특성)
- Interaction Features (상호작용 특성)
- Domain-specific Features (도메인 특화 특성)
- 1111-11-11, Feature Selection (특성 선택)
- Filter Methods (필터 방법)
- Wrapper Methods (래퍼 방법)
- Embedded Methods (임베디드 방법)
- 1111-11-11, Dimensionality Reduction (차원 축소)
- Principal Component Analysis (주성분 분석)
- Linear Discriminant Analysis (선형판별분석)
- t-SNE and UMAP
- 1111-11-11, Feature Creation (특성 생성)
4 Supervised Learning
4.1 Regression (회귀)
- Regression Analysis (회귀 분석)
- 1111-11-11, Linear Regression (선형 회귀) ⭐
- Ordinary Least Squares (최소자승법)
- Geometric Interpretation (기하학적 해석)
- Assumptions and Diagnostics (가정과 진단)
- Coefficient Interpretation (계수 해석)
- 1111-11-11, Regularization (정규화) ⭐
- Ridge Regression (L2 정규화)
- Lasso Regression (L1 정규화)
- Elastic Net (탄력망)
- Regularization Path (정규화 경로)
- 1111-11-11, Polynomial Regression (다항 회귀)
- 1111-11-11, Generalized Linear Models (일반화 선형모형)
- Logistic Regression (로지스틱 회귀)
- Poisson Regression (포아송 회귀)
- Link Functions (연결함수)
- 1111-11-11, Linear Regression (선형 회귀) ⭐
4.2 Classification (분류)
- Classification Methods (분류 방법)
- 1111-11-11, Logistic Regression (로지스틱 회귀) ⭐
- Binary Classification (이진 분류)
- Multinomial Logistic Regression (다항 로지스틱)
- Odds Ratio Interpretation (오즈비 해석)
- 1111-11-11, Naive Bayes (나이브 베이즈)
- Gaussian Naive Bayes
- Multinomial Naive Bayes
- Bernoulli Naive Bayes
- 1111-11-11, K-Nearest Neighbors (K-최근접 이웃)
- Distance Metrics (거리 척도)
- Curse of Dimensionality (차원의 저주)
- Weighted KNN (가중 KNN)
- 1111-11-11, Support Vector Machines (서포트 벡터 머신) ⭐
- Linear SVM (선형 SVM)
- Kernel Trick (커널 트릭)
- Soft Margin (소프트 마진)
- Multi-class SVM (다중클래스 SVM)
- 1111-11-11, Decision Trees (의사결정 나무) ⭐
- Splitting Criteria (분할 기준)
- Gini Impurity (지니 불순도)
- Information Gain (정보 이득)
- Gain Ratio (이득 비율)
- Pruning (가지치기)
- CART Algorithm (CART 알고리즘)
- Splitting Criteria (분할 기준)
- 1111-11-11, Logistic Regression (로지스틱 회귀) ⭐
4.3 Ensemble Methods (앙상블 방법)
- Ensemble Learning (앙상블 학습) ⭐
- 1111-11-11, Bagging Methods (배깅 방법)
- Random Forest (랜덤 포레스트) ⭐
- Bootstrap Aggregating
- Feature Randomness (특성 무작위성)
- Out-of-Bag Error (OOB 오차)
- Feature Importance (특성 중요도)
- Extra Trees (극단적 무작위 나무)
- Random Forest (랜덤 포레스트) ⭐
- 1111-11-11, Boosting Methods (부스팅 방법) ⭐
- AdaBoost (적응적 부스팅)
- Weak Learner Weighting (약학습기 가중치)
- Sample Reweighting (표본 재가중)
- Gradient Boosting (경사 부스팅) ⭐
- Gradient Boosting Machines (GBM)
- Loss Function Optimization (손실함수 최적화)
- Learning Rate and Trees (학습률과 나무 수)
- XGBoost ⭐
- Regularization Terms (정규화 항)
- System Optimization (시스템 최적화)
- Handling Missing Values (결측치 처리)
- LightGBM ⭐
- Histogram-based Algorithm (히스토그램 기반)
- Leaf-wise Growth (리프 단위 성장)
- Categorical Feature Support (범주형 특성 지원)
- CatBoost
- Ordered Boosting (순서 부스팅)
- Native Categorical Handling (네이티브 범주형 처리)
- AdaBoost (적응적 부스팅)
- 1111-11-11, Stacking (스태킹)
- Meta-learner (메타학습기)
- Blending (블렌딩)
- 1111-11-11, Voting Classifiers (투표 분류기)
- Hard Voting (하드 투표)
- Soft Voting (소프트 투표)
- 1111-11-11, Bagging Methods (배깅 방법)
5 Unsupervised Learning
5.1 Clustering (군집화)
- Clustering Methods (군집화 방법)
- 1111-11-11, K-Means Clustering (K-평균 군집화) ⭐
- Lloyd’s Algorithm (로이드 알고리즘)
- Elbow Method (엘보우 방법)
- K-Means++ Initialization (K-평균++ 초기화)
- Mini-Batch K-Means (미니배치 K-평균)
- 1111-11-11, Hierarchical Clustering (계층적 군집화)
- Agglomerative Clustering (병합 군집화)
- Divisive Clustering (분할 군집화)
- Linkage Methods (연결 방법)
- Dendrogram (덴드로그램)
- 1111-11-11, DBSCAN (밀도 기반 군집화)
- Density-Based Clustering (밀도 기반)
- Core Points and Border Points (핵심점과 경계점)
- Handling Noise (잡음 처리)
- 1111-11-11, Gaussian Mixture Models (가우시안 혼합 모형)
- Expectation-Maximization (EM 알고리즘)
- Soft Clustering (소프트 군집화)
- Model Selection (모형 선택)
- 1111-11-11, Other Clustering Methods (기타 군집화 방법)
- OPTICS
- Mean Shift
- Spectral Clustering (스펙트럼 군집화)
- 1111-11-11, K-Means Clustering (K-평균 군집화) ⭐
5.2 Dimensionality Reduction (차원 축소)
- Dimensionality Reduction Methods (차원 축소 방법)
- 1111-11-11, Principal Component Analysis (주성분 분석) ⭐
- Eigenvalue Decomposition (고유값 분해)
- Variance Explained (설명된 분산)
- Scree Plot (스크리 플롯)
- Kernel PCA (커널 PCA)
- 1111-11-11, Linear Discriminant Analysis (선형판별분석)
- Fisher’s Linear Discriminant (피셔의 선형판별)
- Between-class vs. Within-class Variance
- 1111-11-11, t-SNE (t-분포 확률적 임베딩)
- Perplexity Parameter (복잡도 매개변수)
- KL Divergence Optimization (KL 발산 최적화)
- Visualization Considerations (시각화 고려사항)
- 1111-11-11, UMAP (균일 다양체 근사)
- Topological Data Analysis (위상 데이터 분석)
- Comparison with t-SNE
- 1111-11-11, Autoencoders (오토인코더)
- Vanilla Autoencoder
- Denoising Autoencoder (잡음제거 오토인코더)
- Variational Autoencoder (VAE)
- 1111-11-11, Principal Component Analysis (주성분 분석) ⭐
5.3 Anomaly Detection (이상 탐지)
- Anomaly Detection Methods (이상 탐지 방법)
- 1111-11-11, Statistical Methods (통계적 방법)
- Z-score Method
- Interquartile Range (IQR)
- Mahalanobis Distance (마할라노비스 거리)
- 1111-11-11, Isolation Forest (고립 숲)
- Random Partitioning (무작위 분할)
- Anomaly Score (이상 점수)
- 1111-11-11, One-Class SVM (단일클래스 SVM)
- 1111-11-11, Local Outlier Factor (LOF)
- Local Density Estimation (지역 밀도 추정)
- 1111-11-11, Statistical Methods (통계적 방법)
5.4 Association Rule Learning (연관 규칙 학습)
- Association Rules (연관 규칙)
- 1111-11-11, Apriori Algorithm (Apriori 알고리즘)
- Support, Confidence, Lift (지지도, 신뢰도, 향상도)
- Frequent Itemsets (빈발 항목집합)
- 1111-11-11, FP-Growth
- 1111-11-11, Eclat Algorithm
- 1111-11-11, Apriori Algorithm (Apriori 알고리즘)
6 Model Evaluation and Selection
6.1 Performance Metrics (성능 지표)
- Evaluation Metrics (평가 지표) ⭐
- 1111-11-11, Classification Metrics (분류 지표)
- Confusion Matrix (혼동 행렬)
- Accuracy, Precision, Recall (정확도, 정밀도, 재현율)
- F1-Score and F-beta Score
- ROC Curve and AUC (ROC 곡선과 AUC)
- Precision-Recall Curve (정밀도-재현율 곡선)
- Matthews Correlation Coefficient (MCC)
- Cohen’s Kappa (코헨의 카파)
- 1111-11-11, Regression Metrics (회귀 지표)
- Mean Squared Error (MSE, 평균제곱오차)
- Root Mean Squared Error (RMSE)
- Mean Absolute Error (MAE, 평균절대오차)
- R-squared and Adjusted R-squared
- Mean Absolute Percentage Error (MAPE)
- 1111-11-11, Ranking Metrics (순위 지표)
- Mean Average Precision (MAP)
- Normalized Discounted Cumulative Gain (NDCG)
- 1111-11-11, Clustering Metrics (군집화 지표)
- Silhouette Score (실루엣 점수)
- Davies-Bouldin Index
- Calinski-Harabasz Index
- 1111-11-11, Classification Metrics (분류 지표)
6.2 Model Validation (모델 검증)
- Validation Strategies (검증 전략) ⭐
- 1111-11-11, Train-Test Split (학습-테스트 분할)
- 1111-11-11, Cross-Validation (교차 검증) ⭐
- K-Fold Cross-Validation (K-겹 교차검증)
- Stratified K-Fold (층화 K-겹)
- Leave-One-Out (LOOCV)
- Time Series Split (시계열 분할)
- 1111-11-11, Bootstrap Methods (부트스트랩 방법)
- Bootstrap Sampling (부트스트랩 샘플링)
- Out-of-Bag Validation
- .632 Bootstrap
6.3 Hyperparameter Tuning (하이퍼파라미터 튜닝)
- Hyperparameter Optimization (하이퍼파라미터 최적화) ⭐
- 1111-11-11, Grid Search (그리드 탐색)
- 1111-11-11, Random Search (무작위 탐색)
- 1111-11-11, Bayesian Optimization (베이지안 최적화)
- Gaussian Process (가우시안 과정)
- Acquisition Functions (획득 함수)
- 1111-11-11, Hyperband and ASHA
- 1111-11-11, Automated Machine Learning (AutoML)
- Auto-sklearn
- TPOT
- H2O AutoML
6.4 Model Selection (모델 선택)
- Model Selection Strategies (모델 선택 전략)
- 1111-11-11, Information Criteria (정보 기준)
- AIC (Akaike Information Criterion)
- BIC (Bayesian Information Criterion)
- MDL (Minimum Description Length)
- 1111-11-11, Model Comparison (모델 비교)
- Statistical Tests for Model Comparison
- McNemar’s Test
- Paired t-test
- 1111-11-11, Baseline Models (베이스라인 모델)
- 1111-11-11, Information Criteria (정보 기준)
7 Advanced Topics
7.1 Imbalanced Learning (불균형 학습)
- Handling Imbalanced Data (불균형 데이터 처리) ⭐
- 1111-11-11, Resampling Techniques (리샘플링 기법)
- Oversampling (과대표집)
- Random Oversampling
- SMOTE (Synthetic Minority Over-sampling)
- ADASYN
- Undersampling (과소표집)
- Random Undersampling
- Tomek Links
- NearMiss
- Combined Methods (결합 방법)
- Oversampling (과대표집)
- 1111-11-11, Cost-Sensitive Learning (비용 민감 학습)
- Class Weights (클래스 가중치)
- Cost Matrix (비용 행렬)
- 1111-11-11, Ensemble Methods for Imbalanced Data
- Balanced Random Forest
- EasyEnsemble
- RUSBoost
- 1111-11-11, Resampling Techniques (리샘플링 기법)
7.2 Probabilistic Models (확률 모형)
- Probabilistic Machine Learning (확률적 머신러닝)
- 1111-11-11, Bayesian Methods (베이지안 방법)
- Bayesian Linear Regression
- Bayesian Logistic Regression
- Prior Selection (사전분포 선택)
- 1111-11-11, Gaussian Processes (가우시안 과정) ⭐
- Kernel Functions (커널 함수)
- Predictive Distribution (예측 분포)
- Uncertainty Quantification (불확실성 정량화)
- 1111-11-11, Hidden Markov Models (은닉 마르코프 모형)
- Forward-Backward Algorithm
- Viterbi Algorithm
- 1111-11-11, Probabilistic Graphical Models (확률적 그래프 모형)
- Bayesian Networks (베이지안 네트워크)
- Markov Random Fields (마르코프 무작위장)
- 1111-11-11, Bayesian Methods (베이지안 방법)
7.3 Online Learning (온라인 학습)
- Online and Incremental Learning (온라인 및 증분 학습)
- 1111-11-11, Stochastic Gradient Descent (확률적 경사하강)
- 1111-11-11, Online Learning Algorithms (온라인 학습 알고리즘)
- Perceptron
- Passive-Aggressive Algorithms
- Online Gradient Descent
- 1111-11-11, Concept Drift Detection (개념 표류 탐지)
- ADWIN
- DDM (Drift Detection Method)
- 1111-11-11, Streaming Algorithms (스트리밍 알고리즘)
7.4 Semi-Supervised Learning (준지도학습)
- Semi-Supervised Methods (준지도 방법)
- 1111-11-11, Self-Training (자기학습)
- 1111-11-11, Co-Training (공동학습)
- 1111-11-11, Label Propagation (레이블 전파)
- 1111-11-11, Pseudo-Labeling (의사 레이블링)
7.5 Transfer Learning (전이학습)
- Transfer Learning Methods (전이학습 방법)
- 1111-11-11, Domain Adaptation (도메인 적응)
- 1111-11-11, Fine-tuning (미세조정)
- 1111-11-11, Multi-task Learning (다중작업 학습)
7.6 Interpretability and Explainability (해석가능성)
- Model Interpretability (모델 해석가능성) ⭐
- 1111-11-11, Feature Importance (특성 중요도)
- Permutation Importance (순열 중요도)
- Drop-Column Importance
- 1111-11-11, Partial Dependence Plots (부분 의존 플롯)
- 1111-11-11, SHAP (SHapley Additive exPlanations) ⭐
- Shapley Values (섀플리 값)
- TreeSHAP
- KernelSHAP
- 1111-11-11, LIME (Local Interpretable Model-agnostic Explanations)
- 1111-11-11, Anchors and Counterfactuals (앵커와 반사실)
- 1111-11-11, Feature Importance (특성 중요도)
7.7 Fairness and Bias (공정성과 편향)
- Fairness in Machine Learning (머신러닝의 공정성)
- 1111-11-11, Bias Detection (편향 탐지)
- Demographic Parity (인구통계적 동등성)
- Equal Opportunity (동등한 기회)
- Disparate Impact (차별적 영향)
- 1111-11-11, Bias Mitigation (편향 완화)
- Pre-processing Methods (전처리 방법)
- In-processing Methods (학습 중 방법)
- Post-processing Methods (후처리 방법)
- 1111-11-11, Fairness Metrics (공정성 지표)
- 1111-11-11, Bias Detection (편향 탐지)
8 ML Engineering
8.1 Production ML (프로덕션 ML)
- ML in Production (프로덕션 머신러닝)
- 1111-11-11, Model Serialization (모델 직렬화)
- Pickle and Joblib
- ONNX
- PMML
- 1111-11-11, Model Serving (모델 서빙)
- REST API
- gRPC
- Batch Prediction (배치 예측)
- 1111-11-11, Model Monitoring (모델 모니터링) ⭐
- Performance Monitoring (성능 모니터링)
- Data Drift Detection (데이터 표류 탐지)
- Model Drift Detection (모델 표류 탐지)
- Feature Drift (특성 표류)
- 1111-11-11, Model Versioning (모델 버전관리)
- 1111-11-11, A/B Testing for Models (모델 A/B 테스트)
- 1111-11-11, Model Serialization (모델 직렬화)
8.2 Scalability (확장성)
- Scaling Machine Learning (머신러닝 확장)
- 1111-11-11, Distributed Training (분산 학습)
- Data Parallelism (데이터 병렬화)
- Model Parallelism (모델 병렬화)
- 1111-11-11, Large-scale ML Frameworks (대규모 ML 프레임워크)
- Apache Spark MLlib
- Dask-ML
- Ray
- 1111-11-11, Feature Stores (특성 저장소)
- 1111-11-11, Model Compression (모델 압축)
- Quantization (양자화)
- Pruning (가지치기)
- Knowledge Distillation (지식 증류)
- 1111-11-11, Distributed Training (분산 학습)
9 Special Applications
9.1 ML for Longitudinal Data (종단 데이터 ML)
- 2026-03-07, ML Overview: RSF, XGBoost, HMM, Lasso
- 2026-03-08, Random Survival Forest (RSF)
- 2026-03-08, XGBoost + 시간 피처 공학 (Temporal Feature Engineering)
- 2026-03-08, Hidden Markov Model (HMM) for Longitudinal Data
- 2026-03-08, Lasso / Elastic Net / glmmLasso — 종단 데이터의 변수 선택
- 통계적 기초 (LMM, GLMM, GEE): Statistics — Mixed Models & Longitudinal Analysis
- 딥러닝/강화학습 접근: Deep Learning — DL/RL for Longitudinal Data
9.2 Time Series Analysis (시계열 분석)
- Time Series Methods (시계열 방법)
- 1111-11-11, Classical Methods (고전적 방법)
- ARIMA Models
- Exponential Smoothing (지수 평활)
- Seasonal Decomposition (계절 분해)
- 1111-11-11, ML for Time Series (시계열 ML)
- Feature Engineering for Time Series
- Lagged Features (지연 특성)
- Rolling Statistics (이동 통계량)
- 1111-11-11, Forecasting (예측)
- Prophet
- LightGBM for Time Series
- XGBoost for Time Series
- 1111-11-11, Classical Methods (고전적 방법)
9.3 Natural Language Processing (자연어처리)
- 2026-03-27, Fine-tuning 학습 데이터 샘플 수 추정 — KoBERT 14-class 분류 사례
- 2026-04-15, Train/Val/Test 분할 비율 선택 — 데이터 규모별 실무 가이드
- Classical NLP with ML (머신러닝 기반 NLP)
- 1111-11-11, Text Preprocessing (텍스트 전처리)
- Tokenization (토큰화)
- Stemming and Lemmatization (어간추출과 표제어추출)
- Stop Words Removal (불용어 제거)
- 1111-11-11, Text Representation (텍스트 표현)
- Bag of Words (단어 가방)
- TF-IDF
- Word Embeddings (단어 임베딩)
- Word2Vec
- GloVe
- FastText
- 1111-11-11, Text Classification (텍스트 분류)
- Naive Bayes for Text
- SVM for Text
- 1111-11-11, Topic Modeling (토픽 모델링)
- Latent Dirichlet Allocation (LDA)
- Non-negative Matrix Factorization (NMF)
- 1111-11-11, Text Preprocessing (텍스트 전처리)
9.4 Computer Vision (컴퓨터 비전)
- Classical Computer Vision with ML (머신러닝 기반 컴퓨터 비전)
- 1111-11-11, Image Features (이미지 특성)
- HOG (Histogram of Oriented Gradients)
- SIFT (Scale-Invariant Feature Transform)
- SURF
- 1111-11-11, Image Classification (이미지 분류)
- SVM for Images
- Random Forest for Images
- 1111-11-11, Object Detection Basics (객체 탐지 기초)
- Sliding Window (슬라이딩 윈도우)
- Image Pyramids (이미지 피라미드)
- 1111-11-11, Image Features (이미지 특성)
9.5 Recommender Systems (추천 시스템)
- Recommendation Algorithms (추천 알고리즘)
- 1111-11-11, Collaborative Filtering (협업 필터링)
- User-based CF (사용자 기반)
- Item-based CF (아이템 기반)
- Matrix Factorization (행렬 분해)
- SVD (특이값 분해)
- ALS (교대 최소자승)
- 1111-11-11, Content-Based Filtering (내용 기반 필터링)
- 1111-11-11, Hybrid Methods (하이브리드 방법)
- 1111-11-11, Evaluation Metrics (평가 지표)
- Precision@K and Recall@K
- MAP and NDCG
- Coverage and Diversity
- 1111-11-11, Collaborative Filtering (협업 필터링)
10 Key Resources
10.1 Books
- Foundations:
- Bishop (2006). “Pattern Recognition and Machine Learning”
- Murphy (2012). “Machine Learning: A Probabilistic Perspective”
- Hastie, Tibshirani, Friedman (2009). “The Elements of Statistical Learning” ⭐
- Practical:
- Géron (2019). “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”
- Kuhn and Johnson (2013). “Applied Predictive Modeling”
- Advanced:
- Raschka and Mirjalili (2019). “Python Machine Learning”
- Chollet (2021). “Deep Learning with Python”
10.2 Online Courses
- Andrew Ng - Machine Learning (Coursera) ⭐
- Fast.ai - Practical Deep Learning for Coders
- Stanford CS229 - Machine Learning
10.3 Papers
- Random Forests: Breiman (2001). “Random Forests”
- XGBoost: Chen and Guestrin (2016). “XGBoost: A Scalable Tree Boosting System”
- SHAP: Lundberg and Lee (2017). “A Unified Approach to Interpreting Model Predictions”