Semi-Supervised Support Vector Machines (S3VM)
Python implementation of semi-supervised SVM algorithms for binary classification using both labeled and unlabeled data.
Algorithms
- NewtonUTSVM: Newton-based method using two parallel planes
- S3VM Constrained: Constrained formulation with MIP
- S3VM Unconstrained: Unconstrained smooth optimization
- MC_NDCC: Synthetic dataset generator
- SRMSVM: 1-norm SVM for sparse solutions
Requirements
- Python 3.x
- numpy
- pandas
- scipy
- scikit-learn
Installation
pip install numpy pandas scipy scikit-learn
Usage
Run Experiments
cd code
python main.py
Basic Example
from NewtonUTSVM import NewtonUTSVM
from utils import load_dataset, calculate_accuracy
# Load dataset
X, y, x_test, y_test, U = load_dataset("data/diabetes.csv")
y = y.reshape(y.shape[0], 1)
# Train
C = [0.1, 0.1, 0.1, 0.1, 0.3, 0.3]
model = NewtonUTSVM(X, y, U, C=C, eps=1e-4)
model.fit()
# Predict
model.predict(x_test=x_test)
predictions = model.get_preds()
accuracy = calculate_accuracy(y_test, predictions)
print(f"Accuracy: {accuracy}%")
Datasets
Real-world: diabetes, ionosphere, musk, sonar, gender, wpbc
Synthetic: NDCC datasets (100_10, 100_100, 500_10, 1000_10)
Project Structure
code/
├── main.py # Main script
├── NewtonUTSVM.py # NewtonUTSVM
├── S3VM_constrained.py # Constrained S3VM
├── S3VM_unconstrained.py # Unconstrained S3VM
├── SRMSVM.py # SRMSVM
├── MC_NDCC.py # Dataset generator
├── utils.py # Utilities
└── data/ # Datasets
Notes
- Binary classification only (labels: +1, -1)
- Datasets are automatically normalized
- NewtonUTSVM requires 6 C parameters
Description
Languages
Python
100%