editing readme.md

2025-11-30 23:45:59 +01:00
parent 3ffa2524a3
commit 036f107a59
1 changed files with 10 additions and 8 deletions
--- a/README.md
+++ b/README.md
@@ -2,7 +2,7 @@
 We are dealing with an exteremly imbalance dataset related to electrocardiogram signals that contain binary classes and labeled as good(0) and bad(1) signals. 
-### STEP 1: Fill missing values
+## STEP 1: Fill missing values
  All the columns in our data contain missing values a range from 25 to 70. By using `from sklearn.impute import KNNImputer` 
@@ -16,7 +16,7 @@ We are dealing with an exteremly imbalance dataset related to electrocardiogram
  return data_frame_imputed
  ```
-### STEP 2: Scaling
+## STEP 2: Scaling
  We used `from sklearn.preprocessing import RobustScaler` to handle scaling.
@@ -28,7 +28,7 @@ We are dealing with an exteremly imbalance dataset related to electrocardiogram
  data_frame_scaled["label"] = labels.values
  ```
-### STEP 3: k-fold cross validation + stratify classes + balancing training data
+## STEP 3: k-fold cross validation + stratify classes + balancing training data
  First of all we split the dataset into 2 parts train (85%) and test (15%). For making sure that majority class and imbalanced class
  distributed fairly we passed `stratify=y`
@@ -79,16 +79,18 @@ We are dealing with an exteremly imbalance dataset related to electrocardiogram
  model.fit(X_train, y_train)
  ```
-### STEP 4: Train different models to find the best possible approach 
+## STEP 4: Train different models to find the best possible approach 
-What we are looking for:
+#### What we are looking for:
-Dangerous: Sick → predicted healthy : high recall score or low FN
+
-Costly: Healthy → predicted sick : high precision score or low FP
+#### Dangerous: Sick → predicted healthy : high recall score or low FN
 #### Costly: Healthy → predicted sick : high precision score or low FP
 ## next steps: 
 ```
 next steps: 
 ✅ 1. Stratified K-fold only apply on train.
 🗹 2. train LGBM model using KMEANS_SMOTE with k_neighbors=10
 🗹 3. train Cat_boost using KMEANS_SMOTE with k_neighbors=10