editing readme.md
This commit is contained in:
18
README.md
18
README.md
@@ -2,7 +2,7 @@
|
||||
|
||||
We are dealing with an exteremly imbalance dataset related to electrocardiogram signals that contain binary classes and labeled as good(0) and bad(1) signals.
|
||||
|
||||
### STEP 1: Fill missing values
|
||||
## STEP 1: Fill missing values
|
||||
|
||||
All the columns in our data contain missing values a range from 25 to 70. By using `from sklearn.impute import KNNImputer`
|
||||
|
||||
@@ -16,7 +16,7 @@ We are dealing with an exteremly imbalance dataset related to electrocardiogram
|
||||
return data_frame_imputed
|
||||
```
|
||||
|
||||
### STEP 2: Scaling
|
||||
## STEP 2: Scaling
|
||||
|
||||
We used `from sklearn.preprocessing import RobustScaler` to handle scaling.
|
||||
|
||||
@@ -28,7 +28,7 @@ We are dealing with an exteremly imbalance dataset related to electrocardiogram
|
||||
data_frame_scaled["label"] = labels.values
|
||||
```
|
||||
|
||||
### STEP 3: k-fold cross validation + stratify classes + balancing training data
|
||||
## STEP 3: k-fold cross validation + stratify classes + balancing training data
|
||||
|
||||
First of all we split the dataset into 2 parts train (85%) and test (15%). For making sure that majority class and imbalanced class
|
||||
distributed fairly we passed `stratify=y`
|
||||
@@ -79,16 +79,18 @@ We are dealing with an exteremly imbalance dataset related to electrocardiogram
|
||||
model.fit(X_train, y_train)
|
||||
```
|
||||
|
||||
### STEP 4: Train different models to find the best possible approach
|
||||
## STEP 4: Train different models to find the best possible approach
|
||||
|
||||
What we are looking for:
|
||||
Dangerous: Sick → predicted healthy : high recall score or low FN
|
||||
Costly: Healthy → predicted sick : high precision score or low FP
|
||||
#### What we are looking for:
|
||||
|
||||
#### Dangerous: Sick → predicted healthy : high recall score or low FN
|
||||
|
||||
#### Costly: Healthy → predicted sick : high precision score or low FP
|
||||
|
||||
|
||||
|
||||
## next steps:
|
||||
```
|
||||
next steps:
|
||||
✅ 1. Stratified K-fold only apply on train.
|
||||
🗹 2. train LGBM model using KMEANS_SMOTE with k_neighbors=10
|
||||
🗹 3. train Cat_boost using KMEANS_SMOTE with k_neighbors=10
|
||||
|
||||
Reference in New Issue
Block a user