Elements Reference - Deployed Models by HTML5 UP

Predicting patients with Brain stroke
using the Logistic Regression classification Algorithm.

Data Source: Kaggle

Libraries and Packages: Scikit-learn, Python-Numpy, Pandas

Outline

Understanding the objective

Methodology

1. Data Setup

2. Data Cleaning

3. Defining a feature matrix

4. Building the model

5. Model Evaluation

6. Saving and Loading

7. Summary

Objective

Patients and medical attendants face a lot of uncertainity during, and after hospital encounters. Uncertainities like the possibilitty of a patient being diagnosed of an ailment after measuring certain vital signs. Machine learning models are capable of predicting a patient's outcome with up to a 90-95% accuracy. In this project, we will be building a model to predict and classify patients with the possibility of Brain stroke using the Logistic Regression Algorithm.

Features

Categorical features

Gender
Age
Marital status(Ever married)
Residence type
smoking status
Work type

Numerical features

Hypertension
heart_disease
avg_glucose_level
bmi

Prediction Variables

Age

Hypertension

heart disease

avg glucose level

bmi- body mass index

Output Variables

The target shows the diagnosis of brain stroke: Displays whether the individual is suffering from brain stroke or not :
0 = absence
1 = present The score is being used as a conditional statement for classification as viewed in the deployed model section.

1.Data Setup

Pip installing the dependencies.
Loading the dataset and required libraries

2. Data Cleaning

The data was inspected for null values to remove abnormality. However, the dataset showed no missing values.

3. Defining a feature matrix

Data was split into x and y to separate features to be trained from the response vector y

The data was then split into training and testing.

4. Building the model

Check link

The Logistic Regression was imported from scikitlearn and fit into the split data for training. Predict the Target variable

5. Model Evaluation

Model performance and Accuracy measurements was evalated using a number of performance metrics as shown below

The model shows a classification accuracy score of 95%
Comparing the Actual and Predicted response value for the first 30 predictions

Confusion matrix

Using the confusion matrix to calculate the classification metrics

Printing a classification report

The Area under the curve (AUC) score shows 85% accuracy. This is ususally adopted when the class is imbalanced.

6. Saving and Loading the Model

The trained model was saved and loaded with the pickle library for deployment.

7. Summary of results

A perfect model has an f1-score of 1.00. The model gave an f1-score of 0.98, perfect accuracy is equal to 1.0, we recorded an accuracy of 0.95 and AUC score of 85%. The actual and predicted value in comparison showed that the model is an excellent one for the classification is ready for deployment.

scroll to top

Data Source: Kaggle

Libraries and Packages: Scikit-learn, Python-Numpy, Pandas

Outline

Understanding the objective

Methodology

1. Data Setup

2. Data Cleaning

3. Defining a feature matrix

4. Building the model

5. Model Evaluation

6. Saving and Loading

7. Summary

Objective

Features

Categorical features

Numerical features

Prediction Variables

Output Variables

1.Data Setup

2. Data Cleaning

3. Defining a feature matrix

4. Building the model

5. Model Evaluation

6. Saving and Loading the Model

7. Summary of results

Social