top of page

Mar 2025 - Jun 2025

Employee Absenteeism Prediction Model

This project uses a logistic regression model to predict employee absenteeism, identifying high-risk individuals and uncovering key factors influencing absentee behavior. The model is packaged into a reusable Python module, with insights visualized in an interactive Power BI dashboard for HR teams.

Overview


This project revolves around building a logistic regression model for tackling a common challenge in HR: predicting and understanding absenteeism. The model has three main objectives:


  • Identify high-risk employees likely to take unscheduled leaves.

  • Predict future absenteeism probability based on employee attributes.

  • Uncover key factors that influence absenteeism behavior.


The model was designed using a variety of features, including medical conditions, family size, and commute costs, among others. The result? A reusable Python module that can seamlessly integrate with new employee data, plus a Power BI dashboard for clear, interactive visualizations of predicted absenteeism probabilities.


Approach


Before jumping into the model-building phase, I focused on data preparation. I started by engineering a binary classification target to define absenteeism, then applied custom scaling to select features that were most relevant. With the data prepped, I split it for training and testing.


Using logistic regression, I analyzed the feature weights and odds ratios to pinpoint the factors that most significantly drove absenteeism. I refined the model by applying statistical techniques like backward elimination to improve its accuracy. After validating the model using performance metrics and a confusion matrix, I serialized the model with Pickle, enabling it to make future predictions on new data.


Integration & Visualization


I built the model with the intention of making it reusable across different datasets. I packaged it into a modular Python module, automating the entire preprocessing and prediction workflow for new employee records. To make the insights actionable, I created an intuitive Power BI dashboard that visualized absenteeism patterns across various attributes like medical conditions, commute costs, and age groups.





Project Gallery

 

Have a Question or Want to Connect?

 

Let's Get In Touch!

linkedin.com/in/shreeyasha-pandey/

United States

  • GitHub
  • LinkedIn

 

© 2025 by Shreeyasha Pandey. Powered and secured by Wix 

 

3D Wireframe Sphere

Thanks for reaching out!

bottom of page