Bo Sheng's Web Page

Introduction

Hi, I’m Tan Bo Sheng, Materials Engineering student at Nanyang Technological University (NTU), Singapore.
I’m passionate about exploring the intersection between engineering, data analytics, and technology to solve real-world challenges.

Achievements

Cybersecurity, Coursera | Google

View Credential

Google Business Intelligence, Coursera | Google

View Credential

Google IT Automation With Python, Coursera | Google

View Credential

Google Advanced Data Analytics, Coursera | Google

View Credential

Google AI Essentials, Coursera | Google

View Credential

Google Project Management, Coursera | Google

View Credential

Projects

Chatbot database design (Group Project)

As part of my module Designing & Developing Databases project, for one of the question which was quite open ended in nature, I proposed a chatbot concept designed to handle exceptional cases such turbulence compensation scenarios as described in our project brief.

The video showcases:

The design process and key assumptions made to maximise customer satisfaction and experience.
My approach to solving an abstract problem through structured analysis and creative thinking.
This project highlights my problem-solving skills, analytical thinking, and ability to design solutions for real-world cases.

Note: The original video snippet from the submission is used for authenticity purpose

Credit Default Risk (Group Project)

Worked in a team of 5 from multiple backgrounds to build an end-to-end ML pipeline that predict's customer credit default. The project covered EDA, data cleaning, categorical encoding, scaling, class-imbalance handling, model selection with cross-validation, and justifications. Technologies used includes, Python, Jupyter Notebook, Pandas, Numpy, Matplotlib, Seaborn, 'PLotly Express & Plotly Graph Objects', Scikit-learn, XGBoost, imblearn, KMeans

How It Was Done

Performed data cleaning, handling missing values, encoding categorical variables, and scaling features.
Conducted correlation analysis and visualization to identify key predictors.
Addressed severe class imbalance using SMOTE oversampling, random undersampling, and class-weight adjustments.
Trained and compared multiple models including Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, XGBoost, and an ensemble Voting Classifier.
Optimized models using cross-validation and hyperparameter tuning with GridSearchCV.
Evaluated models using Accuracy, Precision, Recall, F1-score, and ROC-AUC, with additional explainability provided through SHAP values.

Result Achieved

Baseline models achieved high accuracy (~92%) but failed to identify defaulters effectively.
Class-weighted and SMOTE-based models significantly improved recall to around 65–71%, making the system far more effective for risk flagging.
Ensemble approaches, particularly weighted XGBoost and Voting Classifier, offered the best balance between recall and precision.

Technologies Used

Python, Jupyter Notebook
pandas, numpy, matplotlib, seaborn, plotly
scikit-learn, XGBoost, imbalanced-learn, SHAP

Hello, I’m Tan Bo Sheng

Data driven Materials Engineering Undergraduate

Introduction

Achievements

Projects

Chatbot database design (Group Project)

Credit Default Risk (Group Project)

How It Was Done

Result Achieved

Technologies Used

Hello, I’m Tan Bo Sheng

Data driven Materials Engineering Undergraduate

Scan to Visit My Website

Introduction

Achievements

Projects

Chatbot database design (Group Project)

Credit Default Risk (Group Project)

How It Was Done

Result Achieved

Technologies Used