I'm Kai Shiun
I build explainable machine learning systems that drive real business decisions.

Testimonials
If Kai got a job offer elsewhere, I would give an excellent recommendation for a Data Science internship position, commend his understanding of statistics and data science techniques, commend his work ethic, commitment, and work culture.
Camilo Lagos
Head of Data Science
CONXAI Technologies GmBH

Since joining our team, Kai Shiun has been a vocal and outspoken individual, constantly operating with the team's best interest at heart. A critical thinker, he is unafraid to clarify doubts and probe for further details during team discussions. Regarding his assigned work, he is capable of taking the initiative to direct his own projects and return promptly with his assigned tasks.
Veronica Low
Founder & President
ASEAN Business Youth Association

From the outset, he displayed a high level of technical proficiency, independently developing robust data pipelines and feature engineering workflows to consolidate complex, multi-year SKU-level financial data... He approached challenges methodically, showing maturity in identifying data quality issues, proposing structured solutions, and ensuring reproducibility in his work.
Peter Condron
Group Technology Strategy, Architecture, & Governance Director
iNova Pharmaceuticals

Experiences






Graduation
Group Technology Data Scientist
Machine Learning Intern
Data Science Intern
Chief Operations Officer
Business Analyst Intern
Featured Projects

HDB Resale Price Prediction
An end-to-end data science pipeline to model resale flat prices using public and external datasets.
Problem
HDB resale prices are influenced by numerous factors including location, property attributes, market conditions, and external economic indicators. Traditional valuation methods struggle to capture the complex, non-linear relationships between these variables, making accurate price prediction challenging for both buyers and sellers.
Technical Approach
- •Built a scalable ETL pipeline with Apache Spark to ingest and preprocess large-scale structured and semi-structured data into a cloud-based data lake (AWS S3)
- •Enriched the dataset with external sources including SORA interest rates, BTO launch timelines, and geospatial data on top primary schools to capture market dynamics
- •Engineered market-relevant features including distance to amenities, school proximity scores, and temporal market indicators
- •Trained and compared multiple predictive models (Random Forest, XGBoost, CatBoost) using cross-validation to identify the best-performing ensemble approach
Results
Achieved improved prediction accuracy by incorporating external data sources and advanced feature engineering, enabling more informed decision-making for property transactions.
Technologies

Customer Churn & Marketing Analytics
Project with an aim to improve customer retention and optimize marketing spend through predictive modeling and analytics
Problem
The business was experiencing high customer churn rates without a clear understanding of which customers were at risk. Additionally, marketing campaigns were not optimized, leading to inefficient spend and missed opportunities for cross-selling. There was a need to proactively identify at-risk customers and optimize marketing strategies.
Technical Approach
- •Built a churn prediction ensemble model combining Random Forest and Logistic Regression to leverage both tree-based and linear approaches
- •Analyzed historical discount impact on customer spend patterns to understand price sensitivity and optimize year-end promotion strategies
- •Developed a recommendation engine using Word2Vec embeddings and cosine similarity to identify product associations and boost cross-sell conversions
- •Created comprehensive analytics dashboards to visualize churn risk segments and marketing campaign effectiveness
Results
Achieved 79% precision and 95% recall on churn prediction, enabling targeted retention campaigns. Marketing optimization led to improved ROI on promotional spend.
Technologies

LLM Powered Marketing Dashboard
An AI-driven analytics tool that turns raw marketing data into actionable business insights.
Problem
Marketing teams were spending excessive time manually analyzing CSV files and creating reports. Decision-makers needed quick access to insights but lacked technical skills to query data. There was a gap between raw marketing data and actionable business intelligence, slowing down strategic decision-making.
Technical Approach
- •Developed a full-stack dashboard with CSV upload functionality and natural language chat interface using LangChain and OpenAI API for on-demand chart generation
- •Integrated Prophet time series forecasting to project revenue, ad spend, and new account creation up to 4 months ahead with confidence intervals
- •Built an automated PDF report generation system with descriptive analytics, trend analysis, and forecast visualizations
- •Created a business-focused React frontend with intuitive navigation to surface key marketing insights for non-technical decision-makers
Results
Reduced report generation time from hours to minutes. Enabled real-time data exploration through natural language queries, improving decision-making speed and accessibility.