Xuange (Alex) Liang

梁轩阁

Hong Kong · New York Ph.D. @ HKUST · CSE MPH @ Columbia · Biostatistics

About

Incoming Ph.D. student in Computer Science and Engineering at HKUST (Advisor: Prof. Qian Zhang), and MPH graduate in Biostatistics (Outstanding Student) from Columbia University. My research focuses on applying foundation models and causal inference to clinical EHR data, with publications at AAAI workshops. I also bring engineering depth from internships in AI, back-end systems, and large-scale health databases.

Publications

Foundation Models for Heterogeneous Treatment Effect Estimation: Adapting Pretrained EHR Embeddings in Small Clinical Cohorts
Liang, X., Yang, B., & Wang, Y. — AAAI-26 Workshop on Health Intelligence (W3PHIAI-26), 2026

Experience

🏢
AI Intern
Shanda Group · Shanghai
Jun 2025 – Present
  • Health analysis system using LLM agents on wearable & physical examination data.
  • Reorganized a PostgreSQL database with 3,000+ health indicators and 200,000+ time-series rows.
  • Synthesized virtual health data with C-GAN, CPAR, and Gaussian Copula models.
LLM AgentsPostgreSQLC-GANPython
🛒
Back-End Engineer Intern
JD Tech (京东科技) · Beijing
Aug 2023 – Sep 2023
  • Developed 10+ APIs in Golang for database management and migration scripts.
  • Deployed applications on Docker and Kubernetes; optimized MySQL query performance.
GolangJavaDockerKubernetesMySQL
🤖
Data Scientist Intern
CloudWalk Technology (云从科技) · Guangzhou
Jul 2022 – Sep 2022
  • Cleaned 140+ table health database using SQL; reduced query runtime by 70% via optimized joins.
  • Predicted hotel health risk with Random Forest and XGBoost (cross-validated).
SQLXGBoostRandom ForestPython

Education

🎓
Ph.D. · Computer Science and Engineering
HKUST · Hong Kong SAR · Advisor: Prof. Qian Zhang
Sep 2026 – Present
🗽
M.P.H. · Biostatistics Outstanding Student · CEOR
Columbia University · New York
Sep 2024 – May 2026
🏛️
B.S. · Intelligence Science and Technology
Peking University · Beijing
Sep 2020 – Jul 2024

Selected Projects

Heterogeneous Treatment Effect via EHR
Adapted CLMBR EHR embeddings with PCA + R-learner on Stanford Medicine data. Ranked #1 across 4 clinical outcomes, +7.9% over best baseline.
Columbia · 2025–2026 · Python · Causal ML
3D Object Detection · Multimodality
Lift-Splat-Shoot + ZoeDepth framework for continuous depth prediction. Built detection head and data pipeline on MMDetection 3D.
Peking University · 2022–2024 · PyTorch
SDV-theta
Synthetic tabular data generation with C-GAN, CPAR, Gaussian Copula — extended fork of SDV.
GitHub · Python
IEEE-CIS Fraud Detection
XGBoost + Random Forest on 590K transactions. PR_AUC 0.95 — top 5% on Kaggle leaderboard.
Kaggle · 2024 · Python

Skills

PythonC++Golang JavaSQLR SASGit PyTorchXGBoostDockerKubernetes

Links