Zeyu Ding - Personal Website

About Me

I am a Ph.D. candidate at TU Dortmund specializing in Bayesian Statistics, Machine Learning, and Data Compression techniques for high-dimensional models. My research aims to develop efficient statistical methods for analyzing and modeling complex, high-dimensional data, with applications in scientific computing and large-scale simulation. My core expertise lies in Bayesian Statistics, and I am particularly interested in exploring its potential in advancing generative modeling methods.

Education

Ph.D. in Statistics, TU Dortmund, Germany (2021 - 2025)
M.Sc. in Quantitative Economics, Georg-August-University Göttingen, Germany (2017 - 2020)
B.Sc. in Quantitative Economics, Xi’an Jiaotong University, China (2011 - 2015)

Research Interests

Bayesian Statistics and Machine Learning
High-dimensional Data Compression and Approximation
Generative Models: GANs, Diffusion Models, and Normalizing Flows
Exploring Transformer Architectures for High-dimensional Generative Modeling
Integrating Bayesian Methods with Deep Learning Frameworks

Projects

Big Data for Advanced Classification Models (2021 - 2023)
Developed scalable Bayesian algorithms for high-dimensional probit and logistic regression models.
AI for Physics (KISS Project) (2023 - Present)
Applied advanced Bayesian and Monte Carlo methods for particle physics simulation.
Big Data for Copula Models (2023 - Present)
Created efficient data compression algorithms for multivariate conditional transformation models.

Working Experience

Scientific Researcher, The Lamarr Institute for Machine Learning and Artificial Intelligence, Germany (2024 - Present)
Scientific Researcher, TU Dortmund, Germany (2021 - Present)
Intern in Quantitative Risk Management, Daimler Mobility AG, Germany (2019 - 2020)
Intern in Risk Management, China Construction Bank, Frankfurt Branch (2019)

Academic Activities

Publications

Scalable Bayesian p-Generalized Probit and Logistic Regression
Advances in Data Analysis and Classification, 2024
Developed scalable Bayesian algorithms for high-dimensional classification problems.
Bayesian Analysis for Dimensionality and Complexity Reduction
Machine Learning under Resource Constraints, deGruyter, Berlin, 2023
Unified Bayesian approaches for dimensionality reduction in resource-constrained environments.
Efficiency Coresets Techniques for Multivariate Conditional Transformation Models
submitted, 2024
Proposed innovative coreset methods for high-dimensional data compression in generative models.
A Benchmark Suite for Monte Carlo Sampling Algorithms
submitted, 2024
Developed new Monte Carlo sampling test metrics for academic and non-adademic users.

Talks

A Benchmark Suite for Monte Carlo Sampling Algorithms
18th International Conference on Computational and Methodological Statistics (CMStatistics), KCL, London, Dec. 2024
Poster presentation
Artificial Intelligence for Large-Scale Scientific Simulations
KISS Project Workshop, University of Hamburg, Feb. 2024
Explored AI techniques in high-energy physics simulations with CERN’s LHC data.
Efficiency Coresets Techniques for Multivariate Conditional Transformation Models
17th International Conference on Computational and Methodological Statistics (CMStatistics), Berlin, Dec. 2023
Presented data compression techniques for multivariate conditional transformations.
Scalable Bayesian p-Generalized Probit and Logistic Regression via Coresets
16th International Conference on Computational and Methodological Statistics (CMStatistics), KCL, London, Dec. 2022
Discussed computational efficiency in Bayesian high-dimensional classification.
6th International Summer School 2022 on Machine Learning under Resource Constraints
Poster, TU Dortmund, Sep. 2022
Topics regarding Bayesian models and coresets approaches

Ongoing Research

Adaptive Sliced Maximum Mean Discrepancy with Generalized Kernels and Random Fourier Features
Enhancing Score Matching with P-Normalized Kernels: Theory and Langevin Dynamics Implementation
Regularization and Prior Choice for the Bayesian Generalized Probit Model

Skills

Programming Languages

Advanced: Python, R
Proficient: SAS, Julia, SQL
Intermediate: VBA, PySpark, PyTorch

Statistical and Machine Learning Expertise

Bayesian Methods: MCMC, Prior Design, Model Selection
Machine Learning: Gradient Boosting, Random Forests, SVMs
Deep Learning: Neural Networks, RNNs, LSTMs, Bayesian Neural Networks
Statistical Modeling: GLMs, Time Series (ARIMA, GARCH, etc.)

Contact

Email: zeyu.ding@tu-dortmund.de
GitHub: zeyudsai
LinkedIn: Zeyu Ding