People

Institute of Data Science and Engineering

Wu,Yinjun

Title:Assistant Professor

Institute:Institute of Data Science and Engineering

Research Interests:Database, Data Science, AI

E-mail:wuyinjunpku.edu.cn

URL: https://wuyinjun-1993.github.io/


Yinjun Wu is an Assistant Professor of the School of Computer Science, Peking University. He obtained his PhD degree from University of Pennsylvania in August 2021, under the supervision of Prof. Susan Davidson. After that, he continued staying at University of Pennsylvania for 2.5 years as a postdoctoral researcher.


Dr. Wu’s research interests primarily lie at the intersections of database systems, data science and AI. Specifically, His research work spans three inter-dependent research lines. The first line of his research delves into addressing the data management challenges, such as data wrangling and data integration, specifically within the development pipeline of machine learning models. The overarching objective is to enhance model performance through improving data quality, ultimately reducing the cost and human efforts associated with the broader utilizations of AI. The second facet of his research is on interpreting and further debugging the behaviors of machine learning models through data-centric perspective. This usually involves some pivotal questions, such as identifying the significancy of individual training samples and developing solutions to capture and rectify critical model errors across a large dataset. Tackling those research challenges is imperative for achieving trustworthy AI in high-stake areas, e.g., health care and self-driving. The third part of his research is on the application of artificial intelligence to database systems, with a particular focus on harnessing machine learning techniques, such as large language models, to automate the process of optimizing and debugging the performance of database systems. This aims at alleviating the workload of Database Administrators (DBAs) in real-world database products.


Dr. Wu has published over 20 papers on top-tier academic conferences and journals, ranging from databases, systems and machine learning. Notably, 10 of these publications appear in CCF-A conferences, such as SIGMOD, VLDB, ICML, AAAI etc. He also actively contributes to the academic community by serving as reviewers or PC members for a series of top-tier conferences and journals including SIGMOD, VLDBJ, ICDE, Neurips, AAAI etc. His PhD thesis received the best thesis award at the department of computer and information science at University of Pennsylvania.