"Data Management in Machine Learning Systems" is the title of the new book in which Matthias Böhm, head of the new Know-Center research area Data Management and holder of a BMVIT Endowed Chair for Data Management at Graz University of Technology, contributed.

English Abstract

In this book, we follow this data-centric view of ML systems and aim to provide a comprehensive overview of data management in ML systems for the end-to-end data science or ML lifecycle. We review multiple interconnected lines of work: (1) ML support in database (DB) systems, (2) DB-inspired ML systems, and (3) ML lifecycle systems. Covered topics include: in-database analytics via query generation and user-defined functions, factorized and statistical-relational learning; optimizing compilers for ML workloads; execution strategies and hardware accelerators; data access methods such as compression, partitioning and indexing; resource elasticity and cloud markets; as well as systems for data preparation for ML, model selection, model management, model debugging, and model serving. Given the rapidly evolving field, we strive for a balance between an up-to-date survey of ML systems, an overview of the underlying concepts and techniques, as well as pointers to open research questions. Hence, this book might serve as a starting point for both systems researchers and developers.

More Information


Matthias Boehm, Graz University of Technology

Arun Kumar, University of California, San Diego

Jun Yang, Duke University

This article is only available in German.

For further information, please contact marketing [at] know-center.at

Matthias Böhm (Mitte) am European Big Data Value Forum in Wien.