"Data Management in Machine Learning Systems" is the title of the new book in which Matthias Böhm, head of the new Know-Center research area Data Management and holder of a BMVIT Endowed Chair for Data Management at Graz University of Technology, contributed.

English Abstract

In this book, we follow this data-centric view of ML systems and aim to provide a comprehensive overview of data management in ML systems for the end-to-end data science or ML lifecycle. We review multiple interconnected lines of work: (1) ML support in database (DB) systems, (2) DB-inspired ML systems, and (3) ML lifecycle systems. Covered topics include: in-database analytics via query generation and user-defined functions, factorized and statistical-relational learning; optimizing compilers for ML workloads; execution strategies and hardware accelerators; data access methods such as compression, partitioning and indexing; resource elasticity and cloud markets; as well as systems for data preparation for ML, model selection, model management, model debugging, and model serving. Given the rapidly evolving field, we strive for a balance between an up-to-date survey of ML systems, an overview of the underlying concepts and techniques, as well as pointers to open research questions. Hence, this book might serve as a starting point for both systems researchers and developers.

More Information

Authors:

Matthias Boehm, Graz University of Technology

Arun Kumar, University of California, San Diego

Jun Yang, Duke University

This article is only available in German.

For further information, please contact marketing [at] know-center.at

Matthias Böhm (Mitte) am European Big Data Value Forum in Wien.