ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Treebeard: An Optimizing Compiler for Decision Tree Based ML Inference

Prasad, A and Rajendra, S and Rajan, K and Govindarajan, R and Bondhugula, U (2022) Treebeard: An Optimizing Compiler for Decision Tree Based ML Inference. In: 55th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2022, 1 October 2022 - 5 October 2022, Chicago, pp. 494-511.

[img] PDF
IEEE_ACM_2022.pdf - Published Version
Restricted to Registered users only

Download (1MB)
Official URL: https://doi.org/10.1109/MICRO56248.2022.00043

Abstract

Decision tree ensembles are among the most commonly used machine learning models. These models are used in a wide range of applications and are deployed at scale. Decision tree ensemble inference is usually performed with libraries such as XGBoost, LightGBM, and Sklearn. These libraries incorporate a fixed set of optimizations for the hardware targets they support. However, maintaining these optimizations is prohibitively expensive with the evolution of hardware. Further, they do not specialize the inference code to the model being used, leaving significant performance on the table. This paper presents TREEBEARD, an optimizing compiler that progressively lowers the inference computation to optimized CPU code through multiple intermediate abstractions. By applying model-specific optimizations at the higher levels, tree walk optimizations at the middle level, and machine-specific optimizations lower down, TREEBEARD can specialize inference code for each model on each supported CPU target. TREEBEARD combines several novel optimizations at various abstraction levels to mitigate architectural bottlenecks and enable SIMD vectorization of tree walks. We implement TREEBEARD using the MLIR compiler infrastructure and demonstrate its utility by evaluating it on a diverse set of benchmarks. TREEBEARD is significantly faster than state-of-the-art systems, XGBoost, Treelite and Hummingbird, by 2.6×, 4.7× and 5.4× respectively in a single-core execution setting, and by 2.3×, 2.7× and 14× respectively in multi-core settings.

Item Type: Conference Paper
Publication: Proceedings of the Annual International Symposium on Microarchitecture, MICRO
Publisher: IEEE Computer Society
Additional Information: The copyright for this article belongs to IEEE Computer Society.
Keywords: Abstracting; Libraries; Machine learning; Program compilers, Decision tree ensemble; Decision tree inference; Fixed sets; Machine learning models; Machine-learning; Optimisations; Optimizing compilers; Tree ensembles; Tree-based; Vectorization, Decision trees
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 04 Jan 2023 07:05
Last Modified: 04 Jan 2023 07:05
URI: https://eprints.iisc.ac.in/id/eprint/78725

Actions (login required)

View Item View Item