General Tree Inference Library (GTIL)
GTIL is a reference implementation of a prediction runtime for all Treelite models. It has the following goals:
Universal coverage: GTIL shall support all tree ensemble models that can be represented as Treelite objects.
Accessible code: GTIL should be written in an easy-to-read style that can be understood to a first-time contributor. We prefer code legibility to performance optimization.
Correct output: As a reference implementation, GTIL should produce correct prediction outputs.
General Tree Inference Library (GTIL)
Functions:
|
Predict with a Treelite model using the General Tree Inference Library (GTIL). |
|
Predict with a Treelite model, outputting the leaf node's ID for each row. |
|
Predict with a Treelite model and output prediction of each tree. |
- treelite.gtil.predict(model, data, *, nthread=-1, pred_margin=None)
Predict with a Treelite model using the General Tree Inference Library (GTIL).
- Parameters:
model (
Model
object) – Treelite model objectdata (
numpy.ndarray
array) – 2D NumPy array, with which to run predictionnthread (
int
, optional) – Number of CPU cores to use in prediction. If <= 0, use all CPU cores.pred_margin (bool, optional (defaults to False)) – Whether to produce raw margin scores. If pred_margin=True, post-processing is no longer applied and raw margin scores are produced.
- Returns:
prediction – Prediction output. Expected output dimensions:
(num_row,) for regressors and binary classifiers
(num_row, num_class) for multi-class classifiers (See Notes for a special case.)
- Return type:
numpy.ndarray
array
Notes
The output has shape (num_row,) if the model is a multi-class classifier with task_type=”MultiClfGrovePerClass” and pred_transform=”max_index”.
- treelite.gtil.predict_leaf(model, data, *, nthread=-1)
Predict with a Treelite model, outputting the leaf node’s ID for each row.
- Parameters:
model (
Model
object) – Treelite model objectdata (
numpy.ndarray
array) – 2D NumPy array, with which to run predictionnthread (
int
, optional) – Number of CPU cores to use in prediction. If <= 0, use all CPU cores.
- Returns:
prediction – Prediction output. Expected output dimensions: (num_row, num_tree)
- Return type:
numpy.ndarray
array
Notes
Treelite assigns a unique integer ID for every node in the tree, including leaf nodes as well as internal nodes. It does so by traversing the tree breadth-first. So, for example, the root node is assigned ID 0, and the two nodes at depth=1 is assigned ID 1 and 2, respectively. Call
treelite.Model.dump_as_json()
to obtain the ID of every tree node.
- treelite.gtil.predict_per_tree(model, data, *, nthread=-1)
Predict with a Treelite model and output prediction of each tree. This function computes one or more margin scores per tree.
- Parameters:
model (
Model
object) – Treelite model objectdata (
numpy.ndarray
array) – 2D NumPy array, with which to run predictionnthread (
int
, optional) – Number of CPU cores to use in prediction. If <= 0, use all CPU cores.
- Returns:
prediction – Prediction output. Expected output dimensions:
(num_row, num_tree) for regressors, binary classifiers, and multi-class classifiers with task_type=”MultiClfGrovePerClass”
(num_row, num_tree, num_class) for multi-class classifiers with task_type=”kMultiClfProbDistLeaf”
- Return type:
numpy.ndarray
array