API of treelite Python package.
treelite: a framework to optimize decision tree ensembles for fast prediction
treelite.
DMatrix
(data, data_format=None, missing=None, feature_names=None, feature_types=None, verbose=False, nthread=None)¶Data matrix used in treelite.
Parameters: |
|
---|
treelite.
Model
(handle=None)¶Decision tree ensemble model
Parameters: | handle (ctypes.c_void_p , optional) – Initial value of model handle |
---|
compile
(dirpath, params=None, compiler='recursive', verbose=False)¶Generate prediction code from a tree ensemble model. The code will be C99
compliant. One header file (.h) will be generated, along with one or more
source files (.c). Use create_shared()
method to package
prediction code as a dynamic shared library (.so/.dll/.dylib).
Parameters: |
---|
Example
The following populates the directory ./model
with source and header
files:
model.compile(dirpath='./my/model', params={}, verbose=True)
If parallel compilation is enabled (parameter parallel_comp
), the files
are in the form of ./my/model/model.h
, ./my/model/model0.c
,
./my/model/model1.c
, ./my/model/model2.c
and so forth, depending on
the value of parallel_comp
. Otherwise, there will be exactly two files:
./model/model.h
, ./my/model/model.c
export_lib
(toolchain, libpath, params=None, compiler='recursive', verbose=False, nthread=None, options=None)¶Convenience function: Generate prediction code and immediately turn it into a dynamic shared library. A temporary directory will be created to hold the source files.
Parameters: |
|
---|
Example
The one-line command
model.export_lib(toolchain='msvc', libpath='./mymodel.dll',
params={}, verbose=True)
is equivalent to the following sequence of commands:
model.compile(dirpath='/temporary/directory', params={}, verbose=True)
create_shared(toolchain='msvc', dirpath='/temporary/directory',
verbose=True)
# move the library out of the temporary directory
shutil.move('/temporary/directory/mymodel.dll', './mymodel.dll')
export_srcpkg
(platform, toolchain, pkgpath, libname, params=None, compiler='recursive', verbose=False, options=None)¶Convenience function: Generate prediction code and create a zipped source package for deployment. The resulting zip file will also contain a Makefile.
Parameters: |
|
---|
Example
The one-line command
model.export_srcpkg(platform='unix', toolchain='gcc',
pkgpath='./mymodel_pkg.zip', libname='mymodel.so',
params={}, verbose=True)
is equivalent to the following sequence of commands:
model.compile(dirpath='/temporary/directory/mymodel',
params={}, verbose=True)
generate_makefile(dirpath='/temporary/directory/mymodel',
platform='unix', toolchain='gcc')
# zip the directory containing C code and Makefile
shutil.make_archive(base_name=pkgpath, format='zip',
root_dir='/temporary/directory',
base_dir='mymodel/')
from_xgboost
(booster)¶Load a tree ensemble model from an XGBoost Booster object
Parameters: | booster (object of type xgboost.Booster ) – Python handle to XGBoost model |
---|---|
Returns: | model – loaded model |
Return type: | Model object |
Example
bst = xgboost.train(params, dtrain, 10, [(dtrain, 'train')])
xgb_model = Model.from_xgboost(bst)
treelite.
ModelBuilder
(num_feature, num_output_group=1, random_forest=False, **kwargs)¶Builder class for tree ensemble model: provides tools to iteratively build an ensemble of decision trees
Parameters: |
|
---|
Node
¶Handle to a node in a tree
set_categorical_test_node
(feature_id, left_categories, default_left, left_child_key, right_child_key)¶Set the node as a test node with categorical split. A list defines all
categories that would be classified as the left side. Categories are
integers ranging from 0
to n-1
, where n
is the number of
categories in that particular feature. Let’s assume n <= 64
.
Parameters: |
|
---|
set_leaf_node
(leaf_value)¶Set the node as a leaf node
Parameters: | leaf_value (float / list of float ) – Usually a single leaf value (weight) of the leaf node. For multiclass
random forest classifier, leaf_value should be a list of leaf weights. |
---|
set_numerical_test_node
(feature_id, opname, threshold, default_left, left_child_key, right_child_key)¶Set the node as a test node with numerical split. The test is in the form
[feature value] OP [threshold]
. Depending on the result of the test,
either left or right child would be taken.
Parameters: |
|
---|
set_root
()¶Set the node as the root
Tree
¶Handle to a decision tree in a tree ensemble Builder
append
(tree)¶Add a tree at the end of the ensemble
Parameters: | tree (Tree object) – tree to be added |
---|
Example
builder = ModelBuilder(num_feature=4227)
tree = ... # build tree somehow
builder.append(tree) # add tree at the end of the ensemble
commit
()¶Finalize the ensemble model
Returns: | model – finished model |
---|---|
Return type: | Model object |
Example
builder = ModelBuilder(num_feature=4227)
for i in range(100):
tree = ... # build tree somehow
builder.append(tree) # add one tree at a time
model = builder.commit() # now get a Model object
model.compile(dirpath='test') # compile model into C code
insert
(tree, index)¶Insert a tree at specified location in the ensemble
Parameters: |
---|
Example
builder = ModelBuilder(num_feature=4227)
tree = ... # build tree somehow
builder.insert(tree, 0) # insert tree at index 0
treelite.
Annotator
(path=None)¶Branch annotator class: annotate branches in a given model using frequency patterns in the training data
Parameters: | path (str , optional) – if given, the predictor will load branch frequency information
from the path |
---|
annotate_branch
(model, dmat, nthread=None, verbose=False)¶Annotate branches in a given model using frequency patterns in the training data. Each node gets the count of the instances that belong to it. Any prior annotation information stored in the annotator will be replaced with the new annotation returned by this method.
Parameters: |
|
---|
Create shared library.
Parameters: |
|
---|---|
Returns: | libpath – absolute path of created shared library |
Return type: |
Example
The following command uses Visual C++ toolchain to generate
./my/model/model.dll
:
model.compile(dirpath='./my/model', params={}, verbose=True)
create_shared(toolchain='msvc', dirpath='./my/model', verbose=True)
Later, the shared library can be referred to by its directory name:
predictor = Predictor(libpath='./my/model', verbose=True)
# looks for ./my/model/model.dll
Alternatively, one may specify the library down to its file name:
predictor = Predictor(libpath='./my/model/model.dll', verbose=True)
treelite.
save_runtime_package
(destdir, include_binary=False)¶Save a copy of the (zipped) runtime package, containing all glue code necessary to deploy compiled models into the wild
Parameters: |
|
---|
treelite.
generate_makefile
(dirpath, platform, toolchain, options=None)¶Generate a Makefile for a given directory of headers and sources. The resulting Makefile will be stored in the directory. This function is useful for deploying a model on a different machine.
Parameters: |
|
---|
treelite.gallery.sklearn.
import_model
(sklearn_model)¶Load a tree ensemble model from a scikit-learn model object
Parameters: | sklearn_model (object of type RandomForestRegressor / RandomForestClassifier / GradientBoostingRegressor / GradientBoostingClassifier ) – Python handle to scikit-learn model |
---|---|
Returns: | model – loaded model |
Return type: | Model object |
Example
import sklearn.datasets
import sklearn.ensemble
X, y = sklearn.datasets.load_boston(return_X_y=True)
clf = sklearn.ensemble.RandomForestRegressor(n_estimators=10)
clf.fit(X, y)
import treelite.gallery.sklearn
model = treelite.gallery.sklearn.import_model(clf)