obliquetree.Regressor
- class obliquetree.Regressor(use_oblique=True, max_depth=-1, min_samples_leaf=1, min_samples_split=2, min_impurity_decrease=0.0, ccp_alpha=0.0, categories=None, random_state=None, n_pair=2, top_k=None, gamma=1.0, max_iter=100, relative_change=0.001, linear_leaf=False, leaf_l2=1e-06, leaf_max_iter=100)
- __init__(use_oblique=True, max_depth=-1, min_samples_leaf=1, min_samples_split=2, min_impurity_decrease=0.0, ccp_alpha=0.0, categories=None, random_state=None, n_pair=2, top_k=None, gamma=1.0, max_iter=100, relative_change=0.001, linear_leaf=False, leaf_l2=1e-06, leaf_max_iter=100)
A decision tree regressor supporting both traditional axis-aligned and oblique splits.
This advanced decision tree regressor extends traditional regression trees by supporting oblique splits (linear combinations of features) alongside conventional axis-aligned splits. It offers enhanced flexibility in modeling continuous outputs while maintaining the interpretability of decision trees.
- Parameters:
use_oblique (
bool, defaultTrue) –If True, enables oblique splits using linear combinations of features.
If False, uses traditional axis-aligned splits only.
max_depth (
int, default-1) –Maximum depth of the tree. Controls model complexity and prevents overfitting.
If -1: Expands until leaves are pure or contain fewer than min_samples_split samples.
If int > 0: Limits the tree to the specified depth.
min_samples_leaf (
int, default1) – Minimum number of samples required at leaf nodes.min_samples_split (
int, default2) – Minimum number of samples required to split an internal node.min_impurity_decrease (
float, default0.0) – Minimum required decrease in impurity to create a split.ccp_alpha (
float, default0.0) – Complexity parameter for Minimal Cost-Complexity Pruning.categories (
List[int], defaultNone) – Indices of categorical features in the dataset.random_state (
int, defaultNone) –Seed for random number generation in oblique splits.
Only used when use_oblique=True.
n_pair (
int, default2) –Number of features to combine in oblique splits.
Only used when use_oblique=True.
top_k (
intorNone, defaultNone) –Number of numeric features kept after cheap oblique feature screening.
If None, an internal heuristic is used.
Only used when use_oblique=True.
gamma (
float, default1.0) –Separation strength parameter for oblique splits.
Only used when use_oblique=True.
max_iter (
int, default100) –Maximum iterations for L-BFGS optimization in oblique splits.
Only used when use_oblique=True.
relative_change (
float, default0.001) –Early stopping threshold for L-BFGS optimization.
Only used when use_oblique=True.
linear_leaf (
bool, defaultFalse) –If True, replace the constant leaf (the weighted mean of
y) with a weighted ordinary-least-squares model fit on the leaf samples. Predict returnsintercept + sum(coef * x_numeric).Only numeric features participate in the leaf coefficients (categorical features in
categoriesare excluded; the tree splits already capture their structure). SeeNotesonBaseTreefor the full mechanism and fallback rules.leaf_l2 (
float, default1e-6) – L2 (ridge) penalty on the leaf coefficients.0.0is honored exactly (true unregularized OLS); on rank-deficient numeric features the affected leaf falls back to the mean instead. Pass a small positive value (e.g.1e-8) to keep the fit on correlated features.leaf_max_iter (
int, default100) – Unused for regression (OLS is closed-form), kept for API symmetry withClassifiersoRegressorandClassifiershare the same constructor surface.
- apply(X)
Return the index of the leaf that each sample ends up in.
- Parameters:
X (
array-likeofshape (n_samples,n_features)) – The input samples.- Returns:
X_leaves – For each datapoint x in X, return the index of the leaf x ends up in. Nodes are numbered using pre-order (depth-first) traversal.
- Return type:
numpy.ndarrayofshape (n_samples,)
- fit(X, y, sample_weight=None)
Build a decision tree regressor from the training set (X, y).
- Parameters:
X (
array-likeofshape (n_samples,n_features)) – The training input samples.y (
array-likeofshape (n_samples,)) – Target values.sample_weight (
array-likeofshape (n_samples,), optional, defaultNone) – Sample weights.
- Returns:
self – Fitted estimator.
- Return type:
- predict(X)
Predict regression target for X.
- Parameters:
X (
array-likeofshape (n_samples,n_features)) – The input samples to predict.- Returns:
y – The predicted values.
- Return type:
numpy.ndarrayofshape (n_samples,)