obliquetree.Regressor

class obliquetree.Regressor(use_oblique=True, max_depth=-1, min_samples_leaf=1, min_samples_split=2, min_impurity_decrease=0.0, ccp_alpha=0.0, categories=None, random_state=None, n_pair=2, top_k=None, gamma=1.0, max_iter=100, relative_change=0.001, linear_leaf=False, leaf_l2=1e-06, leaf_max_iter=100)
__init__(use_oblique=True, max_depth=-1, min_samples_leaf=1, min_samples_split=2, min_impurity_decrease=0.0, ccp_alpha=0.0, categories=None, random_state=None, n_pair=2, top_k=None, gamma=1.0, max_iter=100, relative_change=0.001, linear_leaf=False, leaf_l2=1e-06, leaf_max_iter=100)

A decision tree regressor supporting both traditional axis-aligned and oblique splits.

This advanced decision tree regressor extends traditional regression trees by supporting oblique splits (linear combinations of features) alongside conventional axis-aligned splits. It offers enhanced flexibility in modeling continuous outputs while maintaining the interpretability of decision trees.

Parameters:
  • use_oblique (bool, default True) –

    • If True, enables oblique splits using linear combinations of features.

    • If False, uses traditional axis-aligned splits only.

  • max_depth (int, default -1) –

    Maximum depth of the tree. Controls model complexity and prevents overfitting.

    • If -1: Expands until leaves are pure or contain fewer than min_samples_split samples.

    • If int > 0: Limits the tree to the specified depth.

  • min_samples_leaf (int, default 1) – Minimum number of samples required at leaf nodes.

  • min_samples_split (int, default 2) – Minimum number of samples required to split an internal node.

  • min_impurity_decrease (float, default 0.0) – Minimum required decrease in impurity to create a split.

  • ccp_alpha (float, default 0.0) – Complexity parameter for Minimal Cost-Complexity Pruning.

  • categories (List[int], default None) – Indices of categorical features in the dataset.

  • random_state (int, default None) –

    Seed for random number generation in oblique splits.

    • Only used when use_oblique=True.

  • n_pair (int, default 2) –

    Number of features to combine in oblique splits.

    • Only used when use_oblique=True.

  • top_k (int or None, default None) –

    Number of numeric features kept after cheap oblique feature screening.

    • If None, an internal heuristic is used.

    • Only used when use_oblique=True.

  • gamma (float, default 1.0) –

    Separation strength parameter for oblique splits.

    • Only used when use_oblique=True.

  • max_iter (int, default 100) –

    Maximum iterations for L-BFGS optimization in oblique splits.

    • Only used when use_oblique=True.

  • relative_change (float, default 0.001) –

    Early stopping threshold for L-BFGS optimization.

    • Only used when use_oblique=True.

  • linear_leaf (bool, default False) –

    If True, replace the constant leaf (the weighted mean of y) with a weighted ordinary-least-squares model fit on the leaf samples. Predict returns intercept + sum(coef * x_numeric).

    Only numeric features participate in the leaf coefficients (categorical features in categories are excluded; the tree splits already capture their structure). See Notes on BaseTree for the full mechanism and fallback rules.

  • leaf_l2 (float, default 1e-6) – L2 (ridge) penalty on the leaf coefficients. 0.0 is honored exactly (true unregularized OLS); on rank-deficient numeric features the affected leaf falls back to the mean instead. Pass a small positive value (e.g. 1e-8) to keep the fit on correlated features.

  • leaf_max_iter (int, default 100) – Unused for regression (OLS is closed-form), kept for API symmetry with Classifier so Regressor and Classifier share the same constructor surface.

apply(X)

Return the index of the leaf that each sample ends up in.

Parameters:

X (array-like of shape (n_samples, n_features)) – The input samples.

Returns:

X_leaves – For each datapoint x in X, return the index of the leaf x ends up in. Nodes are numbered using pre-order (depth-first) traversal.

Return type:

numpy.ndarray of shape (n_samples,)

fit(X, y, sample_weight=None)

Build a decision tree regressor from the training set (X, y).

Parameters:
  • X (array-like of shape (n_samples, n_features)) – The training input samples.

  • y (array-like of shape (n_samples,)) – Target values.

  • sample_weight (array-like of shape (n_samples,), optional, default None) – Sample weights.

Returns:

self – Fitted estimator.

Return type:

Regressor

predict(X)

Predict regression target for X.

Parameters:

X (array-like of shape (n_samples, n_features)) – The input samples to predict.

Returns:

y – The predicted values.

Return type:

numpy.ndarray of shape (n_samples,)