ā¹ļø Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 0.1 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://catboost.ai/docs/en/references/training-parameters/common |
| Last Crawled | 2026-04-15 22:20:28 (1 day ago) |
| First Indexed | 2024-11-18 17:11:24 (1 year ago) |
| HTTP Status Code | 200 |
| Meta Title | Common parameters | CatBoost |
| Meta Description | loss_function. Command-line: --loss-function. Alias: objective. Description. |
| Meta Canonical | null |
| Boilerpipe Text | loss_function
Command-line:
--loss-function
Alias:
objective
Description
The
metric
to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters (see theĀ
Objectives and metrics
section for details on each metric).
Format:
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Supported metrics
RMSE
Logloss
MAE
CrossEntropy
Quantile
LogLinQuantile
Lq
MultiRMSE
MultiClass
MultiClassOneVsAll
MultiLogloss
MultiCrossEntropy
MAPE
Poisson
PairLogit
PairLogitPairwise
QueryRMSE
QuerySoftMax
GroupQuantile
Tweedie
YetiRank
YetiRankPairwise
StochasticFilter
StochasticRank
A custom python object can also be set as the value of this parameter (see anĀ
example
).
For example, use the following construction to calculate the value ofĀ Quantile with the coefficientĀ
α
=
0.1
\alpha = 0.1
:
Quantile:alpha=0.1
Type
string
object
Default value
Python package
Depends on the class:
CatBoostClassifier
: Logloss if theĀ
target_border
parameter value differs from None. Otherwise, the default loss function depends on the number of unique target values and is either set to Logloss or MultiClass.
CatBoost
and
CatBoostRegressor
: RMSE
R package, Command-line
RMSE
Supported processing units
CPU and GPU
custom_metric
Command-line:
--custom-metric
Description
Metric
values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see theĀ
Objectives and metrics
section for details on each metric).
Format:
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Supported metrics
Examples
Calculate the value of CrossEntropy:
CrossEntropy
Calculate the value ofĀ Quantile with the coefficientĀ
α
=
0.1
\alpha = 0.1
Quantile:alpha=0.1
Calculate the values of Logloss and AUC:
[
'Logloss'
,
'AUC'
]
Values of all custom metrics for learn and validation datasets are saved to theĀ
Metric
output files (
learn_error.tsv
and
test_error.tsv
respectively). The directory for these files is specified in theĀ
--train-dir
(
train_dir
) parameter.
Use theĀ
visualization tools
to see a live chart with the dynamics of the specified metrics.
Type
string
list of strings
Default value
Python package
None
R package
None
Command-line
None (do not output additional metric values)
Supported processing units
CPU and GPU
eval_metric
Command-line:
--eval-metric
Description
The metric used for overfitting detection (if enabled) and best model selection (if enabled). Some metrics support optional parameters (see theĀ
Objectives and metrics
section for details on each metric).
Format:
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Supported metrics
A user-defined function can also be set as the value (see anĀ
example
).
Examples:
R2
Type
string
object
Default value
Optimized objective is used
Supported processing units
CPU and GPU
iterations
Command-line:
-i
,
--iterations
Aliases:
num_boost_round
,
n_estimators
,
num_trees
Description
The maximum number of trees that can be built when solving machine learning problems.
When using other parameters that limit the number of iterations, the final number of trees may be less than the number specified in this parameter.
Type
int
Default value
1000
Supported processing units
CPU and GPU
learning_rate
Command-line:
-w
,
--learning-rate
Alias:
eta
Description
The learning rate.
Used for reducing the gradient step.
Type
float
Default value
The default value is defined automatically for
Logloss
,
MultiClass
and
RMSE
loss functions depending on the number of iterations if none of parameters
leaf_estimation_iterations
,
leaf_estimation_method
,
l2_leaf_reg
is set. In this case, the selected learning rate is printed to stdout and saved in the model.
In other cases, the default value is 0.03.
Supported processing units
CPU and GPU
random_seed
Command-line:
-r
,
--random-seed
Alias:
random_state
Description
The random seed used for training.
Type
int
Default value
Python package
None (0)
R package, Command-line
0
Supported processing units
CPU and GPU
l2_leaf_reg
Command-line:
--l2-leaf-reg
,
l2-leaf-regularizer
Alias:
reg_lambda
Description
Coefficient at the L2 regularization term of the cost function.
Any positive value is allowed.
Type
float
Default value
3.0
Supported processing units
CPU and GPU
bootstrap_type
Command-line:
--bootstrap-type
Description
Bootstrap type
. Defines the method for sampling the weights of objects.
Supported methods:
Bayesian
Bernoulli
MVS
Poisson (supported for GPU only)
No
Type
string
Default value
The default value depends on
objective
,
task_type
,
bagging_temperature
and
sampling_unit
:
When the objective parameter is QueryCrossEntropy, YetiRankPairwise, PairLogitPairwise and the bagging_temperature parameter is not set: Bernoulli with theĀ subsample parameter set to 0.5.
Neither MultiClass nor MultiClassOneVsAll, task_type = CPU and sampling_unit = Object: MVS with theĀ subsample parameter set to 0.8.
Otherwise: Bayesian.
Supported processing units
CPU and GPU
bagging_temperature
Command-line:
--bagging-temperature
Description
Defines the settings of the Bayesian bootstrap. It is used by default in classification and regression modes.
Use the Bayesian bootstrap to assign random weights to objects.
The weights are sampled from exponential distribution if the value of this parameter is set to
1
. All weights are equal to 1 if the value of this parameter is set to
0
.
Possible values are in the range
[
0
;
inf
ā”
)
[0; \inf)
. The higher the value the more aggressive the bagging is.
This parameter can be used if the selected bootstrap type is Bayesian.
Type
float
Default value
1
Supported processing units
CPU and GPU
subsample
Command-line:
--subsample
Description
Sample rate for bagging.
This parameter can be used if one of the following bootstrap types is selected:
Poisson
Bernoulli
MVS
Type
float
Default value
The default value depends on the dataset size and the bootstrap type:
Datasets with less than 100 objectsĀ ā 1
Datasets with 100 objects or more:
Poisson, BernoulliĀ ā 0.66
MVSĀ ā 0.8
Supported processing units
CPU and GPU
sampling_frequency
Command-line:
--sampling-frequency
Description
Frequency to sample weights and objects when building trees.
Supported values:
PerTreeĀ ā Before constructing each new tree
PerTreeLevelĀ ā Before choosing each new split of a tree
Type
string
Default value
PerTreeLevel
Supported processing units
CPU
sampling_unit
Command-line:
--sampling-unit
Description
The sampling scheme.
Possible values:
ObjectĀ ā The weightĀ
w
i
w_{i}
of the i-th object
o
i
o_{i}
is used for sampling the corresponding object.
GroupĀ ā The weight
w
j
w_{j}
of the group
g
j
g_{j}
is used for sampling each objectĀ
o
i
j
o_{i_{j}}
from the groupĀ
g
j
g_{j}
.
Type
String
Default value
Object
Supported processing units
CPU and GPU
mvs_reg
Command-line:
--mvs-reg
Description
Affects the weight of the denominator and can be used for balancing between the importance and Bernoulli sampling (setting it to 0 implies importance sampling and to
ā
\infty
- Bernoulli).
Note
This parameter is supported only for the MVS sampling method (the
bootstrap_type
parameter must be set to MVS).
Type
float
Default value
The value is set based on the gradient distribution on the current iteration
Supported processing units
CPU
random_strength
Command-line:
--random-strength
Description
The amount of randomness to use for scoring splits when the tree structure is selected. Use this parameter to avoid overfitting the model.
The value of this parameter is used when selecting splits. On every iteration each possible split gets a score (for example, the score indicates how much adding this split will improve the loss function for the training dataset). The split with the highest score is selected.
The scores have no randomness. A normally distributed random variable is added to the score of the feature. It has a zero mean and a variance that decreases during the training. The value of this parameter is the multiplier of the variance.
Note
This parameter is not supported for the following loss functions:
QueryCrossEntropy
YetiRankPairwise
PairLogitPairwise
Type
float
Default value
1
Supported processing units
CPU
use_best_model
Command-line:
--use-best-model
Description
If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:
Build the number of trees defined by the training parameters.
Use the validation dataset to identify the iteration with the optimal value of the metric specified in Ā
--eval-metric
(
--eval-metric
).
No trees are saved after this iteration.
This option requires a validation dataset to be provided.
Type
bool
Default value
True if a validation set is input (the eval_set parameter is defined) and at least one of the label values of objects in this set differs from the others. False otherwise.
Supported processing units
CPU and GPU
best_model_min_trees
Command-line:
--best-model-min-trees
Description
The minimal number of trees that the best model should have. If set, the output model contains at least the given number of trees even if the optimal value of the evaluation metric on the validation dataset is achieved with smaller number of trees.
Should be used with the
--use-best-model
parameter.
Type
int
Default value
Python package, R package
None (The minimal number of trees for the best model is not set)
Command-line
The minimal number of trees for the best model is not set
Supported processing units
CPU and GPU
depth
Command-line:
-n
,
--depth
Alias:
max_depth
Description
Depth of the trees.
The range of supported values depends on the processing unit type and the type of the selected loss function:
CPUĀ ā Any integer up toĀ 16.
GPUĀ ā Any integer up to 8 for pairwise modes (YetiRank, PairLogitPairwise, and QueryCrossEntropy), and up to 16 for all other loss functions.
Type
int
Default value
6 (16 if the growing policy is set to Lossguide)
Supported processing units
CPU and GPU
grow_policy
Command-line:
--grow-policy
Description
The tree growing policy. Defines how to perform greedy tree construction.
Possible values:
SymmetricTreeĀ āA tree is built level by level until the specified depth is reached. On each iteration, all leaves from the last tree level are split with the same condition. The resulting tree structure is always symmetric.
DepthwiseĀ ā A tree is built level by level until the specified depth is reached. On each iteration, all non-terminal leaves from the last tree level are split. Each leaf is split by condition with the best loss improvement.
Note
Models with this growing policy can not be analyzed using the PredictionDiff feature importance and can be exported only to json and cbm.
LossguideĀ ā A tree is built leaf by leaf until the specified maximum number of leaves is reached. On each iteration, non-terminal leaf with the best loss improvement is split.
Note
Models with this growing policy can not be analyzed using the PredictionDiff feature importance and can be exported only to json and cbm.
Type
string
Default value
SymmetricTree
Supported processing units
CPU and GPU
min_data_in_leaf
Command-line:
--min-data-in-leaf
Alias:
min_child_samples
Description
The minimum number of training samples in a leaf. CatBoost does not search for new splits in leaves with samples count less than the specified value.
Can be used only with the Lossguide and Depthwise growing policies.
Type
int
Default value
1
Supported processing units
CPU and GPU
max_leaves
Command-line:
--max-leaves
Alias:
num_leaves
Description
The maximum number of leafs in the resulting tree. Can be used only with theĀ Lossguide growing policy.
Note
It is not recommended to use values greater than 64, since it can significantly slow down the training process.
Type
int
Default value
31
Supported processing units
CPU and GPU
ignored_features
Command-line:
-I
,
--ignore-features
Description
Feature indices to exclude from the training.
Python package
It is assumed that all passed values are feature names if at least one of the passed values can not be converted to a number or a range of numbers. Otherwise, it is assumed that all passed values are feature indices.
Specifics:
Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set toĀ
42
, the corresponding non-existing feature is successfully ignored.
The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to
featureCount ā 1
. If a file is used asĀ
input data
then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order:
cat feature<\t>label value<\t>num feature
. So for the row
rock<\t>0<\t>42
, the identifier for the
rock
feature is 0, and for the
42
feature it's 1.
For example, use the following construction if features indexed 1, 2, 7, 42, 43, 44, 45, should be ignored:
[1,2,7,42,43,44,45]
R package
Specifics:
Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set toĀ
42
, the corresponding non-existing feature is successfully ignored.
The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to
featureCount ā 1
. If a file is used asĀ
input data
then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order:
cat feature<\t>label value<\t>num feature
. So for the row
rock<\t>0<\t>42
, the identifier for the
rock
feature is 0, and for the
42
feature it's 1.
For example, if training should exclude features with the identifiers 1, 2, 7, 42, 43, 44, 45, the value of this parameter should be set to c(1,2,7,42,43,44,45).
Command-line
It is assumed that all passed values are feature names if at least one of the passed values can not be converted to a number or a range of numbers. Otherwise, it is assumed that all passed values are feature indices.
Specifics:
Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set toĀ
42
, the corresponding non-existing feature is successfully ignored.
The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to
featureCount ā 1
. If a file is used asĀ
input data
then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order:
cat feature<\t>label value<\t>num feature
. So for the row
rock<\t>0<\t>42
, the identifier for the
rock
feature is 0, and for the
42
feature it's 1.
For example, if training should exclude features with the identifiers 1, 2, 7, 42, 43, 44, 45, use the following construction:
1:2:7:42-45
.
Default value
Python package, R package
None
Command-line
Omitted
Supported processing units
CPU and GPU
one_hot_max_size
Command-line:
--one-hot-max-size
Description
Use one-hot encoding for all categorical features with a number of different values less than or equal to the given parameter value. Ctrs are not calculated for such features.
See
details
.
Type
int
Default value
The default value depends on various conditions:
N/A if training is performed on CPU in Pairwise scoring mode
Read more about Pairwise scoring
The following loss functions use Pairwise scoring:
YetiRankPairwise
PairLogitPairwise
QueryCrossEntropy
Pairwise scoring is slightly different from regular training on pairs, since pairs are generated only internally during the training for the corresponding metrics. One-hot encoding is not available for these loss functions.
255 if training is performed on GPU and the selected Ctr types require target data that is not available during the training
10 if training is performed inĀ
Ranking
mode
2 if none of the conditions above is met
Supported processing units
CPU and GPU
has_time
Command-line:
--has-time
Description
Use the order of objects in the input data (do not perform random permutations during the
Transforming categorical features to numerical features
and
Choosing the tree structure
stages).
TheĀ Timestamp column type is used to determine the order of objects if specified in theĀ
input data
.
Type
bool
Default value
False (not used; generates random permutations)
Supported processing units
CPU and GPU
rsm
Command-line:
--rsm
Alias:
colsample_bylevel
Description
Random subspace method. The percentage of features to use at each split selection, when features are selected over again at random.
The value must be in the rangeĀ (0;1].
Type
float (0;1]
Default value
None (set to 1)
Supported processing units
CPU; GPU for pairwise ranking
nan_mode
Command-line:
--nan-mode
Description
The method forĀ
processing missing values
in the input dataset.
Possible values:
"Forbidden" ā Missing values are not supported, their presence is interpreted as an error.
"Min" ā Missing values are processed as the minimum value (less than all other values) for the feature. It is guaranteed that a split that separates missing values from all other values is considered when selecting trees.
"Max" ā Missing values are processed as the maximum value (greater than all other values) for the feature. It is guaranteed that a split that separates missing values from all other values is considered when selecting trees.
Using theĀ Min or Max value of this parameterĀ guarantees that a split between missing values and other values is considered when selecting a new split in the tree.
Type
string
Default value
Min
Supported processing units
CPU and GPU
input_borders
Command-line:
--input-borders-file
Description
LoadĀ
Custom quantization borders and missing value modes
from a file (do not generate them).
Borders are automatically generated before training if this parameter is not set.
Type
string
Default value
Python package
None
Command-line
The file is not loaded, the values are generated
Supported processing units
CPU and GPU
output_borders
Command-line:
--output-borders-file
Description
Save quantization borders for the current dataset to a file.
Refer to the
file format description
.
Type
string
Default value
Python package
None
Command-line
The file is not saved
Supported processing units
CPU and GPU
fold_permutation_block
Command-line:
--fold-permutation-block
Description
Objects in the dataset are grouped in blocks before the random permutations. This parameter defines the size of the blocks. The smaller is the value, the slower is the training. Large values may result in quality degradation.
Type
int
Default value
Python package
1
R package, Command-line
Default value differs depending on the dataset size and ranges from 1 to 256 inclusively
Supported processing units
CPU and GPU
leaf_estimation_method
Command-line:
--leaf-estimation-method
Description
The method used to calculate the values in leaves.
Possible values:
Newton
Gradient
Exact
Type
string
Default value
Depends on the mode and the selected loss function:
Regression with Quantile or MAE loss functions ā One Exact iteration.
Regression with any loss function but Quantile or MAE ā One Gradient iteration.
Classification mode ā Ten Newton iterations.
Multiclassification mode ā One Newton iteration.
Supported processing units
CPU and GPU
leaf_estimation_iterations
Command-line:
--leaf-estimation-iterations
Description
CatBoost might calculate leaf values using several gradient or newton steps instead of a single one.
This parameter regulates how many steps are done in every tree when calculating leaf values.
Type
int
Default value
Python package
None (Depends on the training objective)
R package, Command-line
Depends on the training objective
Supported processing units
CPU and GPU
leaf_estimation_backtracking
Command-line:
--leaf-estimation-backtracking
Description
When the value of the
leaf_estimation_iterations
parameter is greater than 1, CatBoost makes several gradient or newton steps when calculating the resulting leaf values of a tree.
The behaviour differs depending on the value of this parameter:
NoĀ ā Every next step is a regular gradient or newton step: the gradient step is calculated and added to the leaf.
Any other valueĀ āBacktracking is used.
In this case, before adding a step, a condition is checked. If the condition is not met, then the step size is reduced (divided by 2), otherwise the step is added to the leaf.
When
leaf_estimation_iterations
for the Command-line version is set to
n
, the leaf estimation iterations are calculated as follows: each iteration is either an addition of the next step to the leaf value, or it's a scaling of the leaf value. Scaling counts as a separate iteration. Thus, it is possible that instead of having
n
gradient steps, the algorithm makes a single gradient step that is reduced
n
times, which means that it is divided by
2
ā
n
2\cdot n
times.
Possible values:
NoĀ ā Do not use backtracking. Supported onĀ CPU and GPU.
AnyImprovementĀ ā Reduce the descent step up to the point when the loss function value is smaller than it was on the previous step. The trial reduction factors are 2, 4, 8, and so on. Supported onĀ CPU and GPU.
ArmijoĀ ā Reduce the descent step until the Armijo condition is met. Supported only on GPU.
Type
string
Default value
AnyImprovement
Supported processing units
Depends on the selected value
fold_len_multiplier
Command-line:
--fold-len-multiplier
Description
Coefficient for changing the length of folds.
The value must be greater than 1. The best validation result is achieved with minimum values.
With values close to 1 (for example,
1
+
ϵ
1+\epsilon
), each iteration takes a quadratic amount of memory and time for the number of objects in the iteration. Thus, low values are possible only when there is a small number of objects.
Type
float
Default value
2
Supported processing units
CPU and GPU
approx_on_full_history
Command-line:
--approx-on-full-history
Description
The principles for calculating the approximated values.
Possible values:
False
Ā ā Use only а fraction of the fold for calculating the approximated values. The size of the fraction is calculated as follows:
1
X
\frac{1}X
, whereĀ
X
is the specified coefficient for changing the length of folds. This mode is faster and in rare cases slightly less accurate
True
Ā ā Use all the preceding rows in the fold for calculating the approximated values. This mode is slower and in rare cases slightly more accurate.
Type
bool
Default value
Python package, Command-line
False
R package
True
Supported processing units
CPU
class_weights
Command-line:
--class-weights
Description
Class weights. The values are used as multipliers for the object weights. This parameter can be used for solving binary classification and multiclassification problems.
Python package
Note
For imbalanced datasets with binary classification the weight multiplier can be set to 1 for class 0 and to
(
s
u
m
_
n
e
g
a
t
i
v
e
s
u
m
_
p
o
s
i
t
i
v
e
)
\left(\frac{sum\_negative}{sum\_positive}\right)
for class 1.
For example,
class_weights=[0.1, 4]
multiplies the weights of objects from class 0 by 0.1 and the weights of objects from class 1 by 4.
If class labels are not standard consecutive integers [0, 1 ... class_count-1], use the dict or collections.OrderedDict type with label to weight mapping.
For example,
class_weights={'a': 1.0, 'b': 0.5, 'c': 2.0}
multiplies the weights of objects with class label
a
by 1.0, the weights of objects with class label
b
by 0.5 and the weights of objects with class label
c
by 2.0.
The dictionary form can also be used with standard consecutive integers class labels for additional readability. For example:
class_weights={0: 1.0, 1: 0.5, 2: 2.0}
.
Note
Class labels are extracted from dictionary keys for the following types of class_weights:
dict
collections.OrderedDict (when the order of classes in the model is important)
TheĀ class_names parameter can be skipped when using these types.
Alert
Do not use this parameter withĀ auto_class_weights and scale_pos_weight.
R package
For example,
class_weights <- c(0.1, 4)
multiplies the weights of objects from class 0 by 0.1 and the weights of objects from class 1 by 4.
Alert
Do not use this parameter withĀ auto_class_weights.
Command-line
Note
The quantity of class weights must match the quantity of class names specified in the
--class-names
parameter and the number of classes specified in the
--classes-count parameter
.
For imbalanced datasets with binary classification the weight multiplier can be set to 1 for class 0 and to
(
s
u
m
_
n
e
g
a
t
i
v
e
s
u
m
_
p
o
s
i
t
i
v
e
)
\left(\frac{sum\_negative}{sum\_positive}\right)
for class 1.
Format:
<value for class 1>,..,<values for class N>
For example:
0.85,1.2,1
Alert
Do not use this parameter withĀ auto_class_weights.
Type
list
dict
collections.OrderedDict
Default value
None (the weight for all classes is set to 1)
Supported processing units
CPU and GPU
class_names
Description
Classes names. Allows to redefine the default values when using the MultiClass and Logloss metrics.
If the upper limit for the numeric class label is specified, the number of classes names should match this value.
Warning
The quantity of classes names must match the quantity of classes weights specified in theĀ
--class-weights
parameter and the number of classes specified in theĀ
--classes-count
parameter.
Format:
<name for class 1>,..,<name for class N>
For example:
smartphone,touchphone,tablet
Type
list of strings
Default value
None
Supported processing units
CPU and GPU
auto_class_weights
Command-line:
--auto-class-weights
Description
Automatically calculate class weights based either on the total weight or the total number of objects in each class. The values are used as multipliers for the object weights.
Supported values:
NoneĀ ā All class weights are set to 1
Balanced:
C
W
k
=
m
a
x
c
=
1
K
(
ā
t
i
=
c
w
i
)
ā
t
i
=
k
w
i
CW_k=\displaystyle\frac{max_{c=1}^K(\sum_{t_{i}=c}{w_i})}{\sum_{t_{i}=k}{w_{i}}}
SqrtBalanced:
C
W
k
=
m
a
x
c
=
1
K
(
ā
t
i
=
c
w
i
)
ā
t
i
=
k
w
i
CW_k=\sqrt{\displaystyle\frac{max_{c=1}^K(\sum_{t_i=c}{w_i})}{\sum_{t_i=k}{w_i}}}
Alert
Do not use this parameter withĀ
class_weights
and
scale_pos_weight
.
Type
string
Default value
NoneĀ ā All class weights are set to 1
Supported processing units
CPU and GPU
scale_pos_weight
Description
The weight for class 1 in binary classification. The value is used as a multiplier for the weights of objects from class 1.
Note
For imbalanced datasets, the weight multiplier can be set toĀ
(
s
u
m
_
n
e
g
a
t
i
v
e
s
u
m
_
p
o
s
i
t
i
v
e
)
\left(\frac{sum\_negative}{sum\_positive}\right)
Alert
Do not use this parameter with
auto_class_weights
and
class_weights
.
Type
float
Default value
1.0
Supported processing units
CPU and GPU
boosting_type
Command-line:
--boosting-type
Description
Boosting scheme.
Possible values:
OrderedĀ ā Usually provides better quality on small datasets, but it may be slower than the Plain scheme.
PlainĀ ā The classic gradient boosting scheme.
Type
string
Default value
Depends on the processing unit type, the number of objects in the training dataset and the selected learning mode
CPU
Plain
GPU
Any number of objects, MultiClass or MultiClassOneVsAll mode: Plain
More than 50 thousand objects, any mode: Plain
Less than or equal to 50 thousand objects, any mode but MultiClass or MultiClassOneVsAll: Ordered
Supported processing units
CPU and GPU
Only the Plain mode is supported for theĀ MultiClass loss on GPU
boost_from_average
Command-line:
--boost-from-average
Description
Initialize approximate values by best constant value for the specified loss function. Sets the value of bias to the initial best constant value.
Available for the following loss functions:
RMSE
Logloss
CrossEntropy
Quantile
MAE
MAPE
Type
bool
Default value
Depends on the selected loss function:
True for RMSE, Quantile, MAE, MAPE
False for all other loss functions
Supported processing units
CPU and GPU
langevin
Command-line:
--langevin
Description
Enables the Stochastic Gradient Langevin Boosting mode.
Refer to the
SGLB: Stochastic Gradient Langevin Boosting
paper for details.
Type
bool
Default value
False
Supported processing units
CPU
diffusion_temperature
Command-line:
--diffusion-temperature
Description
The diffusion temperature of the Stochastic Gradient Langevin Boosting mode.
Only non-negative values are supported.
Type
float
Default value
10000
Supported processing units
CPU
posterior_sampling
Command-line:
--posterior-sampling
Description
If this parameter is set several options are specified as follows and model parameters are checked to obtain uncertainty predictions with good theoretical properties.
Specifies options:
Langevin
: true,
DiffusionTemperature
: objects in learn pool count,
ModelShrinkRate
: 1 / (2. * objects in learn pool count).
Type
bool
Default value
False
Supported processing units
CPU only
allow_const_label
Command-line:
--allow-const-label
Description
Use it to train models with datasets that have equal label values for all objects.
Type
bool
Default value
False
Supported processing units
CPU and GPU
score_function
Command-line:
--score-function
Description
The
score type
used to select the next split during the tree construction.
Possible values:
Cosine (do not use this score type with theĀ Lossguide tree growing policy)
L2
NewtonCosine (do not use this score type with theĀ Lossguide tree growing policy)
NewtonL2
Type
string
Default value
Cosine
Supported processing units
The supported score functions vary depending on the processing unit type:
GPUĀ ā All score types
CPUĀ ā Cosine, L2
monotone_constraints
Command-line:
--monotone-constraints
Description
Impose monotonic constraints on numerical features.
Possible values:
1
Ā ā Increasing constraint on the feature. The algorithm forces the model to be a non-decreasing function of this features.
-1
Ā ā Decreasing constraint on the feature. The algorithm forces the model to be a non-increasing function of this features.
0
Ā ā constraints are disabled.
Supported formats for setting the value of this parameter (all feature indices are zero-based):
Set constraints individually for each feature as a string (the number of features is n).
Format
"(<constraint_0>, <constraint_2>, .., <constraint_n-1>)"
Zero constraints for features at the end of the list may be dropped.
In
monotone_constraints = "(1,0,-1)"
an increasing constraint is set on the first feature and a decreasing one on the third. Constraints are disabled for all other features.
Set constraints individually for each explicitly specified feature as a string (the number of features is n).
"<feature index or name>:<constraint>, .., <feature index or name>:<constraint>"
These examples
monotone-constraints = "2:1,4:-1"
monotone-constraints = "Feature2:1,Feature4:-1"
are identical, given that the name of the feature index 2 is
Feature2
and the name of the feature indexed 4 is
Feature4
.
Set constraints individually for each required feature as an array or a dictionary (the number of features is n).
Format
[<constraint_0>, <constraint_2>, .., <constraint_n-1>]
{"<feature index or name>":<constraint>, .., "<feature index or name>":<constraint>}
Array examples
monotone_constraints = [1, 0, -1]
These dictionary examples
monotone_constraints = {
"Feature2"
:
1
,
"Feature4"
:-
1
}
monotone_constraints = {
"2"
:
1
,
"4"
:-
1
}
are identical, given that the name of the feature indexed 2 is
Feature2
and the name of the feature indexed 4 is
Feature4
.
Type
list of strings
string
dict
list
Default value
Python package, R package
None
Command-line
Ommited
Supported processing units
CPU
feature_weights
Command-line:
--feature-weights
Description
Per-feature multiplication weights used when choosing the best split. The score of each candidate is multiplied by the weights of features from the current split.
Non-negative float values are supported for each weight.
Supported formats for setting the value of this parameter:
Set the multiplication weight for each feature as a string (the number of features is n).
Format
"(<feature-weight_0>,<feature-weight_2>,..,<feature-weight_n-1>)"
Note
Spaces between values are not allowed.
Values should be passed as a parenthesized string of comma-separated values. Multiplication weights equal to 1 at the end of the list may be dropped.
In this
example
feature_weights = "(0.1,1,3)"
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
Set the multiplication weight individually for each explicitly specified feature as a string (the number of features is n).
Format
"<feature index or name>:<weight>, .., <feature index or name>:<weight>"
Note
Spaces between values are not allowed.
These examples
feature_weights = "2:1.1,4:0.1"
feature_weights = "Feature2:1.1,Feature4:0.1"
are identical, given that the name of the feature indexed 2 is
Feature2
and the name of the feature indexed 4 is
Feature4
.
Set the multiplication weight individually for each required feature as an array or a dictionary (the number of features is n).
Format
[<feature-weight_0>, <feature-weight_2>, .., <feature-weight_n-1>]
{"<feature index or name>":<weight>, .., "<feature index or name>":<weight>}
Array examples
feature_weights = [0.1, 1, 3]
These dictionary examples
feature_weights = {
"Feature2"
:
1.1
,
"Feature4"
:
0.3
}
feature_weights = {
"2"
:
1.1
,
"4"
:
0.3
}
are identical, given that the name of the feature indexed 2 is
Feature2
and the name of the feature indexed 4 is
Feature4
.
Type
list
numpy.ndarray
string
dict
Default value
1 for all features
Supported processing units
CPU
first_feature_use_penalties
Command-line:
--first-feature-use-penalties
Description
Per-feature penalties for the first occurrence of the feature in the model. The given value is subtracted from the score if the current candidate is the first one to include the feature in the model.
Refer to the
Per-object and per-feature penalties
section for details on applying different score penalties.
Non-negative float values are supported for each penalty.
Set the penalty for each feature as a string (the number of features is n).
Format
"(<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>)"
Note
Spaces between values are not allowed.
Values should be passed as a parenthesized string of comma-separated values. Penalties equal to 0 at the end of the list may be dropped.
In this example
first_feature_use_penalties
parameter:
first_feature_use_penalties = "(0.1,1,3)"
per_object_feature_penalties
parameter:
per_object_feature_penalties = "(0.1,1,3)"
Note
Spaces between values are not allowed.
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
Set the penalty individually for each explicitly specified feature as a string (the number of features is n).
Format
"<feature index or name>:<penalty>,..,<feature index or name>:<penalty>"
Note
Spaces between values are not allowed.
These examples
first_feature_use_penalties
parameter:
first_feature_use_penalties = "2:1.1,4:0.1"
first_feature_use_penalties = "Feature2:1.1,Feature4:0.1"
per_object_feature_penalties
parameter:
per_object_feature_penalties = "2:1.1,4:0.1"
per_object_feature_penalties = "Feature2:1.1,Feature4:0.1"
are identical, given that the name of the feature indexed 2 is
Feature2
and the name of the feature indexed 4 is
Feature4
.
Set the penalty individually for each required feature as an array or a dictionary (the number of features is n).
Format
[<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>]
{"<feature index or name>":<penalty>, .., "<feature index or name>":<penalty>}
Array examples.
first_feature_use_penalties
parameter:
first_feature_use_penalties = [0.1, 1, 3]
per_object_feature_penalties
parameter:
per_object_feature_penalties = [0.1, 1, 3]
These dictionary examples
first_feature_use_penalties
parameter:
first_feature_use_penalties = {
"Feature2"
:
1.1
,
"Feature4"
:
0.1
}
first_feature_use_penalties = {
"2"
:
1.1
,
"4"
:
0.1
}
per_object_feature_penalties
parameter:
per_object_feature_penalties = {
"Feature2"
:
1.1
,
"Feature4"
:
0.1
}
per_object_feature_penalties = {
"2"
:
1.1
,
"4"
:
0.1
}
are identical, given that the name of the feature indexed 2 is
Feature2
and the name of the feature indexed 4 is
Feature4
.
Type
list
numpy.ndarray
string
dict
Default value
0 for all features
Supported processing units
CPU
fixed_binary_splits
Command-line:
--fixed-binary-splits
Description
A list of indices of binary features to put at the top of each tree; ignored if
grow_policy
is
Symmetric
.
Type
list
Default value
None
Supported processing units
GPU
penalties_coefficient
Command-line:
--penalties-coefficient
Description
A single-value common coefficient to multiply all penalties.
Non-negative values are supported.
Type
float
Default value
1
Supported processing units
CPU
per_object_feature_penalties
Command-line:
--per-object-feature-penalties
Description
Per-object penalties for the first use of the feature for the object. The given value is multiplied by the number of objects that are divided by the current split and use the feature for the first time.
Refer to the
Per-object and per-feature penalties
section for details on applying different score penalties.
Non-negative float values are supported for each penalty.
Python package
Set the penalty for each feature as a string (the number of features is n).
Format
"(<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>)"
Note
Spaces between values are not allowed.
Values should be passed as a parenthesized string of comma-separated values. Penalties equal to 0 at the end of the list may be dropped.
In this example
first_feature_use_penalties
parameter:
first_feature_use_penalties = "(0.1,1,3)"
per_object_feature_penalties
parameter:
per_object_feature_penalties = "(0.1,1,3)"
Note
Spaces between values are not allowed.
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
Set the penalty individually for each explicitly specified feature as a string (the number of features is n).
Format
"<feature index or name>:<penalty>,..,<feature index or name>:<penalty>"
Note
Spaces between values are not allowed.
These examples
first_feature_use_penalties
parameter:
first_feature_use_penalties = "2:1.1,4:0.1"
first_feature_use_penalties = "Feature2:1.1,Feature4:0.1"
per_object_feature_penalties
parameter:
per_object_feature_penalties = "2:1.1,4:0.1"
per_object_feature_penalties = "Feature2:1.1,Feature4:0.1"
are identical, given that the name of the feature indexed 2 is
Feature2
and the name of the feature indexed 4 is
Feature4
.
Set the penalty individually for each required feature as an array or a dictionary (the number of features is n).
Format
[<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>]
{"<feature index or name>":<penalty>, .., "<feature index or name>":<penalty>}
Array examples.
first_feature_use_penalties
parameter:
first_feature_use_penalties = [0.1, 1, 3]
per_object_feature_penalties
parameter:
per_object_feature_penalties = [0.1, 1, 3]
These dictionary examples
first_feature_use_penalties
parameter:
first_feature_use_penalties = {
"Feature2"
:
1.1
,
"Feature4"
:
0.1
}
first_feature_use_penalties = {
"2"
:
1.1
,
"4"
:
0.1
}
per_object_feature_penalties
parameter:
per_object_feature_penalties = {
"Feature2"
:
1.1
,
"Feature4"
:
0.1
}
per_object_feature_penalties = {
"2"
:
1.1
,
"4"
:
0.1
}
are identical, given that the name of the feature indexed 2 is
Feature2
and the name of the feature indexed 4 is
Feature4
.
R package
Set the penalty for each feature as a string (the number of features is n).
Format
"(<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>)"
Note
Spaces between values are not allowed.
Values should be passed as a parenthesized string of comma-separated values. Penalties equal to 0 at the end of the list may be dropped.
Penalties equal to 0 at the end of the list may be dropped.
In this
example
first_feature_use_penalties
parameter:
first_feature_use_penalties = "(0.1,1,3)"
per_object_feature_penalties
parameter:
per_object_feature_penalties = "(0.1,1,3)"
Note
Spaces between values are not allowed.
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
Set the penalty individually for each explicitly specified feature as a string (the number of features is n).
Format
"<feature index or name>:<penalty>,..,<feature index or name>:<penalty>"
Note
Spaces between values are not allowed.
These examples
first_feature_use_penalties
parameter:
first_feature_use_penalties = "2:1.1,4:0.1"
first_feature_use_penalties = "Feature2:1.1,Feature4:0.1"
per_object_feature_penalties
parameter:
per_object_feature_penalties = "2:1.1,4:0.1"
per_object_feature_penalties = "Feature2:1.1,Feature4:0.1"
are identical, given that the name of the feature indexed 2 is
Feature2
and the name of the feature indexed 4 is
Feature4
.
Type
list
numpy.ndarray
string
dict
Default value
0 for all objects
Supported processing units
CPU
model_shrink_rate
Command-line:
--model-shrink-rate
Description
The constant used to calculate the coefficient for multiplying the model on each iteration.
The actual model shrinkage coefficient calculated at each iteration depends on the value of the
--model-shrink-mode
for the Command-line version parameter. The resulting value of the coefficient should be always in the range (0, 1].
Type
float
Default value
The default value depends on the values of the following parameters:
--model-shrink-mode
for the Command-line version
--monotone-constraints
for the Command-line version
Supported processing units
CPU
model_shrink_mode
Command-line:
model_shrink_mode
Description
Determines how the actual model shrinkage coefficient is calculated at each iteration.
Possible values:
Constant:
1
ā
m
o
d
e
l
_
s
h
r
i
n
k
_
r
a
t
e
ā
l
e
a
r
n
i
n
g
_
r
a
t
e
,
1 - model\_shrink\_rate \cdot learning\_rate {,}
m
o
d
e
l
_
s
h
r
i
n
k
_
r
a
t
e
model\_shrink\_rate
is the value of the
--model-shrink-rate
for the Command-line version parameter.
l
e
a
r
n
i
n
g
_
r
a
t
e
learning\_rate
is the value of the
--learning-rate
for the Command-line version parameter
Decreasing:
1
ā
m
o
d
e
l
_
s
h
r
i
n
k
_
r
a
t
e
i
,
1 - \frac{model\_shrink\_rate}{i} {,}
m
o
d
e
l
_
s
h
r
i
n
k
_
r
a
t
e
model\_shrink\_rate
is the value of the
--model-shrink-rate
for the Command-line version parameter.
i
i
is the identifier of the iteration.
Type
string
Default value
Constant
Supported processing units
CPU |
| Markdown | [](https://catboost.ai/ "CatBoost")
- Installation
- [Overview](https://catboost.ai/docs/en/references/training-parameters/en/concepts/installation)
- Python package installation
- CatBoost for Apache Spark installation
- R package installation
- Command-line version binary
- Build from source
- Key Features
- Training parameters
- [Overview](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/)
- [Common parameters](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common)
- [CTR settings](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/ctr)
- [Input settings](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/input)
- [Multiclassification settings](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/multiclassification)
- [Output settings](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/output)
- [Overfitting detection settings](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/overfitting-detection)
- [Performance settings](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/performance)
- [Processing unit settings](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/processing-unit)
- [Quantization settings](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/quantization)
- [Text processing parameters](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/text-processing)
- [Visualization settings](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/visualization)
- Python package
- CatBoost for Apache Spark
- R package
- Command-line version
- Applying models
- Objectives and metrics
- Model analysis
- Data format description
- [Parameter tuning](https://catboost.ai/docs/en/references/training-parameters/en/concepts/parameter-tuning)
- [Speeding up the training](https://catboost.ai/docs/en/references/training-parameters/en/concepts/speed-up-training)
- Data visualization
- Algorithm details
- [FAQ](https://catboost.ai/docs/en/references/training-parameters/en/concepts/faq)
- Educational materials
- [Development and contributions](https://catboost.ai/docs/en/references/training-parameters/en/concepts/development-and-contributions)
- [Contacts](https://catboost.ai/docs/en/references/training-parameters/en/concepts/contacts)
loss\_function
## In this article:
- [loss\_function](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#loss_function)
- [custom\_metric](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#custom_metric)
- [eval\_metric](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#eval_metric)
- [iterations](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#iterations)
- [learning\_rate](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#learning_rate)
- [random\_seed](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#random_seed)
- [l2\_leaf\_reg](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#l2_leaf_reg)
- [bootstrap\_type](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#bootstrap_type)
- [bagging\_temperature](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#bagging_temperature)
- [subsample](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#subsample)
- [sampling\_frequency](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#sampling_frequency)
- [sampling\_unit](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#sampling_unit)
- [mvs\_reg](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#mvs_reg)
- [random\_strength](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#random_strength)
- [use\_best\_model](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#use_best_model)
- [best\_model\_min\_trees](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#best_model_min_trees)
- [depth](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#depth)
- [grow\_policy](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#grow_policy)
- [min\_data\_in\_leaf](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#min_data_in_leaf)
- [max\_leaves](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#max_leaves)
- [ignored\_features](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#ignored_features)
- [one\_hot\_max\_size](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#one_hot_max_size)
- [has\_time](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#has_time)
- [rsm](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#rsm)
- [nan\_mode](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#nan_mode)
- [input\_borders](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#input_borders)
- [output\_borders](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#output_borders)
- [fold\_permutation\_block](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#fold_permutation_block)
- [leaf\_estimation\_method](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#leaf_estimation_method)
- [leaf\_estimation\_iterations](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#leaf_estimation_iterations)
- [leaf\_estimation\_backtracking](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#leaf_estimation_backtracking)
- [fold\_len\_multiplier](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#fold_len_multiplier)
- [approx\_on\_full\_history](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#approx_on_full_history)
- [class\_weights](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#class_weights)
- [class\_names](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#class_names)
- [auto\_class\_weights](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#auto_class_weights)
- [scale\_pos\_weight](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#scale_pos_weight)
- [boosting\_type](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#boosting_type)
- [boost\_from\_average](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#boost_from_average)
- [langevin](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#langevin)
- [diffusion\_temperature](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#diffusion_temperature)
- [posterior\_sampling](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#posterior_sampling)
- [allow\_const\_label](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#allow_const_label)
- [score\_function](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#score_function)
- [monotone\_constraints](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#monotone_constraints)
- [feature\_weights](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#feature_weights)
- [first\_feature\_use\_penalties](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#first_feature_use_penalties)
- [fixed\_binary\_splits](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#fixed_binary_splits)
- [penalties\_coefficient](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#penalties_coefficient)
- [per\_object\_feature\_penalties](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#per_object_feature_penalties)
- [model\_shrink\_rate](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#model_shrink_rate)
- [model\_shrink\_mode](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#model_shrink_mode)
1. [Training parameters](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/)
2. Common parameters
# Common parameters
- [loss\_function](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#loss_function)
- [custom\_metric](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#custom_metric)
- [eval\_metric](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#eval_metric)
- [iterations](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#iterations)
- [learning\_rate](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#learning_rate)
- [random\_seed](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#random_seed)
- [l2\_leaf\_reg](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#l2_leaf_reg)
- [bootstrap\_type](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#bootstrap_type)
- [bagging\_temperature](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#bagging_temperature)
- [subsample](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#subsample)
- [sampling\_frequency](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#sampling_frequency)
- [sampling\_unit](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#sampling_unit)
- [mvs\_reg](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#mvs_reg)
- [random\_strength](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#random_strength)
- [use\_best\_model](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#use_best_model)
- [best\_model\_min\_trees](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#best_model_min_trees)
- [depth](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#depth)
- [grow\_policy](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#grow_policy)
- [min\_data\_in\_leaf](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#min_data_in_leaf)
- [max\_leaves](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#max_leaves)
- [ignored\_features](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#ignored_features)
- [one\_hot\_max\_size](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#one_hot_max_size)
- [has\_time](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#has_time)
- [rsm](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#rsm)
- [nan\_mode](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#nan_mode)
- [input\_borders](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#input_borders)
- [output\_borders](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#output_borders)
- [fold\_permutation\_block](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#fold_permutation_block)
- [leaf\_estimation\_method](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#leaf_estimation_method)
- [leaf\_estimation\_iterations](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#leaf_estimation_iterations)
- [leaf\_estimation\_backtracking](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#leaf_estimation_backtracking)
- [fold\_len\_multiplier](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#fold_len_multiplier)
- [approx\_on\_full\_history](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#approx_on_full_history)
- [class\_weights](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#class_weights)
- [class\_names](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#class_names)
- [auto\_class\_weights](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#auto_class_weights)
- [scale\_pos\_weight](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#scale_pos_weight)
- [boosting\_type](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#boosting_type)
- [boost\_from\_average](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#boost_from_average)
- [langevin](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#langevin)
- [diffusion\_temperature](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#diffusion_temperature)
- [posterior\_sampling](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#posterior_sampling)
- [allow\_const\_label](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#allow_const_label)
- [score\_function](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#score_function)
- [monotone\_constraints](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#monotone_constraints)
- [feature\_weights](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#feature_weights)
- [first\_feature\_use\_penalties](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#first_feature_use_penalties)
- [fixed\_binary\_splits](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#fixed_binary_splits)
- [penalties\_coefficient](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#penalties_coefficient)
- [per\_object\_feature\_penalties](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#per_object_feature_penalties)
- [model\_shrink\_rate](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#model_shrink_rate)
- [model\_shrink\_mode](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#model_shrink_mode)
## loss\_function
Command-line: `--loss-function`
*Alias:* `objective`
#### Description
The [metric](https://catboost.ai/docs/en/references/training-parameters/en/concepts/loss-functions) to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters (see the [Objectives and metrics](https://catboost.ai/docs/en/references/training-parameters/en/concepts/loss-functions) section for details on each metric).
Format:
```
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
```
Supported metrics
- RMSE
- Logloss
- MAE
- CrossEntropy
- Quantile
- LogLinQuantile
- Lq
- MultiRMSE
- MultiClass
- MultiClassOneVsAll
- MultiLogloss
- MultiCrossEntropy
- MAPE
- Poisson
- PairLogit
- PairLogitPairwise
- QueryRMSE
- QuerySoftMax
- GroupQuantile
- Tweedie
- YetiRank
- YetiRankPairwise
- StochasticFilter
- StochasticRank
A custom python object can also be set as the value of this parameter (see an [example](https://catboost.ai/docs/en/references/training-parameters/en/concepts/python-usages-examples)).
For example, use the following construction to calculate the value of Quantile with the coefficient α \= 0\.1 \\alpha = 0.1 α\=0\.1:
```
Quantile:alpha=0.1
```
**Type**
- string
- object
**Default value**
Python package
Depends on the class:
- [CatBoostClassifier](https://catboost.ai/docs/en/references/training-parameters/en/concepts/python-reference_catboostclassifier): Logloss if the `target_border` parameter value differs from None. Otherwise, the default loss function depends on the number of unique target values and is either set to Logloss or MultiClass.
- [CatBoost](https://catboost.ai/docs/en/references/training-parameters/en/concepts/python-reference_catboost) and [CatBoostRegressor](https://catboost.ai/docs/en/references/training-parameters/en/concepts/python-reference_catboostregressor): RMSE
R package, Command-line
RMSE
**Supported processing units**
CPU and GPU
## custom\_metric
Command-line: `--custom-metric`
#### Description
[Metric](https://catboost.ai/docs/en/references/training-parameters/en/concepts/loss-functions) values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see the [Objectives and metrics](https://catboost.ai/docs/en/references/training-parameters/en/concepts/loss-functions) section for details on each metric).
Format:
```
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
```
[Supported metrics](https://catboost.ai/docs/en/references/training-parameters/en/references/custom-metric__supported-metrics)
Examples
- Calculate the value of CrossEntropy:
```
CrossEntropy
```
- Calculate the value of Quantile with the coefficient α \= 0\.1 \\alpha = 0.1 α\=0\.1
```
Quantile:alpha=0.1
```
- Calculate the values of Logloss and AUC:
```
['Logloss', 'AUC']
```
Values of all custom metrics for learn and validation datasets are saved to the [Metric](https://catboost.ai/docs/en/references/training-parameters/en/concepts/output-data_loss-function) output files (`learn_error.tsv` and `test_error.tsv` respectively). The directory for these files is specified in the `--train-dir` (`train_dir`) parameter.
Use the [visualization tools](https://catboost.ai/docs/en/references/training-parameters/en/features/visualization) to see a live chart with the dynamics of the specified metrics.
**Type**
- string
- list of strings
**Default value**
Python package
None
R package
None
Command-line
None (do not output additional metric values)
**Supported processing units**
CPU and GPU
## eval\_metric
Command-line: `--eval-metric`
#### Description
The metric used for overfitting detection (if enabled) and best model selection (if enabled). Some metrics support optional parameters (see the [Objectives and metrics](https://catboost.ai/docs/en/references/training-parameters/en/concepts/loss-functions) section for details on each metric).
Format:
```
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
```
[Supported metrics](https://catboost.ai/docs/en/references/training-parameters/en/references/eval-metric__supported-metrics)
A user-defined function can also be set as the value (see an [example](https://catboost.ai/docs/en/references/training-parameters/en/concepts/python-usages-examples)).
Examples:
```
R2
```
**Type**
- string
- object
**Default value**
Optimized objective is used
**Supported processing units**
CPU and GPU
## iterations
Command-line: `-i`, `--iterations`
*Aliases:* `num_boost_round`, `n_estimators`, `num_trees`
#### Description
The maximum number of trees that can be built when solving machine learning problems.
When using other parameters that limit the number of iterations, the final number of trees may be less than the number specified in this parameter.
**Type**
int
**Default value**
1000
**Supported processing units**
CPU and GPU
## learning\_rate
Command-line: `-w`, `--learning-rate`
*Alias:* `eta`
#### Description
The learning rate.
Used for reducing the gradient step.
**Type**
float
**Default value**
The default value is defined automatically for [`Logloss`](https://catboost.ai/docs/en/references/training-parameters/en/concepts/loss-functions-classification#Logit), [`MultiClass`](https://catboost.ai/docs/en/references/training-parameters/en/concepts/loss-functions-multiclassification#MultiClass) and [`RMSE`](https://catboost.ai/docs/en/references/training-parameters/en/concepts/loss-functions-regression#RMSE) loss functions depending on the number of iterations if none of parameters [`leaf_estimation_iterations`](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#leaf_estimation_iterations), [`leaf_estimation_method`](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#leaf_estimation_method), [`l2_leaf_reg`](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/common#l2_leaf_reg) is set. In this case, the selected learning rate is printed to stdout and saved in the model.
In other cases, the default value is 0.03.
**Supported processing units**
CPU and GPU
## random\_seed
Command-line: `-r`, `--random-seed`
*Alias:*`random_state`
#### Description
The random seed used for training.
**Type**
int
**Default value**
Python package
None (0)
R package, Command-line
0
**Supported processing units**
CPU and GPU
## l2\_leaf\_reg
Command-line: `--l2-leaf-reg`, `l2-leaf-regularizer`
*Alias:* `reg_lambda`
#### Description
Coefficient at the L2 regularization term of the cost function.
Any positive value is allowed.
**Type**
float
**Default value**
3\.0
**Supported processing units**
CPU and GPU
## bootstrap\_type
Command-line: `--bootstrap-type`
#### Description
[Bootstrap type](https://catboost.ai/docs/en/references/training-parameters/en/concepts/algorithm-main-stages_bootstrap-options). Defines the method for sampling the weights of objects.
Supported methods:
- Bayesian
- Bernoulli
- MVS
- Poisson (supported for GPU only)
- No
**Type**
string
**Default value**
The default value depends on `objective`, `task_type`, `bagging_temperature` and `sampling_unit`:
- When the objective parameter is QueryCrossEntropy, YetiRankPairwise, PairLogitPairwise and the bagging\_temperature parameter is not set: Bernoulli with the subsample parameter set to 0.5.
- Neither MultiClass nor MultiClassOneVsAll, task\_type = CPU and sampling\_unit = Object: MVS with the subsample parameter set to 0.8.
- Otherwise: Bayesian.
**Supported processing units**
CPU and GPU
## bagging\_temperature
Command-line: `--bagging-temperature`
#### Description
Defines the settings of the Bayesian bootstrap. It is used by default in classification and regression modes.
Use the Bayesian bootstrap to assign random weights to objects.
The weights are sampled from exponential distribution if the value of this parameter is set to "1". All weights are equal to 1 if the value of this parameter is set to "0".
Possible values are in the range \[ 0 ; inf ā” ) \[0; \\inf) \[0;inf). The higher the value the more aggressive the bagging is.
This parameter can be used if the selected bootstrap type is Bayesian.
**Type**
float
**Default value**
1
**Supported processing units**
CPU and GPU
## subsample
Command-line: `--subsample`
#### Description
Sample rate for bagging.
This parameter can be used if one of the following bootstrap types is selected:
- Poisson
- Bernoulli
- MVS
**Type**
float
**Default value**
The default value depends on the dataset size and the bootstrap type:
- Datasets with less than 100 objects ā 1
- Datasets with 100 objects or more:
- Poisson, Bernoulli ā 0.66
- MVS ā 0.8
**Supported processing units**
CPU and GPU
## sampling\_frequency
Command-line: `--sampling-frequency`
#### Description
Frequency to sample weights and objects when building trees.
Supported values:
- PerTree ā Before constructing each new tree
- PerTreeLevel ā Before choosing each new split of a tree
**Type**
string
**Default value**
PerTreeLevel
**Supported processing units**
CPU
## sampling\_unit
Command-line: `--sampling-unit`
#### Description
The sampling scheme.
Possible values:
- Object ā The weight
w
i
w\_{i}
wiā
of the i-th object
o
i
o\_{i}
oiā
is used for sampling the corresponding object.
- Group ā The weight
w
j
w\_{j}
wjā
of the group
g
j
g\_{j}
gjā
is used for sampling each object
o
i
j
o\_{i\_{j}}
oijāā
from the group
g
j
g\_{j}
gjā
.
**Type**
String
**Default value**
Object
**Supported processing units**
CPU and GPU
## mvs\_reg
Command-line: `--mvs-reg`
#### Description
Affects the weight of the denominator and can be used for balancing between the importance and Bernoulli sampling (setting it to 0 implies importance sampling and to ā \\infty ā - Bernoulli).
Note
This parameter is supported only for the MVS sampling method (the `bootstrap_type` parameter must be set to MVS).
**Type**
float
**Default value**
The value is set based on the gradient distribution on the current iteration
**Supported processing units**
CPU
## random\_strength
Command-line: `--random-strength`
#### Description
The amount of randomness to use for scoring splits when the tree structure is selected. Use this parameter to avoid overfitting the model.
The value of this parameter is used when selecting splits. On every iteration each possible split gets a score (for example, the score indicates how much adding this split will improve the loss function for the training dataset). The split with the highest score is selected.
The scores have no randomness. A normally distributed random variable is added to the score of the feature. It has a zero mean and a variance that decreases during the training. The value of this parameter is the multiplier of the variance.
Note
This parameter is not supported for the following loss functions:
- QueryCrossEntropy
- YetiRankPairwise
- PairLogitPairwise
**Type**
float
**Default value**
1
**Supported processing units**
CPU
## use\_best\_model
Command-line: `--use-best-model`
#### Description
If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:
1. Build the number of trees defined by the training parameters.
2. Use the validation dataset to identify the iteration with the optimal value of the metric specified in `--eval-metric` (`--eval-metric`).
No trees are saved after this iteration.
This option requires a validation dataset to be provided.
**Type**
bool
**Default value**
True if a validation set is input (the eval\_set parameter is defined) and at least one of the label values of objects in this set differs from the others. False otherwise.
**Supported processing units**
CPU and GPU
## best\_model\_min\_trees
Command-line: `--best-model-min-trees`
#### Description
The minimal number of trees that the best model should have. If set, the output model contains at least the given number of trees even if the optimal value of the evaluation metric on the validation dataset is achieved with smaller number of trees.
Should be used with the `--use-best-model` parameter.
**Type**
int
**Default value**
Python package, R package
None (The minimal number of trees for the best model is not set)
Command-line
The minimal number of trees for the best model is not set
**Supported processing units**
CPU and GPU
## depth
Command-line: `-n`, `--depth`
*Alias:* `max_depth`
#### Description
Depth of the trees.
The range of supported values depends on the processing unit type and the type of the selected loss function:
- CPU ā Any integer up to 16.
- GPU ā Any integer up to 8 for pairwise modes (YetiRank, PairLogitPairwise, and QueryCrossEntropy), and up to 16 for all other loss functions.
**Type**
int
**Default value**
6 (16 if the growing policy is set to Lossguide)
**Supported processing units**
CPU and GPU
## grow\_policy
Command-line: `--grow-policy`
#### Description
The tree growing policy. Defines how to perform greedy tree construction.
Possible values:
- SymmetricTree āA tree is built level by level until the specified depth is reached. On each iteration, all leaves from the last tree level are split with the same condition. The resulting tree structure is always symmetric.
- Depthwise ā A tree is built level by level until the specified depth is reached. On each iteration, all non-terminal leaves from the last tree level are split. Each leaf is split by condition with the best loss improvement.
Note
Models with this growing policy can not be analyzed using the PredictionDiff feature importance and can be exported only to json and cbm.
- Lossguide ā A tree is built leaf by leaf until the specified maximum number of leaves is reached. On each iteration, non-terminal leaf with the best loss improvement is split.
Note
Models with this growing policy can not be analyzed using the PredictionDiff feature importance and can be exported only to json and cbm.
**Type**
string
**Default value**
SymmetricTree
**Supported processing units**
CPU and GPU
## min\_data\_in\_leaf
Command-line: `--min-data-in-leaf`
*Alias:* `min_child_samples`
#### Description
The minimum number of training samples in a leaf. CatBoost does not search for new splits in leaves with samples count less than the specified value.
Can be used only with the Lossguide and Depthwise growing policies.
**Type**
int
**Default value**
1
**Supported processing units**
CPU and GPU
## max\_leaves
Command-line: `--max-leaves`
*Alias:*`num_leaves`
#### Description
The maximum number of leafs in the resulting tree. Can be used only with the Lossguide growing policy.
Note
It is not recommended to use values greater than 64, since it can significantly slow down the training process.
**Type**
int
**Default value**
31
**Supported processing units**
CPU and GPU
## ignored\_features
Command-line: `-I`, `--ignore-features`
#### Description
Feature indices to exclude from the training.
Python package
It is assumed that all passed values are feature names if at least one of the passed values can not be converted to a number or a range of numbers. Otherwise, it is assumed that all passed values are feature indices.
Specifics:
- Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set to "42", the corresponding non-existing feature is successfully ignored.
- The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to `featureCount ā 1`. If a file is used as [input data](https://catboost.ai/docs/en/references/training-parameters/en/concepts/input-data) then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order: `cat feature<\t>label value<\t>num feature`. So for the row `rock<\t>0<\t>42`, the identifier for the "rock" feature is 0, and for the "42" feature it's 1.
For example, use the following construction if features indexed 1, 2, 7, 42, 43, 44, 45, should be ignored: `[1,2,7,42,43,44,45]`
R package
Specifics:
- Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set to "42", the corresponding non-existing feature is successfully ignored.
- The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to `featureCount ā 1`. If a file is used as [input data](https://catboost.ai/docs/en/references/training-parameters/en/concepts/input-data) then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order: `cat feature<\t>label value<\t>num feature`. So for the row `rock<\t>0<\t>42`, the identifier for the "rock" feature is 0, and for the "42" feature it's 1.
For example, if training should exclude features with the identifiers 1, 2, 7, 42, 43, 44, 45, the value of this parameter should be set to c(1,2,7,42,43,44,45).
Command-line
It is assumed that all passed values are feature names if at least one of the passed values can not be converted to a number or a range of numbers. Otherwise, it is assumed that all passed values are feature indices.
Specifics:
- Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set to "42", the corresponding non-existing feature is successfully ignored.
- The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to `featureCount ā 1`. If a file is used as [input data](https://catboost.ai/docs/en/references/training-parameters/en/concepts/input-data) then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order: `cat feature<\t>label value<\t>num feature`. So for the row `rock<\t>0<\t>42`, the identifier for the "rock" feature is 0, and for the "42" feature it's 1.
For example, if training should exclude features with the identifiers 1, 2, 7, 42, 43, 44, 45, use the following construction: `1:2:7:42-45`.
**Default value**
Python package, R package
None
Command-line
Omitted
**Supported processing units**
CPU and GPU
## one\_hot\_max\_size
Command-line: `--one-hot-max-size`
#### Description
Use one-hot encoding for all categorical features with a number of different values less than or equal to the given parameter value. Ctrs are not calculated for such features.
See [details](https://catboost.ai/docs/en/references/training-parameters/en/features/categorical-features).
**Type**
int
**Default value**
The default value depends on various conditions:
- N/A if training is performed on CPU in Pairwise scoring mode
Read more about Pairwise scoring
The following loss functions use Pairwise scoring:
- YetiRankPairwise
- PairLogitPairwise
- QueryCrossEntropy
Pairwise scoring is slightly different from regular training on pairs, since pairs are generated only internally during the training for the corresponding metrics. One-hot encoding is not available for these loss functions.
- 255 if training is performed on GPU and the selected Ctr types require target data that is not available during the training
- 10 if training is performed in [Ranking](https://catboost.ai/docs/en/references/training-parameters/en/concepts/loss-functions-ranking) mode
- 2 if none of the conditions above is met
**Supported processing units**
CPU and GPU
## has\_time
Command-line: `--has-time`
#### Description
Use the order of objects in the input data (do not perform random permutations during the [Transforming categorical features to numerical features](https://catboost.ai/docs/en/references/training-parameters/en/concepts/algorithm-main-stages_cat-to-numberic) and [Choosing the tree structure](https://catboost.ai/docs/en/references/training-parameters/en/concepts/algorithm-main-stages_choose-tree-structure) stages).
The Timestamp column type is used to determine the order of objects if specified in the [input data](https://catboost.ai/docs/en/references/training-parameters/en/concepts/input-data).
**Type**
bool
**Default value**
False (not used; generates random permutations)
**Supported processing units**
CPU and GPU
## rsm
Command-line: `--rsm`
*Alias:*`colsample_bylevel`
#### Description
Random subspace method. The percentage of features to use at each split selection, when features are selected over again at random.
The value must be in the range (0;1\].
**Type**
float (0;1\]
**Default value**
None (set to 1)
**Supported processing units**
CPU; GPU for pairwise ranking
## nan\_mode
Command-line: `--nan-mode`
#### Description
The method for [processing missing values](https://catboost.ai/docs/en/references/training-parameters/en/concepts/algorithm-missing-values-processing) in the input dataset.
Possible values:
- "Forbidden" ā Missing values are not supported, their presence is interpreted as an error.
- "Min" ā Missing values are processed as the minimum value (less than all other values) for the feature. It is guaranteed that a split that separates missing values from all other values is considered when selecting trees.
- "Max" ā Missing values are processed as the maximum value (greater than all other values) for the feature. It is guaranteed that a split that separates missing values from all other values is considered when selecting trees.
Using the Min or Max value of this parameter guarantees that a split between missing values and other values is considered when selecting a new split in the tree.
Note
The method for processing missing values can be set individually for each feature in the [Custom quantization borders and missing value modes](https://catboost.ai/docs/en/references/training-parameters/en/concepts/input-data_custom-borders) input file. Such values override the ones specified in this parameter.
**Type**
string
**Default value**
Min
**Supported processing units**
CPU and GPU
## input\_borders
Command-line: `--input-borders-file`
#### Description
Load [Custom quantization borders and missing value modes](https://catboost.ai/docs/en/references/training-parameters/en/concepts/input-data_custom-borders) from a file (do not generate them).
Borders are automatically generated before training if this parameter is not set.
**Type**
string
**Default value**
Python package
None
Command-line
The file is not loaded, the values are generated
**Supported processing units**
CPU and GPU
## output\_borders
Command-line: `--output-borders-file`
#### Description
Save quantization borders for the current dataset to a file.
Refer to the [file format description](https://catboost.ai/docs/en/references/training-parameters/en/concepts/output-data_custom-borders).
**Type**
string
**Default value**
Python package
None
Command-line
The file is not saved
**Supported processing units**
CPU and GPU
## fold\_permutation\_block
Command-line: `--fold-permutation-block`
#### Description
Objects in the dataset are grouped in blocks before the random permutations. This parameter defines the size of the blocks. The smaller is the value, the slower is the training. Large values may result in quality degradation.
**Type**
int
**Default value**
Python package
1
R package, Command-line
Default value differs depending on the dataset size and ranges from 1 to 256 inclusively
**Supported processing units**
CPU and GPU
## leaf\_estimation\_method
Command-line: `--leaf-estimation-method`
#### Description
The method used to calculate the values in leaves.
Possible values:
- Newton
- Gradient
- Exact
**Type**
string
**Default value**
Depends on the mode and the selected loss function:
- Regression with Quantile or MAE loss functions ā One Exact iteration.
- Regression with any loss function but Quantile or MAE ā One Gradient iteration.
- Classification mode ā Ten Newton iterations.
- Multiclassification mode ā One Newton iteration.
**Supported processing units**
CPU and GPU
## leaf\_estimation\_iterations
Command-line: `--leaf-estimation-iterations`
#### Description
CatBoost might calculate leaf values using several gradient or newton steps instead of a single one.
This parameter regulates how many steps are done in every tree when calculating leaf values.
**Type**
int
**Default value**
Python package
None (Depends on the training objective)
R package, Command-line
Depends on the training objective
**Supported processing units**
CPU and GPU
## leaf\_estimation\_backtracking
Command-line: `--leaf-estimation-backtracking`
#### Description
When the value of the `leaf_estimation_iterations` parameter is greater than 1, CatBoost makes several gradient or newton steps when calculating the resulting leaf values of a tree.
The behaviour differs depending on the value of this parameter:
- No ā Every next step is a regular gradient or newton step: the gradient step is calculated and added to the leaf.
- Any other value āBacktracking is used.
In this case, before adding a step, a condition is checked. If the condition is not met, then the step size is reduced (divided by 2), otherwise the step is added to the leaf.
When `leaf_estimation_iterations` for the Command-line version is set to `n`, the leaf estimation iterations are calculated as follows: each iteration is either an addition of the next step to the leaf value, or it's a scaling of the leaf value. Scaling counts as a separate iteration. Thus, it is possible that instead of having `n` gradient steps, the algorithm makes a single gradient step that is reduced `n` times, which means that it is divided by 2 ā
n 2\\cdot n 2ā
n times.
Possible values:
- No ā Do not use backtracking. Supported on CPU and GPU.
- AnyImprovement ā Reduce the descent step up to the point when the loss function value is smaller than it was on the previous step. The trial reduction factors are 2, 4, 8, and so on. Supported on CPU and GPU.
- Armijo ā Reduce the descent step until the Armijo condition is met. Supported only on GPU.
**Type**
string
**Default value**
AnyImprovement
**Supported processing units**
Depends on the selected value
## fold\_len\_multiplier
Command-line: `--fold-len-multiplier`
#### Description
Coefficient for changing the length of folds.
The value must be greater than 1. The best validation result is achieved with minimum values.
With values close to 1 (for example, 1 \+ ϵ 1+\\epsilon 1\+ϵ), each iteration takes a quadratic amount of memory and time for the number of objects in the iteration. Thus, low values are possible only when there is a small number of objects.
**Type**
float
**Default value**
2
**Supported processing units**
CPU and GPU
## approx\_on\_full\_history
Command-line:`--approx-on-full-history`
#### Description
The principles for calculating the approximated values.
Possible values:
- "False" ā Use only а fraction of the fold for calculating the approximated values. The size of the fraction is calculated as follows:
1
X
\\frac{1}X
X1ā
, where `X` is the specified coefficient for changing the length of folds. This mode is faster and in rare cases slightly less accurate
- "True" ā Use all the preceding rows in the fold for calculating the approximated values. This mode is slower and in rare cases slightly more accurate.
**Type**
bool
**Default value**
Python package, Command-line
False
R package
True
**Supported processing units**
CPU
## class\_weights
Command-line: `--class-weights`
#### Description
Class weights. The values are used as multipliers for the object weights. This parameter can be used for solving binary classification and multiclassification problems.
Python package
Note
For imbalanced datasets with binary classification the weight multiplier can be set to 1 for class 0 and to ( s u m \_ n e g a t i v e s u m \_ p o s i t i v e ) \\left(\\frac{sum\\\_negative}{sum\\\_positive}\\right) (sum\_positivesum\_negativeā) for class 1.
For example, `class_weights=[0.1, 4]`multiplies the weights of objects from class 0 by 0.1 and the weights of objects from class 1 by 4.
If class labels are not standard consecutive integers \[0, 1 ... class\_count-1\], use the dict or collections.OrderedDict type with label to weight mapping.
For example, `class_weights={'a': 1.0, 'b': 0.5, 'c': 2.0}` multiplies the weights of objects with class label `a` by 1.0, the weights of objects with class label `b` by 0.5 and the weights of objects with class label `c` by 2.0.
The dictionary form can also be used with standard consecutive integers class labels for additional readability. For example: `class_weights={0: 1.0, 1: 0.5, 2: 2.0}`.
Note
Class labels are extracted from dictionary keys for the following types of class\_weights:
- dict
- collections.OrderedDict (when the order of classes in the model is important)
The class\_names parameter can be skipped when using these types.
Alert
Do not use this parameter with auto\_class\_weights and scale\_pos\_weight.
R package
For example, `class_weights <- c(0.1, 4)` multiplies the weights of objects from class 0 by 0.1 and the weights of objects from class 1 by 4.
Alert
Do not use this parameter with auto\_class\_weights.
Command-line
Note
The quantity of class weights must match the quantity of class names specified in the `--class-names` parameter and the number of classes specified in the `--classes-count parameter`.
For imbalanced datasets with binary classification the weight multiplier can be set to 1 for class 0 and to ( s u m \_ n e g a t i v e s u m \_ p o s i t i v e ) \\left(\\frac{sum\\\_negative}{sum\\\_positive}\\right) (sum\_positivesum\_negativeā) for class 1.
Format:
```
<value for class 1>,..,<values for class N>
```
For example:
```
0.85,1.2,1
```
Alert
Do not use this parameter with auto\_class\_weights.
**Type**
- list
- dict
- collections.OrderedDict
**Default value**
None (the weight for all classes is set to 1)
**Supported processing units**
CPU and GPU
## class\_names
#### Description
Classes names. Allows to redefine the default values when using the MultiClass and Logloss metrics.
If the upper limit for the numeric class label is specified, the number of classes names should match this value.
Warning
The quantity of classes names must match the quantity of classes weights specified in the `--class-weights` parameter and the number of classes specified in the `--classes-count` parameter.
Format:
```
<name for class 1>,..,<name for class N>
```
For example:
```
smartphone,touchphone,tablet
```
**Type**
list of strings
**Default value**
None
**Supported processing units**
CPU and GPU
## auto\_class\_weights
Command-line: `--auto-class-weights`
#### Description
Automatically calculate class weights based either on the total weight or the total number of objects in each class. The values are used as multipliers for the object weights.
Supported values:
- None ā All class weights are set to 1
- Balanced:
C W k \= m a x c \= 1 K ( ā t i \= c w i ) ā t i \= k w i CW\_k=\\displaystyle\\frac{max\_{c=1}^K(\\sum\_{t\_{i}=c}{w\_i})}{\\sum\_{t\_{i}=k}{w\_{i}}} CWkā\=ātiā\=kāwiāmaxc\=1Kā(ātiā\=cāwiā)ā
- SqrtBalanced:
C W k \= m a x c \= 1 K ( ā t i \= c w i ) ā t i \= k w i CW\_k=\\sqrt{\\displaystyle\\frac{max\_{c=1}^K(\\sum\_{t\_i=c}{w\_i})}{\\sum\_{t\_i=k}{w\_i}}} CWkā\= ātiā\=kāwiāmaxc\=1Kā(ātiā\=cāwiā)ā ā
Alert
Do not use this parameter with `class_weights` and `scale_pos_weight`.
**Type**
string
**Default value**
None ā All class weights are set to 1
**Supported processing units**
CPU and GPU
## scale\_pos\_weight
#### Description
The weight for class 1 in binary classification. The value is used as a multiplier for the weights of objects from class 1.
Note
For imbalanced datasets, the weight multiplier can be set to ( s u m \_ n e g a t i v e s u m \_ p o s i t i v e ) \\left(\\frac{sum\\\_negative}{sum\\\_positive}\\right) (sum\_positivesum\_negativeā)
Alert
Do not use this parameter with `auto_class_weights` and `class_weights`.
**Type**
float
**Default value**
1\.0
**Supported processing units**
CPU and GPU
## boosting\_type
Command-line: `--boosting-type`
#### Description
Boosting scheme.
Possible values:
- Ordered ā Usually provides better quality on small datasets, but it may be slower than the Plain scheme.
- Plain ā The classic gradient boosting scheme.
**Type**
string
**Default value**
Depends on the processing unit type, the number of objects in the training dataset and the selected learning mode
- CPU
Plain
- GPU
- Any number of objects, MultiClass or MultiClassOneVsAll mode: Plain
- More than 50 thousand objects, any mode: Plain
- Less than or equal to 50 thousand objects, any mode but MultiClass or MultiClassOneVsAll: Ordered
**Supported processing units**
CPU and GPU
Only the Plain mode is supported for the MultiClass loss on GPU
## boost\_from\_average
Command-line: `--boost-from-average`
#### Description
Initialize approximate values by best constant value for the specified loss function. Sets the value of bias to the initial best constant value.
Available for the following loss functions:
- RMSE
- Logloss
- CrossEntropy
- Quantile
- MAE
- MAPE
**Type**
bool
**Default value**
Depends on the selected loss function:
- True for RMSE, Quantile, MAE, MAPE
- False for all other loss functions
**Supported processing units**
CPU and GPU
## langevin
Command-line: `--langevin`
#### Description
Enables the Stochastic Gradient Langevin Boosting mode.
Refer to the [SGLB: Stochastic Gradient Langevin Boosting](https://arxiv.org/abs/2001.07248) paper for details.
**Type**
bool
**Default value**
False
**Supported processing units**
CPU
## diffusion\_temperature
Command-line: `--diffusion-temperature`
#### Description
The diffusion temperature of the Stochastic Gradient Langevin Boosting mode.
Only non-negative values are supported.
**Type**
float
**Default value**
10000
**Supported processing units**
CPU
## posterior\_sampling
Command-line: `--posterior-sampling`
#### Description
If this parameter is set several options are specified as follows and model parameters are checked to obtain uncertainty predictions with good theoretical properties.
Specifies options:
- `Langevin`: true,
- `DiffusionTemperature`: objects in learn pool count,
- `ModelShrinkRate`: 1 / (2. \* objects in learn pool count).
**Type**
bool
**Default value**
False
**Supported processing units**
CPU only
## allow\_const\_label
Command-line: `--allow-const-label`
#### Description
Use it to train models with datasets that have equal label values for all objects.
**Type**
bool
**Default value**
False
**Supported processing units**
CPU and GPU
## score\_function
Command-line: `--score-function`
#### Description
The [score type](https://catboost.ai/docs/en/references/training-parameters/en/concepts/algorithm-score-functions) used to select the next split during the tree construction.
Possible values:
- Cosine (do not use this score type with the Lossguide tree growing policy)
- L2
- NewtonCosine (do not use this score type with the Lossguide tree growing policy)
- NewtonL2
**Type**
string
**Default value**
Cosine
**Supported processing units**
The supported score functions vary depending on the processing unit type:
- GPU ā All score types
- CPU ā Cosine, L2
## monotone\_constraints
Command-line: `--monotone-constraints`
#### Description
Impose monotonic constraints on numerical features.
Possible values:
- "1" ā Increasing constraint on the feature. The algorithm forces the model to be a non-decreasing function of this features.
- "\-1" ā Decreasing constraint on the feature. The algorithm forces the model to be a non-increasing function of this features.
- "0" ā constraints are disabled.
Supported formats for setting the value of this parameter (all feature indices are zero-based):
- Set constraints individually for each feature as a string (the number of features is n).
Format
```
"(<constraint_0>, <constraint_2>, .., <constraint_n-1>)"
```
Zero constraints for features at the end of the list may be dropped.
In `monotone_constraints = "(1,0,-1)"`an increasing constraint is set on the first feature and a decreasing one on the third. Constraints are disabled for all other features.
- Set constraints individually for each explicitly specified feature as a string (the number of features is n).
```
"<feature index or name>:<constraint>, .., <feature index or name>:<constraint>"
```
These examples
```
monotone-constraints = "2:1,4:-1"
```
```
monotone-constraints = "Feature2:1,Feature4:-1"
```
are identical, given that the name of the feature index 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
- Set constraints individually for each required feature as an array or a dictionary (the number of features is n).
Format
```
[<constraint_0>, <constraint_2>, .., <constraint_n-1>]
```
```
{"<feature index or name>":<constraint>, .., "<feature index or name>":<constraint>}
```
Array examples
```
monotone_constraints = [1, 0, -1]
```
These dictionary examples
```
monotone_constraints = {"Feature2":1,"Feature4":-1}
```
```
monotone_constraints = {"2":1, "4":-1}
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
**Type**
- list of strings
- string
- dict
- list
**Default value**
Python package, R package
None
Command-line
Ommited
**Supported processing units**
CPU
## feature\_weights
Command-line: `--feature-weights`
#### Description
Per-feature multiplication weights used when choosing the best split. The score of each candidate is multiplied by the weights of features from the current split.
Non-negative float values are supported for each weight.
Supported formats for setting the value of this parameter:
- Set the multiplication weight for each feature as a string (the number of features is n).
Format
```
"(<feature-weight_0>,<feature-weight_2>,..,<feature-weight_n-1>)"
```
Note
Spaces between values are not allowed.
Values should be passed as a parenthesized string of comma-separated values. Multiplication weights equal to 1 at the end of the list may be dropped.
In this
example
```
feature_weights = "(0.1,1,3)"
```
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
- Set the multiplication weight individually for each explicitly specified feature as a string (the number of features is n).
Format
```
"<feature index or name>:<weight>, .., <feature index or name>:<weight>"
```
Note
Spaces between values are not allowed.
These examples
```
feature_weights = "2:1.1,4:0.1"
```
```
feature_weights = "Feature2:1.1,Feature4:0.1"
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
- Set the multiplication weight individually for each required feature as an array or a dictionary (the number of features is n).
Format
```
[<feature-weight_0>, <feature-weight_2>, .., <feature-weight_n-1>]
```
```
{"<feature index or name>":<weight>, .., "<feature index or name>":<weight>}
```
Array examples
```
feature_weights = [0.1, 1, 3]
```
These dictionary examples
```
feature_weights = {"Feature2":1.1,"Feature4":0.3}
```
```
feature_weights = {"2":1.1, "4":0.3}
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
**Type**
- list
- numpy.ndarray
- string
- dict
**Default value**
1 for all features
**Supported processing units**
CPU
## first\_feature\_use\_penalties
Command-line: `--first-feature-use-penalties`
#### Description
Per-feature penalties for the first occurrence of the feature in the model. The given value is subtracted from the score if the current candidate is the first one to include the feature in the model.
Refer to the [Per-object and per-feature penalties](https://catboost.ai/docs/en/references/training-parameters/en/concepts/algorithm-score-functions) section for details on applying different score penalties.
Non-negative float values are supported for each penalty.
- Set the penalty for each feature as a string (the number of features is n).
Format
```
"(<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>)"
```
Note
Spaces between values are not allowed.
Values should be passed as a parenthesized string of comma-separated values. Penalties equal to 0 at the end of the list may be dropped.
In this example
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = "(0.1,1,3)"
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = "(0.1,1,3)"
```
Note
Spaces between values are not allowed.
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
- Set the penalty individually for each explicitly specified feature as a string (the number of features is n).
Format
```
"<feature index or name>:<penalty>,..,<feature index or name>:<penalty>"
```
Note
Spaces between values are not allowed.
These examples `first_feature_use_penalties` parameter:
```
first_feature_use_penalties = "2:1.1,4:0.1"
```
```
first_feature_use_penalties = "Feature2:1.1,Feature4:0.1"
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = "2:1.1,4:0.1"
```
```
per_object_feature_penalties = "Feature2:1.1,Feature4:0.1"
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
- Set the penalty individually for each required feature as an array or a dictionary (the number of features is n).
Format
```
[<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>]
```
```
{"<feature index or name>":<penalty>, .., "<feature index or name>":<penalty>}
```
Array examples.
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = [0.1, 1, 3]
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = [0.1, 1, 3]
```
These dictionary examples
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = {"Feature2":1.1,"Feature4":0.1}
```
```
first_feature_use_penalties = {"2":1.1, "4":0.1}
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = {"Feature2":1.1,"Feature4":0.1}
```
```
per_object_feature_penalties = {"2":1.1, "4":0.1}
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
**Type**
- list
- numpy.ndarray
- string
- dict
**Default value**
0 for all features
**Supported processing units**
CPU
## fixed\_binary\_splits
Command-line: `--fixed-binary-splits`
#### Description
A list of indices of binary features to put at the top of each tree; ignored if `grow_policy` is `Symmetric`.
**Type**
list
**Default value**
None
**Supported processing units**
GPU
## penalties\_coefficient
Command-line: `--penalties-coefficient`
#### Description
A single-value common coefficient to multiply all penalties.
Non-negative values are supported.
**Type**
float
**Default value**
1
**Supported processing units**
CPU
## per\_object\_feature\_penalties
Command-line: `--per-object-feature-penalties`
#### Description
Per-object penalties for the first use of the feature for the object. The given value is multiplied by the number of objects that are divided by the current split and use the feature for the first time.
Refer to the [Per-object and per-feature penalties](https://catboost.ai/docs/en/references/training-parameters/en/concepts/algorithm-score-functions) section for details on applying different score penalties.
Non-negative float values are supported for each penalty.
Python package
- Set the penalty for each feature as a string (the number of features is n).
Format
```
"(<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>)"
```
Note
Spaces between values are not allowed.
Values should be passed as a parenthesized string of comma-separated values. Penalties equal to 0 at the end of the list may be dropped.
In this example
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = "(0.1,1,3)"
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = "(0.1,1,3)"
```
Note
Spaces between values are not allowed.
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
- Set the penalty individually for each explicitly specified feature as a string (the number of features is n).
Format
```
"<feature index or name>:<penalty>,..,<feature index or name>:<penalty>"
```
Note
Spaces between values are not allowed.
These examples `first_feature_use_penalties` parameter:
```
first_feature_use_penalties = "2:1.1,4:0.1"
```
```
first_feature_use_penalties = "Feature2:1.1,Feature4:0.1"
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = "2:1.1,4:0.1"
```
```
per_object_feature_penalties = "Feature2:1.1,Feature4:0.1"
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
- Set the penalty individually for each required feature as an array or a dictionary (the number of features is n).
Format
```
[<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>]
```
```
{"<feature index or name>":<penalty>, .., "<feature index or name>":<penalty>}
```
Array examples.
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = [0.1, 1, 3]
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = [0.1, 1, 3]
```
These dictionary examples
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = {"Feature2":1.1,"Feature4":0.1}
```
```
first_feature_use_penalties = {"2":1.1, "4":0.1}
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = {"Feature2":1.1,"Feature4":0.1}
```
```
per_object_feature_penalties = {"2":1.1, "4":0.1}
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
R package
- Set the penalty for each feature as a string (the number of features is n).
Format
```
"(<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>)"
```
Note
Spaces between values are not allowed.
Values should be passed as a parenthesized string of comma-separated values. Penalties equal to 0 at the end of the list may be dropped.
Penalties equal to 0 at the end of the list may be dropped.
In this
example
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = "(0.1,1,3)"
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = "(0.1,1,3)"
```
Note
Spaces between values are not allowed.
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
- Set the penalty individually for each explicitly specified feature as a string (the number of features is n).
Format
```
"<feature index or name>:<penalty>,..,<feature index or name>:<penalty>"
```
Note
Spaces between values are not allowed.
These examples
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = "2:1.1,4:0.1"
```
```
first_feature_use_penalties = "Feature2:1.1,Feature4:0.1"
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = "2:1.1,4:0.1"
```
```
per_object_feature_penalties = "Feature2:1.1,Feature4:0.1"
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
**Type**
- list
- numpy.ndarray
- string
- dict
**Default value**
0 for all objects
**Supported processing units**
CPU
## model\_shrink\_rate
Command-line: `--model-shrink-rate`
#### Description
The constant used to calculate the coefficient for multiplying the model on each iteration.
The actual model shrinkage coefficient calculated at each iteration depends on the value of the `--model-shrink-mode`for the Command-line version parameter. The resulting value of the coefficient should be always in the range (0, 1\].
**Type**
float
**Default value**
The default value depends on the values of the following parameters:
- `--model-shrink-mode` for the Command-line version
- `--monotone-constraints` for the Command-line version
**Supported processing units**
CPU
## model\_shrink\_mode
Command-line: `model_shrink_mode`
#### Description
Determines how the actual model shrinkage coefficient is calculated at each iteration.
Possible values:
- Constant:
1 ā m o d e l \_ s h r i n k \_ r a t e ā
l e a r n i n g \_ r a t e , 1 - model\\\_shrink\\\_rate \\cdot learning\\\_rate {,} 1āmodel\_shrink\_rateā
learning\_rate,
- m
o
d
e
l
\_
s
h
r
i
n
k
\_
r
a
t
e
model\\\_shrink\\\_rate
model\_shrink\_rate
is the value of the `--model-shrink-rate`for the Command-line version parameter.
- l
e
a
r
n
i
n
g
\_
r
a
t
e
learning\\\_rate
learning\_rate
is the value of the `--learning-rate`for the Command-line version parameter
- Decreasing:
1 ā m o d e l \_ s h r i n k \_ r a t e i , 1 - \\frac{model\\\_shrink\\\_rate}{i} {,} 1āimodel\_shrink\_rateā,
- m
o
d
e
l
\_
s
h
r
i
n
k
\_
r
a
t
e
model\\\_shrink\\\_rate
model\_shrink\_rate
is the value of the `--model-shrink-rate`for the Command-line version parameter.
- i
i
i
is the identifier of the iteration.
**Type**
string
**Default value**
Constant
**Supported processing units**
CPU
### Was the article helpful?
Yes
No
Previous
[Overview](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/)
Next
[CTR settings](https://catboost.ai/docs/en/references/training-parameters/en/references/training-parameters/ctr)
 |
| Readable Markdown | ## loss\_function
Command-line: `--loss-function`
*Alias:* `objective`
#### Description
The [metric](https://catboost.ai/docs/en/concepts/loss-functions) to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters (see the [Objectives and metrics](https://catboost.ai/docs/en/concepts/loss-functions) section for details on each metric).
Format:
```
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
```
Supported metrics
- RMSE
- Logloss
- MAE
- CrossEntropy
- Quantile
- LogLinQuantile
- Lq
- MultiRMSE
- MultiClass
- MultiClassOneVsAll
- MultiLogloss
- MultiCrossEntropy
- MAPE
- Poisson
- PairLogit
- PairLogitPairwise
- QueryRMSE
- QuerySoftMax
- GroupQuantile
- Tweedie
- YetiRank
- YetiRankPairwise
- StochasticFilter
- StochasticRank
A custom python object can also be set as the value of this parameter (see an [example](https://catboost.ai/docs/en/concepts/python-usages-examples)).
For example, use the following construction to calculate the value of Quantile with the coefficient α \= 0\.1 \\alpha = 0.1:
```
Quantile:alpha=0.1
```
**Type**
- string
- object
**Default value**
Python package
Depends on the class:
- [CatBoostClassifier](https://catboost.ai/docs/en/concepts/python-reference_catboostclassifier): Logloss if the `target_border` parameter value differs from None. Otherwise, the default loss function depends on the number of unique target values and is either set to Logloss or MultiClass.
- [CatBoost](https://catboost.ai/docs/en/concepts/python-reference_catboost) and [CatBoostRegressor](https://catboost.ai/docs/en/concepts/python-reference_catboostregressor): RMSE
R package, Command-line
RMSE
**Supported processing units**
CPU and GPU
## custom\_metric
Command-line: `--custom-metric`
#### Description
[Metric](https://catboost.ai/docs/en/concepts/loss-functions) values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see the [Objectives and metrics](https://catboost.ai/docs/en/concepts/loss-functions) section for details on each metric).
Format:
```
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
```
[Supported metrics](https://catboost.ai/docs/en/references/custom-metric__supported-metrics)
Examples
- Calculate the value of CrossEntropy:
```
CrossEntropy
```
- Calculate the value of Quantile with the coefficient α \= 0\.1 \\alpha = 0.1
```
Quantile:alpha=0.1
```
- Calculate the values of Logloss and AUC:
```
['Logloss', 'AUC']
```
Values of all custom metrics for learn and validation datasets are saved to the [Metric](https://catboost.ai/docs/en/concepts/output-data_loss-function) output files (`learn_error.tsv` and `test_error.tsv` respectively). The directory for these files is specified in the `--train-dir` (`train_dir`) parameter.
Use the [visualization tools](https://catboost.ai/docs/en/features/visualization) to see a live chart with the dynamics of the specified metrics.
**Type**
- string
- list of strings
**Default value**
Python package
None
R package
None
Command-line
None (do not output additional metric values)
**Supported processing units**
CPU and GPU
## eval\_metric
Command-line: `--eval-metric`
#### Description
The metric used for overfitting detection (if enabled) and best model selection (if enabled). Some metrics support optional parameters (see the [Objectives and metrics](https://catboost.ai/docs/en/concepts/loss-functions) section for details on each metric).
Format:
```
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
```
[Supported metrics](https://catboost.ai/docs/en/references/eval-metric__supported-metrics)
A user-defined function can also be set as the value (see an [example](https://catboost.ai/docs/en/concepts/python-usages-examples)).
Examples:
```
R2
```
**Type**
- string
- object
**Default value**
Optimized objective is used
**Supported processing units**
CPU and GPU
## iterations
Command-line: `-i`, `--iterations`
*Aliases:* `num_boost_round`, `n_estimators`, `num_trees`
#### Description
The maximum number of trees that can be built when solving machine learning problems.
When using other parameters that limit the number of iterations, the final number of trees may be less than the number specified in this parameter.
**Type**
int
**Default value**
1000
**Supported processing units**
CPU and GPU
## learning\_rate
Command-line: `-w`, `--learning-rate`
*Alias:* `eta`
#### Description
The learning rate.
Used for reducing the gradient step.
**Type**
float
**Default value**
The default value is defined automatically for [`Logloss`](https://catboost.ai/docs/en/concepts/loss-functions-classification#Logit), [`MultiClass`](https://catboost.ai/docs/en/concepts/loss-functions-multiclassification#MultiClass) and [`RMSE`](https://catboost.ai/docs/en/concepts/loss-functions-regression#RMSE) loss functions depending on the number of iterations if none of parameters [`leaf_estimation_iterations`](https://catboost.ai/docs/en/references/training-parameters/common#leaf_estimation_iterations), [`leaf_estimation_method`](https://catboost.ai/docs/en/references/training-parameters/common#leaf_estimation_method), [`l2_leaf_reg`](https://catboost.ai/docs/en/references/training-parameters/common#l2_leaf_reg) is set. In this case, the selected learning rate is printed to stdout and saved in the model.
In other cases, the default value is 0.03.
**Supported processing units**
CPU and GPU
## random\_seed
Command-line: `-r`, `--random-seed`
*Alias:*`random_state`
#### Description
The random seed used for training.
**Type**
int
**Default value**
Python package
None (0)
R package, Command-line
0
**Supported processing units**
CPU and GPU
## l2\_leaf\_reg
Command-line: `--l2-leaf-reg`, `l2-leaf-regularizer`
*Alias:* `reg_lambda`
#### Description
Coefficient at the L2 regularization term of the cost function.
Any positive value is allowed.
**Type**
float
**Default value**
3\.0
**Supported processing units**
CPU and GPU
## bootstrap\_type
Command-line: `--bootstrap-type`
#### Description
[Bootstrap type](https://catboost.ai/docs/en/concepts/algorithm-main-stages_bootstrap-options). Defines the method for sampling the weights of objects.
Supported methods:
- Bayesian
- Bernoulli
- MVS
- Poisson (supported for GPU only)
- No
**Type**
string
**Default value**
The default value depends on `objective`, `task_type`, `bagging_temperature` and `sampling_unit`:
- When the objective parameter is QueryCrossEntropy, YetiRankPairwise, PairLogitPairwise and the bagging\_temperature parameter is not set: Bernoulli with the subsample parameter set to 0.5.
- Neither MultiClass nor MultiClassOneVsAll, task\_type = CPU and sampling\_unit = Object: MVS with the subsample parameter set to 0.8.
- Otherwise: Bayesian.
**Supported processing units**
CPU and GPU
## bagging\_temperature
Command-line: `--bagging-temperature`
#### Description
Defines the settings of the Bayesian bootstrap. It is used by default in classification and regression modes.
Use the Bayesian bootstrap to assign random weights to objects.
The weights are sampled from exponential distribution if the value of this parameter is set to "1". All weights are equal to 1 if the value of this parameter is set to "0".
Possible values are in the range \[ 0 ; inf ā” ) \[0; \\inf). The higher the value the more aggressive the bagging is.
This parameter can be used if the selected bootstrap type is Bayesian.
**Type**
float
**Default value**
1
**Supported processing units**
CPU and GPU
## subsample
Command-line: `--subsample`
#### Description
Sample rate for bagging.
This parameter can be used if one of the following bootstrap types is selected:
- Poisson
- Bernoulli
- MVS
**Type**
float
**Default value**
The default value depends on the dataset size and the bootstrap type:
- Datasets with less than 100 objects ā 1
- Datasets with 100 objects or more:
- Poisson, Bernoulli ā 0.66
- MVS ā 0.8
**Supported processing units**
CPU and GPU
## sampling\_frequency
Command-line: `--sampling-frequency`
#### Description
Frequency to sample weights and objects when building trees.
Supported values:
- PerTree ā Before constructing each new tree
- PerTreeLevel ā Before choosing each new split of a tree
**Type**
string
**Default value**
PerTreeLevel
**Supported processing units**
CPU
## sampling\_unit
Command-line: `--sampling-unit`
#### Description
The sampling scheme.
Possible values:
- Object ā The weight
w
i
w\_{i}
of the i-th object
o
i
o\_{i}
is used for sampling the corresponding object.
- Group ā The weight
w
j
w\_{j}
of the group
g
j
g\_{j}
is used for sampling each object
o
i
j
o\_{i\_{j}}
from the group
g
j
g\_{j}
.
**Type**
String
**Default value**
Object
**Supported processing units**
CPU and GPU
## mvs\_reg
Command-line: `--mvs-reg`
#### Description
Affects the weight of the denominator and can be used for balancing between the importance and Bernoulli sampling (setting it to 0 implies importance sampling and to ā \\infty - Bernoulli).
Note
This parameter is supported only for the MVS sampling method (the `bootstrap_type` parameter must be set to MVS).
**Type**
float
**Default value**
The value is set based on the gradient distribution on the current iteration
**Supported processing units**
CPU
## random\_strength
Command-line: `--random-strength`
#### Description
The amount of randomness to use for scoring splits when the tree structure is selected. Use this parameter to avoid overfitting the model.
The value of this parameter is used when selecting splits. On every iteration each possible split gets a score (for example, the score indicates how much adding this split will improve the loss function for the training dataset). The split with the highest score is selected.
The scores have no randomness. A normally distributed random variable is added to the score of the feature. It has a zero mean and a variance that decreases during the training. The value of this parameter is the multiplier of the variance.
Note
This parameter is not supported for the following loss functions:
- QueryCrossEntropy
- YetiRankPairwise
- PairLogitPairwise
**Type**
float
**Default value**
1
**Supported processing units**
CPU
## use\_best\_model
Command-line: `--use-best-model`
#### Description
If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:
1. Build the number of trees defined by the training parameters.
2. Use the validation dataset to identify the iteration with the optimal value of the metric specified in `--eval-metric` (`--eval-metric`).
No trees are saved after this iteration.
This option requires a validation dataset to be provided.
**Type**
bool
**Default value**
True if a validation set is input (the eval\_set parameter is defined) and at least one of the label values of objects in this set differs from the others. False otherwise.
**Supported processing units**
CPU and GPU
## best\_model\_min\_trees
Command-line: `--best-model-min-trees`
#### Description
The minimal number of trees that the best model should have. If set, the output model contains at least the given number of trees even if the optimal value of the evaluation metric on the validation dataset is achieved with smaller number of trees.
Should be used with the `--use-best-model` parameter.
**Type**
int
**Default value**
Python package, R package
None (The minimal number of trees for the best model is not set)
Command-line
The minimal number of trees for the best model is not set
**Supported processing units**
CPU and GPU
## depth
Command-line: `-n`, `--depth`
*Alias:* `max_depth`
#### Description
Depth of the trees.
The range of supported values depends on the processing unit type and the type of the selected loss function:
- CPU ā Any integer up to 16.
- GPU ā Any integer up to 8 for pairwise modes (YetiRank, PairLogitPairwise, and QueryCrossEntropy), and up to 16 for all other loss functions.
**Type**
int
**Default value**
6 (16 if the growing policy is set to Lossguide)
**Supported processing units**
CPU and GPU
## grow\_policy
Command-line: `--grow-policy`
#### Description
The tree growing policy. Defines how to perform greedy tree construction.
Possible values:
- SymmetricTree āA tree is built level by level until the specified depth is reached. On each iteration, all leaves from the last tree level are split with the same condition. The resulting tree structure is always symmetric.
- Depthwise ā A tree is built level by level until the specified depth is reached. On each iteration, all non-terminal leaves from the last tree level are split. Each leaf is split by condition with the best loss improvement.
Note
Models with this growing policy can not be analyzed using the PredictionDiff feature importance and can be exported only to json and cbm.
- Lossguide ā A tree is built leaf by leaf until the specified maximum number of leaves is reached. On each iteration, non-terminal leaf with the best loss improvement is split.
Note
Models with this growing policy can not be analyzed using the PredictionDiff feature importance and can be exported only to json and cbm.
**Type**
string
**Default value**
SymmetricTree
**Supported processing units**
CPU and GPU
## min\_data\_in\_leaf
Command-line: `--min-data-in-leaf`
*Alias:* `min_child_samples`
#### Description
The minimum number of training samples in a leaf. CatBoost does not search for new splits in leaves with samples count less than the specified value.
Can be used only with the Lossguide and Depthwise growing policies.
**Type**
int
**Default value**
1
**Supported processing units**
CPU and GPU
## max\_leaves
Command-line: `--max-leaves`
*Alias:*`num_leaves`
#### Description
The maximum number of leafs in the resulting tree. Can be used only with the Lossguide growing policy.
Note
It is not recommended to use values greater than 64, since it can significantly slow down the training process.
**Type**
int
**Default value**
31
**Supported processing units**
CPU and GPU
## ignored\_features
Command-line: `-I`, `--ignore-features`
#### Description
Feature indices to exclude from the training.
Python package
It is assumed that all passed values are feature names if at least one of the passed values can not be converted to a number or a range of numbers. Otherwise, it is assumed that all passed values are feature indices.
Specifics:
- Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set to "42", the corresponding non-existing feature is successfully ignored.
- The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to `featureCount ā 1`. If a file is used as [input data](https://catboost.ai/docs/en/concepts/input-data) then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order: `cat feature<\t>label value<\t>num feature`. So for the row `rock<\t>0<\t>42`, the identifier for the "rock" feature is 0, and for the "42" feature it's 1.
For example, use the following construction if features indexed 1, 2, 7, 42, 43, 44, 45, should be ignored: `[1,2,7,42,43,44,45]`
R package
Specifics:
- Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set to "42", the corresponding non-existing feature is successfully ignored.
- The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to `featureCount ā 1`. If a file is used as [input data](https://catboost.ai/docs/en/concepts/input-data) then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order: `cat feature<\t>label value<\t>num feature`. So for the row `rock<\t>0<\t>42`, the identifier for the "rock" feature is 0, and for the "42" feature it's 1.
For example, if training should exclude features with the identifiers 1, 2, 7, 42, 43, 44, 45, the value of this parameter should be set to c(1,2,7,42,43,44,45).
Command-line
It is assumed that all passed values are feature names if at least one of the passed values can not be converted to a number or a range of numbers. Otherwise, it is assumed that all passed values are feature indices.
Specifics:
- Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set to "42", the corresponding non-existing feature is successfully ignored.
- The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to `featureCount ā 1`. If a file is used as [input data](https://catboost.ai/docs/en/concepts/input-data) then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order: `cat feature<\t>label value<\t>num feature`. So for the row `rock<\t>0<\t>42`, the identifier for the "rock" feature is 0, and for the "42" feature it's 1.
For example, if training should exclude features with the identifiers 1, 2, 7, 42, 43, 44, 45, use the following construction: `1:2:7:42-45`.
**Default value**
Python package, R package
None
Command-line
Omitted
**Supported processing units**
CPU and GPU
## one\_hot\_max\_size
Command-line: `--one-hot-max-size`
#### Description
Use one-hot encoding for all categorical features with a number of different values less than or equal to the given parameter value. Ctrs are not calculated for such features.
See [details](https://catboost.ai/docs/en/features/categorical-features).
**Type**
int
**Default value**
The default value depends on various conditions:
- N/A if training is performed on CPU in Pairwise scoring mode
Read more about Pairwise scoring
The following loss functions use Pairwise scoring:
- YetiRankPairwise
- PairLogitPairwise
- QueryCrossEntropy
Pairwise scoring is slightly different from regular training on pairs, since pairs are generated only internally during the training for the corresponding metrics. One-hot encoding is not available for these loss functions.
- 255 if training is performed on GPU and the selected Ctr types require target data that is not available during the training
- 10 if training is performed in [Ranking](https://catboost.ai/docs/en/concepts/loss-functions-ranking) mode
- 2 if none of the conditions above is met
**Supported processing units**
CPU and GPU
## has\_time
Command-line: `--has-time`
#### Description
Use the order of objects in the input data (do not perform random permutations during the [Transforming categorical features to numerical features](https://catboost.ai/docs/en/concepts/algorithm-main-stages_cat-to-numberic) and [Choosing the tree structure](https://catboost.ai/docs/en/concepts/algorithm-main-stages_choose-tree-structure) stages).
The Timestamp column type is used to determine the order of objects if specified in the [input data](https://catboost.ai/docs/en/concepts/input-data).
**Type**
bool
**Default value**
False (not used; generates random permutations)
**Supported processing units**
CPU and GPU
## rsm
Command-line: `--rsm`
*Alias:*`colsample_bylevel`
#### Description
Random subspace method. The percentage of features to use at each split selection, when features are selected over again at random.
The value must be in the range (0;1\].
**Type**
float (0;1\]
**Default value**
None (set to 1)
**Supported processing units**
CPU; GPU for pairwise ranking
## nan\_mode
Command-line: `--nan-mode`
#### Description
The method for [processing missing values](https://catboost.ai/docs/en/concepts/algorithm-missing-values-processing) in the input dataset.
Possible values:
- "Forbidden" ā Missing values are not supported, their presence is interpreted as an error.
- "Min" ā Missing values are processed as the minimum value (less than all other values) for the feature. It is guaranteed that a split that separates missing values from all other values is considered when selecting trees.
- "Max" ā Missing values are processed as the maximum value (greater than all other values) for the feature. It is guaranteed that a split that separates missing values from all other values is considered when selecting trees.
Using the Min or Max value of this parameter guarantees that a split between missing values and other values is considered when selecting a new split in the tree.
**Type**
string
**Default value**
Min
**Supported processing units**
CPU and GPU
## input\_borders
Command-line: `--input-borders-file`
#### Description
Load [Custom quantization borders and missing value modes](https://catboost.ai/docs/en/concepts/input-data_custom-borders) from a file (do not generate them).
Borders are automatically generated before training if this parameter is not set.
**Type**
string
**Default value**
Python package
None
Command-line
The file is not loaded, the values are generated
**Supported processing units**
CPU and GPU
## output\_borders
Command-line: `--output-borders-file`
#### Description
Save quantization borders for the current dataset to a file.
Refer to the [file format description](https://catboost.ai/docs/en/concepts/output-data_custom-borders).
**Type**
string
**Default value**
Python package
None
Command-line
The file is not saved
**Supported processing units**
CPU and GPU
## fold\_permutation\_block
Command-line: `--fold-permutation-block`
#### Description
Objects in the dataset are grouped in blocks before the random permutations. This parameter defines the size of the blocks. The smaller is the value, the slower is the training. Large values may result in quality degradation.
**Type**
int
**Default value**
Python package
1
R package, Command-line
Default value differs depending on the dataset size and ranges from 1 to 256 inclusively
**Supported processing units**
CPU and GPU
## leaf\_estimation\_method
Command-line: `--leaf-estimation-method`
#### Description
The method used to calculate the values in leaves.
Possible values:
- Newton
- Gradient
- Exact
**Type**
string
**Default value**
Depends on the mode and the selected loss function:
- Regression with Quantile or MAE loss functions ā One Exact iteration.
- Regression with any loss function but Quantile or MAE ā One Gradient iteration.
- Classification mode ā Ten Newton iterations.
- Multiclassification mode ā One Newton iteration.
**Supported processing units**
CPU and GPU
## leaf\_estimation\_iterations
Command-line: `--leaf-estimation-iterations`
#### Description
CatBoost might calculate leaf values using several gradient or newton steps instead of a single one.
This parameter regulates how many steps are done in every tree when calculating leaf values.
**Type**
int
**Default value**
Python package
None (Depends on the training objective)
R package, Command-line
Depends on the training objective
**Supported processing units**
CPU and GPU
## leaf\_estimation\_backtracking
Command-line: `--leaf-estimation-backtracking`
#### Description
When the value of the `leaf_estimation_iterations` parameter is greater than 1, CatBoost makes several gradient or newton steps when calculating the resulting leaf values of a tree.
The behaviour differs depending on the value of this parameter:
- No ā Every next step is a regular gradient or newton step: the gradient step is calculated and added to the leaf.
- Any other value āBacktracking is used.
In this case, before adding a step, a condition is checked. If the condition is not met, then the step size is reduced (divided by 2), otherwise the step is added to the leaf.
When `leaf_estimation_iterations` for the Command-line version is set to `n`, the leaf estimation iterations are calculated as follows: each iteration is either an addition of the next step to the leaf value, or it's a scaling of the leaf value. Scaling counts as a separate iteration. Thus, it is possible that instead of having `n` gradient steps, the algorithm makes a single gradient step that is reduced `n` times, which means that it is divided by 2 ā
n 2\\cdot n times.
Possible values:
- No ā Do not use backtracking. Supported on CPU and GPU.
- AnyImprovement ā Reduce the descent step up to the point when the loss function value is smaller than it was on the previous step. The trial reduction factors are 2, 4, 8, and so on. Supported on CPU and GPU.
- Armijo ā Reduce the descent step until the Armijo condition is met. Supported only on GPU.
**Type**
string
**Default value**
AnyImprovement
**Supported processing units**
Depends on the selected value
## fold\_len\_multiplier
Command-line: `--fold-len-multiplier`
#### Description
Coefficient for changing the length of folds.
The value must be greater than 1. The best validation result is achieved with minimum values.
With values close to 1 (for example, 1 \+ ϵ 1+\\epsilon), each iteration takes a quadratic amount of memory and time for the number of objects in the iteration. Thus, low values are possible only when there is a small number of objects.
**Type**
float
**Default value**
2
**Supported processing units**
CPU and GPU
## approx\_on\_full\_history
Command-line:`--approx-on-full-history`
#### Description
The principles for calculating the approximated values.
Possible values:
- "False" ā Use only а fraction of the fold for calculating the approximated values. The size of the fraction is calculated as follows:
1
X
\\frac{1}X
, where `X` is the specified coefficient for changing the length of folds. This mode is faster and in rare cases slightly less accurate
- "True" ā Use all the preceding rows in the fold for calculating the approximated values. This mode is slower and in rare cases slightly more accurate.
**Type**
bool
**Default value**
Python package, Command-line
False
R package
True
**Supported processing units**
CPU
## class\_weights
Command-line: `--class-weights`
#### Description
Class weights. The values are used as multipliers for the object weights. This parameter can be used for solving binary classification and multiclassification problems.
Python package
Note
For imbalanced datasets with binary classification the weight multiplier can be set to 1 for class 0 and to ( s u m \_ n e g a t i v e s u m \_ p o s i t i v e ) \\left(\\frac{sum\\\_negative}{sum\\\_positive}\\right) for class 1.
For example, `class_weights=[0.1, 4]`multiplies the weights of objects from class 0 by 0.1 and the weights of objects from class 1 by 4.
If class labels are not standard consecutive integers \[0, 1 ... class\_count-1\], use the dict or collections.OrderedDict type with label to weight mapping.
For example, `class_weights={'a': 1.0, 'b': 0.5, 'c': 2.0}` multiplies the weights of objects with class label `a` by 1.0, the weights of objects with class label `b` by 0.5 and the weights of objects with class label `c` by 2.0.
The dictionary form can also be used with standard consecutive integers class labels for additional readability. For example: `class_weights={0: 1.0, 1: 0.5, 2: 2.0}`.
Note
Class labels are extracted from dictionary keys for the following types of class\_weights:
- dict
- collections.OrderedDict (when the order of classes in the model is important)
The class\_names parameter can be skipped when using these types.
Alert
Do not use this parameter with auto\_class\_weights and scale\_pos\_weight.
R package
For example, `class_weights <- c(0.1, 4)` multiplies the weights of objects from class 0 by 0.1 and the weights of objects from class 1 by 4.
Alert
Do not use this parameter with auto\_class\_weights.
Command-line
Note
The quantity of class weights must match the quantity of class names specified in the `--class-names` parameter and the number of classes specified in the `--classes-count parameter`.
For imbalanced datasets with binary classification the weight multiplier can be set to 1 for class 0 and to ( s u m \_ n e g a t i v e s u m \_ p o s i t i v e ) \\left(\\frac{sum\\\_negative}{sum\\\_positive}\\right) for class 1.
Format:
```
<value for class 1>,..,<values for class N>
```
For example:
```
0.85,1.2,1
```
Alert
Do not use this parameter with auto\_class\_weights.
**Type**
- list
- dict
- collections.OrderedDict
**Default value**
None (the weight for all classes is set to 1)
**Supported processing units**
CPU and GPU
## class\_names
#### Description
Classes names. Allows to redefine the default values when using the MultiClass and Logloss metrics.
If the upper limit for the numeric class label is specified, the number of classes names should match this value.
Warning
The quantity of classes names must match the quantity of classes weights specified in the `--class-weights` parameter and the number of classes specified in the `--classes-count` parameter.
Format:
```
<name for class 1>,..,<name for class N>
```
For example:
```
smartphone,touchphone,tablet
```
**Type**
list of strings
**Default value**
None
**Supported processing units**
CPU and GPU
## auto\_class\_weights
Command-line: `--auto-class-weights`
#### Description
Automatically calculate class weights based either on the total weight or the total number of objects in each class. The values are used as multipliers for the object weights.
Supported values:
- None ā All class weights are set to 1
- Balanced:
C W k \= m a x c \= 1 K ( ā t i \= c w i ) ā t i \= k w i CW\_k=\\displaystyle\\frac{max\_{c=1}^K(\\sum\_{t\_{i}=c}{w\_i})}{\\sum\_{t\_{i}=k}{w\_{i}}}
- SqrtBalanced:
C W k \= m a x c \= 1 K ( ā t i \= c w i ) ā t i \= k w i CW\_k=\\sqrt{\\displaystyle\\frac{max\_{c=1}^K(\\sum\_{t\_i=c}{w\_i})}{\\sum\_{t\_i=k}{w\_i}}}
Alert
Do not use this parameter with `class_weights` and `scale_pos_weight`.
**Type**
string
**Default value**
None ā All class weights are set to 1
**Supported processing units**
CPU and GPU
## scale\_pos\_weight
#### Description
The weight for class 1 in binary classification. The value is used as a multiplier for the weights of objects from class 1.
Note
For imbalanced datasets, the weight multiplier can be set to ( s u m \_ n e g a t i v e s u m \_ p o s i t i v e ) \\left(\\frac{sum\\\_negative}{sum\\\_positive}\\right)
Alert
Do not use this parameter with `auto_class_weights` and `class_weights`.
**Type**
float
**Default value**
1\.0
**Supported processing units**
CPU and GPU
## boosting\_type
Command-line: `--boosting-type`
#### Description
Boosting scheme.
Possible values:
- Ordered ā Usually provides better quality on small datasets, but it may be slower than the Plain scheme.
- Plain ā The classic gradient boosting scheme.
**Type**
string
**Default value**
Depends on the processing unit type, the number of objects in the training dataset and the selected learning mode
- CPU
Plain
- GPU
- Any number of objects, MultiClass or MultiClassOneVsAll mode: Plain
- More than 50 thousand objects, any mode: Plain
- Less than or equal to 50 thousand objects, any mode but MultiClass or MultiClassOneVsAll: Ordered
**Supported processing units**
CPU and GPU
Only the Plain mode is supported for the MultiClass loss on GPU
## boost\_from\_average
Command-line: `--boost-from-average`
#### Description
Initialize approximate values by best constant value for the specified loss function. Sets the value of bias to the initial best constant value.
Available for the following loss functions:
- RMSE
- Logloss
- CrossEntropy
- Quantile
- MAE
- MAPE
**Type**
bool
**Default value**
Depends on the selected loss function:
- True for RMSE, Quantile, MAE, MAPE
- False for all other loss functions
**Supported processing units**
CPU and GPU
## langevin
Command-line: `--langevin`
#### Description
Enables the Stochastic Gradient Langevin Boosting mode.
Refer to the [SGLB: Stochastic Gradient Langevin Boosting](https://arxiv.org/abs/2001.07248) paper for details.
**Type**
bool
**Default value**
False
**Supported processing units**
CPU
## diffusion\_temperature
Command-line: `--diffusion-temperature`
#### Description
The diffusion temperature of the Stochastic Gradient Langevin Boosting mode.
Only non-negative values are supported.
**Type**
float
**Default value**
10000
**Supported processing units**
CPU
## posterior\_sampling
Command-line: `--posterior-sampling`
#### Description
If this parameter is set several options are specified as follows and model parameters are checked to obtain uncertainty predictions with good theoretical properties.
Specifies options:
- `Langevin`: true,
- `DiffusionTemperature`: objects in learn pool count,
- `ModelShrinkRate`: 1 / (2. \* objects in learn pool count).
**Type**
bool
**Default value**
False
**Supported processing units**
CPU only
## allow\_const\_label
Command-line: `--allow-const-label`
#### Description
Use it to train models with datasets that have equal label values for all objects.
**Type**
bool
**Default value**
False
**Supported processing units**
CPU and GPU
## score\_function
Command-line: `--score-function`
#### Description
The [score type](https://catboost.ai/docs/en/concepts/algorithm-score-functions) used to select the next split during the tree construction.
Possible values:
- Cosine (do not use this score type with the Lossguide tree growing policy)
- L2
- NewtonCosine (do not use this score type with the Lossguide tree growing policy)
- NewtonL2
**Type**
string
**Default value**
Cosine
**Supported processing units**
The supported score functions vary depending on the processing unit type:
- GPU ā All score types
- CPU ā Cosine, L2
## monotone\_constraints
Command-line: `--monotone-constraints`
#### Description
Impose monotonic constraints on numerical features.
Possible values:
- "1" ā Increasing constraint on the feature. The algorithm forces the model to be a non-decreasing function of this features.
- "\-1" ā Decreasing constraint on the feature. The algorithm forces the model to be a non-increasing function of this features.
- "0" ā constraints are disabled.
Supported formats for setting the value of this parameter (all feature indices are zero-based):
- Set constraints individually for each feature as a string (the number of features is n).
Format
```
"(<constraint_0>, <constraint_2>, .., <constraint_n-1>)"
```
Zero constraints for features at the end of the list may be dropped.
In `monotone_constraints = "(1,0,-1)"`an increasing constraint is set on the first feature and a decreasing one on the third. Constraints are disabled for all other features.
- Set constraints individually for each explicitly specified feature as a string (the number of features is n).
```
"<feature index or name>:<constraint>, .., <feature index or name>:<constraint>"
```
These examples
```
monotone-constraints = "2:1,4:-1"
```
```
monotone-constraints = "Feature2:1,Feature4:-1"
```
are identical, given that the name of the feature index 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
- Set constraints individually for each required feature as an array or a dictionary (the number of features is n).
Format
```
[<constraint_0>, <constraint_2>, .., <constraint_n-1>]
```
```
{"<feature index or name>":<constraint>, .., "<feature index or name>":<constraint>}
```
Array examples
```
monotone_constraints = [1, 0, -1]
```
These dictionary examples
```
monotone_constraints = {"Feature2":1,"Feature4":-1}
```
```
monotone_constraints = {"2":1, "4":-1}
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
**Type**
- list of strings
- string
- dict
- list
**Default value**
Python package, R package
None
Command-line
Ommited
**Supported processing units**
CPU
## feature\_weights
Command-line: `--feature-weights`
#### Description
Per-feature multiplication weights used when choosing the best split. The score of each candidate is multiplied by the weights of features from the current split.
Non-negative float values are supported for each weight.
Supported formats for setting the value of this parameter:
- Set the multiplication weight for each feature as a string (the number of features is n).
Format
```
"(<feature-weight_0>,<feature-weight_2>,..,<feature-weight_n-1>)"
```
Note
Spaces between values are not allowed.
Values should be passed as a parenthesized string of comma-separated values. Multiplication weights equal to 1 at the end of the list may be dropped.
In this
example
```
feature_weights = "(0.1,1,3)"
```
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
- Set the multiplication weight individually for each explicitly specified feature as a string (the number of features is n).
Format
```
"<feature index or name>:<weight>, .., <feature index or name>:<weight>"
```
Note
Spaces between values are not allowed.
These examples
```
feature_weights = "2:1.1,4:0.1"
```
```
feature_weights = "Feature2:1.1,Feature4:0.1"
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
- Set the multiplication weight individually for each required feature as an array or a dictionary (the number of features is n).
Format
```
[<feature-weight_0>, <feature-weight_2>, .., <feature-weight_n-1>]
```
```
{"<feature index or name>":<weight>, .., "<feature index or name>":<weight>}
```
Array examples
```
feature_weights = [0.1, 1, 3]
```
These dictionary examples
```
feature_weights = {"Feature2":1.1,"Feature4":0.3}
```
```
feature_weights = {"2":1.1, "4":0.3}
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
**Type**
- list
- numpy.ndarray
- string
- dict
**Default value**
1 for all features
**Supported processing units**
CPU
## first\_feature\_use\_penalties
Command-line: `--first-feature-use-penalties`
#### Description
Per-feature penalties for the first occurrence of the feature in the model. The given value is subtracted from the score if the current candidate is the first one to include the feature in the model.
Refer to the [Per-object and per-feature penalties](https://catboost.ai/docs/en/concepts/algorithm-score-functions) section for details on applying different score penalties.
Non-negative float values are supported for each penalty.
- Set the penalty for each feature as a string (the number of features is n).
Format
```
"(<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>)"
```
Note
Spaces between values are not allowed.
Values should be passed as a parenthesized string of comma-separated values. Penalties equal to 0 at the end of the list may be dropped.
In this example
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = "(0.1,1,3)"
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = "(0.1,1,3)"
```
Note
Spaces between values are not allowed.
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
- Set the penalty individually for each explicitly specified feature as a string (the number of features is n).
Format
```
"<feature index or name>:<penalty>,..,<feature index or name>:<penalty>"
```
Note
Spaces between values are not allowed.
These examples `first_feature_use_penalties` parameter:
```
first_feature_use_penalties = "2:1.1,4:0.1"
```
```
first_feature_use_penalties = "Feature2:1.1,Feature4:0.1"
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = "2:1.1,4:0.1"
```
```
per_object_feature_penalties = "Feature2:1.1,Feature4:0.1"
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
- Set the penalty individually for each required feature as an array or a dictionary (the number of features is n).
Format
```
[<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>]
```
```
{"<feature index or name>":<penalty>, .., "<feature index or name>":<penalty>}
```
Array examples.
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = [0.1, 1, 3]
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = [0.1, 1, 3]
```
These dictionary examples
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = {"Feature2":1.1,"Feature4":0.1}
```
```
first_feature_use_penalties = {"2":1.1, "4":0.1}
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = {"Feature2":1.1,"Feature4":0.1}
```
```
per_object_feature_penalties = {"2":1.1, "4":0.1}
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
**Type**
- list
- numpy.ndarray
- string
- dict
**Default value**
0 for all features
**Supported processing units**
CPU
## fixed\_binary\_splits
Command-line: `--fixed-binary-splits`
#### Description
A list of indices of binary features to put at the top of each tree; ignored if `grow_policy` is `Symmetric`.
**Type**
list
**Default value**
None
**Supported processing units**
GPU
## penalties\_coefficient
Command-line: `--penalties-coefficient`
#### Description
A single-value common coefficient to multiply all penalties.
Non-negative values are supported.
**Type**
float
**Default value**
1
**Supported processing units**
CPU
## per\_object\_feature\_penalties
Command-line: `--per-object-feature-penalties`
#### Description
Per-object penalties for the first use of the feature for the object. The given value is multiplied by the number of objects that are divided by the current split and use the feature for the first time.
Refer to the [Per-object and per-feature penalties](https://catboost.ai/docs/en/concepts/algorithm-score-functions) section for details on applying different score penalties.
Non-negative float values are supported for each penalty.
Python package
- Set the penalty for each feature as a string (the number of features is n).
Format
```
"(<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>)"
```
Note
Spaces between values are not allowed.
Values should be passed as a parenthesized string of comma-separated values. Penalties equal to 0 at the end of the list may be dropped.
In this example
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = "(0.1,1,3)"
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = "(0.1,1,3)"
```
Note
Spaces between values are not allowed.
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
- Set the penalty individually for each explicitly specified feature as a string (the number of features is n).
Format
```
"<feature index or name>:<penalty>,..,<feature index or name>:<penalty>"
```
Note
Spaces between values are not allowed.
These examples `first_feature_use_penalties` parameter:
```
first_feature_use_penalties = "2:1.1,4:0.1"
```
```
first_feature_use_penalties = "Feature2:1.1,Feature4:0.1"
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = "2:1.1,4:0.1"
```
```
per_object_feature_penalties = "Feature2:1.1,Feature4:0.1"
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
- Set the penalty individually for each required feature as an array or a dictionary (the number of features is n).
Format
```
[<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>]
```
```
{"<feature index or name>":<penalty>, .., "<feature index or name>":<penalty>}
```
Array examples.
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = [0.1, 1, 3]
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = [0.1, 1, 3]
```
These dictionary examples
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = {"Feature2":1.1,"Feature4":0.1}
```
```
first_feature_use_penalties = {"2":1.1, "4":0.1}
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = {"Feature2":1.1,"Feature4":0.1}
```
```
per_object_feature_penalties = {"2":1.1, "4":0.1}
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
R package
- Set the penalty for each feature as a string (the number of features is n).
Format
```
"(<feature-penalty_0>, <feature-penalty_2>, .., <feature-penalty_n-1>)"
```
Note
Spaces between values are not allowed.
Values should be passed as a parenthesized string of comma-separated values. Penalties equal to 0 at the end of the list may be dropped.
Penalties equal to 0 at the end of the list may be dropped.
In this
example
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = "(0.1,1,3)"
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = "(0.1,1,3)"
```
Note
Spaces between values are not allowed.
the multiplication weight is set to 0.1, 1 and 3 for the first, second and third features respectively. The multiplication weight for all other features is set to 1.
- Set the penalty individually for each explicitly specified feature as a string (the number of features is n).
Format
```
"<feature index or name>:<penalty>,..,<feature index or name>:<penalty>"
```
Note
Spaces between values are not allowed.
These examples
`first_feature_use_penalties` parameter:
```
first_feature_use_penalties = "2:1.1,4:0.1"
```
```
first_feature_use_penalties = "Feature2:1.1,Feature4:0.1"
```
`per_object_feature_penalties` parameter:
```
per_object_feature_penalties = "2:1.1,4:0.1"
```
```
per_object_feature_penalties = "Feature2:1.1,Feature4:0.1"
```
are identical, given that the name of the feature indexed 2 is "Feature2" and the name of the feature indexed 4 is "Feature4".
**Type**
- list
- numpy.ndarray
- string
- dict
**Default value**
0 for all objects
**Supported processing units**
CPU
## model\_shrink\_rate
Command-line: `--model-shrink-rate`
#### Description
The constant used to calculate the coefficient for multiplying the model on each iteration.
The actual model shrinkage coefficient calculated at each iteration depends on the value of the `--model-shrink-mode`for the Command-line version parameter. The resulting value of the coefficient should be always in the range (0, 1\].
**Type**
float
**Default value**
The default value depends on the values of the following parameters:
- `--model-shrink-mode` for the Command-line version
- `--monotone-constraints` for the Command-line version
**Supported processing units**
CPU
## model\_shrink\_mode
Command-line: `model_shrink_mode`
#### Description
Determines how the actual model shrinkage coefficient is calculated at each iteration.
Possible values:
- Constant:
1 ā m o d e l \_ s h r i n k \_ r a t e ā
l e a r n i n g \_ r a t e , 1 - model\\\_shrink\\\_rate \\cdot learning\\\_rate {,}
- m
o
d
e
l
\_
s
h
r
i
n
k
\_
r
a
t
e
model\\\_shrink\\\_rate
is the value of the `--model-shrink-rate`for the Command-line version parameter.
- l
e
a
r
n
i
n
g
\_
r
a
t
e
learning\\\_rate
is the value of the `--learning-rate`for the Command-line version parameter
- Decreasing:
1 ā m o d e l \_ s h r i n k \_ r a t e i , 1 - \\frac{model\\\_shrink\\\_rate}{i} {,}
- m
o
d
e
l
\_
s
h
r
i
n
k
\_
r
a
t
e
model\\\_shrink\\\_rate
is the value of the `--model-shrink-rate`for the Command-line version parameter.
- i
i
is the identifier of the iteration.
**Type**
string
**Default value**
Constant
**Supported processing units**
CPU |
| Shard | 169 (laksa) |
| Root Hash | 17435841955170310369 |
| Unparsed URL | ai,catboost!/docs/en/references/training-parameters/common s443 |