skmultilearn.base package

The skmultilearn.base module implements base classifier classes for scikit-multilearn’s multi-label classification.

Two base classifier classes are in use currently in scikit-multilearn:

class skmultilearn.base.MLClassifierBase

Bases: BaseEstimator, ClassifierMixin

Base class providing API and common functions for all multi-label classifiers.

Implements base functionality for ML classifiers, especially the get/set params for scikit-learn compatibility.

Attributes

copyable_attrsList[str]

list of attribute names that should be copied when class is cloned

fit(X, y)

Abstract method to fit classifier with training data

It must return a fitted instance of self.

Parameters

Xnumpy.ndarray or scipy.sparse

input features, can be a dense or sparse matrix of size (n_samples, n_features)

ynumpy.ndaarray or scipy.sparse {0,1}

binary indicator matrix with label assignments.

Returns

object

fitted instance of self

Raises

NotImplementedError

this is just an abstract method

get_params(deep=True)

Get parameters to sub-objects

Introspection of classifier for search models like cross-validation and grid search.

Parameters

deepbool

if True all params will be introspected also and appended to the output dictionary.

Returns

outdict

dictionary of all parameters and their values. If deep=True the dictionary also holds the parameters of the parameters.

predict(X)

Abstract method to predict labels

Parameters

Xnumpy.ndarray or scipy.sparse.csc_matrix

input features of shape (n_samples, n_features)

Returns

scipy.sparse of int

binary indicator matrix with label assignments with shape (n_samples, n_labels)

Raises

NotImplementedError

this is just an abstract method

set_params(**parameters)

Propagate parameters to sub-objects

Set parameters as returned by get_params. Please see this link.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') MLClassifierBase

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns

selfobject

The updated object.

class skmultilearn.base.ProblemTransformationBase(classifier=None, require_dense=None)

Bases: MLClassifierBase

Base class providing common functions for multi-label classifiers that follow the problem transformation approach.

Problem transformation is the approach in which the original multi-label classification problem is transformed into one or more single-label problems, which are then solved by single-class or multi-class classifiers.

Scikit-multilearn provides a number of such methods:

  • BinaryRelevance - performs a single-label single-class classification for each label and sums the results BinaryRelevance

  • ClassificationHeterogeneousFeature - performs augmentation of the feature set with extra features derived from label probabilities, iteratively resolving cyclic dependencies between features and labels.

  • ClassifierChains - performs a single-label single-class classification for each label and sums the results ClassifierChain

  • LabelPowerset - performs a single-label single-class classification for each label and sums the results LabelPowerset

  • InstanceBasedLogisticRegression - performs a combination of instance-based learning and logistic regression, using a K-Nearest Neighbor layer followed by Logistic Regression classifiers InstanceBasedLogisticRegression

  • StructuredGridSearchCV - performs hyperparameter tuning for each label classifier, considering structural properties and optimizing classifiers for each label StructuredGridSearchCV

Parameters

classifierscikit classifier type

The base classifier that will be used in a class, will be automagically put under self.classifier for future access.

require_denseboolean (default is False)

Whether the base classifier requires input as dense arrays.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ProblemTransformationBase

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns

selfobject

The updated object.


Cite us

If you use scikit-multilearn-ng in your research and publish it, please consider citing scikit-multilearn:

@ARTICLE{2017arXiv170201460S,
    author = {{Szyma{'n}ski}, P. and {Kajdanowicz}, T.},
    title = "{A scikit-based Python environment for performing multi-label classification}",
    journal = {ArXiv e-prints},
    archivePrefix = "arXiv",
    eprint = {1702.01460},
    primaryClass = "cs.LG",
    keywords = {Computer Science - Learning, Computer Science - Mathematical Software},
    year = 2017,
    month = feb,
}