skmultilearn.missing package¶

The skmultilearn.missing module provides classifiers and methods for dealing with missing labels in multi-label classification problems.

Currently the following algorithm adaptation classification schemes are available in scikit-multilearn:

Classifier	Description
`SMiLE`	Semi-supervised multi-label classification using incomplete label information.

class skmultilearn.missing.SMiLE(s=0.5, alpha=0.35, k=5)¶

Bases: object

SMiLE algorithm for multi label with missing labels (Semi-supervised multi-label classification using incomplete label information)

Parameters¶

sfloat, optional, default0.5: Smoothness parameter for class imbalance
alphafloat, optional, default0.35: Smoothness assumption parameter, ensures similar instances having similar predicted output. This parameter balances the importance of the two terms of the equation to optimize
kint, optional, default5: Neighbours parameter for clustering during the algorithm. It will indicate the number of clusters we want to create for the k nearest neighbor (kNN)

Attributes¶

Larray, [n_labels, n_labels]: Correlation matrix between labels
Warray, [n_samples, n_samples]: Weighted matrix created by kNN for instances
estimate_matrixarray-like (n_samples, n_labels): Label estimation matrix y~ic = yiT * L(.,c) if yic == 0 y~ic = 1 otherwise
Harray-like (n_samples, n_samples): Diagonal matrix indicating if an element of X is labeled or not
diagonal_lambdaarray-like (n_samples, n_samples): Diagonal matrix having the sum of weights of the weighted matrix
Marray-like (n_samples, n_samples): Graph laplacian matrix
Hcarray-like (n_samples, n_samples): Hc = H - (H*1*1t*Ht)/(N)
Parray-like (n_features, n_labels): P = (X*Hc*Xt + alpha*X*M*Xt)-1 * X*Hc*YPred R = dxc
barray-like (n_labels): Label bias as the second item of the equation b = ((estimate_matrix - Pt*X)*H*1)/N

References¶

If used, please cite the scikit-multilearn library and the relevant paper:

@article{TAN2017192,
  title = {Semi-supervised multi-label classification using incomplete label information},
  author = {Qiaoyu Tan and Yanming Yu and Guoxian Yu and Jun Wang},
  journal = {Neurocomputing},
  volume = {260},
  pages = {192-202},
  year = {2017},
  issn = {0925-2312},
  doi = {https://doi.org/10.1016/j.neucom.2017.04.033},
  url = {https://www.sciencedirect.com/science/article/pii/S092523121730704X},
}

Examples¶

An example use case for SMiLE algorithm:

from skmultilearn.missing import SMiLE

# initialize SMiLE algorithm with parameters
classifier = SMiLE(s=0.6, alpha=0.4, k=8)

# train
classifier.fit(X,y)

# predict
prediction = classifier.predict(X)

fit(X, y)¶

Fits the model to training data

Parameters¶

Xarray-like or sparse matrix, shape=(n_samples, n_features): Training instances.
yarray-like, shape=(n_samples, n_labels): Training labels.

getParams()¶

Returns the parameters of this model

Returns¶

sfloat, optional, default0.5: Smoothness parameter for class imbalance
alphafloat, optional, default0.35: Smoothness assumption parameter, ensures similar instances having similar predicted output. This parameter balances the importance of the two terms of the equation to optimize
kint, optional, default5: Neighbours parameter for clustering during the algorithm. It will indicate the number of clusters we want to create for the k nearest neighbor (kNN)

predict(X)¶

Predicts using the model

Parameters¶

Xarray-like or sparse matrix, shape=(n_samples, n_features): Test instances.

Returns¶

predictionsarray-like, shape=(n_labels, n_samples): Label predictions for the test instances. (As if it was a regression problem range[0,1])
predictionsNormalizedarray-like, shape=(n_labels, n_samples): Label predictions

setParams(s, alpha, k)¶

Sets the parameters of this model

Parameters¶

sfloat, optional, default0.5: Smoothness parameter for class imbalance
alphafloat, optional, default0.35: Smoothness assumption parameter, ensures similar instances having similar predicted output. This parameter balances the importance of the two terms of the equation to optimize
kint, optional, default5: Neighbours parameter for clustering during the algorithm. It will indicate the number of clusters we want to create for the k nearest neighbor (kNN)

Cite us

If you use scikit-multilearn-ng in your research and publish it, please consider citing scikit-multilearn:

@ARTICLE{2017arXiv170201460S,
    author = {{Szyma{'n}ski}, P. and {Kajdanowicz}, T.},
    title = "{A scikit-based Python environment for performing multi-label classification}",
    journal = {ArXiv e-prints},
    archivePrefix = "arXiv",
    eprint = {1702.01460},
    primaryClass = "cs.LG",
    keywords = {Computer Science - Learning, Computer Science - Mathematical Software},
    year = 2017,
    month = feb,
}

skmultilearn.missing package¶

Parameters¶

Attributes¶

References¶

Examples¶

Parameters¶

Returns¶

Parameters¶

Returns¶

Parameters¶

scikit-multilearn-ng

Navigation

Related Topics