MTS

`mts`

The 'mts' module contains various methods of the MT system.

`MSR(*, delta=0.0001, esp=1e-16)`

Bases: RegressorMixin, BaseEstimator

MSR: Multiple Single Regression.

Parameters:

Name	Type	Description	Default
`delta`	`float`	Threshold for stopping repeated computations.	`1e-4`
`esp`	`float`	A constant to avoid zero division. It is used in the calculation as `1 / (x + esp)`.	`1e-16`

Attributes:

Name	Type	Description
`mean_X_`	`ndarray of shape(n_features, )`	Mean values of each feature of the training data.
`mean_y_`	`float`	Mean value of target values.
`coef_`	`ndarray of shape (n_features, )`	Estimated coefficients for the MSR.
`n_features_in_`	`int`	Number of features seen during fit.
`feature_names_in_`	`ndarray of shape (n_features_in_, )`	Names of features seen during the fit. Defined only if X has feature names that are all strings.

References

前田誠. (2017). T 法 (1) の考え方を利用した新しい回帰手法の提案. 品質, 47(2), 185-194.

Methods:

Name	Description
`fit`	Fit the model.
`predict`	Predict using the fitted model.

Source code in src/mts/_msr.py

def __init__(self, *, delta: float = 1e-4, esp: float = 1e-16):
    """
    Initialize the instance.

    Parameters
    ----------
    delta : float, default=1e-4
        Threshold for stopping repeated computations.

    esp : float, default=1e-16
        A constant to avoid zero division. It is used in the calculation as
        `1 / (x + esp)`.

    Attributes
    ----------
    mean_X_ : ndarray of shape(n_features, )
        Mean values of each feature of the training data.

    mean_y_ : float
        Mean value of target values.

    coef_ : ndarray of shape (n_features, )
        Estimated coefficients for the MSR.

    n_features_in_ : int
        Number of features seen during fit.

    feature_names_in_ : ndarray of shape (n_features_in_, )
        Names of features seen during the fit. Defined only if X has feature
        names that are all strings.

    References
    ----------
    前田誠. (2017). T 法 (1) の考え方を利用した新しい回帰手法の提案. 品質, 47(2),
    185-194.
    """
    self.delta = delta
    self.esp = esp

`fit(X, y)`

Fit the model.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Training data.	required
`y`	`ndarray of shape (n_samples, )`	Target values. Will be cast to X's dtype if necessary.	required

Returns:

Name	Type	Description
`self`	`object`	Fitted model.

Source code in src/mts/_msr.py

def fit(self, X, y):
    """
    Fit the model.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Training data.

    y : ndarray of shape (n_samples, )
        Target values. Will be cast to X's dtype if necessary.

    Returns
    -------
    self : object
        Fitted model.
    """
    self._validate_params()  # type: ignore

    X, y = self._validate_data(  # type: ignore
        X=X,
        y=y,
        reset=True,
        y_numeric=True,
        ensure_min_samples=2,
        estimator=self,
    )

    n_samples, n_features = X.shape

    if n_samples <= 50:
        n_splits = n_samples
    else:
        n_splits = int(2250 / n_samples) + 5

    kf = KFold(n_splits=n_splits)

    self.coef_ = np.zeros(n_features)
    coef_kf = np.zeros((n_splits, n_features))

    self.mean_X_ = np.mean(X, axis=0)
    self.mean_y_ = np.mean(y)

    std_X = X - self.mean_X_[None, :]
    std_y = y - self.mean_y_

    zz_before = None
    skip_kf = []
    while True:
        y_ = np.dot(std_X, self.coef_)

        z = std_y - y_

        st, sb, n, b = self._compute_sn_ratio_and_sensitivity(std_X, z)

        if st == 0 or np.all(sb == 0):
            break

        self.coef_ += b * n / np.sum(n)

        z = np.empty(n_samples)
        for kf_idx, (train_idx, test_idx) in enumerate(kf.split(std_X)):
            train_X, train_y = std_X[train_idx], std_y[train_idx]
            test_X, test_y = std_X[test_idx], std_y[test_idx]

            if kf_idx in skip_kf:
                y_kf = np.dot(test_X, coef_kf[kf_idx])
                z[test_idx] = test_y - y_kf
                continue

            y_kf = np.dot(train_X, coef_kf[kf_idx])
            z_kf = train_y - y_kf

            st, sb, n, b = self._compute_sn_ratio_and_sensitivity(train_X, z_kf)

            if st == 0 or np.all(sb == 0):
                skip_kf.append(kf_idx)
                y_kf = np.dot(test_X, coef_kf[kf_idx])
                z[test_idx] = test_y - y_kf
                continue

            coef_kf[kf_idx] += b * n / np.sum(n)

            y_kf = np.dot(test_X, coef_kf[kf_idx])
            z[test_idx] = test_y - y_kf

        zz_after = np.dot(z, z)

        if zz_before is None:
            zz_before = zz_after * 2

        if (zz_before - zz_after) <= (self.delta * zz_before):
            break
        else:
            zz_before = zz_after

    return self

`predict(X, y=None)`

Predict using the fitted model.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Samples.	required
`y`	`None`	Ignored.	`None`

Returns:

Name	Type	Description
`y_pred`	`ndarray of shape (n_samples, )`	Predicted values.

Source code in src/mts/_msr.py

def predict(self, X, y=None):
    """
    Predict using the fitted model.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Samples.

    y : None
        Ignored.

    Returns
    -------
    y_pred : ndarray of shape (n_samples, )
        Predicted values.
    """
    check_is_fitted(self)

    X = self._validate_data(X=X, reset=False)  # type: ignore

    std_X = X - self.mean_X_[None, :]

    return np.dot(std_X, self.coef_) + self.mean_y_

`MT(*, method='mt', ddof=1, esp=1e-16, kind='specify', a=0.05, threshold=4.0, return_sqrt=False)`

Bases: BaseEstimator

MT, MTA and Standardized-Variation-Pressure methods.

The MT, MTA and SVP methods are unsupervised learning methods used for pattern recognition in quality engineering. These methods learn the mean and standard deviation of each feature and the inverse correlation matrix of the training data, and compute MD values based on these values. The training data is called the unit space and usually contains only normal data. The MTA method learns an adjoint matrix instead of an inverse matrix to deal with multicolinearity. The SVP method does not require a correlation matrix.

Parameters:

Name	Type	Description	Default
`method`	`(mt, mta, svp)`	Computation method.	`"mt"`
`ddof`	`int`	It means the delta degrees of freedom. The divisor used in the is `N - ddof`, where `N` is the number of samples.	`1`
`esp`	`float`	A constant to avoid zero division. It is used in the calculation as `1 / (x + esp)`.	`1e-16`
`kind`	`(k, f, chi2, specify)`	The distribution used to determine normal and abnormal thresholds.	`"k"`
`a`	`float`	Right side significance level. Use to set the threshold when type is set to `f` or `chi2`.	`0.05`
`threshold`	`float`	Threshold to use when `kind` is set to `specify`.	`4.0`
`return_sqrt`	`bool`	Return the square root of the MD value or not.	`False`

Attributes:

Name	Type	Description
`mean_`	`ndarray of shape (n_features, )`	Means of each feature of the training data.
`scale_`	`ndarray of shape (n_features, )`	Standard deviation values of each feature of the training data.
`covariance_`	`ndarray of shape (n_features, n_features)`	Correlation matrix, variance-covariance matrix, or identity matrix of the training data; correlation matrix if "method" is "mt", variance-covariance matrix if "method" is "mta", or identity matrix if "method" is "svp".
`precision_`	`ndarray of shape (n_features, n_features)`	The inverse matrix or adjoint matrix of covariance_; if method is svp, then identity matrix.
`dist_`	`ndarray of shape(n_samples, )`	Mahalanobis distances of the training set (on which the fit is called) observations.
`n_features_in_`	`int`	Number of features seen during fit.
`feature_names_in_`	`ndarray of shape (n_features_in_, )`	Names of features seen during the fit. Defined only if X has feature names that are all strings.

Methods:

Name	Description
`fit`	Fit the model.
`predict`	Predict the labels of X according to the fitted model.
`fit_predict`	Perform Fit to X and Return Labels for X.
`mahalanobis`	Compute the Mahalanobis distances (MD values).
`score`	Return the ROCAUC to the given test data and labels.
`score_samples`	Compute the Mahalanobis distances (MD values).

Source code in src/mts/_mt.py

def __init__(
    self,
    *,
    method: str = "mt",
    ddof: int = 1,
    esp: float = 1e-16,
    kind: str = "specify",
    a: float = 0.05,
    threshold: float = 4.0,
    return_sqrt: bool = False,
):
    """
    Initialize the instance.

    Parameters
    ----------
    method : {"mt", "mta", "svp"}, default="mt"
        Computation method.

    ddof : int, default=1
        It means the delta degrees of freedom. The divisor used in the is
        `N - ddof`, where `N` is the number of samples.

    esp : float, default=1e-16
        A constant to avoid zero division. It is used in the calculation as
        `1 / (x + esp)`.

    kind : {"k", "f", "chi2", "specify"}, default="specify"
        The distribution used to determine normal and abnormal thresholds.

    a : float, default=0.05
        Right side significance level. Use to set the threshold when type is
        set to `f` or `chi2`.

    threshold : float, default=4.0
        Threshold to use when `kind` is set to `specify`.

    return_sqrt : bool, default=False
        Return the square root of the MD value or not.

    Attributes
    ----------
    mean_ : ndarray of shape (n_features, )
        Means of each feature of the training data.

    scale_ : ndarray of shape (n_features, )
        Standard deviation values of each feature of the training data.

    covariance_ : ndarray of shape (n_features, n_features)
        Correlation matrix, variance-covariance matrix, or identity matrix
        of the training data; correlation matrix if "method" is "mt",
        variance-covariance matrix if "method" is "mta", or identity matrix
        if "method" is "svp".

    precision_ : ndarray of shape (n_features, n_features)
        The inverse matrix or adjoint matrix of covariance_; if method is
        svp, then identity matrix.

    dist_ : ndarray of shape(n_samples, )
        Mahalanobis distances of the training set (on which the fit is
        called) observations.

    n_features_in_ : int
        Number of features seen during fit.

    feature_names_in_ : ndarray of shape (n_features_in_, )
        Names of features seen during the fit. Defined only if X has feature
        names that are all strings.
    """
    self.method = method
    self.ddof = ddof
    self.esp = esp
    self.kind = kind
    self.a = a
    self.threshold = threshold
    self.return_sqrt = return_sqrt

`fit(X, y=None)`

Fit the model.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Training data.	required
`y`	`None`	Ignore	`None`

Returns:

Name	Type	Description
`self`	`object`	Fitted model.

Source code in src/mts/_mt.py

def fit(self, X, y=None):
    """
    Fit the model.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Training data.

    y : None
        Ignore

    Returns
    -------
    self : object
        Fitted model.
    """
    self._validate_params()  # type: ignore

    X = self._validate_data(  # type: ignore
        X=X,
        reset=True,
        ensure_min_samples=2,
        ensure_min_features=2,
        estimator=self,
    )

    n, k = X.shape  # type: ignore

    self.mean_ = np.mean(X, axis=0)
    self.scale_ = np.std(X, ddof=self.ddof, axis=0)

    if self.method == "mt":
        std_X = (X - self.mean_[None, :]) / (self.scale_[None, :] + self.esp)
        self.covariance_ = np.corrcoef(std_X, rowvar=False)
    elif self.method == "mta":
        std_X = X - self.mean_[None, :]
        self.covariance_ = np.cov(std_X, rowvar=False)
    else:
        self.covariance_ = np.eye(k)

    self.precision_ = self._get_precision(self.covariance_)

    self.dist_ = self._mahalanobis(X, self.mean_, self.scale_, self.precision_)

    if self.kind == "k":
        self.threshold_ = 4 * k
    elif self.kind == "f":
        self.threshold_ = (
            (k * (n - 1) * (n + 1)) / (n * (n - k)) * f.isf(self.a, k, n - k)
        )
    elif self.kind == "chi2":
        self.threshold_ = chi2.isf(self.a, k)
    else:
        self.threshold_ = self.threshold

    return self

`predict(X, y=None)`

Predict the labels of X according to the fitted model.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Samples.	required
`y`	`None`	Ignored.	`None`

Returns:

Name	Type	Description
`labels`	`ndarray of shape (n_samples, )`	Returns 1 for anomalies/outliers and 0 for inliers.

Source code in src/mts/_mt.py

def predict(self, X, y=None):
    """
    Predict the labels of X according to the fitted model.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Samples.

    y : None
        Ignored.

    Returns
    -------
    labels : ndarray of shape (n_samples, )
        Returns 1 for anomalies/outliers and 0 for inliers.
    """
    check_is_fitted(self)

    return np.where(self.mahalanobis(X=X) >= self.threshold_, 1, 0)

`fit_predict(X, y=None)`

Perform Fit to X and Return Labels for X.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Input data.	required
`y`	`None`	Ignored.	`None`

Returns:

Name	Type	Description
`labels`	`ndarray of shape (n_samples, )`	Returns 1 for anomalies/outliers and 0 for inliers.

Source code in src/mts/_mt.py

def fit_predict(self, X, y=None):
    """
    Perform Fit to X and Return Labels for X.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Input data.

    y : None
        Ignored.

    Returns
    -------
    labels : ndarray of shape (n_samples, )
        Returns 1 for anomalies/outliers and 0 for inliers.
    """
    return self.fit(X).predict(X)

`mahalanobis(X)`

Compute the Mahalanobis distances (MD values).

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Samples.	required

Returns:

Name	Type	Description
`MD`	`ndarray of shape (n_samples, )`	Mahalanobis distances (MD values).

Source code in src/mts/_mt.py

def mahalanobis(self, X):
    """
    Compute the Mahalanobis distances (MD values).

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Samples.

    Returns
    -------
    MD : ndarray of shape (n_samples, )
        Mahalanobis distances (MD values).
    """
    check_is_fitted(self)

    X = self._validate_data(X=X, reset=False)  # type: ignore

    MD = self._mahalanobis(X, self.mean_, self.scale_, self.precision_)

    return MD

`score(X, y)`

Return the ROCAUC to the given test data and labels.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Test samples.	required
`y`	`ndarray of shape (n_samples, )`	True labels for X. 1 for anomalies/outliers and 0 for inliers.	required

Returns:

Name	Type	Description
`score`	`float`	ROCAUC.

Source code in src/mts/_mt.py

def score(self, X, y):
    """
    Return the ROCAUC to the given test data and labels.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Test samples.

    y : ndarray of shape (n_samples, )
        True labels for X. 1 for anomalies/outliers and 0 for inliers.

    Returns
    -------
    score : float
        ROCAUC.
    """
    check_is_fitted(self)

    X, y = self._validate_data(X=X, y=y, reset=False)  # type: ignore

    return roc_auc_score(y, self.mahalanobis(X=X))

`score_samples(X)`

Compute the Mahalanobis distances (MD values).

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Samples.	required

Returns:

Name	Type	Description
`MD`	`ndarray of shape (n_samples, )`	MD values.

Source code in src/mts/_mt.py

def score_samples(self, X):
    """
    Compute the Mahalanobis distances (MD values).

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Samples.

    Returns
    -------
    MD : ndarray of shape (n_samples, )
        MD values.
    """
    check_is_fitted(self)

    return self.mahalanobis(X=X)

`RT(*, ddof=1, esp=1e-16, threshold=4.0, return_sqrt=False)`

Bases: BaseEstimator

RT method.

The RT method is an unsupervised learning method used for pattern recognition in quality engineering. The method learns the mean of each feature in unit space, the sensitivity and SN ratio of each sample, and the associated covariance matrix of the sensitivity and SN ratio, and computes MD values based on these values.

Parameters:

Name	Type	Description	Default
`ddof`	`int`	It means the delta degrees of freedom. The divisor used in the is `N - ddof`, where `N` is the number of samples.	`1`
`esp`	`float`	A constant to avoid zero division. It is used in the calculation as `1 / (x + esp)`.	`1e-16`
`threshold`	`float`	Threshold. A multiple of the standard deviation of the MD values in the unit space. If 4, threshold is 4 sigma.	`4.0`
`return_sqrt`	`bool`	Return the square root of the MD values or not.	`False`

Attributes:

Name	Type	Description
`mean_X_`	`ndarray of shape (n_features, )`	Mean values of each feature of the training data.
`mean_Y_`	`ndarray of shape (2, )`	Means of sensitivity and error variance reciprocals. Mean_Y_[0]`is the sensitivity mean, and Mean_Y_[1]` is the error variance reciprocal.
`covariance_`	`ndarray of shape (2, 2)`	Variance-covariance matrix of sensitivity and error variance reciprocal.
`precision_`	`ndarray of shape (2, 2)`	Adjoint matrix of `covariance_`.
`dist_`	`ndarray of shape(n_samples, )`	Mahalanobis distances of the training set (on which the fit is called) observations.
`n_features_in_`	`int`	Number of features seen during fit.
`feature_names_in_`	`ndarray of shape (n_features_in_, )`	Names of features seen during the fit. Defined only if X has feature names that are all strings.

Methods:

Name	Description
`fit`	Fit the model.
`predict`	Predict the labels of X according to the fitted model.
`fit_predict`	Perform Fit to X and Return Labels for X.
`mahalanobis`	Compute the Mahalanobis distances (MD values).
`score`	Return the ROCAUC to the given test data and labels.
`score_samples`	Compute the Mahalanobis distances (MD values).

Source code in src/mts/_rt.py

def __init__(
    self,
    *,
    ddof: int = 1,
    esp: float = 1e-16,
    threshold: float = 4.0,
    return_sqrt: bool = False,
):
    """
    Initialize the instance.

    Parameters
    ----------
    ddof : int, default=1
        It means the delta degrees of freedom. The divisor used in the is
        `N - ddof`, where `N` is the number of samples.

    esp : float, default=1e-16
        A constant to avoid zero division. It is used in the calculation as
        `1 / (x + esp)`.

    threshold : float, default=4.0
        Threshold. A multiple of the standard deviation of the MD values in
        the unit space. If 4, threshold is 4 sigma.

    return_sqrt : bool, default=False
        Return the square root of the MD values or not.

    Attributes
    ----------
    mean_X_ : ndarray of shape (n_features, )
        Mean values of each feature of the training data.

    mean_Y_ : ndarray of shape (2, )
        Means of sensitivity and error variance reciprocals. Mean_Y_[0]` is
        the sensitivity mean, and Mean_Y_[1]` is the error variance
        reciprocal.

    covariance_ : ndarray of shape (2, 2)
        Variance-covariance matrix of sensitivity and error variance
        reciprocal.

    precision_ : ndarray of shape (2, 2)
        Adjoint matrix of `covariance_`.

    dist_ : ndarray of shape(n_samples, )
        Mahalanobis distances of the training set (on which the fit is
        called) observations.

    n_features_in_ : int
        Number of features seen during fit.

    feature_names_in_ : ndarray of shape (n_features_in_, )
        Names of features seen during the fit. Defined only if X has feature
        names that are all strings.
    """
    self.ddof = ddof
    self.esp = esp
    self.threshold = threshold
    self.return_sqrt = return_sqrt

`fit(X, y=None)`

Fit the model.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Training data.	required
`y`	`None`	Ignored.	`None`

Returns:

Name	Type	Description
`self`	`object`	Fitted model.

Source code in src/mts/_rt.py

def fit(self, X, y=None):
    """
    Fit the model.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Training data.

    y : None
        Ignored.

    Returns
    -------
    self : object
        Fitted model.
    """
    self._validate_params()  # type: ignore

    X = self._validate_data(  # type: ignore
        X=X,
        reset=True,
        ensure_min_samples=2,
        ensure_min_features=2,
        estimator=self,
    )

    self.mean_X_ = np.mean(X, axis=0)

    Y = self._compute_Y(X, self.mean_X_)

    self.mean_Y_ = np.mean(Y, axis=0)

    std_Y = Y - self.mean_Y_[None, :]

    self.covariance_ = np.cov(std_Y, rowvar=False, ddof=self.ddof)

    self.precision_ = self._get_precision(self.covariance_)

    self.dist_ = self._mahalanobis(Y, self.mean_Y_, self.precision_)

    self.sigma_ = np.sqrt(np.mean(self.dist_))

    return self

`predict(X, y=None)`

Predict the labels of X according to the fitted model.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Samples.	required
`y`	`None`	Ignored.	`None`

Returns:

Name	Type	Description
`labels`	`ndarray of shape (n_samples, )`	Returns 1 for anomalies/outliers and 0 for inliers.

Source code in src/mts/_rt.py

def predict(self, X, y=None):
    """
    Predict the labels of X according to the fitted model.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Samples.

    y : None
        Ignored.

    Returns
    -------
    labels : ndarray of shape (n_samples, )
        Returns 1 for anomalies/outliers and 0 for inliers.
    """
    check_is_fitted(self)

    threshold = self.threshold * self.sigma_

    return np.where(self.mahalanobis(X=X) >= threshold, 1, 0)

`fit_predict(X, y=None)`

Perform Fit to X and Return Labels for X.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Input data.	required
`y`	`None`	Ignored.	`None`

Returns:

Name	Type	Description
`labels`	`ndarray of shape (n_samples, )`	Returns 1 for anomalies/outliers and 0 for inliers.

Source code in src/mts/_rt.py

def fit_predict(self, X, y=None):
    """
    Perform Fit to X and Return Labels for X.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Input data.

    y : None
        Ignored.

    Returns
    -------
    labels : ndarray of shape (n_samples, )
        Returns 1 for anomalies/outliers and 0 for inliers.
    """
    return self.fit(X).predict(X)

`mahalanobis(X)`

Compute the Mahalanobis distances (MD values).

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Samples.	required

Returns:

Name	Type	Description
`MD`	`ndarray of shape (n_samples, )`	Mahalanobis distances (MD values).

Source code in src/mts/_rt.py

def mahalanobis(self, X):
    """
    Compute the Mahalanobis distances (MD values).

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Samples.

    Returns
    -------
    MD : ndarray of shape (n_samples, )
        Mahalanobis distances (MD values).
    """
    check_is_fitted(self)

    X = self._validate_data(X=X, reset=False)  # type: ignore

    Y = self._compute_Y(X, self.mean_X_)

    MD = self._mahalanobis(Y, self.mean_Y_, self.precision_)

    if self.return_sqrt:
        MD = np.sqrt(MD)

    return MD

`score(X, y)`

Return the ROCAUC to the given test data and labels.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Test samples.	required
`y`	`ndarray of shape (n_samples, )`	True labels for X. 1 for anomalies/outliers and 0 for inliers.	required

Returns:

Name	Type	Description
`score`	`float`	ROCAUC.

Source code in src/mts/_rt.py

def score(self, X, y):
    """
    Return the ROCAUC to the given test data and labels.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Test samples.

    y : ndarray of shape (n_samples, )
        True labels for X. 1 for anomalies/outliers and 0 for inliers.

    Returns
    -------
    score : float
        ROCAUC.
    """
    check_is_fitted(self)

    X, y = self._validate_data(X=X, y=y, reset=False)  # type: ignore

    return roc_auc_score(y, self.mahalanobis(X=X))

`score_samples(X)`

Compute the Mahalanobis distances (MD values).

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Samples.	required

Returns:

Name	Type	Description
`MD`	`ndarray of shape (n_samples, )`	MD values.

Source code in src/mts/_rt.py

def score_samples(self, X):
    """
    Compute the Mahalanobis distances (MD values).

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Samples.

    Returns
    -------
    MD : ndarray of shape (n_samples, )
        MD values.
    """
    check_is_fitted(self)

    return self.mahalanobis(X=X)

`T(*, tb=False, esp=1e-16, is_simplified=False)`

Bases: RegressorMixin, BaseEstimator

T(1), T(2), Ta and Tb methods.

The T(1), T(2), Ta and Tb methods are supervised learning methods used for regression in quality engineering. The T(1) and T(2) methods divide the training data into unit space and signal data, and learn the mean from the unit space and the sensitivity and SN ratio from the signal data. The Ta method does not divide the training data into unit space and signal data, and learns the mean, sensitivity, and SN ratio from all the training data. The Tb method also learns from all training data, but for each element, the sample with the largest SN ratio is used as the mean.

Parameters:

Name	Type	Description	Default
`tb`	`bool`	Whether to compute as Tb method. If False, compute as T(1), T(2), and Ta methods.	`False`
`esp`	`float`	A constant to avoid zero division. It is used in the calculation as `1 / (x + esp)`.	`1e-16`
`is_simplified`	`bool`	Compute the SN ratio using the simplified formula or not. The simplified formula computes with `b**2 / ve`.	`False`

Attributes:

Name	Type	Description
`mean_X_`	`ndarray of shape (n_features, )`	Mean values of each feature of the training data.
`mean_y_`	`float or ndarray of shape (n_features, )`	Mean value of target values.
`n_`	`ndarray of shape (n_features, )`	SN ratio between each feature and the target values.
`b_`	`ndarray of shape (n_features, )`	Sensitivity between each feature and target values.
`n_features_in_`	`int`	Number of features seen during fit.
`feature_names_in_`	`ndarray of shape (n_features_in_, )`	Names of features seen during the fit. Defined only if X has feature names that are all strings.

Methods:

Name	Description
`fit`	Fit the model.
`predict`	Predict using the fitted model.
`score`	Return the SN ratio of the integrated estimate.

Source code in src/mts/_t.py

def __init__(
    self, *, tb: bool = False, esp: float = 1e-16, is_simplified: bool = False
):
    """
    Initialize the instance.

    Parameters
    ----------
    tb : bool, default=False
        Whether to compute as Tb method. If False, compute as T(1), T(2),
        and Ta methods.

    esp : float, default=1e-16
        A constant to avoid zero division. It is used in the calculation as
        `1 / (x + esp)`.

    is_simplified : bool, default=False
        Compute the SN ratio using the simplified formula or not. The
        simplified formula computes with `b**2 / ve`.

    Attributes
    ----------
    mean_X_ : ndarray of shape (n_features, )
        Mean values of each feature of the training data.

    mean_y_ : float or ndarray of shape (n_features, )
        Mean value of target values.

    n_ : ndarray of shape (n_features, )
        SN ratio between each feature and the target values.

    b_ : ndarray of shape (n_features, )
        Sensitivity between each feature and target values.

    n_features_in_ : int
        Number of features seen during fit.

    feature_names_in_ : ndarray of shape (n_features_in_, )
        Names of features seen during the fit. Defined only if X has feature
        names that are all strings.
    """
    self.tb = tb
    self.esp = esp
    self.is_simplified = is_simplified

`fit(X, y, *, us_idx=None)`

Fit the model.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Training data. Includes unit space and signal data.	required
`y`	`ndarray of shape (n_samples, )`	Target values. Will be cast to X's dtype if necessary.	required
`us_idx`	`array_like of shape (n_samples, ) or None`	A binary array indicating which sample of the training data is the unit space (0 for the unit space, 1 for the signal data); if None, the training data is not divided into the unit space and the signal data, but is computed as the Ta method. It is ignored when the Tb method is computed.	`None.`

Returns:

Name	Type	Description
`self`	`object`	Fitted model.

Source code in src/mts/_t.py

def fit(self, X, y, *, us_idx=None):
    """
    Fit the model.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Training data. Includes unit space and signal data.

    y : ndarray of shape (n_samples, )
        Target values. Will be cast to X's dtype if necessary.

    us_idx : array_like of shape (n_samples, ) or None, default=None.
        A binary array indicating which sample of the training data is the
        unit space (0 for the unit space, 1 for the signal data); if None,
        the training data is not divided into the unit space and the signal
        data, but is computed as the Ta method. It is ignored when the Tb
        method is computed.

    Returns
    -------
    self : object
        Fitted model.
    """
    self._validate_params()  # type: ignore

    X, y = self._validate_data(  # type: ignore
        X=X,
        y=y,
        reset=True,
        y_numeric=True,
        estimator=self,
    )

    if self.tb:
        n = np.empty_like(X)
        b = np.empty_like(X)
        for i, (x_i, y_i) in enumerate(zip(X, y)):
            std_X = X - x_i[None, :]
            std_y = y - y_i

            n[i], b[i] = self._compute_sn_ratio_and_sensitivity(std_X, std_y)

        idx_row = np.argmax(n, axis=0)
        idx_col = np.arange(X.shape[1])

        self.mean_X_ = X[idx_row, idx_col]
        self.mean_y_ = y[idx_row]

        self.b_ = b[idx_row, idx_col]
        self.n_ = n[idx_row, idx_col]
    else:
        if us_idx is None:
            self.mean_X_ = np.mean(X, axis=0)
            self.mean_y_ = np.mean(y)

            std_X = X - self.mean_X_[None, :]
            std_y = y - self.mean_y_
        else:
            unit_space_mask = np.where(us_idx == 0, True, False)

            self.mean_X_ = np.mean(X[unit_space_mask], axis=0)
            self.mean_y_ = np.mean(y[unit_space_mask])

            std_X = X[~unit_space_mask] - self.mean_X_[None, :]
            std_y = X[~unit_space_mask] - self.mean_y_

        self.n_, self.b_ = self._compute_sn_ratio_and_sensitivity(std_X, std_y)

    return self

`predict(X, y=None)`

Predict using the fitted model.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Samples.	required
`y`	`None`	Ignored.	`None`

Returns:

Name	Type	Description
`y_pred`	`ndarray of shape (n_samples, )`	Predict values.

Source code in src/mts/_t.py

def predict(self, X, y=None):
    """
    Predict using the fitted model.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Samples.

    y : None
        Ignored.

    Returns
    -------
    y_pred : ndarray of shape (n_samples, )
        Predict values.
    """
    check_is_fitted(self)

    X = self._validate_data(X=X, reset=False, estimator=self)  # type: ignore

    std_X = X - self.mean_X_

    M_pred = std_X / (self.b_ + self.esp)[None, :]

    if self.tb:
        y_pred = M_pred + self.mean_y_[None, :]  # type: ignore
        y_pred = np.dot(y_pred, self.n_) / (np.sum(self.n_) + self.esp)
    else:
        M_pred = np.dot(M_pred, self.n_) / (np.sum(self.n_) + self.esp)
        y_pred = M_pred + self.mean_y_

    return y_pred

`score(X, y)`

Return the SN ratio of the integrated estimate.

Parameters:

Name	Type	Description	Default
`X`	`ndarray of shape (n_samples, n_features)`	Test samples.	required
`y`	`ndarray of shape (n_samples, )`	True values for X.	required

Returns:

Name	Type	Description
`n`	`float`	SN ratio of the integrated estimate. It is computed from M_True and M_Pred for the T(1), T(2) and Ta methods, and from y_True and y_Pred for the Tb method.

Source code in src/mts/_t.py

def score(self, X, y):
    """
    Return the SN ratio of the integrated estimate.

    Parameters
    ----------
    X : ndarray of shape (n_samples, n_features)
        Test samples.

    y : ndarray of shape (n_samples, )
        True values for X.

    Returns
    -------
    n : float
        SN ratio of the integrated estimate. It is computed from M_True and
        M_Pred for the T(1), T(2) and Ta methods, and from y_True and y_Pred
        for the Tb method.
    """
    check_is_fitted(self)

    X, y = self._validate_data(X=X, y=y, reset=False)  # type: ignore

    if self.tb:
        M_true = y
        M_pred = self.predict(X)
    else:
        M_true = y - self.mean_y_
        M_pred = self.predict(X) - self.mean_y_

    n, _ = self._compute_sn_ratio_and_sensitivity(M_pred[:, None], M_true)
    n = 10 * np.log10(n)

    return n