Test the `heteroscedasticity` of a given Time-Series Dataset🔗

Introduction🔗

Summary

From Wikipedia:

In statistics, a sequence of random variables is heteroscedastic if its random variables have different variances. The term means "differing variance" and comes from the Greek "hetero" (different) and "skedasis" (dispersion). In the context of regression analysis, heteroscedasticity refers to the case where the variability of a variable is unequal across the range of values of a second variable that predicts it.

For more info, see: Heteroscedasticity on Wikipedia.

Info

The library currently wraps four widely used heteroscedasticity tests from statsmodels, providing a unified interface for diagnostic checking:

library	category	algorithm	short	import script	url
statsmodels	Heteroscedasticity	ARCH Test	ARCH	`from statsmodels.stats.diagnostic import het_arch`	https://www.statsmodels.org/stable/generated/statsmodels.stats.diagnostic.het_arch.html
	Heteroscedasticity	Breusch-Pagan Test	BP	`from statsmodels.stats.diagnostic import het_breuschpagan`	https://www.statsmodels.org/stable/generated/statsmodels.stats.diagnostic.het_breuschpagan.html
	Heteroscedasticity	Goldfeld-Quandt Test	GQ	`from statsmodels.stats.diagnostic import het_goldfeldquandt`	https://www.statsmodels.org/stable/generated/statsmodels.stats.diagnostic.het_goldfeldquandt.html
	Heteroscedasticity	White's Test	WHITE	`from statsmodels.stats.diagnostic import het_white`	https://www.statsmodels.org/stable/generated/statsmodels.stats.diagnostic.het_white.html

For more info, see: Statsmodels Diagnostics.

Source Library

statsmodels was chosen as the primary engine for these tests because it provides comprehensive, well-documented, and numerically stable implementations of the most common heteroscedasticity diagnostics. These tests allow for robust verification of OLS assumptions across various scenarios, including autoregressive clusters (ARCH) and general non-constant variance (White's).

Source Module

All of the source code can be found within these modules:

ts_stat_tests.heteroscedasticity.algorithms.
ts_stat_tests.heteroscedasticity.tests.

Modules🔗

ts_stat_tests.heteroscedasticity.tests 🔗

Summary

This module provides wrapper functions for various heteroscedasticity tests including: - ARCH Test - Breusch-Pagan Test - Goldfeld-Quandt Test - White's Test

heteroscedasticity 🔗

heteroscedasticity(
    res: Union[RegressionResults, RegressionResultsWrapper],
    algorithm: str = "bp",
    **kwargs: Union[float, int, str, bool, ArrayLike, None]
) -> tuple[Any, ...]

Summary

Perform a heteroscedasticity test on a fitted regression model.

Details

This function is a convenience wrapper around four underlying algorithms:
- arch()
- bp()
- gq()
- white()

Parameters:

Name	Type	Description	Default
`res`	`Union[RegressionResults, RegressionResultsWrapper]`	The fitted regression model to be checked.	required
`algorithm`	`str`	Which heteroscedasticity algorithm to use. - `arch()`: `["arch", "engle"]` - `bp()`: `["bp", "breusch-pagan", "breusch-pagan-lagrange-multiplier"]` - `gq()`: `["gq", "goldfeld-quandt"]` - `white()`: `["white"]` Default: `"bp"`	`'bp'`
`kwargs`	`Union[float, int, str, bool, ArrayLike, None]`	Additional keyword arguments passed to the underlying test function.	`{}`

Raises:

Type	Description
`ValueError`	When the given value for `algorithm` is not valid.

Returns:

Type	Description
`Union[tuple[float, float, float, float], tuple[float, float, str], ResultsStore]`	The results of the heteroscedasticity test. The return type depends on the chosen algorithm and `kwargs`.

Credit

Calculations are performed by statsmodels.

Examples

Setup
>>> import statsmodels.api as sm
>>> from ts_stat_tests.heteroscedasticity.tests import heteroscedasticity
>>> from ts_stat_tests.utils.data import data_line, data_random
>>> X = sm.add_constant(data_line)
>>> y = 2 * data_line + data_random
>>> res = sm.OLS(y, X).fit()

Example 1: Breusch-Pagan test
>>> result = heteroscedasticity(res, algorithm="bp")
>>> print(f"p-value: {result[1]:.4f}")
p-value: 0.2461

Example 2: ARCH test
>>> lm, lmp, f, fp = heteroscedasticity(res, algorithm="arch")
>>> print(f"ARCH p-value: {lmp:.4f}")
ARCH p-value: 0.9124

Source code in src/ts_stat_tests/heteroscedasticity/tests.py

@typechecked
def heteroscedasticity(
    res: Union[RegressionResults, RegressionResultsWrapper],
    algorithm: str = "bp",
    **kwargs: Union[float, int, str, bool, ArrayLike, None],
) -> tuple[Any, ...]:
    """
    !!! note "Summary"
        Perform a heteroscedasticity test on a fitted regression model.

    ???+ abstract "Details"
        This function is a convenience wrapper around four underlying algorithms:<br>
        - [`arch()`][ts_stat_tests.heteroscedasticity.algorithms.arch]<br>
        - [`bp()`][ts_stat_tests.heteroscedasticity.algorithms.bpl]<br>
        - [`gq()`][ts_stat_tests.heteroscedasticity.algorithms.gq]<br>
        - [`white()`][ts_stat_tests.heteroscedasticity.algorithms.wlm]

    Params:
        res (Union[RegressionResults, RegressionResultsWrapper]):
            The fitted regression model to be checked.
        algorithm (str):
            Which heteroscedasticity algorithm to use.<br>
            - `arch()`: `["arch", "engle"]`<br>
            - `bp()`: `["bp", "breusch-pagan", "breusch-pagan-lagrange-multiplier"]`<br>
            - `gq()`: `["gq", "goldfeld-quandt"]`<br>
            - `white()`: `["white"]`<br>
            Default: `"bp"`
        kwargs (Union[float, int, str, bool, ArrayLike, None]):
            Additional keyword arguments passed to the underlying test function.

    Raises:
        (ValueError):
            When the given value for `algorithm` is not valid.

    Returns:
        (Union[tuple[float, float, float, float], tuple[float, float, str], ResultsStore]):
            The results of the heteroscedasticity test. The return type depends on the chosen algorithm and `kwargs`.

    !!! success "Credit"
        Calculations are performed by `statsmodels`.

    ???+ example "Examples"

        ```pycon {.py .python linenums="1" title="Setup"}
        >>> import statsmodels.api as sm
        >>> from ts_stat_tests.heteroscedasticity.tests import heteroscedasticity
        >>> from ts_stat_tests.utils.data import data_line, data_random
        >>> X = sm.add_constant(data_line)
        >>> y = 2 * data_line + data_random
        >>> res = sm.OLS(y, X).fit()

        ```

        ```pycon {.py .python linenums="1" title="Example 1: Breusch-Pagan test"}
        >>> result = heteroscedasticity(res, algorithm="bp")
        >>> print(f"p-value: {result[1]:.4f}")
        p-value: 0.2461

        ```

        ```pycon {.py .python linenums="1" title="Example 2: ARCH test"}
        >>> lm, lmp, f, fp = heteroscedasticity(res, algorithm="arch")
        >>> print(f"ARCH p-value: {lmp:.4f}")
        ARCH p-value: 0.9124

        ```
    """
    options: dict[str, tuple[str, ...]] = {
        "arch": ("arch", "engle"),
        "bp": ("bp", "breusch-pagan", "breusch-pagan-lagrange-multiplier"),
        "gq": ("gq", "goldfeld-quandt"),
        "white": ("white",),
    }

    # Internal helper to handle kwargs casting for ty
    def _call(
        func: Callable[..., Any],
        **args: Any,
    ) -> tuple[Any, ...]:
        """
        !!! note "Summary"
            Internal helper to handle keyword arguments types.

        Params:
            func (Callable[..., Any]):
                The function to call.
            args (Any):
                The keyword arguments to pass.

        Returns:
            (tuple[Any, ...]):
                The function output.

        ???+ example "Examples"
            This is an internal function and is not intended to be called directly.
        """
        return func(**args)

    if algorithm in options["arch"]:
        return _call(arch, resid=res.resid, **kwargs)

    if algorithm in options["bp"]:
        return _call(bpl, resid=res.resid, exog_het=res.model.exog, **kwargs)

    if algorithm in options["gq"]:
        return _call(gq, y=res.model.endog, x=res.model.exog, **kwargs)

    if algorithm in options["white"]:
        return _call(wlm, resid=res.resid, exog_het=res.model.exog, **kwargs)

    raise ValueError(
        generate_error_message(
            parameter_name="algorithm",
            value_parsed=algorithm,
            options=options,
        )
    )

is_heteroscedastic 🔗

is_heteroscedastic(
    res: Union[RegressionResults, RegressionResultsWrapper],
    algorithm: str = "bp",
    alpha: float = 0.05,
    **kwargs: Union[float, int, str, bool, ArrayLike, None]
) -> dict[str, Union[str, float, bool, None]]

Summary

Test whether a given model's residuals exhibit heteroscedasticity or not.

Details

This function checks the results of a heteroscedasticity test against a significance level alpha. The null hypothesis (\(H_0\)) for all supported tests is homoscedasticity (constant variance). If the p-value is less than alpha, the null hypothesis is rejected in favor of heteroscedasticity.

Parameters:

Name	Type	Description	Default
`res`	`Union[RegressionResults, RegressionResultsWrapper]`	The fitted regression model to be checked.	required
`algorithm`	`str`	Which heteroscedasticity algorithm to use. See `heteroscedasticity()` for options. Default: `"bp"`	`'bp'`
`alpha`	`float`	The significance level for the test. Default: `0.05`	`0.05`
`kwargs`	`Union[float, int, str, bool, ArrayLike, None]`	Additional keyword arguments passed to the underlying test function.	`{}`

Returns:

Type	Description
`dict[str, Union[str, float, bool, None]]`	A dictionary containing: - `"result"` (bool): Indicator if the residuals are heteroscedastic (i.e., p-value < alpha). - `"statistic"` (float): The test statistic. - `"pvalue"` (float): The p-value of the test. - `"alpha"` (float): The significance level used. - `"algorithm"` (str): The algorithm used.

Examples

Setup
>>> import statsmodels.api as sm
>>> from ts_stat_tests.heteroscedasticity.tests import is_heteroscedastic
>>> from ts_stat_tests.utils.data import data_line, data_random
>>> X = sm.add_constant(data_line)
>>> y = 2 * data_line + data_random
>>> res = sm.OLS(y, X).fit()

Example 1: Check heteroscedasticity with Breusch-Pagan
>>> res_check = is_heteroscedastic(res, algorithm="bp")
>>> print(res_check["result"])
False

Source code in src/ts_stat_tests/heteroscedasticity/tests.py

@typechecked
def is_heteroscedastic(
    res: Union[RegressionResults, RegressionResultsWrapper],
    algorithm: str = "bp",
    alpha: float = 0.05,
    **kwargs: Union[float, int, str, bool, ArrayLike, None],
) -> dict[str, Union[str, float, bool, None]]:
    """
    !!! note "Summary"
        Test whether a given model's residuals exhibit `heteroscedasticity` or not.

    ???+ abstract "Details"
        This function checks the results of a heteroscedasticity test against a significance level `alpha`. The null hypothesis ($H_0$) for all supported tests is homoscedasticity (constant variance). If the p-value is less than `alpha`, the null hypothesis is rejected in favor of heteroscedasticity.

    Params:
        res (Union[RegressionResults, RegressionResultsWrapper]):
            The fitted regression model to be checked.
        algorithm (str):
            Which heteroscedasticity algorithm to use. See [`heteroscedasticity()`][ts_stat_tests.heteroscedasticity.tests.heteroscedasticity] for options.
            Default: `"bp"`
        alpha (float):
            The significance level for the test.
            Default: `0.05`
        kwargs (Union[float, int, str, bool, ArrayLike, None]):
            Additional keyword arguments passed to the underlying test function.

    Returns:
        (dict[str, Union[str, float, bool, None]]):
            A dictionary containing:
            - `"result"` (bool): Indicator if the residuals are heteroscedastic (i.e., p-value < alpha).
            - `"statistic"` (float): The test statistic.
            - `"pvalue"` (float): The p-value of the test.
            - `"alpha"` (float): The significance level used.
            - `"algorithm"` (str): The algorithm used.

    ???+ example "Examples"

        ```pycon {.py .python linenums="1" title="Setup"}
        >>> import statsmodels.api as sm
        >>> from ts_stat_tests.heteroscedasticity.tests import is_heteroscedastic
        >>> from ts_stat_tests.utils.data import data_line, data_random
        >>> X = sm.add_constant(data_line)
        >>> y = 2 * data_line + data_random
        >>> res = sm.OLS(y, X).fit()

        ```

        ```pycon {.py .python linenums="1" title="Example 1: Check heteroscedasticity with Breusch-Pagan"}
        >>> res_check = is_heteroscedastic(res, algorithm="bp")
        >>> print(res_check["result"])
        False

        ```
    """
    raw_res = heteroscedasticity(res=res, algorithm=algorithm, **kwargs)

    # All heteroscedasticity algorithms return a tuple
    # (lm, lmpval, fval, fpval) or (fval, pval, ...)
    stat = float(raw_res[0])
    pvalue = float(raw_res[1])

    return {
        "algorithm": algorithm,
        "statistic": float(stat),
        "pvalue": float(pvalue),
        "alpha": alpha,
        "result": bool(pvalue < alpha),
    }

ts_stat_tests.heteroscedasticity.algorithms 🔗

Summary

This module implements various heteroscedasticity tests including: - ARCH Test - Breusch-Pagan Test - Goldfeld-Quandt Test - White's Test

VALID_GQ_ALTERNATIVES_OPTIONS `module-attribute` 🔗

VALID_GQ_ALTERNATIVES_OPTIONS = Literal[
    "two-sided", "increasing", "decreasing"
]

arch 🔗

arch(
    resid: ArrayLike,
    nlags: Optional[int] = None,
    ddof: int = 0,
    *,
    store: Literal[False] = False
) -> tuple[float, float, float, float]

arch(
    resid: ArrayLike,
    nlags: Optional[int] = None,
    ddof: int = 0,
    *,
    store: Literal[True]
) -> tuple[float, float, float, float, ResultsStore]

arch(
    resid: ArrayLike,
    nlags: Optional[int] = None,
    ddof: int = 0,
    *,
    store: bool = False
) -> Union[
    tuple[float, float, float, float],
    tuple[float, float, float, float, ResultsStore],
]

Summary

Engle's Test for Autoregressive Conditional Heteroscedasticity (ARCH).

Details

This test is used to determine whether the residuals of a time-series model exhibit ARCH effects. ARCH effects are characterized by clusters of volatility, where periods of high volatility are followed by periods of high volatility, and vice versa. The test is essentially a Lagrange Multiplier (LM) test for autocorrelation in the squared residuals.

Parameters:

Name	Type	Description	Default
`resid`	`ArrayLike`	The residuals from a linear regression model.	required
`nlags`	`Optional[int]`	The number of lags to include in the test regression. If `None`, the number of lags is determined based on the number of observations. Default: `None`	`None`
`ddof`	`int`	Degrees of freedom to adjust for in the calculation of the F-statistic. Default: `0`	`0`
`store`	`bool`	Whether to return a `ResultsStore` object containing additional test results. Default: `False`	`False`

Returns:

Type	Description
`Union[tuple[float, float, float, float], tuple[float, float, float, float, ResultsStore]]`	A tuple containing: - `lmstat` (float): The Lagrange Multiplier statistic. - `lmpval` (float): The p-value for the LM statistic. - `fstat` (float): The F-statistic. - `fpval` (float): The p-value for the F-statistic. - `resstore` (ResultsStore, optional): Returned only if `store` is `True`.

Examples

Setup
>>> import statsmodels.api as sm
>>> from ts_stat_tests.heteroscedasticity.algorithms import arch
>>> from ts_stat_tests.utils.data import data_line, data_random
>>> X = sm.add_constant(data_line)
>>> y = 2 * data_line + data_random
>>> res = sm.OLS(y, X).fit()
>>> resid = res.resid

Example 1: Basic ARCH test
>>> lm, lmp, f, fp = arch(resid)
>>> print(f"LM p-value: {lmp:.4f}")
LM p-value: 0.9124

Calculation

The test is performed by regressing the squared residuals \(e_t^2\) on a constant and \(q\) lags of the squared residuals:

\[ e_t^2 = \gamma_0 + \gamma_1 e_{t-1}^2 + \gamma_2 e_{t-2}^2 + \dots + \gamma_q e_{t-q}^2 + \nu_t \]

The null hypothesis of no ARCH effects is:

\[ H_0: \gamma_1 = \gamma_2 = \dots = \gamma_q = 0 \]

The LM statistic is calculated as \(T \times R^2\) from this regression, where \(T\) is the number of observations and \(R^2\) is the coefficient of determination.

Credit

Calculations are performed by statsmodels.

References

Engle, R. F. (1982). Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Econometrica, 50(4), 987-1007.

Source code in src/ts_stat_tests/heteroscedasticity/algorithms.py

@typechecked
def arch(resid: ArrayLike, nlags: Optional[int] = None, ddof: int = 0, *, store: bool = False) -> Union[
    tuple[float, float, float, float],
    tuple[float, float, float, float, ResultsStore],
]:
    r"""
    !!! note "Summary"
        Engle's Test for Autoregressive Conditional Heteroscedasticity (ARCH).

    ???+ abstract "Details"
        This test is used to determine whether the residuals of a time-series model exhibit ARCH effects. ARCH effects are characterized by clusters of volatility, where periods of high volatility are followed by periods of high volatility, and vice versa. The test is essentially a Lagrange Multiplier (LM) test for autocorrelation in the squared residuals.

    Params:
        resid (ArrayLike):
            The residuals from a linear regression model.
        nlags (Optional[int]):
            The number of lags to include in the test regression. If `None`, the number of lags is determined based on the number of observations.
            Default: `None`
        ddof (int):
            Degrees of freedom to adjust for in the calculation of the F-statistic.
            Default: `0`
        store (bool):
            Whether to return a `ResultsStore` object containing additional test results.
            Default: `False`

    Returns:
        (Union[tuple[float, float, float, float], tuple[float, float, float, float, ResultsStore]]):
            A tuple containing:
            - `lmstat` (float): The Lagrange Multiplier statistic.
            - `lmpval` (float): The p-value for the LM statistic.
            - `fstat` (float): The F-statistic.
            - `fpval` (float): The p-value for the F-statistic.
            - `resstore` (ResultsStore, optional): Returned only if `store` is `True`.

    ???+ example "Examples"

        ```pycon {.py .python linenums="1" title="Setup"}
        >>> import statsmodels.api as sm
        >>> from ts_stat_tests.heteroscedasticity.algorithms import arch
        >>> from ts_stat_tests.utils.data import data_line, data_random
        >>> X = sm.add_constant(data_line)
        >>> y = 2 * data_line + data_random
        >>> res = sm.OLS(y, X).fit()
        >>> resid = res.resid

        ```

        ```pycon {.py .python linenums="1" title="Example 1: Basic ARCH test"}
        >>> lm, lmp, f, fp = arch(resid)
        >>> print(f"LM p-value: {lmp:.4f}")
        LM p-value: 0.9124

        ```

    ??? equation "Calculation"
        The test is performed by regressing the squared residuals $e_t^2$ on a constant and $q$ lags of the squared residuals:

        $$
        e_t^2 = \gamma_0 + \gamma_1 e_{t-1}^2 + \gamma_2 e_{t-2}^2 + \dots + \gamma_q e_{t-q}^2 + \nu_t
        $$

        The null hypothesis of no ARCH effects is:

        $$
        H_0: \gamma_1 = \gamma_2 = \dots = \gamma_q = 0
        $$

        The LM statistic is calculated as $T \times R^2$ from this regression, where $T$ is the number of observations and $R^2$ is the coefficient of determination.

    ??? success "Credit"
        Calculations are performed by `statsmodels`.

    ??? question "References"
        - Engle, R. F. (1982). Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Econometrica, 50(4), 987-1007.
    """
    if store:
        res_5 = cast(
            tuple[float, float, float, float, ResultsStore],
            het_arch(resid=resid, nlags=nlags, store=True, ddof=ddof),
        )
        return (
            float(res_5[0]),
            float(res_5[1]),
            float(res_5[2]),
            float(res_5[3]),
            res_5[4],
        )

    res_4 = cast(
        tuple[float, float, float, float],
        het_arch(resid=resid, nlags=nlags, store=False, ddof=ddof),
    )
    return (float(res_4[0]), float(res_4[1]), float(res_4[2]), float(res_4[3]))

bpl 🔗

bpl(
    resid: ArrayLike,
    exog_het: ArrayLike,
    robust: bool = True,
) -> tuple[float, float, float, float]

Summary

Breusch-Pagan Lagrange Multiplier Test for Heteroscedasticity.

Details

This test checks whether the variance of the errors in a regression model depends on the values of the independent variables. If it does, the errors are heteroscedastic. The null hypothesis assumes homoscedasticity (constant variance).

Parameters:

Name	Type	Description	Default
`resid`	`ArrayLike`	The residuals from a linear regression model.	required
`exog_het`	`ArrayLike`	The explanatory variables for the variance (heteroscedasticity). Usually, these are the same as the original regression's exogenous variables.	required
`robust`	`bool`	Whether to use a robust version of the test that does not assume the errors are normally distributed (Koenker's version). Default: `True`	`True`

Returns:

Type	Description
`tuple[float, float, float, float]`	A tuple containing: - `lmstat` (float): The Lagrange Multiplier statistic. - `lmpval` (float): The p-value for the LM statistic. - `fstat` (float): The F-statistic. - `fpval` (float): The p-value for the F-statistic.

Examples

Setup
>>> import statsmodels.api as sm
>>> from ts_stat_tests.heteroscedasticity.algorithms import bpl
>>> from ts_stat_tests.utils.data import data_line, data_random
>>> X = sm.add_constant(data_line)
>>> y = 2 * data_line + data_random
>>> res = sm.OLS(y, X).fit()
>>> resid, exog = res.resid, X

Example 1: Basic Breusch-Pagan test
>>> lm, lmp, f, fp = bpl(resid, exog)
>>> print(f"LM p-value: {lmp:.4f}")
LM p-value: 0.2461

Calculation

The test first fits a regression of squared residuals (or standardized version) on the specified exogenous variables:

\[ e_t^2 = \delta_0 + \delta_1 z_{t1} + \dots + \delta_k z_{tk} + u_t \]

The null hypothesis is:

\[ H_0: \delta_1 = \dots = \delta_k = 0 \]

Koenker's robust version uses the scores of the likelihood function and does not require the normality assumption.

Credit

Calculations are performed by statsmodels.

References

Breusch, T. S., & Pagan, A. R. (1979). A Simple Test for Heteroscedasticity and Random Coefficient Variation. Econometrica, 47(5), 1287-1294.
Koenker, R. (1981). A Note on Studentizing a Test for Heteroscedasticity. Journal of Econometrics, 17(1), 107-112.

Source code in src/ts_stat_tests/heteroscedasticity/algorithms.py

@typechecked
def bpl(resid: ArrayLike, exog_het: ArrayLike, robust: bool = True) -> tuple[float, float, float, float]:
    r"""
    !!! note "Summary"
        Breusch-Pagan Lagrange Multiplier Test for Heteroscedasticity.

    ???+ abstract "Details"
        This test checks whether the variance of the errors in a regression model depends on the values of the independent variables. If it does, the errors are heteroscedastic. The null hypothesis assumes homoscedasticity (constant variance).

    Params:
        resid (ArrayLike):
            The residuals from a linear regression model.
        exog_het (ArrayLike):
            The explanatory variables for the variance (heteroscedasticity). Usually, these are the same as the original regression's exogenous variables.
        robust (bool):
            Whether to use a robust version of the test that does not assume the errors are normally distributed (Koenker's version).
            Default: `True`

    Returns:
        (tuple[float, float, float, float]):
            A tuple containing:
            - `lmstat` (float): The Lagrange Multiplier statistic.
            - `lmpval` (float): The p-value for the LM statistic.
            - `fstat` (float): The F-statistic.
            - `fpval` (float): The p-value for the F-statistic.

    ???+ example "Examples"

        ```pycon {.py .python linenums="1" title="Setup"}
        >>> import statsmodels.api as sm
        >>> from ts_stat_tests.heteroscedasticity.algorithms import bpl
        >>> from ts_stat_tests.utils.data import data_line, data_random
        >>> X = sm.add_constant(data_line)
        >>> y = 2 * data_line + data_random
        >>> res = sm.OLS(y, X).fit()
        >>> resid, exog = res.resid, X

        ```

        ```pycon {.py .python linenums="1" title="Example 1: Basic Breusch-Pagan test"}
        >>> lm, lmp, f, fp = bpl(resid, exog)
        >>> print(f"LM p-value: {lmp:.4f}")
        LM p-value: 0.2461

        ```

    ??? equation "Calculation"
        The test first fits a regression of squared residuals (or standardized version) on the specified exogenous variables:

        $$
        e_t^2 = \delta_0 + \delta_1 z_{t1} + \dots + \delta_k z_{tk} + u_t
        $$

        The null hypothesis is:

        $$
        H_0: \delta_1 = \dots = \delta_k = 0
        $$

        Koenker's robust version uses the scores of the likelihood function and does not require the normality assumption.

    ??? success "Credit"
        Calculations are performed by `statsmodels`.

    ??? question "References"
        - Breusch, T. S., & Pagan, A. R. (1979). A Simple Test for Heteroscedasticity and Random Coefficient Variation. Econometrica, 47(5), 1287-1294.
        - Koenker, R. (1981). A Note on Studentizing a Test for Heteroscedasticity. Journal of Econometrics, 17(1), 107-112.
    """
    res = het_breuschpagan(resid=resid, exog_het=exog_het, robust=robust)
    return (float(res[0]), float(res[1]), float(res[2]), float(res[3]))

gq 🔗

gq(
    y: ArrayLike,
    x: ArrayLike,
    idx: Optional[int] = None,
    split: Optional[Union[int, float]] = None,
    drop: Optional[Union[int, float]] = None,
    alternative: VALID_GQ_ALTERNATIVES_OPTIONS = "increasing",
    *,
    store: Literal[False] = False
) -> tuple[float, float, str]

gq(
    y: ArrayLike,
    x: ArrayLike,
    idx: Optional[int] = None,
    split: Optional[Union[int, float]] = None,
    drop: Optional[Union[int, float]] = None,
    alternative: VALID_GQ_ALTERNATIVES_OPTIONS = "increasing",
    *,
    store: Literal[True]
) -> tuple[float, float, str, ResultsStore]

gq(
    y: ArrayLike,
    x: ArrayLike,
    idx: Optional[int] = None,
    split: Optional[Union[int, float]] = None,
    drop: Optional[Union[int, float]] = None,
    alternative: VALID_GQ_ALTERNATIVES_OPTIONS = "increasing",
    *,
    store: bool = False
) -> Union[
    tuple[float, float, str],
    tuple[float, float, str, ResultsStore],
]

Summary

Goldfeld-Quandt Test for Heteroscedasticity.

Details

The Goldfeld-Quandt test checks for heteroscedasticity by dividing the dataset into two subsets (usually at the beginning and end of the sample) and comparing the variance of the residuals in each subset using an F-test.

Parameters:

Name	Type	Description	Default
`y`	`ArrayLike`	The dependent variable (endogenous).	required
`x`	`ArrayLike`	The independent variables (exogenous).	required
`idx`	`Optional[int]`	The column index of the variable to sort by. If `None`, the data is assumed to be ordered. Default: `None`	`None`
`split`	`Optional[Union[int, float]]`	The index at which to split the sample. If a float between 0 and 1, it represents the fraction of observations. Default: `None`	`None`
`drop`	`Optional[Union[int, float]]`	The number of observations to drop in the middle. If a float between 0 and 1, it represents the fraction of observations. Default: `None`	`None`
`alternative`	`VALID_GQ_ALTERNATIVES_OPTIONS`	The alternative hypothesis. Options are `"increasing"`, `"decreasing"`, or `"two-sided"`. Default: `"increasing"`	`'increasing'`
`store`	`bool`	Whether to return a `ResultsStore` object. Default: `False`	`False`

Returns:

Type	Description
`Union[tuple[float, float, str], tuple[float, float, str, ResultsStore]]`	A tuple containing: - `fstat` (float): The F-statistic. - `fpval` (float): The p-value for the F-statistic. - `alternative` (str): The alternative hypothesis used. - `resstore` (ResultsStore, optional): Returned only if `store` is `True`.

Examples

Setup
>>> import statsmodels.api as sm
>>> from ts_stat_tests.utils.data import data_line, data_random
>>> from ts_stat_tests.heteroscedasticity.algorithms import gq
>>> X = sm.add_constant(data_line)
>>> y = 2 * data_line + data_random

Example 1: Basic Goldfeld-Quandt test
>>> f, p, alt = gq(y, X)
>>> print(f"F p-value: {p:.4f}")
F p-value: 0.2269

Calculation

The dataset is split into two samples after sorting by an independent variable (or using the natural order). Separate regressions are run on each sample:

\[ RSS_1 = \sum e_{1,t}^2, \quad RSS_2 = \sum e_{2,t}^2 \]

The test statistic is the ratio of variances:

\[ F = \frac{RSS_2 / df_2}{RSS_1 / df_1} \]

where \(RSS_i\) are the residual sum of squares and \(df_i\) are the degrees of freedom.

Credit

Calculations are performed by statsmodels.

References

Goldfeld, S. M., & Quandt, R. E. (1965). Some Tests for Homoscedasticity. Journal of the American Statistical Association, 60(310), 539-547.

Source code in src/ts_stat_tests/heteroscedasticity/algorithms.py

@typechecked
def gq(
    y: ArrayLike,
    x: ArrayLike,
    idx: Optional[int] = None,
    split: Optional[Union[int, float]] = None,
    drop: Optional[Union[int, float]] = None,
    alternative: VALID_GQ_ALTERNATIVES_OPTIONS = "increasing",
    *,
    store: bool = False,
) -> Union[
    tuple[float, float, str],
    tuple[float, float, str, ResultsStore],
]:
    r"""
    !!! note "Summary"
        Goldfeld-Quandt Test for Heteroscedasticity.

    ???+ abstract "Details"
        The Goldfeld-Quandt test checks for heteroscedasticity by dividing the dataset into two subsets (usually at the beginning and end of the sample) and comparing the variance of the residuals in each subset using an F-test.

    Params:
        y (ArrayLike):
            The dependent variable (endogenous).
        x (ArrayLike):
            The independent variables (exogenous).
        idx (Optional[int]):
            The column index of the variable to sort by. If `None`, the data is assumed to be ordered.
            Default: `None`
        split (Optional[Union[int, float]]):
            The index at which to split the sample. If a float between 0 and 1, it represents the fraction of observations.
            Default: `None`
        drop (Optional[Union[int, float]]):
            The number of observations to drop in the middle. If a float between 0 and 1, it represents the fraction of observations.
            Default: `None`
        alternative (VALID_GQ_ALTERNATIVES_OPTIONS):
            The alternative hypothesis. Options are `"increasing"`, `"decreasing"`, or `"two-sided"`.
            Default: `"increasing"`
        store (bool):
            Whether to return a `ResultsStore` object.
            Default: `False`

    Returns:
        (Union[tuple[float, float, str], tuple[float, float, str, ResultsStore]]):
            A tuple containing:
            - `fstat` (float): The F-statistic.
            - `fpval` (float): The p-value for the F-statistic.
            - `alternative` (str): The alternative hypothesis used.
            - `resstore` (ResultsStore, optional): Returned only if `store` is `True`.

    ???+ example "Examples"

        ```pycon {.py .python linenums="1" title="Setup"}
        >>> import statsmodels.api as sm
        >>> from ts_stat_tests.utils.data import data_line, data_random
        >>> from ts_stat_tests.heteroscedasticity.algorithms import gq
        >>> X = sm.add_constant(data_line)
        >>> y = 2 * data_line + data_random

        ```

        ```pycon {.py .python linenums="1" title="Example 1: Basic Goldfeld-Quandt test"}
        >>> f, p, alt = gq(y, X)
        >>> print(f"F p-value: {p:.4f}")
        F p-value: 0.2269

        ```

    ??? equation "Calculation"
        The dataset is split into two samples after sorting by an independent variable (or using the natural order). Separate regressions are run on each sample:

        $$
        RSS_1 = \sum e_{1,t}^2, \quad RSS_2 = \sum e_{2,t}^2
        $$

        The test statistic is the ratio of variances:

        $$
        F = \frac{RSS_2 / df_2}{RSS_1 / df_1}
        $$

        where $RSS_i$ are the residual sum of squares and $df_i$ are the degrees of freedom.

    ??? success "Credit"
        Calculations are performed by `statsmodels`.

    ??? question "References"
        - Goldfeld, S. M., & Quandt, R. E. (1965). Some Tests for Homoscedasticity. Journal of the American Statistical Association, 60(310), 539-547.
    """
    if store:
        res_4 = cast(
            tuple[float, float, str, ResultsStore],
            het_goldfeldquandt(
                y=y,
                x=x,
                idx=idx,
                split=split,
                drop=drop,
                alternative=alternative,
                store=True,
            ),
        )
        return (float(res_4[0]), float(res_4[1]), str(res_4[2]), res_4[3])

    res_3 = cast(
        tuple[float, float, str],
        het_goldfeldquandt(
            y=y,
            x=x,
            idx=idx,
            split=split,
            drop=drop,
            alternative=alternative,
            store=False,
        ),
    )
    return (float(res_3[0]), float(res_3[1]), str(res_3[2]))

wlm 🔗

wlm(
    resid: ArrayLike, exog_het: ArrayLike
) -> tuple[float, float, float, float]

Summary

White's Test for Heteroscedasticity.

Details

White's test is a general test for heteroscedasticity that does not require a specific functional form for the variance of the error terms. It is essentially a test of whether the squared residuals can be explained by the levels, squares, and cross-products of the independent variables.

Parameters:

Name	Type	Description	Default
`resid`	`ArrayLike`	The residuals from a linear regression model.	required
`exog_het`	`ArrayLike`	The explanatory variables for the variance. Usually, these are the original exogenous variables; the test internally handles adding their squares and cross-products.	required

Returns:

Type	Description
`tuple[float, float, float, float]`	A tuple containing: - `lmstat` (float): The Lagrange Multiplier statistic. - `lmpval` (float): The p-value for the LM statistic. - `fstat` (float): The F-statistic. - `fpval` (float): The p-value for the F-statistic.

Examples

Setup
>>> import statsmodels.api as sm
>>> from ts_stat_tests.heteroscedasticity.algorithms import wlm
>>> from ts_stat_tests.utils.data import data_line, data_random
>>> X = sm.add_constant(data_line)
>>> y = 2 * data_line + data_random
>>> res = sm.OLS(y, X).fit()
>>> resid, exog = res.resid, X

Example 1: Basic White's test
>>> lm, lmp, f, fp = wlm(resid, exog)
>>> print(f"White p-value: {lmp:.4f}")
White p-value: 0.4558

Calculation

Squared residuals are regressed on all distinct variables in the cross-product of the original exogenous variables (including constant, linear terms, squares, and interactions):

\[ e_t^2 = \delta_0 + \sum \delta_i z_{it} + \sum \delta_{ij} z_{it} z_{jt} + u_t \]

The LM statistic is \(T \times R^2\) from this auxiliary regression, where \(T\) is the number of observations.

Credit

Calculations are performed by statsmodels.

References

White, H. (1980). A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica, 48(4), 817-838.

Source code in src/ts_stat_tests/heteroscedasticity/algorithms.py

@typechecked
def wlm(resid: ArrayLike, exog_het: ArrayLike) -> tuple[float, float, float, float]:
    r"""
    !!! note "Summary"
        White's Test for Heteroscedasticity.

    ???+ abstract "Details"
        White's test is a general test for heteroscedasticity that does not require a specific functional form for the variance of the error terms. It is essentially a test of whether the squared residuals can be explained by the levels, squares, and cross-products of the independent variables.

    Params:
        resid (ArrayLike):
            The residuals from a linear regression model.
        exog_het (ArrayLike):
            The explanatory variables for the variance. Usually, these are the original exogenous variables; the test internally handles adding their squares and cross-products.

    Returns:
        (tuple[float, float, float, float]):
            A tuple containing:
            - `lmstat` (float): The Lagrange Multiplier statistic.
            - `lmpval` (float): The p-value for the LM statistic.
            - `fstat` (float): The F-statistic.
            - `fpval` (float): The p-value for the F-statistic.

    ???+ example "Examples"

        ```pycon {.py .python linenums="1" title="Setup"}
        >>> import statsmodels.api as sm
        >>> from ts_stat_tests.heteroscedasticity.algorithms import wlm
        >>> from ts_stat_tests.utils.data import data_line, data_random
        >>> X = sm.add_constant(data_line)
        >>> y = 2 * data_line + data_random
        >>> res = sm.OLS(y, X).fit()
        >>> resid, exog = res.resid, X

        ```

        ```pycon {.py .python linenums="1" title="Example 1: Basic White's test"}
        >>> lm, lmp, f, fp = wlm(resid, exog)
        >>> print(f"White p-value: {lmp:.4f}")
        White p-value: 0.4558

        ```

    ??? equation "Calculation"
        Squared residuals are regressed on all distinct variables in the cross-product of the original exogenous variables (including constant, linear terms, squares, and interactions):

        $$
        e_t^2 = \delta_0 + \sum \delta_i z_{it} + \sum \delta_{ij} z_{it} z_{jt} + u_t
        $$

        The LM statistic is $T \times R^2$ from this auxiliary regression, where $T$ is the number of observations.

    ??? success "Credit"
        Calculations are performed by `statsmodels`.

    ??? question "References"
        - White, H. (1980). A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica, 48(4), 817-838.
    """
    res = het_white(resid=resid, exog=exog_het)
    return (float(res[0]), float(res[1]), float(res[2]), float(res[3]))

Test the heteroscedasticity of a given Time-Series Dataset🔗

Introduction🔗

Modules🔗

ts_stat_tests.heteroscedasticity.tests 🔗

heteroscedasticity 🔗

is_heteroscedastic 🔗

ts_stat_tests.heteroscedasticity.algorithms 🔗

VALID_GQ_ALTERNATIVES_OPTIONS module-attribute 🔗

arch 🔗

bpl 🔗

gq 🔗

wlm 🔗

Test the `heteroscedasticity` of a given Time-Series Dataset🔗

VALID_GQ_ALTERNATIVES_OPTIONS `module-attribute` 🔗