Coverage for src / ts_stat_tests / stability / algorithms.py: 100%

13 statements  

« prev     ^ index     » next       coverage.py v7.13.2, created at 2026-02-01 09:48 +0000

1# ============================================================================ # 

2# # 

3# Title: Stability Algorithms # 

4# Purpose: Algorithms for measuring time series stability and lumpiness. # 

5# # 

6# ============================================================================ # 

7 

8 

9# ---------------------------------------------------------------------------- # 

10# # 

11# Overview #### 

12# # 

13# ---------------------------------------------------------------------------- # 

14 

15 

16# ---------------------------------------------------------------------------- # 

17# Description #### 

18# ---------------------------------------------------------------------------- # 

19 

20 

21""" 

22!!! note "Summary" 

23 This module provides algorithms to measure the stability and lumpiness of time series data. 

24""" 

25 

26 

27# ---------------------------------------------------------------------------- # 

28# # 

29# Setup #### 

30# # 

31# ---------------------------------------------------------------------------- # 

32 

33 

34# ---------------------------------------------------------------------------- # 

35# Imports #### 

36# ---------------------------------------------------------------------------- # 

37 

38 

39# ## Python StdLib Imports ---- 

40from typing import Union 

41 

42# ## Python Third Party Imports ---- 

43import numpy as np 

44import pandas as pd 

45from numpy.typing import NDArray 

46from tsfeatures import lumpiness as ts_lumpiness, stability as ts_stability 

47from typeguard import typechecked 

48 

49 

50# ---------------------------------------------------------------------------- # 

51# Exports #### 

52# ---------------------------------------------------------------------------- # 

53 

54 

55__all__: list[str] = ["stability", "lumpiness"] 

56 

57 

58# ---------------------------------------------------------------------------- # 

59# # 

60# Algorithms #### 

61# # 

62# ---------------------------------------------------------------------------- # 

63 

64 

65# -----------------------------------------------------------------------------# 

66# Stability #### 

67# -----------------------------------------------------------------------------# 

68 

69 

70@typechecked 

71def stability(data: Union[NDArray[np.float64], pd.DataFrame, pd.Series], freq: int = 1) -> float: 

72 r""" 

73 !!! note "Summary" 

74 Measure the stability of a time series by calculating the variance of the means across non-overlapping windows. 

75 

76 ???+ abstract "Details" 

77 Stability is a feature extracted from time series data that quantifies how much the mean level of the series changes over time. It is particularly useful for identifying series with structural breaks or varying levels. 

78 

79 The series is divided into non-overlapping "tiles" (windows) of length equal to the specified frequency. The mean of each tile is computed, and the stability is defined as the variance of these means. A higher value indicates lower stability (greater changes in the mean level). 

80 

81 Params: 

82 data (Union[NDArray[np.float64], pd.DataFrame, pd.Series]): 

83 The time series data to analyse. 

84 freq (int): 

85 The number of observations per seasonal period or the desired window size for tiling. 

86 Default: `1` 

87 

88 Returns: 

89 (float): 

90 The calculated stability value. 

91 

92 ???+ example "Examples" 

93 

94 ```pycon {.py .python linenums="1" title="Setup"} 

95 >>> import numpy as np 

96 >>> from ts_stat_tests.stability.algorithms import stability 

97 >>> from ts_stat_tests.utils.data import load_airline 

98 

99 ``` 

100 

101 ```pycon {.py .python linenums="1" title="Example 1: Measure stability of airline data"} 

102 >>> data = load_airline().values 

103 >>> res = stability(data, freq=12) 

104 >>> print(f"{res:.2f}") 

105 13428.67 

106 

107 ``` 

108 

109 ```pycon {.py .python linenums="1" title="Example 2: Measure stability of random noise"} 

110 >>> rng = np.random.RandomState(42) 

111 >>> data_random = rng.normal(0, 1, 144) 

112 >>> res = stability(data_random, freq=12) 

113 >>> print(f"{res:.4f}") 

114 0.0547 

115 

116 ``` 

117 

118 ??? equation "Calculation" 

119 The stability $S$ is calculated by: 

120 

121 1. Dividing the time series $X$ into $k$ non-overlapping windows $W_1, W_2, \dots, W_k$ of size $freq$. 

122 2. Computing the mean $\mu_i$ for each window $W_i$. 

123 3. Calculating the variance of these means: 

124 $$ 

125 S = \text{Var}(\mu_1, \mu_2, \dots, \mu_k) 

126 $$ 

127 

128 ??? question "References" 

129 - Hyndman, R.J., Wang, X., & Laptev, N. (2015). Large-scale unusual time series detection. In Proceedings of the IEEE International Conference on Data Mining (ICDM 2015). 

130 

131 ??? tip "See Also" 

132 - [`lumpiness()`][ts_stat_tests.stability.algorithms.lumpiness] 

133 """ 

134 return ts_stability(x=data, freq=freq)["stability"] 

135 

136 

137# ------------------------------------------------------------------------------# 

138# Lumpiness #### 

139# ------------------------------------------------------------------------------# 

140 

141 

142@typechecked 

143def lumpiness(data: Union[NDArray[np.float64], pd.DataFrame, pd.Series], freq: int = 1) -> float: 

144 r""" 

145 !!! note "Summary" 

146 Measure the lumpiness of a time series by calculating the variance of the variances across non-overlapping windows. 

147 

148 ???+ abstract "Details" 

149 Lumpiness quantifies the extent to which the variance of a time series changes over time. It is useful for detecting series with "lumpy" patterns, where volatility is concentrated in certain periods. 

150 

151 Similar to stability, the series is divided into non-overlapping tiles of length `freq`. Instead of means, the variance of each tile is computed. The lumpiness is defined as the variance of these tile variances. A higher value indicates greater "lumpiness" or inconsistent volatility. 

152 

153 Params: 

154 data (Union[NDArray[np.float64], pd.DataFrame, pd.Series]): 

155 The time series data to analyse. 

156 freq (int): 

157 The number of observations per seasonal period or the desired window size for tiling. 

158 Default: `1` 

159 

160 Returns: 

161 (float): 

162 The calculated lumpiness value. 

163 

164 ???+ example "Examples" 

165 

166 ```pycon {.py .python linenums="1" title="Setup"} 

167 >>> import numpy as np 

168 >>> from ts_stat_tests.stability.algorithms import lumpiness 

169 >>> from ts_stat_tests.utils.data import load_airline 

170 

171 ``` 

172 

173 ```pycon {.py .python linenums="1" title="Example 1: Measure lumpiness of airline data"} 

174 >>> data = load_airline().values 

175 >>> res = lumpiness(data, freq=12) 

176 >>> print(f"{res:.2f}") 

177 3986791.94 

178 

179 ``` 

180 

181 ```pycon {.py .python linenums="1" title="Example 2: Measure lumpiness of random noise"} 

182 >>> rng = np.random.RandomState(42) 

183 >>> data_random = rng.normal(0, 1, 144) 

184 >>> res = lumpiness(data_random, freq=12) 

185 >>> print(f"{res:.4f}") 

186 0.0925 

187 

188 ``` 

189 

190 ??? equation "Calculation" 

191 The lumpiness $L$ is calculated by: 

192 

193 1. Dividing the time series $X$ into $k$ non-overlapping windows $W_1, W_2, \dots, W_k$ of size $freq$. 

194 2. Computing the variance $\sigma^2_i$ for each window $W_i$. 

195 3. Calculating the variance of these variances: 

196 $$ 

197 L = \text{Var}(\sigma^2_1, \sigma^2_2, \dots, \sigma^2_k) 

198 $$ 

199 

200 ??? question "References" 

201 - Hyndman, R.J., Wang, X., & Laptev, N. (2015). Large-scale unusual time series detection. In Proceedings of the IEEE International Conference on Data Mining (ICDM 2015). 

202 

203 ??? tip "See Also" 

204 - [`stability()`][ts_stat_tests.stability.algorithms.stability] 

205 """ 

206 return ts_lumpiness(x=data, freq=freq)["lumpiness"]