Test the regularity of a given Time-Series Dataset🔗
Introduction🔗
Summary
As stated by Selva Prabhakaran:
The more regular and repeatable patterns a time series has, the easier it is to forecast.
The 'Approximate Entropy' algorithm can be used to quantify the regularity and unpredictability of fluctuations in a time series.
The higher the approximate entropy, the more difficult it is to forecast it.
Another better alternate is the 'Sample Entropy'.
Sample Entropy is similar to approximate entropy but is more consistent in estimating the complexity even for smaller time series.
For example, a random time series with fewer data points can have a lower 'approximate entropy' than a more 'regular' time series, whereas, a longer random time series will have a higher 'approximate entropy'.
For more info, see: Time Series Analysis in Python: A Comprehensive Guide with Examples.
Info
To state that the data is 'regular' is to say that the data points are evenly spaced, regularly collected, and not missing data points (ie. do not contain excessive NA values). Logically, it is not always necessary to conduct the Test for Regularity on automatically collected data (like for example with Energy Prices, or Daily Temperature), however if this data was collected manually then it is highly recommended. If the data does not meet the requirements of Regularity, then it is necessary to return to the data collection plan, and revise the methodology used.
| library | category | algorithm | short | import script | url |
|---|---|---|---|---|---|
| antropy | Regularity | Approximate Entropy | AppEn | from antropy import app_entropy |
https://raphaelvallat.com/antropy/build/html/generated/antropy.app_entropy.html |
| antropy | Regularity | Sample Entropy | SampEn | from antropy import sample_entropy |
https://raphaelvallat.com/antropy/build/html/generated/antropy.sample_entropy.html |
| antropy | Regularity | Permutation Entropy | PermEn | from antropy import perm_entropy |
https://raphaelvallat.com/antropy/build/html/generated/antropy.perm_entropy.html |
| antropy | Regularity | Spectral Entropy | SpecEn | from antropy import spectral_entropy |
https://raphaelvallat.com/antropy/build/html/generated/antropy.spectral_entropy.html |
| antropy | Regularity | SVD Entropy | SvdEn | from antropy import svd_entropy |
https://raphaelvallat.com/antropy/build/html/generated/antropy.svd_entropy.html |
For more info, see: The Future of Australian Energy Prices: Time-Series Analysis of Historic Prices and Forecast for Future Prices.
Source Library
The AntroPy package was chosen because it provides well-tested and efficient implementations of approximate entropy, sample entropy, and related complexity measures for time-series data, is built on top of the scientific Python stack (NumPy/SciPy), and is actively maintained and open source, making it a reliable choice for reproducible statistical analysis.
Source Module
All of the source code can be found within the modules:
Modules🔗
ts_stat_tests.regularity.tests
🔗
Summary
This module contains convenience functions and tests for regularity measures, allowing for easy access to different entropy algorithms.
entropy
🔗
entropy(
x: ArrayLike,
algorithm: str = "sample",
order: int = 2,
metric: VALID_KDTREE_METRIC_OPTIONS = "chebyshev",
sf: float = 1,
normalize: bool = True,
) -> Union[float, NDArray[np.float64]]
Summary
Test for the entropy of a given data set.
Details
This function is a convenience wrapper around the five underlying algorithms:
- approx_entropy()
- sample_entropy()
- spectral_entropy()
- permutation_entropy()
- svd_entropy()
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ArrayLike
|
The data to be checked. Should be a |
required |
algorithm
|
str
|
Which entropy algorithm to use. |
'sample'
|
order
|
int
|
Embedding dimension. |
2
|
metric
|
VALID_KDTREE_METRIC_OPTIONS
|
Name of the distance metric function used with |
'chebyshev'
|
sf
|
float
|
Sampling frequency, in Hz. |
1
|
normalize
|
bool
|
If |
True
|
Raises:
| Type | Description |
|---|---|
ValueError
|
When the given value for |
Returns:
| Type | Description |
|---|---|
Union[float, NDArray[float64]]
|
The calculated entropy value. |
Credit
All credit goes to the AntroPy library.
Examples
| Setup | |
|---|---|
1 2 3 | |
| Example 1: Sample Entropy | |
|---|---|
1 2 | |
| Example 2: Approx Entropy | |
|---|---|
1 2 | |
| Example 3: Spectral Entropy | |
|---|---|
1 2 | |
References
- Richman, J. S. et al. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.
- https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.DistanceMetric.html
- Inouye, T. et al. (1991). Quantification of EEG irregularity by use of the entropy of the power spectrum. Electroencephalography and clinical neurophysiology, 79(3), 204-210.
- https://en.wikipedia.org/wiki/Spectral_density
- https://en.wikipedia.org/wiki/Welch%27s_method
See Also
Source code in src/ts_stat_tests/regularity/tests.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | |
regularity
🔗
regularity(
x: ArrayLike,
algorithm: str = "sample",
order: int = 2,
metric: VALID_KDTREE_METRIC_OPTIONS = "chebyshev",
sf: float = 1,
normalize: bool = True,
) -> Union[float, NDArray[np.float64]]
Summary
Test for the regularity of a given data set.
Details
This is a pass-through, convenience wrapper around the entropy() function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ArrayLike
|
The data to be checked. Should be a |
required |
algorithm
|
str
|
Which entropy algorithm to use. |
'sample'
|
order
|
int
|
Embedding dimension. |
2
|
metric
|
VALID_KDTREE_METRIC_OPTIONS
|
Name of the distance metric function used with |
'chebyshev'
|
sf
|
float
|
Sampling frequency, in Hz. |
1
|
normalize
|
bool
|
If |
True
|
Returns:
| Type | Description |
|---|---|
Union[float, NDArray[float64]]
|
The calculated regularity (entropy) value. |
Credit
All credit goes to the AntroPy library.
Examples
| Setup | |
|---|---|
1 2 3 | |
| Example 1: Sample Entropy | |
|---|---|
1 2 | |
| Example 2: Approx Entropy | |
|---|---|
1 2 | |
| Example 3: Spectral Entropy | |
|---|---|
1 2 | |
References
- Richman, J. S. et al. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.
- https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.DistanceMetric.html
- Inouye, T. et al. (1991). Quantification of EEG irregularity by use of the entropy of the power spectrum. Electroencephalography and clinical neurophysiology, 79(3), 204-210.
- https://en.wikipedia.org/wiki/Spectral_density
- https://en.wikipedia.org/wiki/Welch%27s_method
See Also
Source code in src/ts_stat_tests/regularity/tests.py
202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 | |
is_regular
🔗
is_regular(
x: ArrayLike,
algorithm: str = "sample",
order: int = 2,
sf: float = 1,
metric: VALID_KDTREE_METRIC_OPTIONS = "chebyshev",
normalize: bool = True,
tolerance: Union[str, float, int, None] = "default",
) -> dict[str, Union[str, float, bool]]
Summary
Test whether a given data set is regular or not.
Details
This function implements the given algorithm (defined in the parameter algorithm), and returns a dictionary containing the relevant data:
{
"result": ..., # The result of the test. Will be `True` if `entropy<tolerance`, and `False` otherwise
"entropy": ..., # A `float` value, the result of the `entropy()` function
"tolerance": ..., # A `float` value, which is the tolerance used for determining whether or not the `entropy` is `regular` or not
}
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ArrayLike
|
The data to be checked. Should be a |
required |
algorithm
|
str
|
Which entropy algorithm to use. |
'sample'
|
order
|
int
|
Embedding dimension. |
2
|
metric
|
VALID_KDTREE_METRIC_OPTIONS
|
Name of the distance metric function used with |
'chebyshev'
|
sf
|
float
|
Sampling frequency, in Hz. |
1
|
normalize
|
bool
|
If |
True
|
tolerance
|
Union[str, float, int, None]
|
The tolerance value used to determine whether or not the result is ValueError error will be raised.Defaults to "default".
|
'default'
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If the given Valid options are:
|
Returns:
| Type | Description |
|---|---|
dict[str, Union[str, float, bool]]
|
A dictionary containing the test results:
|
Credit
All credit goes to the AntroPy library.
Examples
| Setup | |
|---|---|
1 2 3 | |
| Example 1: Sample Entropy | |
|---|---|
1 2 | |
| Example 2: Approx Entropy | |
|---|---|
1 2 | |
References
- Richman, J. S. et al. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.
- https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.DistanceMetric.html
- Inouye, T. et al. (1991). Quantification of EEG irregularity by use of the entropy of the power spectrum. Electroencephalography and clinical neurophysiology, 79(3), 204-210.
- https://en.wikipedia.org/wiki/Spectral_density
- https://en.wikipedia.org/wiki/Welch%27s_method
See Also
Source code in src/ts_stat_tests/regularity/tests.py
298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 | |
ts_stat_tests.regularity.algorithms
🔗
Summary
This module contains algorithms to compute regularity measures for time series data, including approximate entropy, sample entropy, spectral entropy, and permutation entropy.
VALID_KDTREE_METRIC_OPTIONS
module-attribute
🔗
VALID_KDTREE_METRIC_OPTIONS = Literal[
"euclidean",
"l2",
"minkowski",
"p",
"manhattan",
"cityblock",
"l1",
"chebyshev",
"infinity",
]
VALID_SPECTRAL_ENTROPY_METHOD_OPTIONS
module-attribute
🔗
VALID_SPECTRAL_ENTROPY_METHOD_OPTIONS = Literal[
"fft", "welch"
]
approx_entropy
🔗
approx_entropy(
x: ArrayLike,
order: int = 2,
tolerance: Optional[float] = None,
metric: VALID_KDTREE_METRIC_OPTIONS = "chebyshev",
) -> float
Summary
Approximate entropy is a measure of the amount of regularity or predictability in a time series. It is used to quantify the degree of self-similarity of a signal over different time scales, and can be useful for detecting underlying patterns or trends in data.
This function implements the app_entropy() function from the AntroPy library.
Details
Approximate entropy is a technique used to quantify the amount of regularity and the unpredictability of fluctuations over time-series data. Smaller values indicate that the data is more regular and predictable.
To calculate approximate entropy, we first need to define a window size or scale factor, which determines the length of the subsequences that are used to compare the similarity of the time series. We then compare all possible pairs of subsequences within the time series and calculate the probability that two subsequences are within a certain tolerance level of each other, where the tolerance level is usually expressed as a percentage of the standard deviation of the time series.
The approximate entropy is then defined as the negative natural logarithm of the average probability of similarity across all possible pairs of subsequences, normalized by the length of the time series and the scale factor.
The approximate entropy measure is useful in a variety of applications, such as the analysis of physiological signals, financial time series, and climate data. It can be used to detect changes in the regularity or predictability of a time series over time, and can provide insights into the underlying dynamics or mechanisms that generate the signal.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ArrayLike
|
One-dimensional time series of shape |
required |
order
|
int
|
Embedding dimension. |
2
|
tolerance
|
Optional[float]
|
Tolerance level or similarity criterion. If |
None
|
metric
|
VALID_KDTREE_METRIC_OPTIONS
|
Name of the distance metric function used with |
'chebyshev'
|
Returns:
| Type | Description |
|---|---|
float
|
The approximate entropy score. |
Examples
| Setup | |
|---|---|
1 2 3 4 | |
| Example 1: Airline Passengers Data | |
|---|---|
1 2 | |
| Example 2: Random Data | |
|---|---|
1 2 | |
Calculation
The equation for ApEn is:
where:
- \(m\) is the embedding dimension,
- \(r\) is the tolerance or similarity criterion,
- \(N\) is the length of the time series, and
- \(\phi_m(r)\) and \(\phi_{m+1}(r)\) are the logarithms of the probabilities that two sequences of \(m\) data points in the time series that are similar to each other within a tolerance \(r\) remain similar for the next data point, for \(m\) and \(m+1\), respectively.
Notes
- Inputs:
xis a 1-dimensional array. It represents time-series data, ideally with each element in the array being a measurement or value taken at regular time intervals. - Settings:
orderis used for determining the number of values that are used to construct each permutation pattern. If the embedding dimension is too small, we may miss important patterns. If it's too large, we may overfit noise. - Metric: The Chebyshev metric is often used because it is a robust and computationally efficient way to measure the distance between two time series.
Credit
All credit goes to the AntroPy library.
References
Source code in src/ts_stat_tests/regularity/algorithms.py
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 | |
sample_entropy
🔗
sample_entropy(
x: ArrayLike,
order: int = 2,
tolerance: Optional[float] = None,
metric: VALID_KDTREE_METRIC_OPTIONS = "chebyshev",
) -> float
Summary
Sample entropy is a measure of the amount of regularity or predictability in a time series. It is used to quantify the degree of self-similarity of a signal over different time scales, and can be useful for detecting underlying patterns or trends in data.
This function implements the sample_entropy() function from the AntroPy library.
Details
Sample entropy is a modification of approximate entropy, used for assessing the complexity of physiological time-series signals. It has two advantages over approximate entropy: data length independence and a relatively trouble-free implementation. Large values indicate high complexity whereas smaller values characterize more self-similar and regular signals.
The value of SampEn ranges from zero (\(0\)) to infinity (\(\infty\)), with lower values indicating higher regularity or predictability in the time series. A time series with high \(SampEn\) is more unpredictable or irregular, whereas a time series with low \(SampEn\) is more regular or predictable.
Sample entropy is often used in time series forecasting to assess the complexity of the data and to determine whether a time series is suitable for modeling with a particular forecasting method, such as ARIMA or neural networks.
Choosing an appropriate embedding dimension is crucial in ensuring that the permutation entropy calculation is robust and reliable, and captures the essential features of the time series in a meaningful way. This allows us to make more accurate and informative inferences about the behavior of the system that generated the data, and can be useful in a wide range of applications, from signal processing to data analysis and beyond.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ArrayLike
|
One-dimensional time series of shape |
required |
order
|
int
|
Embedding dimension. |
2
|
tolerance
|
Optional[float]
|
Tolerance level or similarity criterion. If |
None
|
metric
|
VALID_KDTREE_METRIC_OPTIONS
|
Name of the distance metric function used with |
'chebyshev'
|
Returns:
| Type | Description |
|---|---|
float
|
The sample entropy score. |
Examples
| Setup | |
|---|---|
1 2 3 4 | |
| Example 1: Airline Passengers Data | |
|---|---|
1 2 | |
| Example 2: Random Data | |
|---|---|
1 2 | |
Calculation
The equation for sample entropy (SampEn) is as follows:
where:
- \(m\) is the embedding dimension,
- \(r\) is the tolerance or similarity criterion,
- \(N\) is the length of the time series, and
- \(C_m(r)\) and \(C_{m+1}(r)\) are the number of \(m\)-tuples (vectors of \(m\) consecutive data points) that have a distance less than or equal to \(r\), and \((m+1)\)-tuples with the same property, respectively.
The calculation of sample entropy involves the following steps:
- Choose the values of \(m\) and \(r\).
- Construct \(m\)-tuples from the time series data.
- Compute the number of \(m\)-tuples that are within a distance \(r\) of each other (\(C_m(r)\)).
- Compute the number of \((m+1)\)-tuples that are within a distance \(r\) of each other (\(C_{m+1}(r)\)).
- Compute the value of \(SampEn\) using the formula above.
Notes
- Note that if
metric == 'chebyshev'andlen(x) < 5000points, then the sample entropy is computed using a fast custom Numba script. For other distance metric or longer time-series, the sample entropy is computed using a code from themne-featurespackage by Jean-Baptiste Schiratti and Alexandre Gramfort (requires sklearn). - The embedding dimension is important in the calculation of sample entropy because it affects the sensitivity of the measure to different patterns in the data. If the embedding dimension is too small, we may miss important patterns or variations. If it is too large, we may overfit the data.
Credit
All credit goes to the AntroPy library.
References
See Also
Source code in src/ts_stat_tests/regularity/algorithms.py
191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 | |
permutation_entropy
🔗
permutation_entropy(
x: ArrayLike,
order: int = 3,
delay: Union[int, list, NDArray[int64]] = 1,
normalize: bool = False,
) -> float
Summary
Permutation entropy is a measure of the complexity or randomness of a time series. It is based on the idea of permuting the order of the values in the time series and calculating the entropy of the resulting permutation patterns.
This function implements the perm_entropy() function from the AntroPy library.
Details
The permutation entropy is a complexity measure for time-series first introduced by Bandt and Pompe in 2002.
It is particularly useful for detecting nonlinear dynamics and nonstationarity in the data. The value of permutation entropy ranges from \(0\) to \(\log_2(\text{order}!)\), where the lower bound is attained for an increasing or decreasing sequence of values, and the upper bound for a completely random system where all possible permutations appear with the same probability.
Choosing an appropriate embedding dimension is crucial in ensuring that the permutation entropy calculation is robust and reliable, and captures the essential features of the time series in a meaningful way.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ArrayLike
|
One-dimensional time series of shape |
required |
order
|
int
|
Order of permutation entropy. |
3
|
delay
|
Union[int, list, NDArray[int64]]
|
Time delay (lag). If multiple values are passed, the average permutation entropy across all these delays is calculated. |
1
|
normalize
|
bool
|
If |
False
|
Returns:
| Type | Description |
|---|---|
Union[float, NDArray[float64]]
|
The permutation entropy of the data set. |
Examples
| Setup | |
|---|---|
1 2 3 4 | |
| Example 1: Airline Passengers Data | |
|---|---|
1 2 | |
| Example 2: Random Data (Normalized) | |
|---|---|
1 2 | |
Calculation
The formula for permutation entropy (\(PE\)) is as follows:
where:
- \(n\) is the embedding dimension (
order), - \(p(i)\) is the probability of the \(i\)-th ordinal pattern.
The embedded matrix \(Y\) is created by:
Notes
- The embedding dimension (
order) determines the number of values used to construct each permutation pattern. If too small, patterns may be missed. If too large, overfitting to noise may occur.
Credit
All credit goes to the AntroPy library.
Source code in src/ts_stat_tests/regularity/algorithms.py
303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 | |
spectral_entropy
🔗
spectral_entropy(
x: ArrayLike,
sf: float = 1,
method: VALID_SPECTRAL_ENTROPY_METHOD_OPTIONS = "fft",
nperseg: Optional[int] = None,
normalize: bool = False,
axis: int = -1,
) -> Union[float, NDArray[np.float64]]
Summary
Spectral entropy is a measure of the amount of complexity or unpredictability in a signal's frequency domain representation. It is used to quantify the degree of randomness or regularity in the power spectrum of a signal.
This function implements the spectral_entropy() function from the AntroPy library.
Details
Spectral Entropy is defined to be the Shannon entropy of the power spectral density (PSD) of the data. It is based on the Shannon entropy, which is a measure of the uncertainty or information content of a probability distribution.
The value of spectral entropy ranges from \(0\) to \(\log_2(N)\), where \(N\) is the number of frequency bands. Lower values indicate a more concentrated or regular distribution of power, while higher values indicate a more spread-out or irregular distribution.
Spectral entropy is particularly useful for detecting periodicity and cyclical patterns, as well as changes in the frequency distribution over time.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ArrayLike
|
One-dimensional or N-dimensional data array. |
required |
sf
|
float
|
Sampling frequency, in Hz. |
1
|
method
|
VALID_SPECTRAL_ENTROPY_METHOD_OPTIONS
|
Spectral estimation method: |
'fft'
|
nperseg
|
Optional[int]
|
Length of each FFT segment for Welch method. If |
None
|
normalize
|
bool
|
If |
False
|
axis
|
int
|
The axis along which the entropy is calculated. Default is the last axis. |
-1
|
Returns:
| Type | Description |
|---|---|
Union[float, NDArray[float64]]
|
The spectral entropy score. Returned as a float for 1D input, or a numpy array for N-dimensional input. |
Examples
| Setup | |
|---|---|
1 2 3 | |
| Example 1: Airline Passengers Data | |
|---|---|
1 2 | |
| Example 2: Welch method for spectral entropy | |
|---|---|
1 2 3 | |
Calculation
The spectral entropy (\(SE\)) is defined as:
where:
- \(P(i)\) is the normalized power spectral density (PSD) at the \(i\)-th frequency band,
- \(f_s\) is the sampling frequency.
Notes
- The power spectrum represents the energy of the signal at different frequencies. High spectral entropy indicates multiple sources or processes with different frequencies, while low spectral entropy suggests a dominant frequency or periodicity.
Credit
All credit goes to the AntroPy library.
References
Source code in src/ts_stat_tests/regularity/algorithms.py
406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 | |
svd_entropy
🔗
svd_entropy(
x: ArrayLike,
order: int = 3,
delay: int = 1,
normalize: bool = False,
) -> float
Summary
SVD entropy is a measure of the complexity or randomness of a time series based on Singular Value Decomposition (SVD).
This function implements the svd_entropy() function from the AntroPy library.
Details
SVD entropy is calculated by first embedding the time series into a matrix, then performing SVD on that matrix to obtain the singular values. The entropy is then calculated from the normalized singular values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ArrayLike
|
One-dimensional time series of shape |
required |
order
|
int
|
Order of the SVD entropy (embedding dimension). |
3
|
delay
|
int
|
Time delay (lag). |
1
|
normalize
|
bool
|
If |
False
|
Returns:
| Type | Description |
|---|---|
float
|
The SVD entropy of the data set. |
Examples
| Setup | |
|---|---|
1 2 3 | |
| Example 1: Basic SVD entropy | |
|---|---|
1 2 | |
Calculation
The SVD entropy is calculated as the Shannon entropy of the singular values of the embedded matrix.
Notes
- Singular Value Decomposition (SVD) is a factorization of a real or complex matrix. It is the generalization of the eigendecomposition of a positive semidefinite normal matrix.
Credit
All credit goes to the AntroPy library.
See Also
Source code in src/ts_stat_tests/regularity/algorithms.py
514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 | |