A new complexity measure for time series analysis and classification
Publication Type:Journal Articles
Source:European Physical Journal - Special Topics, EDP Sciences/Springer-Verlag, Volume 222, p.847–860 (2013)
Complexity measures are used in a number of applications such as extraction of information from data such as ecological time series, detection of non-random structure in biomedical signals, testing of random number generators, language recognition and authorship attribution etc. Different complexity measures proposed in the literature like Shannon entropy, Relative entropy, Lempel-Ziv, Kolmogrov and Algorithmic complexity are mostly ineffective in analyzing short sequences that are further corrupted with noise. To address this problem, we propose a new complexity measure ETC and define it as the ?Effort To Compress? the input sequence by a lossless compression algorithm. Here, we employ the lossless compression algorithm known as Non-Sequential Recursive Pair Substitution (NSRPS) and define ETC as the number of iterations needed for NSRPS to transform the input sequence to a constant sequence. We demonstrate the utility of ETC in two applications. ETC is shown to have better correlation with Lyapunov exponent than Shannon entropy even with relatively short and noisy time series. The measure also has a greater rate of success in automatic identification and classification of short noisy sequences, compared to entropy and a popular measure based on Lempel-Ziv compression (implemented by Gzip).