This is a supporting page for the paper:

Linear Time Motif Discovery in Time Series

Introduction:

Discovering motifs (repeated patterns) is an important task in time series data mining. The task can be formulated as finding the most similar non-overlapping pair of subsequences in a given time series. Existing exact motif discovery methods have quadratic time complexities in the length of the time series. In this work, we present an algorithm that can find the exact motif of a given time series in a linear expected time complexity. The algorithm is further modified to find all pairs of subsequences whose distances are below a given threshold value. In practice if true motifs exist in the data or the threshold is set to a small value, the algorithms are very fast. The proof of correctness and time complexity are detailed and experiments are conducted to verify the analysis. The proposed method is also applied to analyze the real-world bird sound and electrical consumption data to demonstrate its effectiveness.

Supplement Material (source code, experiment data):

GitHub