Replies: 11 comments
-
@Jacks349 Thank you for your question and for your kind words. Firstly, since you are working with market data, I feel obliged to explicitly state that I am not a financial consultant. What I write here only pertains to the general usage of STUMPY for analyzing time series data and should never be taken or interpreted as financial advice. Past performance is not indicative of future results and you should consult with a registered investment advisor.
In short, this is the ideal use case for STUMPY. In the most general case where you have a time series and do not know if a pattern/motif even exists then you should use In this latter case, when you already have a pattern (a subsequence with a particular shape) in hand, then you'll want to use a different function called
Here, the If you haven't already, I strongly recommend reading this tutorial, which explains the details between a "matrix profile" and a "distance profile". Let me know if this makes sense. |
Beta Was this translation helpful? Give feedback.
-
Also, note that we have a PR (work-in-progress) that should make this type of search easier. |
Beta Was this translation helpful? Give feedback.
-
@seanlaw I'm sorry for sounding naive again, but i'm just getting started to this matters and i'm having some troubles understanding the output of the distance profile: Printing
I understand the concept of z-normalized distances, and i also understood that the lowest values are what i'm looking for; the trouble i'm having is understanding how to go back to the "target" dataset. I mean, once i computed the Distance Profile, how do i understand, from it, to what parts of my target dataset (that in this case is |
Beta Was this translation helpful? Give feedback.
-
So, conceptually, the
So, the
Given that the smallest value in your
is the "closest" (by z-normalized Euclidean distance) to your Does that make sense? If not, feel free to ask for more clarification where necessary. I'd be happy to elaborate more |
Beta Was this translation helpful? Give feedback.
-
Of course this makes sense! This is exactly what i was looking for. Even if i'm extremely newbie to all of this, i'm amazed at how much stuff you can do this library. I'll make sure to ask any other noob question i should have! A little feedback about the docs: the visualizations on the article you linked me are very helpful, they make everything easier to understand. |
Beta Was this translation helpful? Give feedback.
-
@Jacks349 Thank you for the feedback. The tutorials take quite a bit of time and effort to make but we try very hard to communicate the concepts as simple as possible. We are always open to being better. I'd love to learn more about how you are planning to leverage STUMPY. Perhaps, you'd like to connect on LinkedIn. Happy data mining! |
Beta Was this translation helpful? Give feedback.
-
Basically my idea was to have a series of patterns specified by myself and a target dataset. What i'm trying to do is make a script that checks for each of the patterns specified by me in the "target" dataset, and for each it needs to find where the "shape" is most similar and give an output like "detected x pattern from y1 to y2". So what i did is:
So for now i'll use the function you mentioned and in the meantime i'll keep going through the docs to see if there are/will be other functions that can help me on this. Cheers! EDIT: I'm sorry if i'm bit though on this, but i have one more doubt:
And the smallest value is 2.53, which is at index 14. So that means, if i understand correctly, that the most similar subsquence is from index 14 to index 21 of my And the target is Shouldn't the most similar subsequence be at index 8, since at that index there will be exactly the same values of |
Beta Was this translation helpful? Give feedback.
-
Sorry, you are absolutely right! In my haste, I misread the values from your |
Beta Was this translation helpful? Give feedback.
-
Thank you a lot! Everything is clearer now! |
Beta Was this translation helpful? Give feedback.
-
@Jacks349 I also highly recommend this resource from the original matrix profile authors that outline different things that you can do with the matrix profile. I think this will inspire other ideas for your exploration |
Beta Was this translation helpful? Give feedback.
-
@seanlaw |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I was making some research on pattern recognition with Python and i found this library. First of all, i think it's very interesting, so good job! I'm just getting started to this, so i apologize if this question is naive.
Second, i wanted to ask if Stumpy is the right choice for what i want to do:
Basically, i'm trying to detect patterns from OHLC trading data, it works like that: i have a specific pattern with a specific shape defined by myself and an external dataset. What i want to do is to check if and where, in the dataset, there is a shape which is similar to the pattern i specified.
Here is how i did it:
The pattern specified by myself is a series of local minima and maxima normalized in terms of percentage that, when charted, has a certain shape:
Pattern = [7.602339181286544, 3.5054347826086927, -5.198214754528746, 4.7078371642204315, -2.9357312880190425, 2.098092643051778, -0.5337603416066172]
Then i have the total dataset, which is a set of OHLC data normalized in maxima and minima too:
Data = [2.1502119927316805, -2.282834272161288, -3.00364077669902, 2.533625273694082, -2.2574740695546116, 3.027465667915112, 6.4222962738564, -2.647309991460278, 7.602339181286544, 3.5054347826086927, -5.198214754528746, 4.7078371642204315, -2.9357312880190425, 2.098092643051778, -0.5337603416066172, 4.212503353903944, -2.600411946446969, 8.511763150938416, -3.775883069427527, 1.8227848101265856, 3.6300348085529524, -1.4635316698656395, 5.527148770392016, -1.476695892939546, 12.248243559718961, -4.443980805341117, 1.9213973799126631, -9.061696658097686, 5.347467608951697, -2.8622540250447197, 1.3121546961326038]
So basically, what i need to do is check in which parts
Data
takes a similar shape toPattern
. That said, my questions are: is this a use case for Stumpy? And if it is, which function am i going to use? In the examples, the dataset is checked against itself, while in this case i'm checking two different datasets.I tried the following:
mp = stumpy.stump(Data, len(Pattern), Pattern, ignore_trivial=False)
But got the following output:
Beta Was this translation helpful? Give feedback.
All reactions