Rule discovery from time series

Content

Introduction

Rule Discovery Process

Review on Rule Discovery from Time Series

Problem Statement

Conclusion

 

ppt64 trang | Chia sẻ: tranluankk2 | Lượt xem: 17 | Lượt tải: 0download
Bạn đang xem trước 20 trang tài liệu Rule discovery from time series, để xem tài liệu hoàn chỉnh bạn click vào nút TẢI VỀ ở trên
m time series, 2007 
34 
Review on [Sac07] - Data mining with temporal abstractions: learning rules from time series, 2007 
Complex temporal abstractions 
Used to detect patterns characterized by behaviors which can’t be represented by basic TAs 
Corresponding to patterns whose time intervals are based on temporal relationships defined in Allen’s algebra (13 temporal operators) 
Originating from the same and from different time series 
35 
Review on [Sac07] - Data mining with temporal abstractions: learning rules from time series, 2007 
36 
Review on [Sac07] - Data mining with temporal abstractions: learning rules from time series, 2007 
Allen’s temporal operators on two periods of time 
BEFORE (period1, period2)	1111 1 2 2222 
EQUAL (period1, period2)	11111 
	22222 
MEETS (period1, period2)	1111 12 2222 
OVERLAPS (period1, period2)	11111 
	 22222 
DURING (period1, period2)	 	 11111 
	22222222 
STARTS (period1, period2)	 1 1111 
	 2 2222222 
FINISHES (period1, period2)	 1111 1 
	2222222 2 
TRUE 
TRUE 
TRUE 
TRUE 
TRUE 
TRUE 
TRUE 
37 
Review on [Sac07] - Data mining with temporal abstractions: learning rules from time series, 2007 
Temporal association rule 
Temporal relationships (OVERLAPS, FINISHED-BY, MEETS, BEFORE, EQUALS, STARTS  PRECEDES) between the complex temporal patterns 
Rule format: A  p C 
A: the set {a1, a2, a3, , an}  AoI (set of abstractions of interest) 
p : a triple (LS-left shift, G-gap, RS-right shift) 
C: an abstraction  AoI 
Interpretation: a temporal association rule holds when all the patterns in the antecedent intersect and when the relation PRECEDES holds between the intervals of this intersection and an episode of the pattern in the consequent. 
38 
Review on [Sac07] - Data mining with temporal abstractions: learning rules from time series, 2007 
Definition of the set AoI of abstractions of interest 
39 
Review on [Sac07] - Data mining with temporal abstractions: learning rules from time series, 2007 
40 
Review on [Sac07] - Data mining with temporal abstractions: learning rules from time series, 2007 
41 
Review on [Sac07] - Data mining with temporal abstractions: learning rules from time series, 2007 
42 
Support = RTS / TSO 
RTS: the rule time span corresponding to the union of the episodes in which both the antecedent and the consequent of the rule occur 
TSO: the time span, i.e. the total duration, of the observation period over which the rule is derived 
Interpretation: a measure of how the rule is ‘spread’ over the observation time span 
Confidence = NARTS / NAT 
NARTS: the number of times (episodes) the antecedent occurs during RTS 
NAT: the number of time (episodes) the antecedent occurs during TSO 
Interpretation: the frequency of the rule over the total amount of episodes of the antecedent 
Review on [Sac07] - Data mining with temporal abstractions: learning rules from time series, 2007 
43 
Review on [Sac07] - Data mining with temporal abstractions: learning rules from time series, 2007 
44 
Review on [Sac07] - Data mining with temporal abstractions: learning rules from time series, 2007 
Algorithm for extracting temporal association rules 
45 
Review on [Nam08] - Identification of temporal association rules from time-series microarray data set, 2008 
46 
Discrete values: up, down, and none 
Temporal item: a discrete value + its position in rules + time stamp 
Example: g 1L  is an “up” discrete value at the time point “1” expected to appear on the left hand side of temporal association rules. 
Temporal item set: a non-empty set of temporal items 
Temporal frequent item set: a temporal item set whose support is greater than or equal to a support threshold ( MinSup ) 
Review on [Nam08] - Identification of temporal association rules from time-series microarray data set, 2008 
47 
Rule format: LHS  ( ∆ ) RHS 
LHS & RHS: disjoint temporal frequent item sets. 
∆ (association interval): the interval of different two time stamps of the LHS and RHS. ∆ = 0..n. 
Interpretation: RHS follows LHS after ∆. 
Temporal transaction set: a combination of the transaction at the time t (t=0..n-∆) and the transaction at the time t + ∆. 
Review on [Nam08] - Identification of temporal association rules from time-series microarray data set, 2008 
48 
Support of a temporal item set = the number of occurrences of the temporal item set / the number of temporal transaction sets 
Confidence of a temporal association rule = support (LHS  RHS)/support (LHS) 
Precision = (# of matched rules)/(# of extracted rules) 
A measure of correctness of the extracted rules 
Precision  [0, 1] 
The KEGG cell cycle regulation path (known) 
Review on [Nam08] - Identification of temporal association rules from time-series microarray data set, 2008 
49 
Review on [Nam08] - Identification of temporal association rules from time-series microarray data set, 2008 
a temporal transaction set 
# of temporal transaction sets = 5 
Support ({g 1L  }) = 3/5 = 60% 
Support ({g 2R  }) = 3/5 = 60% 
Support ({g 3R  }) = 3/5 = 60% 
Support ({g 2L  }) = 3/5 = 60% 
Support ({g 1L  , g 2R  }) = 3/5 = 60% 
Support ({g 1L  , g 3R  }) = 1/5 = 20% 
Confidence (g 1   (2) g 2  ) = 
Support ({g 1L  , g 2R  }) / support({g 1L  }) 
Confidence (g 1   (2) g 2  ) = 60%/60% Confidence (g 1   (2) g 2  ) = 1 
50 
Review on [Ha09] - A stock recommendation system exploiting rule discovery in stock databases, 2009 
Rule format: 
H: an event corresponding to an appearance of a pattern P within the time range of the rule head 
B: an event corresponding to the characteristics of stock prices within the time range of the rule body. Investor-specified rule body conditions decide the characteristics of stock prices (INCREASE, DECREASE, UNCHANGED). 
s = support (H) = (number of occurrences of a pattern P corresponding to H)*100/(number of occurrences of patterns whose lengths are same as that of the pattern P corresponding to H) 
c = confidence (H, B) = (number of occurrences of patterns that are matched to H and also satisfied with the condition on B)*100/(number of occurrences of patterns that are matched to H) 
Interpretation: B happens after time t since H has occurred. 
51 
Review on [Ha09] - A stock recommendation system exploiting rule discovery in stock databases, 2009 
Changing patterns of stock prices 
52 
Review on [Ha09] - A stock recommendation system exploiting rule discovery in stock databases, 2009 
Process flow for finding frequent patterns 
53 
Review on [Ha09] - A stock recommendation system exploiting rule discovery in stock databases, 2009 
54 
Review on [Ha09] - A stock recommendation system exploiting rule discovery in stock databases, 2009 
Categorization 
 A symbol sequence S’’ = 
55 
Review on [Ha09] - A stock recommendation system exploiting rule discovery in stock databases, 2009 
The Apriori algorithm to discover frequent patterns 
Discover a set of frequent patterns F 1 of length 1 by scanning S’’ 
Perform the self-join of F k-1 (2<=k<= length(S ’’)) to produce a set of candidate patterns Ck of length k 
For each pattern  in C k , scan S’’ to decide if the support of  is greater than or equal to MinSup 
If yes, then  is inserted into F k . 
56 
Review on [Ha09] - A stock recommendation system exploiting rule discovery in stock databases, 2009 
57 
Review – Summary on Patterns 
Refs 
Meaning 
Length 
Shape 
Overlapping 
Temporal aspect 
Extraction 
[Das98] 
Not specified 
Fixed 
Fixed 
Yes 
No 
Subsequence clustering 
[Au04] 
Not specified 
Fixed 
Fixed 
No 
No 
Fuzzifying 
[Daf05] 
Pre-specified 
Arbitrary 
Fixed 
No (almost) 
No 
Subsequence clustering 
[Sac07] 
Pre-specified 
Arbitrary 
Not fixed 
No 
Yes 
A sliding window algorithm that recursively expand segments by adding new points until some error bound is exceeded 
[Nam08] 
Pre-specified 
Arbitrary 
Not fixed 
No 
Yes 
A discrete value conversion algorithm 
[Ha09] 
Not specified 
Arbitrary 
Not fixed 
Yes 
No 
Categorization and the Apriori algorithm 
58 
Review – Summary on Rules 
Refs 
Point/interval-based rules 
Inter/intra-transaction rules 
Predictive rules 
Single/multiple time series rules 
Association (dependence, correlation, casual, classification) rules 
[Das98] 
Point 
Inter 
Yes (not always) 
Both 
correlation 
[Au04] 
None 
None 
None 
Single 
classification 
[Daf05] 
Point 
Inter 
Yes (not always) 
Both 
correlation 
[Sac07] 
Interval 
Both 
Yes (not always) 
Both 
correlation 
[Nam08] 
Point 
Both 
Yes (not always) 
Both 
correlation 
[Ha09] 
Point 
Inter 
Yes 
Single 
classification 
59 
Review – Summary on Algorithms 
Refs 
In-memory 
Index structures 
Measures 
Algorithm analysis 
[Das98] 
Yes 
No 
Frequency, confidence, J-measure 
[Au04] 
Yes 
No 
Interestingness, weight of evidence 
[Daf05] 
Yes 
No 
Frequency, confidence, J-measure 
[Sac07] 
Yes 
No 
Support, confidence, precision, recall 
[Nam08] 
Yes 
No 
Support, confidence, precision 
[Ha09] 
Yes 
Yes (B-tree) 
Support, confidence 
60 
Review – Summary on User Support 
Refs 
Rule interpretation 
Rule visualization 
End-users/data mining application developers 
[Das98] 
Yes 
No 
None 
[Au04] 
No 
No 
None 
[Daf05] 
Yes 
No 
None 
[Sac07] 
Yes 
No 
None 
[Nam08] 
Yes 
No 
None 
[Ha09] 
Yes 
No 
End-users 
61 
Problem Statement 
Temporal aspect investigation for rules 
Items (Patterns) of interest 
Rule model 
Rule discovery technique evaluation (analysis) 
Choice of measures (support, confidence, j-measure, weight of evidence, ) 
User support 
62 
Conclusion 
Rule discovery from time series 
Rule discovery process 
Several issues regarding time series rule discovery 
Patterns 
Rules 
Rule discovery algorithms 
User support 
Problem statement 
63 
Thank you. 
64 

File đính kèm:

  • pptrule_discovery_from_time_series.ppt
Bài giảng liên quan