higher order learning
TRANSCRIPT
![Page 1: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/1.jpg)
William M. Pottenger, Ph.D.
Higher Order LearningHigher Order Learning
William M. Pottenger, Ph.D.Rutgers University
ARO Workshop
![Page 2: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/2.jpg)
William M. Pottenger, Ph.D.
• Introduction• Overview
– IID Assumption in Machine Learning– Statistical Relational Learning (SRL)– Higher-order Co-occurrence Relations
• Approach– Supervised Higher Order Learning– Unsupervised Higher Order Learning
• Conclusion
OutlineOutline
![Page 3: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/3.jpg)
William M. Pottenger, Ph.D.
IID Assumption in Machine LearningIID Assumption in Machine Learning
• Data mining tasks such as association rule mining, cluster analysis, classification aim to find patterns/form a model from a collection of instances.
• Traditionally instances are assumed to be independent and identically distributed (IID).– In classification, a model is applied to a single instance
and the decision is based on the feature vector of this instance in a “context-free” manner, independent of the other instances in the test set.
• This context-free approach does not exploit the available information about relationships between instances in the dataset (Angelova & Weikum, 2006)
![Page 4: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/4.jpg)
William M. Pottenger, Ph.D.
Statistical Relational Learning (SRL)Statistical Relational Learning (SRL)
• Underlying assumption – Linked instances are often correlated
• SRL operates on relational data with explicit links between instances– Explicitly leverages correlations between related instances
• Collective Inference / Classification– Simultaneously label all test instances together
– Exploit the correlations between class labels of related instances
– Learn from one network (a set of labeled training instances with links )
– Apply the model to a separate network (a set of unlabeled test instances with links)
• Iterative algorithms– First assign initial class labels (content-only traditional
classifier)– Adjust class label using the class labels of linked instances
![Page 5: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/5.jpg)
William M. Pottenger, Ph.D.
Statistical Relational Learning (SRL)Statistical Relational Learning (SRL)
• Several tasks– Collective classification / Link based
classification– Link prediction– Link based clustering– Social network modeling– Object identification– Bibliometrics– …– Ref: P. Domingos and M. Richardson,
Markov Logic: A Unifying Framework for Statistical Relational Learning. Proceedings of the ICML-2004 Workshop on Statistical Relational Learning and its Connections to Other Fields (pp. 49-54), 2004. Banff, Canada: IMLS.
![Page 6: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/6.jpg)
William M. Pottenger, Ph.D.
Some Related Work in Some Related Work in SRLSRL
• Relational Markov Networks (Taskar et al., 2002)– Extend Markov networks for relational data– Discriminatively train undirected graphical model - for every link
between two pages, there is an edge between labels of these pages– Significant improvement over the flat model (logistic regression)
• Link-based Classification (Lu & Getoor, 2003)– Structured logistic regression– Iterative classification algorithm– Outperforms content-only classifier on WebKB, Cora, CiteSeer
datasets• Relational Dependency Networks (Neville & Jensen, 2004)
– Extend Dependency Networks (DNs) for relational data– Experiments on IMDb, Cora, WebKB, Gene datasets– Results: RDN model is superior to IID classifier
• Graph-based Text Classification (Angelova & Weikum, 2006)– Has graph in which nodes are instances and edges are the
relationships between instances in the dataset– Increase in performance on DBLP, IMDb, Wikipedia datasets– Interesting observation: gains are most prominent for small training
sets
![Page 7: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/7.jpg)
William M. Pottenger, Ph.D.
Reasoning by Abductive Inference Reasoning by Abductive Inference
• Need for reasoning from evidence, even in the face of information that may be incomplete, inexact, inaccurate, or from diverse sources
• Evidence is provided by sets of diverse, distributed, and noisy sensors and information.
• Build a quantitative theoretical framework for reasoning by abduction in the face of real-world uncertainties.
• Reasoning by leveraging higher order relations…
![Page 8: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/8.jpg)
William M. Pottenger, Ph.D.
Gathering EvidenceGathering Evidence
stress
migraine
CCB
magnesium
PA
magnesium
SCD
magnesiummagnesium
Slide reused with permission of Marti Hearst @ UCB
![Page 9: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/9.jpg)
William M. Pottenger, Ph.D.
A Higher Order A Higher Order Co-OccurrenceCo-Occurrence Relation!Relation!
migraine magnesium
stress
CCB
PA
SCD
Slide reused with permission of Marti Hearst @ UCB
No single author knew/wrote about this connection… this distinguishes Text Mining from Information Retrieval.
![Page 10: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/10.jpg)
William M. Pottenger, Ph.D.
Uses of Higher-order Co-occurrence Uses of Higher-order Co-occurrence RelationsRelations
Higher-order co-occurrences play a key role in the effectiveness of systems used for information retrieval and text mining
• Literature Based Discovery (LBD) (Swanson, 1988)– Migraine↔(stress, calcium channel blockers)↔ Magnesium
• Improve the runtime performance of LSI (Zhang et al., 2000)– Explicitly use 2nd order co-occurrence to reduce MT×D
• Word sense disambiguation (Schütze,1998)– Similarity in word space is based on 2nd order co-
occurrence • Identifying synonyms in a given context (Edmonds,1997)
– Precision of system using 3rd order > 2nd order > 1st order • Stemming algorithm (Xu & Croft, 1998)
– Implicitly uses higher orders of co-occurrence
![Page 11: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/11.jpg)
William M. Pottenger, Ph.D.
Is there a theoretical basis for the Is there a theoretical basis for the use of higher order co-occurrence use of higher order co-occurrence
relations?relations?• Research agenda: study machine learning
algorithms in search of a theoretical foundation for the use of higher order relations
• First algorithm: Latent Semantic Indexing (LSI)– Widely used technique in text mining and IR based
on the Singular Value Decomposition (SVD) matrix factoring algorithm
– Terms semantically similar lie closer in LSI vector space even though they don’t co-occur LSI reveals hidden or latent relationships
– Research question: Does LSI leverage higher order term co-occurrence?
![Page 12: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/12.jpg)
William M. Pottenger, Ph.D.
• Yes! Answer is in the following theorem we proved: If the ijth element of the truncated term by term matrix, Y, is non-zero, then there exists a co-occurrence path of order 1 between terms i and j.– Kontostathis, A. and Pottenger, W. M. (2006) A
Framework for Understanding LSI Performance. Information Processing & Management.
• We have both proven mathematically and demonstrated empirically that LSI is based on the use of higher order co-occurrence relations.
• Next step? Extend the theoretical foundation by studying characteristics of higher-order relations in other machine learning datasets/algorithms such as association rule mining, supervised learning, etc.– Start by analyzing higher-order relations in labeled
training data used in supervised machine learning
Is there a theoretical basis for the use Is there a theoretical basis for the use of higher order co-occurrence relations of higher order co-occurrence relations
in LSI?in LSI?
![Page 13: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/13.jpg)
William M. Pottenger, Ph.D. 13
What role do higher-order relations What role do higher-order relations play in supervised machine learning?play in supervised machine learning?• Goal: discover patterns in higher-order paths
useful in separating the classes• Co-occurrence relations in a record or instance set
can be represented as an undirected graph G = (V, E) – V : a finite set of vertices (e.g., entities in a record) – E is the set of edges representing co-occurrence relations
(edges are labeled with the record(s) in which entities co-occur)
• Path definition from graph theory: Two vertices xi and xk are linked by a path P (nodes xi distinct) where the number of edges in P is its length.
• Higher-order path: Not only vertices (entities) must be distinct but also edges (records) must be distinct.
e1 e5e4e3e2r1 r2 r3 r4
r5
r6
An example of a fourth-order path between e1 and e5, as well as several shorter paths
![Page 14: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/14.jpg)
William M. Pottenger, Ph.D. 14
What role do higher-order relations What role do higher-order relations play in supervised machine play in supervised machine
learning?learning?• Path Group: A path (length≥2) is
extracted per the definition of a path from graph theory. In the example, a 2nd order path group comprises two sets of records: S1={1,2,5} and S2={1,2,3,4}. A path group may be composed of several higher-order paths.
• A bipartite graph G = (V1 U V2, E) is formed where V1 is the sets of records and V2 is the records. Enumerating all maximum matchings in this graph yields all higher-order paths in the path group. Another approach is to discover the system of distinct representatives (SDR) of these sets.
S1
R1
S2
R2
R3
R4
R5
An example co-occurrence graph
e1e2
e3
e4e5
e1 e2 e3R1R2R5
R1R2R3R4
An example 2nd order path group
e1 e2 e3R1 R3
A valid 2nd order path
![Page 15: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/15.jpg)
William M. Pottenger, Ph.D.
• Approach: Discover frequent itemsets in higher-order paths – For labeled datasets, divide instances by class and
enumerate k-itemsets (initially for k in {3,4})– Results in a distribution of k-itemset frequencies for
a given class – Compare distributions using simple statistical
measure such as t-test to determine independence – If two distributions are statistically significantly
different, we conclude that the higher-order path patterns (i.e., itemset frequencies) distinguish the classes
• Labeled training data analyzed– Mushroom dataset: performs well on decision tree– Border gateway protocol updates: relevant to
cybersecurity
What role do higher-order relations What role do higher-order relations play in supervised machine learning?play in supervised machine learning?
![Page 16: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/16.jpg)
William M. Pottenger, Ph.D.
• For each fold; compared 3-itemset frequencies of E set vs. P set
• Interesting result: six of the 10 folds had a confidence of 95% or greater that the E and P instances are statistically significantly different– Other folds between 80-95% (see below)
Fold t Stat P(T<=t) one-tail
t_Critical one-tail
P(T<=t) two-tail
t_Critical two-tail
0 -2.684 0.0037 1.6471 0.0074 1.9634
1 -1.357 0.0875 1.6467 0.1751 1.9629
2 -1.554 0.0603 1.6468 0.1205 1.9629
3 -2.924 0.0018 1.6472 0.0036 1.9636
4 -1.908 0.0284 1.6469 0.0568 1.9631
5 -2.047 0.0205 1.6469 0.041 1.9631
6 -1.455 0.073 1.6467 0.146 1.9629
7 -2.023 0.0217 1.6469 0.0434 1.9631
8 -2.795 0.0027 1.6471 0.0053 1.9635
9 -2.71 0.0034 1.647 0.0069 1.9633
Preliminary Results – Supervised ML Preliminary Results – Supervised ML datasetdataset
Ganiz, M., Pottenger, W.M. and Yang, X. (2006). Link Analysis of Higher-Order Paths in Supervised Learning Datasets, In the Proceedings of the Workshop on Link Analysis, Counterterrorism and Security, 2006 SIAM Conference on Data Mining, Bethesda, MD, April
![Page 17: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/17.jpg)
William M. Pottenger, Ph.D.
What role do higher-order relations What role do higher-order relations play in supervised machine learning?play in supervised machine learning?
• Detection of Interdomain Routing Anomalies Based on Higher-Order Path Analysis– Border Gateway Protocol (BGP) is de facto
interdomain routing protocol for Internet.– Anomalous BGP events: misconfigurations,
attacks and large-scale power failures often affect the global routing infrastructure.– Slammer worm attack (January 25, 2003 )– Witty worm attack (March 19, 2004)– 2003 East Coast Blackout (i.e., power failure)
– Goal: detect and categorize such events
![Page 18: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/18.jpg)
William M. Pottenger, Ph.D.
What role do higher-order relations What role do higher-order relations play in supervised machine learning?play in supervised machine learning?
• Detection of Interdomain Routing Anomalies Based on Higher-Order Path Analysis– The data divided into three-second bins– Each bin is a single instance in our
training data ID Attribute Definition
1 Announce # of BGP announcements
2 Withdrawal # of BGP withdrawals
3 Update # of BGP updates(=Announce + Withdrawal )
4 Announce Prefix # of announced prefixes
5 Withdraw Prefix # of withdrawn prefixes
6 Updated Prefix # of updated prefixes(=Announce Prefix + Withdraw Prefix)
![Page 19: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/19.jpg)
William M. Pottenger, Ph.D.
• Border Gateway Protocol (BGP) routing data– BGP messages generated during interdomain
routing – Relevant to cybersecurity– Detect abnormal BGP events
– Internet worm attacks (slammer, witty,…), power failures, etc.
– Data from a period of time surrounding/including worm propagation
– Instance ->three second sample of BGP traffic– Six numeric attributes (Li et al., 2005)
– Previously decision tree applied successfully for two classes: worm vs. normal (Li et al., 2005)– Cannot distinguish different worms!
Preliminary Results – BGP datasetPreliminary Results – BGP dataset
![Page 20: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/20.jpg)
William M. Pottenger, Ph.D.
Preliminary Results – BGP datasetPreliminary Results – BGP dataset
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37
sliding window (Slammer)
P two-tail %5 level
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37
sliding window (Witty)
P two-tail %5 level
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37
sliding window (Blackout)
P two-tail %5 level
Event 1 Event 2 t-test results
Slammer Witty 0.00023
Blackout Witty 0.00016
Slammer Blackout 0.018
• 240 instances to characterize a particular abnormal event• Sliding window approach for detection
– Window size: 120 instances (360 seconds)– Sliding 10 instances (sampling every 30 seconds)
Ganiz, M., Pottenger, W.M., Kanitkar, S., Chuah, M.C. (2006b). Detection of Interdomain Routing Anomalies Based on Higher-Order Path Analysis. Proceedings of the Sixth IEEE International Conference on Data Mining (ICDM’06), December 2006, Hong Kong, China
![Page 21: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/21.jpg)
William M. Pottenger, Ph.D.
Preliminary Results – Preliminary Results – Naïve Bayes on Higher-order PathsNaïve Bayes on Higher-order Paths
• Cora (McCallum et al., 2000) – Scientific paper dataset – Several classes: case based, neural networks, etc.– 2708 documents, 1433 terms, 5429 links– Terms are ordered most sparse first
• Instead of links, we used higher order paths in a Naïve Bayes framework
• E.g., when 2nd order paths are used, F-beta (beta=1) is higher starting from dictionary size 400Cora Dataset - macro averaged F1 by increasing dictionary size
0
10
20
30
40
50
60
70
80
200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1433
dictionary size
Ac
cu
rac
y
Fb1
Fb2
![Page 22: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/22.jpg)
William M. Pottenger, Ph.D.
What role do higher-order relations play in What role do higher-order relations play in unsupervised machine learning?unsupervised machine learning?
• Next step? Consider unsupervised learning…– Association Rule Mining (ARM)
• ARM is one of the most widely used algorithms in data mining– Extend ARM to higher order… Higher Order
Apriori
• Experiments confirm the value of Higher Order Apriori on real world e-marketplace data
![Page 23: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/23.jpg)
William M. Pottenger, Ph.D.
Higher Order Apriori: Approach Higher Order Apriori: Approach
• First we extend the itemset definition to incorporate k-itemsets up to nth-order
– Definition 1: item a and b are nth-order associated, If a and b can be associated across n distinct records.
– Definition 2: An nth-order k-itemset is a k-itemset for which each pair of its items is nth-order associated.
– Definition 3: 2 records are nth-order linked if they can be linked through n-2 distinct records.
– Definition 4: A nth-order itemset i1i2…in is supported by a nth-order recordset r1r2...rn if no two items come from the same record.
niii rrr n 121 ~~~ 21
biiia nn rn
rrr ~~~~~ 121121
![Page 24: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/24.jpg)
William M. Pottenger, Ph.D.
Higher Order Apriori: ApproachHigher Order Apriori: Approach
• Given j instances of nth-order k-recordset rs, its
size is defined as:
• Since the same k-itemset can be generated at different orders, the global support for a given k-itemset must include the local support at each
order u. So we have the this formula:
j
t
kk
u
n
vvkn Irssize
1
2/)1(
1
1
1_ )()(
order
u
ku
k u
rssizeis
max_
1
_10 1)(log)(sup
![Page 25: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/25.jpg)
William M. Pottenger, Ph.D.
Higher Order Apriori: ApproachHigher Order Apriori: Approach
• Higher Order Apriori is structured in a level-wise order-first manner. – Level-wise: the size of k-
itemsets increases in each iteration (as is the case for Apriori),
– Order-first: at each level,
itemsets are generated across all orders.
![Page 26: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/26.jpg)
William M. Pottenger, Ph.D.
Higher Order Apriori: Results Higher Order Apriori: Results
• Our algorithm was tested on real-world e-commerce data, from the KDD Cup 2000 competition. there are 530 transactions involving 244 products in the dataset.
• We compared the itemsets generated by Higher Order Apriori with two other algorithms:
– Apriori (1st order) – Indirect (Using our algorithm limited to 2nd
order). – Higher Order Apriori limited to 6th order.
• We conducted experiments on multiple systems, including at the National Center for Supercomputing Applications (NCSA)
![Page 27: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/27.jpg)
William M. Pottenger, Ph.D.
Higher Order Apriori: ResultsHigher Order Apriori: Results
• Higher Order Apriori mines significantly more final itemsets than Apriori and Indirect
33.23.43.63.8
44.24.44.6
50 75 100 200 530
Number of Records (k=2 for HOApriori and Indirect)
Lo
g (
Nu
mb
er o
f It
em s
ets)
Apriori Indirect HOApriori
• Next we will show high support itemsets are discovered using smaller datasets than required by Apriori or Indirect
![Page 28: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/28.jpg)
William M. Pottenger, Ph.D.
Higher Order Apriori: ResultsHigher Order Apriori: Results• {CU, DQ} is the top ranked 2-itemset using Apriori
on all 530 transactions.– Neither Apriori nor Indirect, leverage the latent higher
order information in runs of 75, 100 and 200 random transactions
– While Higher Order Apriori discovered this itemset as top ranked using only 75 transactions,
– In addition, the gap between the supports increases as the transaction sets get larger
0123456
0 100 200 300 400 500 600
Number of Records
Rank
ing
Indirect HOApriori
00.5
11.5
22.5
33.5
44.5
5
0 50 100 150 200 250
Number of Records
Sup
port
Apriori Indirect HOApriori
![Page 29: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/29.jpg)
William M. Pottenger, Ph.D.
Higher Order Apriori: ResultsHigher Order Apriori: Results
• Discovering Novel Itemsets Apriori Indirect Higher Order
Apriori
Itemsets Discovered {AY, X }{X, K}{K, Q)
{AY, K}{X, Q}
+ Apriori Itemsets
{AY, Q} + Indirect Itemsets+ Apriori Itemsets
Itemsets Undiscovered
{AY, K} {X, Q} {AY, Q}
• AY- Girdle-at-the-top Classic Sheer Pantyhose
• Q- Men’s City Rib Socks - 3 Pack
Shaver : Women’s Pantyhose relationship
Apriori (Donna Karan’s Extra Thin Pantyhose, Wet/Dry Shaver)
Indirect (Berkshire’s Ultra Nudes Pantyhose, Epilady Wet/Dry Shaver)
Higher-order Apriori
(Donna Karan’s Pantyhose, Epilady Wet/Dry Shaver)
• This relationship is also discovered by Apriori and Indirect, but Higher order Apriori discovered a new nugget, which provides extra evidence for this relationship
![Page 30: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/30.jpg)
William M. Pottenger, Ph.D.
Higher Order Apriori: ResultsHigher Order Apriori: Results
• Discovering Novel Relationships – Higher Order Apriori discovers itemsets that demonstrate
novel relationships not discovered by lower order methods.
– For example, the following are reasonable relationships. While Apriori and Indirect failed to discover itemsets representing such relationships in the SIGKDD dataset, they might discover such links given a larger data set.
Shaver : Lotion/Cream
Apriori No
Indirect No
HigherOrderApriori
(Pedicure Care Kit, Toning Lotion)(wet/dry shaver, Herb Lotion)
(Pedicure Care Kit, Leg Cream)
foot cream – women’s socks
Apriori No
Indirect No
Higher-order Apriori
(foot cream, Women's Ultra Sheer Knee High)(foot cream, women’s cotton dog sock)
![Page 31: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/31.jpg)
William M. Pottenger, Ph.D.
ConclusionsConclusions
• Many traditional machine learning algorithms assume instances are independent and identically distributed (I.I.D.) – Apply model to a single instance (decision is based on the
feature vector) in a “context-free” manner– Independent of the other instances in the test set
• Statistical Relational Learning (SRL)– Classifies a set of instances simultaneously (collective
classification)– Utilizes relations (links) between instances in dataset– Usually considers immediate neighbors– Violates “independence” assumption
• Our approach utilizes the latent information based on higher order paths – Utilizes higher order paths of order greater than or equal to
two – Higher-order paths are implicit; based on co-occurrences of
entities– We do not use the explicit links in the dataset!
– Captures “latent semantics” (aka Latent Semantic Indexing)
![Page 32: Higher Order Learning](https://reader035.vdocuments.site/reader035/viewer/2022081515/55853556d8b42ae70f8b4b5b/html5/thumbnails/32.jpg)
William M. Pottenger, Ph.D.
ThanksThanks
Q&A