succinct ordinal trees based on tree covering
DESCRIPTION
Succinct Ordinal Trees Based on Tree Covering. Meng He , J. Ian Munro, University of Waterloo S. Srinivasa Rao, IT University of Copenhagen. Background: Succinct Data Structures. The problem Modern applications often process huge amounts of data Examples - PowerPoint PPT PresentationTRANSCRIPT
Succinct Ordinal Trees Based on Tree Covering
Meng He, J. Ian Munro, University of Waterloo
S. Srinivasa Rao, IT University of Copenhagen
Background: Succinct Data Structures The problem
Modern applications often process huge amounts of data
Examples Web search engines: Google, Altavista,
etc. Bioinformatics application XML databases Spatial databases …
The Solution: Succinct Data Structures
What are succinct data structures Representing data structures using
preferably information-theoretic minimum space
Supporting efficient navigational operations
History of Succinct Data Structures Jacobson 1989
The number of different ordinal trees of n nodes: ( )/(n+1) ≈ 4n/(πn)3/2
Information-theoretic minimum: 2n-O(lg n) bits
Explicit, pointer-based representation: Θ(n lg n) bits
Trees
2nn
Succinct Ordinal Trees Level order unary degree sequence
(LOUDS): Jacobson 1989 Balanced parentheses (BP): Munro &
Raman 1997 Depth first unary degree sequence
(DFUDS): Benoit et al. 1999 Tree covering (TC): Geary et al. 2004
3
Preorder and DFUDS order
1
2
7
5 6
4
8
9 10 11
3
4 5 6
7 8
Navigational Operations Considered Parent Child Level_ancestor Depth Subtree_size LCA …all in O(1) time with 2n+o(n) bits on
the word RAM
Motivations and Objectives Three main representations: BP, DFUDS,
TC
The fact: different representation supports different operations on trees Example: height, node_rankDFUDS, node_rankpost
New problem: a representation supporting all the navigational operations
Motivations and Objectives (Continued)
The assumption: there may be new operations supported by one of these representations
New problem: one representation that can compute an arbitrary word of all the other representations
The Tree Covering Algorithm by Geary et al. The idea
Cover the tree with a set of mini-trees Cover each mini-tree with a set of micro-trees Compute the set of mini-trees (micro-trees) in a
bottom-up, greedy fashion Properties
Any two mini-trees (micro-trees) can only share their root
Size of a mini-tree (micro-tree): M~3M-4 (M’ ~3M’-4) Parameters: M = lg4
n, M’ = lg n / 24
The Tree Covering Algorithm: An Example
M = 8, M’ = 3
Operations supported on TC by Geary et al. The old TC (Geary et al. 2004)
child child_rank depth level_anc nbdesc degree node_rankPRE, node_selectPRE
node_rankPOST, node_selectPOST
New Definitions and Properties: Preorder Changers
Tier-1 preorder changers: Number of Tier-1 preorder changers: at
most twice the number of mini-trees Tier-2 preorder changers are similar
DFUDS Order Changers
Tier-1 DFUDS order changers: Number of Tier-1 DFUDS order changers: at
most four times the number of mini-trees Tier-2 DFUDS order changers are similar
τ*-name of a Node
Preorder numbers: node x τ-names: τ(x)=<τ1(x),τ2(x),τ3(x)>
τ*-names: τ*(x)=<τ1(x),τ2(x),τ3*(x)>
Node 29
τ(29)=<3,
τ*(29)=<3,1,4>
1,5>
Supporting node_selectDFUDS
From the DFUDS number to the τ*-name
From the τ*-name to the τ-name Table lookup
From the τ-name to the preorder number Geary et al. 2004
Computing τ1(x)
1 2 3 4 5 6 7 8 9 …
1
Nodes (DFUDS #)
τ1’s stored …2 1 2
Example: x=8th node in DFUDS
τ1(x)=2
10
Computing τ1(x) (Continued) Dictionary
Universe: n size: O(n / lg4 n) Space cost: o(n) bits (Raman et al,
2002) τ1’s stored
Number of elements stored: O(n / lg4 n) Each element: O(lg n) bits Space cost: o(n) bits
Computing τ2(x) and τ3*(x)
1 2 3 4 5 6 7 8 9 …
1
Nodes
τ2 …1 1 2
τ2(x)=1
10
1 3 1 2
τ3* 1 1 2 1 2 1 3 2 …
τ3*(x)=4
Computing τ2(x) and τ3*(x) (Continued) Dictionary
Universe: n size: O(n / lg n) Space cost: o(n) bits (Raman et al,
2002) τ2’s and τ3
*’s stored Number of elements stored: O(n / lg n) Each element: O(lglg n) bits Space cost: o(n) bits
Other Operations Supported height LCA distance leaf_rank and leaf_select leftmost_leaf and rightmost_leaf leaf_size node_rankDFUDS level_leftmost and level_rightmost level_succ and level_pred
Data Abstraction: Computing a subsequence of BP and DFUDS
The problem: store the tree using TC, and support the computation of a word of its BP or DFUDS sequence
Results: Time: compute a word (Θ(lg n) bits) of
the BP or DFUDS sequence in O(f(n)) time
Space: n/f(n) additional bits
Conclusions A succinct representation of ordinal
trees using 2n+o(n) bits that support all the navigational operations
Our representation also supports level-order traversal, a useful ordering previously supported only with a very limited set of operations
Our encoding schemes supports BP and DFUDS as abstract data types
Open Problems
Support new operations that are not supported by BP, DFUDS or TC
Constant-time computation of a word of BP or DFUDS using o(n) additional bits (or is this possible at all?)
Thank you!