Download - Lets Impl Sbv
echizen_tm Dec. 10, 2011
http://d.hatena.ne.jp/echizen_tm/20111210/1323541165
(1 slides) (3 slides / ) (7 slides / ) (1 slides) (25 slides / ) (1 slides) (1 slides)
IDechizen_tm EchizenBlog-Zwei(http://d.hatena.ne.jp/echizen_tm/)
web ()
(1/3) (Succinct Data Structure) (O(1)O(logN))
(2/3)
mozc(google) (WEB+DB PRESS Vol.64 ) mozcLOUDS
Sedue(PFI)
(3/3)
LOUDS() LOUDS
LOUDS
(1/7) (Succinct Bit Vector)
rank/select(O(1)O(logN)) NN+o(N) LOUDS o(N)N
(2/7) rank(i): i(0)
(00110010) rank(0) = 0rank(1) rank(3) rank(5) rank(7) = = = = 0, 1, 2, 3, rank(2) rank(4) rank(6) rank(8) = = = = 1, 1, 3, 3
0 7
(3/7) rank char v[] = {0, a, 0, 0, n, x, 0, 0} Char w[] = {a, n, x}uint8_t b = 0x32
8byte
(0011 0010)
(3 + 1)byte
get(w, b, 1) = w[rank(b, 1)] = w[0] = a
get(w, b, 4) = w[rank(b, 4)] = w[1] = n get(w, b, 5) = w[rank(b, 5)] = w[2] = x
(4/7) select(i): i(0)
(00110010) select(0) = 1,select(1) = 4, select(2) = 5
0 7
(5/7) select char s[] = appleorangeremonchar p[] = {0, 5, 11}(16 + 3)byte
char s[] = appleorangeremonuint16_t b = 0x0821
(16 + 2)byte (0000 1000 0010 0001)
get(s, b, 0) = select(b, 0) = 0
get(s, b, 1) = select(b, 1) = 5 get(s, b, 2) = select(b, 2) = 11
(6/7) rank/select
(irank/selector) rank(select(0)) = rank(1) = 0 rank(select(1)) = rank(4) = 1 rank(select(2)) = rank(5) = 2
rankselect
(7/7) (sparse)(dense)
501100011, 01100100
501000010, 11101101(00010010)012 01 5
Linux/2.27GHz 2core/24GB rank/select1000 (5000bit / 1bit)name ux-trie rx-trie marisa-trie rank(sec) 18.4 19.1 18.7 select(sec) 21.1 20.1 18.8 size(byte) 14,062,520 14,597,160 15,234,440
(1000bit / 1bit)name
rank(sec) 19.0 18.5 18.7
select(sec) 21.8 20.4 19.7
size(byte) 14,062,520 14,597,160 14,921,944
ux-trie rx-trie marisa-trie
(1/25) (ux,rx,marisa)
rankselect3
(2/25)
ux, marisa: C++ rx: C
ux, marisa: uint8_t, uint32_t, uint64_t (stdint.h) rx: char, intC++ (C/C++) uintXX_t
(3/25) rank rankpopcount + rank
popcountuint32_t,uint64_t rankB
O(1) iBrank(i)
(4/25) (B=8) popcount rank11001101 11000100 10000111 01100011
5 5
3 8
4 12
4 16
rank
iBrank(i)O(1) rank(0)=0, rank(8)=5, rank(16) = 8,
/B
(5/25) iBrank(i)O(1)
rank(21) 11001101 11000100 10000111
015rankO(1) 1623 3popcount10000111 00111000
8
3
(6/25) rank B BV BrankrankR rank(i) = R[i / B 1] + popcount(V[i / B] 1 = +
0 1 0 0 0 1 0 0
0 1 0 0 0 1 0 0
=
10 00 10 01
(18/25) 2
(1) (2)
10 00 10 01 10 00 10 01
& &
0x33 (00110011) 0xCC (11001100)
= =
00 00 00 01 10 00 10 00
(2)2(1)10 00 10 00 00 00 00 01
>> 2 = +
00 10 00 10
00 10 00 10
=
0010 0011
(19/25) 4
(1) (2)
0010 0011 0010 0011
& &
0x0F (00001111) 0xF0 (11110000)
= =
0000 0011 0010 0000
(2)4(1)0010 0000 0000 0011
>> 4 = +
0000 0010
0000 0010
=
00000101
(11001101) 5(=00000101)
(20/25) popcountrank
popcount
(3) (2) (1)
1 1 0 0 1 1 0 1 10 00 10 01 0010 0011 00000101
1 1 0 0 1 1 0 1 2 0 2 1 2 3 5
1 2 4 8
(21/25) 11001101select(1)
(1)44
(1)
0010 0011
2 3
4
select(1)(0)1 (1)43(01234) select(1)40 7
(22/25) (2)0-12-3
(2)
10 00 10 01
2 0 2 1
2
select(1)(0)1 (2)0-112-32 (00-1 1,22-3) select(1)2-30 7
(23/25) (3)231 1 0 0 1 1 0 1 1 1 0 0 1 1 0 1
(3)
1
select(1)(0)1 (2)0-10 32 select(1) = 20 7
(3)21
(24/25) popcount2n
select(1)
(1) (2) (3)
0010 0011 10 00 10 01 1 1 0 0 1 1 0 1
2 3 2 0 2 1 1 1 0 0 1 1 0 1
4 2 1
0 7
(25/25) select B=64popcount B>646464
B=51264 B=512
popcount*8rank popcount (12/1000) (select) 8
marisa-trieselect
B=512rank rank: rank(O(1)) + popcount(O(B/64) = O(1)) select: (O(log(N/B)) = O(logN)) +(O(B/64) = O(1)) + popcount(O(log64) = O(1))
Sedue(http://preferred.jp/sedue.html) mozc(http://code.google.com/p/mozc/) ux-trie(http://code.google.com/p/ux-trie) marisa-trie(http://code.google.com/p/marisa-trie) revision mozc: r73 ux-trie: r42 marisa-trie: r83