fast and parallel webpage layout
DESCRIPTION
CS722 Advanced System TopicsTRANSCRIPT
![Page 1: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/1.jpg)
Fast and Parallel Webpage Layout � Leo A. Meyerovich, Rastislav Bodik
University of California, Berkeley
CPSC 722: Advanced Systems Seminar Presenter: Tian Pan
![Page 2: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/2.jpg)
NYTimes: Facebook to rewrite their iOS app BBC: Facebook recodes iOS mobile app to address speed complaints Guardian: Facebook doubles iPhone app speed by dumping HTML5 for native code …
Let’s get started with a story… in June, 2012 Facebook…
![Page 3: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/3.jpg)
There are 85,000 + iPhone applications in the same situation: refactoring existing UI / rewrite clients completely + downloaded over 2 billion times - cover less than 1% of online content
![Page 4: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/4.jpg)
So we still need: A browser supporting emerging and diverse class of mobile devices
A fast and parallel mobile browser
However, - limited CPU computational resources. - The power wall forces hardware architects to apply increases in transistor counts towards improving parallel performance, not sequential performance.
![Page 5: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/5.jpg)
1. Problem and background 2. Challenges
3. Solutions 4. Conclusion
Outline
![Page 6: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/6.jpg)
Data flow in a browser
![Page 7: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/7.jpg)
Lower bounds on CPU times for loading popular pages (Laptop)
Where are the bottlenecks in loading a page?
![Page 8: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/8.jpg)
Where are the bottlenecks in loading a page?
Layout matching and rendering (34%)
Lower bounds on CPU times for loading popular pages (Laptop)
![Page 9: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/9.jpg)
Input HTML tree
CSS
Fonts
Absolute element positions
Output
Layout matching and rendering (34%)
![Page 10: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/10.jpg)
Layout matching and rendering steps
Categories I. Selector matching
step 1 II. Box and text layout
step 2, 4, 5, 6 III. Glyph handling
step 3 IV. Painting or rendering
step 7
![Page 11: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/11.jpg)
Where are the bottlenecks in layout matching and rendering?
3 < 2 < 1 Challenges:
1. CSS selector matching 2. Box and text layout solving 3. Glyph rendering
![Page 12: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/12.jpg)
1. Problem and background 2. Challenges
3. Solutions 3.1. CSS selector matching 3.2. Box and text layout 3.3. Glyph rendering
4. Conclusion
Outline
![Page 13: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/13.jpg)
3.1 CSS Selector Matching Match CSS rules with HTML nodes
Style constraints p img { margin: 10px; } Selector
<p> <img blahblah></p>
DOM node with CSS rules
![Page 14: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/14.jpg)
id hash table
attributes rules id1 r1 id2 r2 … …
CSS a list of selector{rules}
Selector {Rules} …id1 r1 …id2 r2 …class1 r3 …tag1 r4 …class2 r5 …class3 r6 … …
attributes rules class1 r3 class2 r5 class3 r6 … …
attributes rules tag1 r4 … …
class hash table
tag hash table
![Page 15: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/15.jpg)
attributes rules id1 r1 id2 r2 … …
attributes rules class1 r3 class2 r5 class3 r6 … …
attributes rules tag1 r4 … …
node attributes
n1 id2 class2 class3 tag1
n2 id1 tag1
n3 class1 … …
HTML nodes
Map
node rules n2 r1 n1 r2 … …
… …
… …
n3 r3 n1 r5 n1 r6 … …
… …
… …
n1 r4 n3 r4
node rules
n1 r2 r5 r6 r4
n2 r1 r4
n3 r4 … …
Reduce
![Page 16: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/16.jpg)
Optimizations adopted by WebKit: • Hashtables. [×] check CSS repeatedly for every node
[√] read only once, build hashmap, and check hash • Right-to-left matching. Most selectors can be matched
by only examing a short suffix of the path. Other Optimization: • Hash Tiling. partition the hashtable to idHash,
classHash, tagHash, … for reducing cache misses. (Also could have been parallel.)
• Tokenization. store attributes as int of tokens instead of string to save cache and comparison time.
• Random load balancing. Allocate selectors matching randomly instead of sequentially as origin.
![Page 17: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/17.jpg)
Other Optimization: • Result pre-allocation. Pre-allocate space for popular
sites. • Delayed set insertion. Preallocate a vector with a size
of potential matches. • Non-STL sets. Create the vector with a size of
potential matches, add matches one by one and do linear collision checks.
![Page 18: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/18.jpg)
3.1 CSS Selector Matching Evaluation
Cilk++: Overall 13x and 14.8x with and without Gmail Intel TBB: Overall 55.2x and 64.8x with and without Gmail
Workstation: 204ms -> 3.5ms Handheld: 3000ms ->50ms
![Page 19: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/19.jpg)
3.2 Box and text layout Input: HTML tree nodes with symbolic constraint attributes Output: actual layout details (size, shape, position) waiting to be painted into pixels
Layout constraints input Layout constraints output
![Page 20: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/20.jpg)
Unfortunately, it is hard to optimize, because CSS • Informal written and cross-cutting, e.g. infinite loops • Confusing for webpage designers • Need standards-compliant engines
![Page 21: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/21.jpg)
Berkeley Style Sheets (BSS) A new, more orthogonal, concise, well-defined intermediate layout language • Transformed from CSS • Specified with an attribute grammar (chances
for parallelization) • BSS0 (vertical and horizontal boxes), BSS1
(BBS0+shrink-to-fit sizing), BSS2 (BBS1+left floats)
![Page 22: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/22.jpg)
BSS0 (vertical and horizontal boxes)
![Page 23: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/23.jpg)
Attribute Grammars Potential for parallelization attrA
attrB attrC
attrD attrE attrF attrG
attrA
attrB attrC
attrD attrE attrF attrG
IattrA IattrA
IattrB IattrA
IattrB IattrA
IattrB IattrA
IattrB IattrA
attrA
S1 S2
S3 S4 S5 S6
S3 S4 S5 S6
attr: attribute Iattr: inherited attribute S: synthesized attribute
S3 S4 S5 S6
S7 S8
S9
calcInherited()
calcSynthesized()
O(log|tree|)
n1 n2 n3
n4 n5 n6 n7
![Page 24: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/24.jpg)
3.2 Layout Constraint Solving Evaluation
Slashdot.org, BSS1, Cilk++: 3x~4x
![Page 25: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/25.jpg)
Till now, the size and position of texts have been calculated. How to render these texts?
3.3 Glyph Rendering
requests request groups pull and render
Parallel and locality benefits
![Page 26: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/26.jpg)
Evaluation
FreeType2 font library, TBB: 3x~4x
3.3 Glyph Rendering
![Page 27: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/27.jpg)
4 Conclusion
Address three bottlenecks of loading a page 1. CSS selector matching • Pre-built hash tables, map-reduce
2. Box and text layout solving • Specify layout as attribute grammars
3. Glyph rendering • Combine requests to groups and render
in parallel Milestone in building a parallel and mobile browser
![Page 28: Fast and Parallel Webpage Layout](https://reader036.vdocuments.site/reader036/viewer/2022081400/554f66f8b4c905c8088b4e37/html5/thumbnails/28.jpg)
Thanks~