are we fast yet? html & javascript performance - utahjs

32
Are we fast yet? JavaScript & HTML performance Trevor Linton - July 2014

Upload: trevor-linton

Post on 15-Jan-2015

1.605 views

Category:

Technology


3 download

DESCRIPTION

Presentation to UtahJS on webkit.js and HTML/Javascript performance.

TRANSCRIPT

Page 1: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Are we fast yet? JavaScript & HTML performance

Trevor Linton - July 2014

Page 2: Are We Fast Yet? HTML & Javascript Performance - UtahJS

JavaScript? Sure.

•  Firefox has asm.js (A subset of JavaScript)

•  Chrome has V8 •  Safari now has LLVM optimizations

C++/Clang

Firefox ASM.js

In some measurements we’re butting up against native C++.

Page 3: Are We Fast Yet? HTML & Javascript Performance - UtahJS

JS has road blocks in HTML though.

JIT Begins optimizing.

STOP, unknown what this function may do or return.

Recalculate Layouts

STOP, Waiting for return value from renderer

STOP, Events have unknown values, cannot pre-optimize

Mouse click from user

Renderer kicks off JS

HTML Renderer JS compiler Code

function() { .... var el = document.getElementById(‘id’) .... var bounds = el.getBoundingClientRect() .... } addEventListener(‘click’,function(e) { ... });

Page 4: Are We Fast Yet? HTML & Javascript Performance - UtahJS

An experiment to overcome this

Re-implement rendering in HTML5 to be JavaScript based.

Page 5: Are We Fast Yet? HTML & Javascript Performance - UtahJS

An experiment to overcome this

•  Re-implement HTML5 rendering in JavaScript.

•  JS can fully JIT through any DOM operation and optimize.

•  JS optimizer has ability to anticipate inputs from C++ in sync/async events.

•  Using ASM.js we can get near C++ runtime speeds.

Original C++ WebKit Code (webcore actually)

Using LLVM/Clang and emscripten compile it down to javascript.

webkit.js

Page 6: Are We Fast Yet? HTML & Javascript Performance - UtahJS

webkit.js speed results (x=iter.)

•  Rendering becomes substantially faster after progressive runs.

•  Rendering speed on pair with native speeds.

•  Firefox faster due to built-in 1:1 ASM.js optimizations.

•  DEMO: http://trevorlinton.github.io

0"

0.2"

0.4"

0.6"

0.8"

1"

1.2"

1" 2" 3" 4" 5" 6" 7" 8"

webkit.js"Chrome"35"

webkit.js"in"Firefox"30"

Chrome"35"

Firefox"30"

Page 7: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Getting over the DOM fence...

•  Continue building a JS based HTML renderer.

•  Firefox, Chromium and the later are working on pulling more of the DOM into native JS.

•  Proposal is out to recreate CSS styles in JS for Chromium.

•  Firefox is already getting close to this...

Page 8: Are We Fast Yet? HTML & Javascript Performance - UtahJS

But WebKit is complex...

CSS Animation

Rendering

Hardware Compositing

Other Things… (Layout, Network, Parsing, DOM, CSS, Javascript)

Compositing, Painting, Drawing and Rendering

ChromeClient(Implemented as

ChromeClient***.cpp)

AcceleratedCompositor (GraphicsLayerClient)

GraphicsLayer(TextureMapperLayer)

WebView

TextureMapperGL(TextureMapper /

GraphicsLayerClient)

ContextGL, OpenGLES V2 (platform specific, accelerated)

CREA

TES

BUT

DOES

NO

T M

ANAG

E

CREA

TES

CHRO

MEC

LIENT

JS A

ND H

ANDS

OFF

ACCE

LERA

TED

COM

POSI

TOR

ONCE

CRE

ATED

ChromeClientJS Executes on AcceleratedContext:setRootGraphicsLayerenabled?()scheduleLayerFlushresizeRootLayer

Chrome class A proxy for ChromeClient interface

passed into Frame

When a graphics layer is created it sends attachRootGraphicsLayer to ChromeClientJS, in addition it will execute WidgetSizeChanged (or WebView may), setNeedsOneShotDrawingSynchronization, scheduleCompositingLayerFlush and scheduleAnimation. These are all passed through to the AcceleratedCompositor on behalf of webkit. Chrome//WebKit//WebCore will only do this if accelerated compositing is turned on by settings and ACCELERATED_COMPOSITING=1 && TEXTURE_MAPPER_GL=1 && TEXTURE_MAPPER=1 in compiler settings.

DEVICE SCALE FACTOR, PAGE SIZE, ETC.Executes setDeviceScaleFactor(float) usually 2 in webkit.js for hide rendering. Also executes viewport size to set size of view. This will cause the frame in both accelerated and non accelerated mode to kick out twice the size of bitmap image when bitblting. However all coordinates are still in logical pixels.

The “layout black box”. This is where the magic happens, we will be re-informed of results through the ChromeClient executed by Chrome

Shoots the created layers and root layers to texture mapper which tiles and uploads them to GL for display, these manage for us scrolling, memory use and other things so we don’t just haphazardly create 20,000 different compositing layers, textures, etc.

paintContents on ChromeClientJS actually draws contents to TextureMapperLayer as the TextureMapperGL interface needs, it’ll request these through paintContents

GraphicsLayer(CREATION)

HostWindowor IPC Channel for WebKit2/Chrome

GraphicsContext3D(GraphicsContext)

TextureMapper sends paintContentsto AcceleratedCompositor, whichin turn manages clearing OpenGL,maintaining buffers/contexts.

Clears buffers,

makes current context.

Chrome/WebKit pushes a graphics layer to ChromeClient

that is created by GraphicsLayerFactory

GraphicsLayerFactory

Created by taking ChromeClientJS that’s held by Frame or global default constructor to create all RenderLayer’s and GraphicsLayers

Note, on some platforms this is part

of ChromeClient

WebView creates the chrome client that isplatform specific, it’s sent to WebCore::Frameand a copy is retained for WebView. WebCore::Frame,WebCore::FrameView, WebCore::Page and a wholehost of other classes run methods on chrome clientwhen specific work needs to be done.

Informs each other of size changes,when graphics layers needs to beflushed, and a whole host of otherthings to sync states.

Pushes textures thatare tiled or full as “composited”layers to GL.

Used for special transformsor accelerated scaling.

Painting / Drawing

cairo (or other drawing library, skia,

CoreAnimation, etc.)

pixman for fast patched drawing optimizations

Image, ImageBufferlibjpegturbo, libpng

(note gif and bmp are built in to webcore)

zlib (decompressing pngs)

FreeType, FontConfigused for font parsing and layout

GraphicsContext(library/platform specific)

WebCore::Frame WebCore::Page

ChromeClient(Created by WebView is passed to Frame/Page

for WebCore to use.)

There’s also coordinated graphics and tiling.

Platform Blit Surface(non-accelerated)

Software Compositing

TextureMapper

GraphicsLayer

When attachRootGraphicsLayer is executed by Chrome the Graphics Layer is passed into accelerated compositor. The compositor is checked to see if its enabled, if not compositing is turned off, if so compositing is turned on.

Non-accelerated, non-composited,

bitable path.

Composited, but not accelerated

path (not bitblted)

WebView creates a device GL and EGL (openglesv2) context via SDL. This context in webkit is globally available once created. It then creates AcceleratedCompositor and does nothing else than hand it to ChromeClientJS. It also makes these the current context and sets the device viewport size (not the GL context size). ContextGL and ContextEGL are hacked to pass specific params to Emscripten to create the right compatible surface, these hacks are wrapped in PLATFORM(JS) Preprocessors

RenderLayerCompositor

RenderLayer

Accelerated, but not composited

bit-blt path.

Composited and accelerated path

Compiled Vertexes & Shaders

Classes compile layout commands into OpenGL Vertex

& Shader Program

WebCore::FrameViewWebCore::Document

Video Codecs

GraphicsLayerTextureMapper.cppGraphicsLayer::create factory ? factory->create :

GraphicsLayerTextureMapper()

ChromeClient->graphicsLayerFactory()(GraphicsLayerFactory passed through from ChromeClient->factory(), if non exists, use default

TextureMapper implementation. RenderLayerBacking Plugins

Layout and painting produce a render tree that is managed by a host of classes. The RenderLayers and RenderLayerTree communicate with render layer compositer to determine the GraphicsLayers that are then passed on through the RenderLayerBacking

glBindTexture() / Canvas / SDL / GLUT / XWindow / DWM / NSOpenGL / etc..

AnimationController

AnimationBaseAnimationControllerPrivate

Document

New

StartWaitTimer

StartWaitStyleAvailable

StartWaitResponse

Looping

Ending

PausedNew

PausedWaitTimer

PausedWaitResponse

PausedWaitStyleAvailable

PausedRun

Done

FillingForwards

Animation state, view

ed as a state machine w

ith enum

m_anim

State

Knows About, and firesAnimation Controller methods as states change.

Element

Knows about and executes stylerecalculations on documents andelements. However it does notactually change the styles value, just whether it should recalculateand potentially layout/render.

Document::updateStyleIfNeededElement::setNeedsStyleRecalc

CompositeAnimation

RenderElement

Knows about and interacts with animation base, unclear

why.

WaitingAnimationSet (An array of AnimationBase)

! Seems to be a list of animations (AnimationBase classes) waiting to be animated, their state is stored in AnimationBase and could potentially become out of sync by being in an array that’s technically not waiting.

RenderStyle

! AnimationController has two paths based on if request animation frame is enabled or not, in addition there is request animation frame timing feature that further branches into a new path confusing how the implementation path flows.

Performs most of its work in AnimationControllerPrivate as a proxy, seems unnecessary and unclear why.

! Performs separate paths for compositing animations, this makes for confusing bugs.

AnimationUpdateBlock(implemented in

AnimationController.h)

! Issues beginAnimationUpdate or endAnimationUpdate simply through its constructor/destructor, very unclear why, and seems to pollute the paths.

animatinon() contains one controller per frame. " Has a circular dependency with

AnimationController, unclear why.

# Runs on a one-shot timer, unclear why.

" Has a circular dependency with AnimationBase, unclear why.

! Implementation hides “AnimationControllerPrivate” rather than implementing AnimationController. Unclear why.

Creates on stack an animation update block letting the deconstructor/constructor fire begin/

end calls to AnimationController. Gets Animation Controller from frame.animation()

Uses the frame reference only to get accessto the frame view class to execute the flushcompositingstateincludingsubframes and otherflush compositing state classes.

Combined with RenderLayerCompositor these do the actual changes to the styles and are called by AnimationController, AnimationBase and Frame/Element.

KeyframeAnimation

KNOWN DESIGN ISSUES:

This system has a race condition if the compositor is flushed or invalidated too quickly (e.g., chrome client calls scheduleLayerFlush on AcceleratedContext.cpp) the animation base’s timer (within AnimationController) fails to remove waiting animations that have already completed within the WaitingAnimationSet. What happens is since there is not a chance for the AnimationController to remove these on its next timer run between the AcceleratedContext’s scheduled layer flushes items within WaitingAnimationSet are thought to be “Waiting” for an animation, but have a m_animState (on AnimationBase) of Ending, Done or other. In other words, the AnimationController thinks that animations that have completed are still waiting for their style because the accelerated compositor is plowing through them too quickly.

The cure for this is to simply think of requests from ChromeClient to flush, invalidate or paint as “suggestions” and prevent them from executing more than 1/60th of a second in addition do not allow more than one flush to be issued at a time (e.g., two timers on separate threads running a flush concurrently).

flushPendingLayerChanges

flushCompositingStateIncludingSubframes

Page 9: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Academic exercises aside...

What can we do now?

FIRST REMEMBER:

•  There’s a difference between perceived performance vs.

actual performance (E.g., is your event just firing late?)

•  Be careful when optimizing your code; it’s a rabbit hole and sometimes a pitfall (80/20 rule).

Page 10: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Avoid"interacFng"with"the"DOM"with"these"paLerns:""•  Changing"a"DOM"parameter"(adding,"modifying,"removing"elements)"then"reading"from"another.""This"requires"a"layout"validaFon"/"invalidaFon"since"the"render"has"no"idea"if"the"change"you"made"could"potenFally"cause"a"change"to"the"value"you’re"trying"to"read!"

Page 11: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Avoid"incremental"changes"to"DOM"if"you"can"batch"them"together:"

For"instance,"if"you"need"to"create"HTML"elements"in"javascript"using"innerHTML"is"faster"than"using"document.createElement,"that"is"if"you’re"creaFng"nested"or"more"than"one"element."

Page 12: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Avoid"JavaScript"that"interacts"with"DOM"funcFons"(vs."strings"or"properFes"on"the"DOM)""•  JavaScript"can"safely"opFmize"more"if"you’re"modifying"a"string"rather"than"execuFng"a"funcFon."

•  Again,"innerHTML"does"not"cause"a"JS"opFmizaFon"pause"(if"you’re"wriFng,"appending"but"not"reading),"but"document.createElement"will."

Page 13: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Give"the"browser"as"much"informaFon"about"animaFons"as"you"can."Use"declaraFve"animaFon"styles"in"CSS.""•  Use"animaFon"key"frames"and"transiFon"in"CSS.""•  Use"will]change"CSS"property"for"properFes"that"frequently"change"(not"yet"implemented,"but"SOON!)"

•  These"can"be"pre]compiled"by"the"RenderLayer"prior"to"the"animaFon"ever"being"executed!"

Page 14: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Use"linear"transformaFons"rather"than"standard"CSS"style"rules"to"change"the"posiFon"or"scale.""•  Using"CSS"transform()"you"can"apply"linear"transformaFons"that"can"be"enFrely"done"in"the"compositor"and"GPU."

•  Changing"the"X/Y"(lec/top)"or"width/height"will"cause"a"reflow/relayout"and"a"new"texture"to"upload"in"the"GPU."

Page 15: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Use"requestAnimaFonFrame"whenever"possible.""•  requestAnimaFonFrame"prevents"layout"thrashing"as"it’s"

explicitly"done"before"the"next"layout"loop"and"acer"composiFng."

•  The"compositor"is"aware"of"requestAnimaFonFrame"and"lets"you"modify"elements"prior"to"composiFng"frames."

•  This"can"significantly"prevent"you"from"interrupFng"a"layout"and"causing"a"new"one"from"running."

Page 16: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Simplify"your"CSS""•  Do"not"use"overtly"complex"selectors"•  Duplicate"styles"must"be"resolved"and"increase"layout"Fme."•  This"has"a"r*e"growth"rate!"(r=rules,"e=elements),"reducing"

either"will"lower"your"layout"Fme."•  Rules"have"a"z*r*e"growth"rate!"(z=number"of"selector"

parameters)""

Page 17: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Do"not"add"CSS"rules"or"explicitly"set"style"parameters"acer"a"document"load.""•  Browsers"can"cache"possible"states"(or"visited"style"states),"

but"not"when"its"dynamically"set."•  Create"various"possible"“style"states”"for"each"element"and"

switch"the"class"on"the"element"rather"than"sekng"the"style"aLribute."

Page 18: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Avoid,"if"you"can,"using"libraries"and"frameworks""Best"pracFce:"•  Prototype"with"libraries,"then"profile"and"begin"removing/

replacing"funcFonality"with"a"smaller"limited"set/needs."•  Most"libraries"and"frameworks"are"built"for"ease"of"use,"and"

not"performance."

Page 19: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Javascript"Memory"Leaks"are"easy"to"create.""It’s"fairly"easy"to"accidently"have"an"object"refer"to"another"object"that"refers"back"to"itself.""Becareful"(and"aware)"of"these"corner"cases.""•  Use"closures"and"avoid"objects"that"take"in"other"objects."•  Avoid"defining"variables"in"the"global"scope"

Page 20: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Use"prototypes"rather"than"user]defined"objects.""var"obj"="{foo:funcFon()"{"console.log(‘hello’);"}""Create"10,000"of"these"and"you’ll"have"10,000"definiFons"AND"instances.""Careful!"

Page 21: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Don’t"fear"iframes""•  If"you"have"complex"controls"(visjs,"d3?)"that"may"need"their"

own"UI"loop"consider"placing"them"into"iframes"•  iframes"give"you"a"new"thread"and"potenFally"a"new"process!"•  Useful,"but"don’t"overdo"it,"iframes"are"heavy"weight."

Page 22: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Be"careful"mixing"contents""•  Plugins,"video,"webgl,"CSS"animaFons"and"tradiFonal"DOM"

rendering"all"run"on"separate"contexts."•  They’re"pulled"together"via"render"layers"and"graphics"layers."•  The"more"contexts"you"introduce"the"more"complex"the"

synchronizaFon"between"them"can"become."•  Contexts"!="composiFng"layers"(but"can"someFmes)"

Page 23: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Avoid"listening"to"high]throughput"events""•  A"common"performance"mistake"is"not"removing"event"

listeners"on"DOM"elements"or"reacFng"to"the"DOM"event"in"the"event"thread."

•  High]throughput"events"such"as"mousemove,"touchmove"and"scroll"should"very"rarely"be"used."

•  If"you"need"to"use"these"cache"the"result"and"animate"in"requestAnimaFonFrame,"NOT"the"event."

Page 24: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Some"rules"of"thumb:"

Be"conservaFve"when"forcing"a"composiFng"layer""(e.g.,"transform3D(0,0,0)"or"translateZ(0))"

"•  CreaFng"a"graphics"layer/render"layer"is"expensive."•  Generally"the"rendering"sub]system"is"very"efficient"at"figuring"out"what"

should"and"shouldn’t"be"layers."•  It"makes"very"liLle"sense"to"force"composiFng"layers"in"a""nested"manner,"

be"careful"doing"this!"•  It"makes"very"liLle"sense"to"force"composiFng"layers"if"they"don’t"have"a"

linear"transformaFon"or"mask"(e.g.,"overflow:scroll)"

Page 25: Are We Fast Yet? HTML & Javascript Performance - UtahJS

CSS"styles"that"cause"paints"

Repaints"are"the"most"expensive"operaFon,"and"should"ALWAYS"be"declaraFve"(when"possible..)"

color " " " "border]style"visibility " " " "background"text]decoraFon " "background]image"background]posiFon "background]repeat"outline]color " " "outline"outline]style " " "border]radius"outline]width " " "box]shadow"background]size"

Page 26: Are We Fast Yet? HTML & Javascript Performance - UtahJS

CSS"styles"that"cause"layout"

Layouts"are"lighter"than"repaints"but"can"(in"certain"circumstances)"trigger"a"repaint"as"well!"

width " "height " "overflow]y "font]weight"padding " "margin " "display " "border]width"border " "top " " "posiFon " "font]size"float" " "text]align" "overflow " "lec"font]family "line]height "verFcal]align "right"clear" " "white]space "boLom " "min]height"

Page 27: Are We Fast Yet? HTML & Javascript Performance - UtahJS

CSS"styles"that"cause"a"composite"

Composites"are"generally"not"expensive,"declaraFve"or"imperaFve"style"declaraFons"are"fine"(note,"not"all"of"these"cause"a"NEW"composite"layer,"but"cause"a"composite"of"exisFng"layers):"

opacity " " " "]webkit]user]select"cursor " " " "]webkit]transform"z]index " " " "transform(scale)"transform3D " " "transformZ()"transform(rotate)"

Page 28: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Identifying Issues: Jank

The overuse of graphics layers causing pages to take excessively long to composite:

Cause: Composite CSS calls used in a nested pattern. Diagnose: Large composite times.

Cure: Remove nested transform3d/transformZ, reduce linear transforms, remove scroll event listeners, remove opacity or CSS composite filters.

http://wesleyhales.com/blog/2013/10/26/Jank-Busting-Apples-Home-Page/

Page 29: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Identifying Issues: Paint Storm

Cause: Changing a paint CSS style on a high-throughput event or circular flip/flopping a CSS paint style.

Diagnose: Very frequent paint->composite in frames. Cure: Find where paint CSS styles are changing.

Page 30: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Identifying Issues: Layout Trashing

Common Cause: Reading layout DOM properties or modifying DOM.

Diagnose: Frequent but short layout requests without paint/composite after. Cure: You’re most likely reading a CSS property or DOM property A LOT in your JavaScript code (perhaps in a tight loop?)

// els is an array of elementsfor(var i = 0; i < els.length; i += 1){ var w = someOtherElement.offsetWidth / 3; els[i].style.width = w + 'px';}

Page 31: Are We Fast Yet? HTML & Javascript Performance - UtahJS

IdenFfying"Issues:"JS"Memory"Leaks"

Common%Cause:""Circular"references"or"code"holding"onto"object"references"for"longer"than"necessary.""Diagnose:"Use"DOM"shim’s"and"run"your"code"in"node"–expose_gc"opFon,"use"“gc()”"to"force"garbage"collecFon.""Cure:"Use"binary]search"type"methods"for"isolaFng"the"offending"code"and"fix/refactor."

Page 32: Are We Fast Yet? HTML & Javascript Performance - UtahJS

Mobile"App"Best"PracFces"

•  Don’t use touch move events on scrollable items. •  Nest overflow elements to produce scroll effects •  Overflow elements should be in 500px intervals

–  WebKit uses tiling for composite layers, each tile is 500px.

•  Use absolute positioning/transforms where ever possible. •  Avoid nesting elements •  Less is more when listening to events •  Pre-paint items soon to show up, use display:none to hide. •  Mobile has more memory to lend, less GPU/CPU.

–  Declarative style CSS animations are key here. –  Be careful when forcing a compositing layer with transforms.