las16-307: benchmarking schedutil in android

46
LAS16-307 Benchmarking Schedutil in Android Steve Muckle

Upload: linaro

Post on 23-Jan-2018

1.025 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: LAS16-307: Benchmarking Schedutil in Android

LAS16-307Benchmarking Schedutil in Android

Steve Muckle

Page 2: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICES

WORKING TOGETHER

Overview● Creating an Android Perf/Power Suite

● Schedutil Comparison Results

Page 3: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICES

WORKING TOGETHER

Overview● Creating an Android Perf/Power Suite

● Schedutil Comparison Results

Page 4: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Why is Android Important to Test?● Maintainers want real world impact

○ Unit tests/synthetic workloads do not always guarantee this

● By far the dominant mobile OS

○ 66% in US, ~75% in top five EU markets, 77% in Chinahttp://venturebeat.com/2016/05/11/android-is-eating-apples-ios-market-share-everywhere/

○ 1.4 Billion active devices as of Sept 2015

http://www.androidcentral.com/google-says-there-are-now-14-billion-active-android-devices-worldwide

● Importance of perf/power dynamic in this space

Page 5: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

How is Android Usually Profiled?● Custom internal-only platforms

● Expensive test equipment

● Commercial-quality software stack○ Mature drivers, userspace

○ Tuned for product level power and performance

● Carefully chosen benchmarks○ Relevant workloads

○ Industry acceptance

Page 6: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Advantages of Commercial Entities● Manpower

● $$$

● Access to internal-only platforms

● Connections○ to bloggers

○ to authors of benchmarks

■ Note: access to source, ability to influence benchmark (!)

○ to chip vendors/handset vendors

○ to Google

Page 7: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Hikey vs. Internal Platform● Access to platform

○ hikey: full community access to hardware, firmware, schematics and sources

○ typical: no access

● software stack○ hikey: untuned community-supported

○ typical: product quality

● power domain accessibility○ hikey: total SOC power

○ typical: CPU power (individual rails)

● measurement tools○ hikey: do-it-yourself w/sense resistor

○ typical: professionally calibrated/supported test equipment, thermal chambers

Page 8: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Creating a Test Suite● Three areas:

○ Performance

○ Ux

○ Power

Page 9: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Performance● Composite system benchmark

○ harder to analyze

○ someone’s perception of the right mixture of tests

○ Licensing :(

○ AnTuTu to start

■ 3D, UX, CPU, RAM

● TODO: I/O, gaming benchmarks

Page 10: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Ux - What is a Jank?● UI ideally runs at 60fps

Why 60fps: https://www.youtube.com/watch?v=CaMTIgxCSqU

● Dropped/delayed frames = Jank

● Something took too long

Page 11: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Ux● Recentfling test

○ Part of Android Tree

○ Measures flinging back and forth across recent apps

● New tests based on recentfling○ Email fling

○ Browser fling

○ Gallery fling

Page 12: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Power● Idle home screen

● Mp3 playback

● Mpeg4 video playback

● Todo:○ Email browsing

○ Web browsing

Page 13: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Automation● Reduce tedium

● Help with repeatability○ Fixed duration power measurements (energy)

○ Capture the same background activity (hopefully)

○ Precise touches and flings

Page 14: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Workload Automation Suite● Apache v2.0 licensed

● Created by ARM

● Written in python

● Supports many popular workloads out of the box*

● Very modular

● http://workload-automation.readthedocs.io/en/latest/

● https://git.linaro.org/people/steve.muckle/wa.git

Page 15: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Workload Support● WA support often lags app updates

● New app version = ?

● Few changes to support non-gapps Android env

Page 16: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Challenges in Testing Power● Be careful reworking your dev boards

Page 17: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Challenges in Testing Power● Random background activity

● Temperature

● Tolerance in shunt resistor

● Aliasing

● Great presentation on all this by Andy Green at

http://www.slideshare.net/linaroorg/how-to-measuresocpower

Page 18: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Issues● Target stability

○ MAILBOX_QUEUE_LEN on Hikey

● Framework stability○ intermittent failures (“adb returned early” and others)

● Test stability○ Antutu failing to start

○ Inconsistent flings

Page 19: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICES

WORKING TOGETHER

Overview● Creating an Android Perf/Power Suite

● Schedutil Comparison Results

Page 20: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Two Test Builds● EAS not upstream yet

● Comparisons with/without EAS

Page 21: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Test Build 1● Android 4.4 hikey kernel pre-EAS

● Cpufreq tip from Rafael Wysocki’s bleeding-edge branch

● Schedutil patch to use rt-avg for RT instead of fmax

● OPP dependencies

● Fixes to interactive gov for tip cpufreq

● AOSP master as August 22

Page 22: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Tuning● Interactive

above_hispeed_delay: 20000

boostpulse_duration: 1000000

go_hispeed_load: 99

hispeed_freq: 1200000

io_is_busy: 1

min_sample_time: 80000

target_loads: 65 729000:75 960000:85

timer_rate: 20000

timer_slack: 20000

Page 23: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Tuning● Ondemand

ignore_nice_load: 0

io_is_busy: 0

min_sampling_rate: 10000

powersave_bias: 0

sampling_down_factor: 1

sampling_rate: 20000

up_threshold: 95

Page 24: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Tuning● schedutil

rate_limit_us: 20000

Page 25: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

AnTuTu Total (higher is better)

trial 1 trial 2 trial 3 avg stddev

ondemand 31328 31478 31026 31277 230

interactive 31796 30724 30405 30975 728

schedutil 30874 30512 30526 30637 205

Respectable result for schedutil.

Page 26: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

recentfling (lower is better)

% janky frames

trial 1 trial 2 trial 3 avg stddev

ondemand 18% 19% 18% 18.3% 0.57

interactive 11% 17% 14% 14% 3

schedutil 33% 45% 43% 40.3% 6.4

Nasty regression for schedutil. Needs further investigation. PELT? WALT?

Page 27: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

emailfling (lower is better)

% janky frames

trial 1 trial 2 trial 3 avg stddev

ondemand 1% 0% 0% 0.33% 0.57

interactive 4% 3% 2% 3% 1

schedutil 0% 0% 1% 0.33% 0.57

Results in the noise.

Page 28: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

browserfling (lower is better)

pct janky frames

trial 1 trial 2 trial 3 avg stddev

ondemand 0% 0% 0% 0% 0

interactive 8% 8% 7% 7.66% 0.57

schedutil 0% 0% 0% 0% 0

Good result for schedutil here...

Page 29: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

galleryfling (lower is better)

% janky frames

trial 1 trial 2 trial 3 avg stddev

ondemand 0% 0% 0% 0% 0

interactive 1% 1% 1% 1% 0

schedutil 1% 1% 1% 1% 0

Results in noise.

Page 30: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Idle home screen energy (lower is better)

J trial 1 trial 2 trial 3 avg stddev

ondemand 12.6 12.935 12.955 12.83 0.20

interactive 13.734 13.717 13.71 13.72 0.01

schedutil 14.017 13.875 13.893 13.93 0.07

Schedutil is competitive.

Page 31: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Mp3 playback energy (lower is better)

J trial 1 trial 2 trial 3 avg stddev

ondemand 22.888 23.127 23.004 23.01 0.12

interactive 23.871 23.928 23.467 23.76 0.25

schedutil 23.953 23.866 23.746 23.86 0.10

Schedutil is competitive.

Page 32: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

720p mpeg4 playback energy (lower is better)

J trial 1 trial 2 trial 3 avg stddev

ondemand 20.375 20.485 20.739 20.53 0.18

interactive 19.505 20.095 19.331 19.64 0.40

schedutil 21.103 20.879 20.766 20.92 0.17

Schedutil is competitive.

Page 33: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Test Build 2● Android 4.4 hikey kernel + EAS 5.2 - schedfreq/schedtune + schedutil/cpufreq

from test build 1○ Thanks to Juri Lelli for providing this branch

● AOSP master as August 22

Page 34: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Tuning● Interactive

above_hispeed_delay: 20000

boostpulse_duration: 80000

go_hispeed_load: 99

hispeed_freq: 1200000

io_is_busy: 0

min_sample_time: 80000

target_loads: 90

timer_rate: 20000

timer_slack: 80000

Page 35: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Tuning● Ondemand

ignore_nice_load: 0

io_is_busy: 0

min_sampling_rate: 10000

powersave_bias: 0

sampling_down_factor: 1

sampling_rate: 20000

up_threshold: 95

Page 36: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Tuning● schedutil

rate_limit_us: 20000

Page 37: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

AnTuTu Total (higher is better)

trial 1 trial 2 trial 3 avg stddev

ondemand 29959 30066 30108 30044 77

interactive 31050 31083 30863 30998 118

schedutil 29606 29279 28942 29275 332

schedutil-w 30364 30607 30447 30472 123

perf 31688 31413 31548 31549 137

WALT clearly helps schedutil on AnTuTu.

Page 38: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

recentfling (lower is better)

% janky frames

trial 1 trial 2 trial 3 avg stddev

ondemand 30% 30% 31% 30.3% 0.57

interactive 18% 17% 21% 18.6% 2.08

schedutil 39% 44% 42% 41.6% 2.5

schedutil-w 22% 32% 36% 30% 7.2

perf 23% 16% 14% 17.6% 4.7

WALT also helps with recentfling regression (not completely though).

Page 39: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

emailfling (lower is better)

% janky frames

trial 1 trial 2 trial 3 avg stddev

ondemand 0% 0% 0% 0% 0

interactive 0% 1% 1% 0.66% 0.57

schedutil 5% 4% 5% 4.67% 0.57

schedutil-w 6% 6% 4% 5.33% 1.15

perf 0% 0% 0% 0% 0

Respectable result for schedutil.

Page 40: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

browserfling (lower is better)

pct janky frames

trial 1 trial 2 trial 3 avg stddev

ondemand 0% 0% 0% 0% 0

interactive 2% 1% 1% 1.3% 0.57

schedutil 9% 7% 5% 7% 2

schedutil-w 3% 4% 8% 5% 2.64

perf 0% 0% 0% 0% 0

Possible small regression in browserfling.

Page 41: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

galleryfling (lower is better)

% janky frames

trial 1 trial 2 trial 3 avg stddev

ondemand 0% 0% 1% 0.33% 0.57

interactive 1% 1% 2% 1.33% 0.57

schedutil 20% 21% 27% 22.66% 3.79

schedutil-w 5% 6% 6% 5.66% 0.57

perf 4% 4% 4% 4% 0

Regression with schedutil.

Page 42: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Idle home screen energy (lower is better)

J trial 1 trial 2 trial 3 avg stddev

ondemand 11.704 11.761 11.730 11.73 0.03

interactive 10.685 10.753 10.844 10.76 0.08

schedutil 10.47 11.554 10.613 10.87 0.59

schedutil-w 12.146 12.144 12.127 12.14 0.01

perf 12.999 13.187 12.822 13.00 0.18

Energy regression with WALT.

Page 43: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Mp3 playback energy (lower is better)

J trial 1 trial 2 trial 3 avg stddev

ondemand 20.044 20.992 21.178 20.74 0.61

interactive 21.699 21.933 21.667 21.76 0.15

schedutil 18.458 18.446 18.471 18.46 0.01

schedutil-w 22.87 22.914 22.995 22.93 0.01

perf 22.539 23.445 22.869 22.951 0.46

Energy regression with WALT.

Page 44: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

720p mpeg4 playback energy (lower is better)

J trial 1 trial 2 trial 3 avg stddev

ondemand 19.932 19.443 19.578 19.651 0.25

interactive 19.34 19.408 19.295 19.347 0.06

schedutil 17.48 17.508 17.485 17.491 0.01

schedutil-w 21.313 21.028 21.218 21.186 0.15

perf 21.574 20.979 21.263 21.272 0.30

Energy regression with WALT.

Page 45: LAS16-307: Benchmarking Schedutil in Android

ENGINEERS AND DEVICESWORKING TOGETHER

Next Steps● test framework

○ repeatability

○ WA stability

● deeper result analysis○ recentfling regression

○ possible galleryfling regression

○ energy regression with WALT

● incorporate schedtune

Page 46: LAS16-307: Benchmarking Schedutil in Android

Thank You

#LAS16For further information: www.linaro.org

LAS16 keynotes and videos on: connect.linaro.org