i/o stack optimization for smartphones
DESCRIPTION
I/O Stack Optimization for Smartphones. 2013. 10. 28 Mobile Lab 박세 준. Contents. Intro Past work Elimination of JOJ External Journaling Polling based I/O Evaluation Related Work & Conclusion. Intro. Ref: sqlite.org. Android adopts DB - PowerPoint PPT PresentationTRANSCRIPT
1
I/O Stack Optimization for Smart-phones
2013. 10. 28Mobile Lab
박세준
2
Contents
- Intro
- Past work
- Elimination of JOJ
- External Journaling
- Polling based I/O
- Evaluation
- Related Work & Conclusion
3
Intro• Android adopts DB
– Featherweight file-based DB(It’s rather DB API than DBMS)
Minimum library size is <300KB, max up to <500KB– Provides journaling, to support atomic transaction
action– Widely used
• Web browsers: Firefox, Chrome, Opera• Web language: HTML5(default web storage), python• OS : Blackberry, Windows phone, iOS, Android,
Symbian, WebOS, … Nevertheless..
Ref: sqlite.org
4
• Ext4– A default file system since LINUX kernel ver. 2.6.28– Introduces journaling also due to improve reliability– Became default Android file system from ICS, ver.
4.0.4• Replaces rootfs for Linux kernel, YAFFS2 for previous inter-
nal flash, FAT32 for external SD• Using MTP(Media Transfer Protocol) to interface between
Ext4 external SD and NTFS for windows– Has criticismTheodore, Developer of Ext4, stated that btrfs is better because there are more advanced technique in btrfs than Ext4
Ref: https://ext4.wiki.kernel.org/index.php/Main_Page https://lkml.org/lkml/2008/8/1/217Intro
5
Intro
90%
% about overall
30% 70%
75%
64%
6
• Revisiting storage for smartphones
Web cache tend to write sequentially, not on SQLite.-> Caused by different characteristic that web cache varies by pages, but SQLite reuses specific DB in address so that locality couldn’t be demonstrated(Continuous big page contents, Discrete much smaller chunks)
In addition, almost SQLite write requests are synchronous, and they cause I/O delay in committing atomic operations
Past work
7
• Revisiting storage for smartphones
– Two improvement factors• fsync
– Alleviation too frequent sync• DB in RAM
– Write performance : NAND flash << RAM– Improvement write performance itself
Past work
8
• Revisiting storage for smartphonesPast work
Treat random write as sequential write
Lazy evaluation
9
• Revisiting storage for smartphones
Nilfs : Log structured file systemPCM : Phase-Change Memory, a kind of NVRAM(Non-Volatile RAM)
Past work
Kingston&Webbench
RiData&Webbench
Kingston&Facebook
RiData&Facebook
10
Revisiting storage for smartphones
Evolved!!I/O stack opti-mization for smartphones
11
• Journaling in SQLite– 6 Journaling modes
• DELTE (Default before Android 4.0.4)• TRUNCATE(Default since Android 4.0.4)• PERSIST• MEMORY• WAL(Write-Ahead-Logging)• OFF
Elimination of JOJ(Journaling of Journal)
12
• Journaling Mode : Delete– Delete journal when atomic transaction successfully
processed (unlink, delete)
• Journaling Mode : Truncate– Don’t delete journal even if a transaction completed,
instead, truncate by zero (delete)• But both of delete & truncate need new allocation
Elimination of JOJ
13
• Journaling Mode : Persist– Overwrite journal by zero (=zero-fill)– No need for reallocation, if journal has to be up-
dated, reuse zero-filled journal
• Journaling Mode : Memory– Keeping journal on memory– If application crashes entirely,
no way to rollback– Fastest
Elimination of JOJ
000000000000000000000000000
000000
14
• Journaling Mode : WAL– Create a separate WAL file(.wal)– .wal is checkpointed every specified threshold
• Journaling Mode : OFF– No journal– No guarantee atomic operation
Elimination of JOJ
CheckpointedCheckpointedCheckpointedCheckpointed
Database(.db) WAL(.wal)
15
• Journaling in EXT4– Overhead of journaling is negligible
• Journal transaction is much bigger than SQLite• Journal transaction interval is much longer (e.g. 5sec)
– But if in SQLite?• SQLite order fsync() command to commit journal to EXT4• In experiment, a INSERT SQLite SQL issues 2 or more fsync()
within 2ms• Moreover, each fsync() consists tiny chunks
containing very few records– Causes very inefficiency from fsync()
-> 200% Overhead
Elimination of JOJ
16
• What fsync causes
Red numbers denote data size(KB), red X represent write operation
Elimination of JOJ
17
• Comparison along SQLite journaling mode
24KB of operation instruction size3 fsync() calls9 write operations
Elimination of JOJ
18
• Comparison along SQLite journaling mode
16KB of operation instruction sizeNo opening & unlinking of .db-journal2 fsync() calls8 write operations
Elimination of JOJ
19
• Comparison along SQLite journaling mode
8KB of operation instruction sizeNo opening & unlinking of .db-journal3 fsync() calls12 write operations -> Caused by zero-filling: Not adequate EXT4
Elimination of JOJ
20
• Comparison along SQLite journaling mode
16KB of operation instruction size1 fsync() call2 write operations
Elimination of JOJ
21
• Another filesystem for Android
– Uses B+ tree– COW mechanism
(Copy on Write)
– Wandering tree problemUpdating a node invokes other nodes to be updated
– 1 fsync : 6 writes
Elimination of JOJ
22
• Another filesystem for Android
– Log structured– Merging chunks and
combine to segment– Size of a segment is
128KB– fsync contains a command of flushing segment
– Due to too big segment & flushing operationit reveals worst performance
Elimination of JOJ
23
• Another filesystem for Android
– To support enterpriselevel storage
– But result has beenreversed
– One fsync calls only one journal write– Minimum journal write is 1KB
In EXT4 : 4KB
Elimination of JOJ
24
• Another filesystem for Android
– Log structured
– Not only supports merging data tosegments, but alsooperation that updatingsmall chunks to storage, which is not supported the other LFS (e.g. NILFS2)
– So, it is relieved of suffering from updating tiny chunk
Elimination of JOJ
25
• Seq, rnd write with fsyncElimination of JOJ
26
• fsync, fdatasync and noatime– fsync flushes metadata every operation– fdatasync doesn’t flush metadata until existing
metadata is required to be flushed which caused by considerable metadata is to be journaled
– noatime : Mount option• atime : Linux logs the time of last access• relatime : Updates when current modify or change time is
expired, compared with last access time• noatime : Don’t log unless the file is written(modified)
i.e. don’t log access time(Default Android mount option)
Elimination of JOJ Ref: http://www.lug.or.kr/m/bbs/view.php?bo_table=centos_book&wr_id=70&page=8
27
• Evaluation of JOJelimination– F2FS is best, XFS
is next– BTRFS & NILFS2
gain benefit hardlyCaused by COW:COW mechanisminterference advantageof fdatasync thatdoesn’t flush metadata
Elimination of JOJ
28
• Evaluation JOJ– Variant
SQLite journalingmodes, filesystems
– In update in DB onGalaxy S3, there isno journal filecreation(Difference betweenInsert/sec andUpdate/sec)
Elimination of JOJ
29
• Exploit Locality– Data IO seems
random– Journal IO
looks contiguous
– External journaling• Put journal another block(=partition)
– As a result, external journaling can exploitlocality to the maximum
External journaling
30
• Flash vs HDD– In legacy PC system,
the size of I/O interruptis much bigger thana smartphone system
– So, frequent contextswitches may be harmful
– Actually, as shown table 1, polling I/O consumes more CPU but this is ignorable because of dominant power consumption from dis-play and telecommunication elements
Polling-based IO
31
• Power consumption in smartphonePolling-based IO
Power usage on E-Mail Power usage on Web browsing
32
• Combining all advances
– EXT4 -> EXT4 advanced : about 2.4X– EXT4 -> F2FS baseline : about 2.2X– EXT4 -> F2FS advanced : about 4X
Overall result
33
34
Any question?