clang and linux: the new normal
TRANSCRIPT
LLVMLinux project
Clang and Linux: The New Normal
Presented by:
Behan Webster
(LLVMLinux project lead)
Presentation Date: 2015.02.18
LLVMLinux project
New Clang Benchmarks (compilation)
http://www.phoronix.com/scan.php?page=article&item=gcc49_compiler_llvm35&num=1
OpenBenchmarking.orgSeconds, Less Is Better
Timed ImageMagick Compilation v6.8.1-10Time To Compile
GCC 4.8.2
GCC 4.9.0 RC1
LLVM Clang 3.5 20140413
14 28 42 56 70
SE +/- 0.3660.35
SE +/- 0.0460.59
SE +/- 0.1429.61
Powered By Phoronix Test Suite 5.0.1
LLVMLinux project
New Clang Benchmarks (execution)
http://www.phoronix.com/scan.php?page=article&item=gcc49_compiler_llvm35&num=1
OpenBenchmarking.orgSeconds, More Is Better
ebizzy v0.3Records/s
GCC 4.8.2
GCC 4.9.0 RC1
LLVM Clang 3.5 20140413
9000 18000 27000 36000 45000
SE +/- 109.8542849
SE +/- 44.0842950
SE +/- 53.0343202
Powered By Phoronix Test Suite 5.0.1
1. (CC) gcc options: -pthread -lpthread -O3 -march=native
LLVMLinux project
LLVMLinx Patched Mainline Kernel Tree
● A mainline kernel tree with all LLVMLinux patches applied on top is now available:– git://git.linuxfoundation.org/llvmlinux/kernel.git
● Dated llvmlinux branches– remotes/origin/llvmlinux-2015.01.18
● The master branch is rebased regularly
LLVMLinux project
LLVMLinux Project Status
● LLVM/clang:– All LLVMLinux patches for LLVM are Upstream– Newer LLVM patches to support the Linux kernel
are mostly being added by upstream maintainers– Bug 4068 - [META] Compiling the Linux kernel with
clang– 8 LLVM open Bugs to compile the kernel
LLVMLinux project
LLVMLinux Project Status
● Linux Kernel:– Roughly 40 kernel patches for 4 architectures
● X86, arm, arm64/aarch64, mips– And ppc support independently upstreamed– LLVMLinux branch in linux-next
LLVMLinux project
Kernel Patches
Architecture Number of patches SubmittedNo arch 21 9aarch64 3 0arm 6 2mips 6 0x86_64 4 0TOTALS 40 11
LLVMLinux project
Building the kernel with clang
● Install clang (from distro, source, or llvm.org)– http://llvm.org/releases/download.html
● Get the code: mainline or patched kernel
git clone git://git.linuxfoundation.org/llvmlinux/kernel.git● Native build
make CC=clang● Cross build with clang
make CC=clang ARCH=arm CROSS_COMPILE=arm-linux.gnueabi-
LLVMLinux project
Up to date prebuilt binaries
● Up to date 3.6 prebuilt binaries available for:– Debian from sid/main– Fedora from llvm.org– OpenSUSE from llvm.org– Ubuntu from ppa:xorg-edgers/ppa or llvm.org
http://llvm.org/releases/download.html
LLVMLinux project
Clang vs gcc
● Clang generates code which is roughly the same size and speed as gcc
● Clang is improving at a tremendous rate however...● Competition from clang has meant that gcc has
improved a lot recently (error reporting, etc)● So why use clang?
LLVMLinux project
Advantages to using more than one CC
● The C standard is long, can difficult to interpret, and doesn't define everything → Undefined behavior
● This means that every C compiler is a little different
LLVMLinux project
Advantages to using more than one CC
● Most programmers try their code with one compiler only
● If it works, most people move on● This means that assumptions about what is valid C
code can inadvertently lead to code which uses undefined behavior
LLVMLinux project
Advantages to using more than one CC
● Using more than one interpretation/implementation of the C standard can catch issues before they become a problem
● The Linux kernel has also traditionally used a lot of gnu extensions which mostly aren't strictly required anymore with newer C versions
LLVMLinux project
More than one CC FTW
● More than one CC means code is more portable● More portable often means more standards compliant● More standards compliant often means more correct
LLVMLinux project
LLVMLinux project
Debian recompiled with clang
Clang version
Packages built
Failures Percentage
3.3 18854 2188 11.6%
3.4 21204 2221 10.5%
3.4.2 21383 2040 9.5%
3.5.0 22202 1261 5.7%
3.6.0rc1 22202 1307 5.9%
LLVMLinux project
Unpublicized success
● Recently a large company was able to find a significant error in their private kernel which had apparently been missed by gcc
● Over 2 weeks of work had been put into finding the error which clang found within minutes
● I hear now that their engineers compile their kernel code with both compilers
LLVMLinux project
Toolization
● Both libllvm and libclang allow the same technology in clang to be included into other tools
● Gcc is explicitly not toolized in order to make it difficult to comingle it with non-copylefted code
LLVMLinux project
What can be built with libclang?
● Most static analyzers implement their own parser/grammar which means yet-another-C-standard-interpretation
● Clang Static Analyzer uses libclang which means having full access to the parse tree AST
● This means that it sees what the compiler can see
LLVMLinux project
Integrated Assembly Integration
● The Assembler is used, in part, to inline assembly into C code● In these situations there are a lot of parameters instructing the
compiler on how the ASM should be included● gcc uses an external assembler and leaves much of the
validation of the ASM to gnu as● Clang can also use gnu as, but also has the option of using the
Integrated Assembler which is faster and more integrated into the compiler for tighter code checking and better error messages
LLVMLinux project
Integrated Assembly Integration (2)
● Gcc doesn't validate ASM, but clang does (because of IA)● There are places in the kernel code where ASM not being
validated is taken advantage of to export information from CC (the generation of bounds.s)
● This code breaks in clang since what is being exported are macros and not valid ASM
->NR_PAGEFLAGS $23 __NR_PAGEFLAGS
->MAX_NR_ZONES $4 __MAX_NR_ZONES● In this case clang catches the inappropriate use of ASM
LLVMLinux project
extern inline: Different for gnu89 and gnu99/C11
● GNU89/GNU90 (used by gcc, gcc5 is C11)– Function will be inlined where it is used
– No function definition is emitted
– A non-inlined function may also be provided
● GNU99/C99 (used by clang, clang-3.6 is C11)– Function will be inlined where it is used
– An external function is emitted
– No other function of the same name may be provided.
● Solution? Use “static inline” instead.
LLVMLinux project
Other things which affect inlining
● Both gcc5 and clang-3.6 are C11 by default● However a number of kernel developers have found issues with
code which makes the kernel not yet C11 compliant● The changes to the kernel code for clang pushed for at least
C99 compliance with the idea that C11 isn't far off● As of Linux v3.18 -std=gnu89 is now being passed to CC to
force the old behavior● Linus insists that this should only be a temporary workaround
LLVMLinux project
Attribute Order
● gcc is less picky about placement of __attribute__(()) ● clang requires it at the end of the type or variable● This arguably ends up being a lot more readable
-struct __read_mostly va_alignment va_align = {
+struct va_alignment __read_mostly va_align = {
LLVMLinux project
Named Registers
● Global named registers now fixed for x86, arm, aarch64, mips
register unsigned long current_stack_pointer asm ("sp");
LLVMLinux project
Variable Length Arrays In Structs
● VLAIS isn't supported by Clang (now documented gcc extension) char vla[n]; /* Supported, C99/C11 */
struct { char flexible_member[]; /* Supported, C99/C11 */ } struct_with_flexible_member;
struct { char vlais[n]; /* Explicitly not allowed by C99/C11 */ } variable_length_array_in_struct;
● VLAIS is used in the Linux kernel in a number of places, spreading mostly through reusing patterns from data structures found in crypto (fixes for which have now been merged)
LLVMLinux project
VLAIS Removal Example
- struct {
- struct shash_desc shash;
- char ctx[crypto_shash_descsize(hash)];
- } desc;
+ SHASH_DESC_ON_STACK(shash, hash);
- desc.shash.tfm = hash;
+ shash->tfm = hash; (from crypto/hmac.c)
LLVMLinux project
Status of VLAIS in the Linux Kernel
● USB Gadget patch is in mainline● Netfilter patch is in mainline● Crypto patches are upstream (still a few cleanup
patches)● Only the following VLAIS patches are left:
(exofs, md-raid10, nfs, wimax-i2400m)
LLVMLinux project
Nested Functions
● The kernel community generally don't like the use of nested functions
● However they kept being added to the kernel● About half a dozen uses of nested functions have
been removed.● Clang keeps finding more of them● The latest one is in drivers/md/bcache/sysfs.c
LLVMLinux project
How Can You Help?
● Make it known you want to be able to use Clang to compile the kernel
● Test the LLVMLinux patches● Report bugs to the mailing list● Help get clang related patches upstream● Work on unsupported features and Bugs● Submit new targets and arch support● Patches welcome
LLVMLinux project
Embrace the Dragon.
He's cuddly.
Thank you
http://llvm.linuxfoundation.org
LLVMLinux project
Contribute to the LLVMLinux Project
● Project wiki page– http://llvm.linuxfoundation.org
● Project Mailing List– http://lists.linuxfoundation.org/mailman/listinfo/llvmlinux– http://lists.linuxfoundation.org/pipermail/llvmlinux/
● IRC Channel– #llvmlinux on OFTC– http://buildbot.llvm.linuxfoundation.org/irclogs/OFTC/%23llvmlinux/
● LLVMLinux Community on Google Plus