![Page 1: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/1.jpg)
X-Containers: Breaking Down Barriers to Improve Performance and Isolation
of Cloud-Native ContainersZhiming Shen
Cornell University
Joint work with Zhen Sun, Gur-Eyal Sela, Eugene Bagdasaryan, Christina Delimitrou, Robbert Van Renesse, Hakim Weatherspoon
![Page 2: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/2.jpg)
Software Containers
2
![Page 3: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/3.jpg)
3Img src: https://pivotal.io/cloud-native
Cloud-Native Container Platforms
![Page 4: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/4.jpg)
4Img src: https://pivotal.io/cloud-native
• Single Concern Principle:Every container should address a single concern and do it well.
• Making containers easier too Replace, reuse, and upgrade
transparentlyo Scale horizontallyo Debug and troubleshoot
Cloud-Native Container Platforms
![Page 5: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/5.jpg)
Container
Proc
ess
Proc
ess
5
Hardware
Linux Kernelnamespaces cgroups SELinux
ContainerPr
oces
s
Proc
ess
Shared kernel attack surface and TCB
Not allowed to install kernel modules
The Problem
Hard to tune or optimize for a specific container
![Page 6: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/6.jpg)
6
Existing Solutions
Linux
Container
Process
Process
Container
Linux
VM
Linux
Process
Process
Clear Container
KVMLinux
gVisor
Container
Process
Process
gVisor
Require nested hardware virtualization support in the cloud
Ptrace mode: high overheadKVM mode: require nested virtualization
IsolationCustomizationOptimizationPortabilityPerformance
![Page 7: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/7.jpg)
X-Containers achieve• VM-level • Support of Kernel• Support of Kernel • Good (without the need of hardware-assisted virtualization)• High
AND• Backward Compatibility
7
IsolationCustomizationOptimization
PortabilityPerformance
![Page 8: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/8.jpg)
OS Kernel
8
X-Containers
OS Kernel
Container
Process
Process
Container
Process
Process
User mode
Kernel mode
![Page 9: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/9.jpg)
9
X-Containers
Container
Process
Process
Container
Process
Process
User mode
Kernel mode
OS Kernel OS Kernel
Exokernel
![Page 10: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/10.jpg)
10
X-Containers
Container
Process
Process
Container
Process
Process
User mode
Kernel mode
OS Kernel OS Kernel
Exokernel
![Page 11: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/11.jpg)
X-Container X-Container
11
X-Containers
Process
Process
Process
Process
User mode
Kernel mode
X-LibOS X-LibOS
X-Kernel
![Page 12: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/12.jpg)
• A new security paradigm for cloud-native containers
• X-Kernel: an exokernel with a small attack surface and TCB• X-LibOS: a LibOS that decouples security isolation from the process model
12
X-Kernel
X-Container
X-Container
X-LibOSProcess
Process
X-Containers
Linux
Container
Process
Process
Container
Linux
VM
Linux
Process
Process
Clear Container
KVMLinux
gVisor
Container
Process
Process
gVisor
![Page 13: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/13.jpg)
Threat Model and Design Trade-offs
• Threat model
• Trade-offs• Reduced intra-container isolation• Improved inter-container isolation and performance• Process isolation and kernel-supported security features are not effective
13
X-Kernel
X-Container
X-LibOS
Process
Process
X-Container
X-LibOS
Process
X-Container
X-LibOS
Process
Process
Process
![Page 14: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/14.jpg)
Implementation
• X-LibOS from Linux kernel• Binary compatibility• Highly customizable
• X-Kernel from Xen• Para-virtualization interface• Concurrent multi-processing
• Limitations• Memory management• Spawning time
14
X-Kernel
X-Container
X-LibOS
Process
Process
X-Container
X-LibOS
Process
Process
X-Container
X-LibOS
Process
Process
User mode
Kernel mode
![Page 15: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/15.jpg)
Optimizing System Calls
• Existing solutions• Patch source code• Link to another library
• Our solution• Automatic Binary Optimization
Module (ABOM)• Binary level equivalence• Position-independence
15
Kernel Mode X-Kernel
User Mode
X-Container
X-LibOS
Process
Process
System calls Function calls
For many applications, more than 90% of syscalls are turned into function calls
![Page 16: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/16.jpg)
Evaluation Setup
• Testbed• Amazon EC2• Google Compute Engine
• Compared container runtimes• Docker• gVisor (Ptrace in Amazon, and KVM in Google)• Clear-Container (only in Google)• Xen-Container• X-Container
• Configurations• Patched for Meltdown
16
![Page 17: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/17.jpg)
System Call Performance
17
0
5
10
15
20
25
30
Amazon Google
Norm
alize
d Pe
rform
ance
Docker Clear-Container gVisor Xen-Container X-Container
Up to 27X of Docker (patched) and 1.6X of Clear-Container
![Page 18: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/18.jpg)
Real Application Performance
18
00.5
11.5
Amazon Google
NGINX
Norm
alize
d Th
roug
hput
1.21x~1.27x
0
2
4
Amazon Google
Memcached 2.64x~3.08x
00.5
11.5
Amazon Google
Redis 1x~1.2x
00.5
11.5
Amazon Google
Apache 0.64x~0.72x
![Page 19: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/19.jpg)
Spawning Time and Memory Footprint
19
1.00 1.933.56
11.16
8.80
2.10
0
5
10
15
20
25
30
Docker X-Container
Mem
ory
Foot
prin
t (M
B)
Memory Footprint
FreeX-LibOSExtramicropython
3.66
0.46
0.28
0.280.56
0.29
0.29
0
1
2
3
4
5
Docker X-Container X-Container'
Tim
e (S
)
Spawning Time
User ProgramX-LibOS BootingXen Tool Stack
Reduced to 460ms. Can be further reduced to <10ms.
![Page 20: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/20.jpg)
More Evaluations in the Paper
• More micro/macro benchmarks• Patched and unpatched for Meltdown• Comparing to Unikernel and Graphene• Scalability (up to 400 containers on a single host)
20
![Page 21: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/21.jpg)
Conclusion
• X-Containers: a new security paradigm for isolating single-concerned cloud-native containers• X-Kernel: an exokernel with a small attack surface and TCB• X-LibOS: A LibOS that decouples security isolation from the process model• Trade-off: intra-container isolation vs. inter-container isolation
• Implemented with Xen and Linux• Binary compatibility• Concurrent multi-processing
• More at http://x-containers.org
21
Thank You. Questions?
![Page 22: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/22.jpg)
Backup Slides
22
![Page 23: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/23.jpg)
Pros and Cons of the X-Container Architecture
Container gVisor Clear-Container LightVM X-ContainerInter-container isolation Poor Good Good Good GoodSystem call performance Limited Poor Limited Poor GoodPortability Good Good Limited Good GoodCompatibility Good Limited Good Good GoodIntra-container isolation Good Good Good Good ReducedMemory efficiency Good Good Limited Limited LimitedSpawning time Short Short Moderate Moderate ModerateSoftware licensing Clean Clean Clean Clean Need
discussion
23
![Page 24: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/24.jpg)
Comparing Isolation Boundaries
24
X-Kernel
X-Container
X-LibOS
Process
Process
Kernel
Container
Process
Process
Hypervisor
VM
Kernel
Process
Process
Hypervisor
VM
Process
Exokernel
ProcessLibOSLibOS Process
Microkernel
L4Linux
X-ContainerContainer Virtual Machine
Unikernel, Dune, EbbRT, OSv
Library OS (Exokernel)
L4Linux (Microkernel)
Kernel
Process
Process
![Page 25: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/25.jpg)
Automatic Binary Optimization Module (ABOM)
25
00000000000eb6a0 <__read>: eb6a9: b8 00 00 00 00 mov $0x0,%eax eb6ae: 0f 05 syscall
00000000000eb6a0 <__read>: eb6a9: ff 14 25 08 00 60 ff callq *0xffffffffff600008
0000000000010330 <__restore_rt>: 10330: 48 c7 c0 0f 00 00 00 mov $0xf,%rax 10337: 0f 05 syscall
0000000000010330 <__restore_rt>: 10330: ff 14 25 80 00 60 ff callq *0xffffffffff600080 10337: 0f 05 syscall
7-Byte Replacement (Case 1)
9-Byte Replacement (Phase-1)
0000000000010330 <__restore_rt>: 10330: ff 14 25 80 00 60 ff callq *0xffffffffff600080 10337: eb f7 jmp 0x10330
9-Byte Replacement (Phase-2)
000000000007f400 < syscall.Syscall>: 7f41d: 48 8b 44 24 08 mov 0x8(%rsp),%eax 7f422: 0f 05 syscall
000000000007f400 < syscall.Syscall>: 7f41d: ff 14 25 08 0c 60 ff callq *0xffffffffff600c08
7-Byte Replacement (Case 2)
![Page 26: X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces cgroups SELinux Container Shared kernel attack s s surface and TCB Not allowed](https://reader034.vdocuments.site/reader034/viewer/2022042920/5f657dd480e3d952370bf1fc/html5/thumbnails/26.jpg)
The Exokernel Approach
• Separating protection and management
26
Exokernel
Hardware
Library OS
Process
Library OS
Process
Exokernel
Operating System Kernel
Hardware
Process Process
Monolithic OS Kernel