filesystem layouts and apache performance jeff d. almeida [email protected]
TRANSCRIPT
Filesystem Layouts and Apache Performance
Jeff D. Almeida [email protected]
Foreword
Convention used throughout: Our sample site is “puppysurprise.com”, a hypothetical
advertiser-supported content site where professional football players share their favorite dog stories from each year of their careers.
Acknowledgements/Apologies-in-advance: Dean Gaudet Christopher Alexander The Grateful Dead
The Proposition System Administrators can directly impact how their webservers
perform by decisions they make when initially configuring systems.
1
The Mission
Teach system administrators about “best practices” for system installations that lead to optimum performance.
Enable system administrators to make informed decisions regarding trade-offs among performance and other constraints (e.g. security, multiple users, available hardware).
2
The Destination
Apache webservers on optimally configured hardware. Apache performing at its best. Hardware not taxed unnecessarily.
3
The Origin
Default installation for Apache is suboptimal. Many vendor-provided installs are even worse. Default/”standard” hardware configurations exacerbate the
problem. Many times, site design decisions contribute as well.
4
A Long, Strange Trip
Until “recently” performance wasn’t much of a consideration.
Old. proven ways of thinking among Unix admins. Resulting DocumentRoot:
/usr/local/apache/htdocs /usr/local/web/sites/puppysurprise/docs /var/web/puppysurprise/docs
5
Vendor Paradox
In theory, not selling high-performance webservers. In practice, web services have been the “killer app” for
Linux, FreeBSD, etc. A lot of this appears to be whimsy. Resulting
DocumentRoot: /home/httpd/sites/puppysurprise/docs /usr/share/webdocs/puppysurprise /usr/users/httpd/htdocs
Hardware Vendors
Old guard, traditional Unix vendors slowest to adapt. High-density rackmount Intel/Linux vendors doing better.
Webservers are their bread-and-butter Problems arise from the disk configuration.
Site Design Problems
Traditional thinking favors “deep” websites. /teams/1997/Pats/qbs/Drew_Bledsoe/index.html
Traditional thinking favors designer-configurable access controls .htaccess files
The Timeless Way of Building Webservers
Optimized hardware configuration Optimized filesystem layouts Optimized Apache configuration Optimized website design
6
Patterns for Successful Server Design
Not a “one-size-fits-all” sort of solution Different environments and requirements call for different
solutions Learn to make installation-time decisions that optimize
aspects of server configuration. Patterns are manifestations of “the way”.
7
Hardware Optimizations I
Avoiding Device Contention Three things looking for disk seeks: docs, logs, swap In order of decreasing efficiency, keep the three on
separate: Buses (unlikely on Intel hardware) Controllers Channels Spindles Partitions
Avoid clutter to speed seeks
Hardware Optimizations II
Make the right purchasing decisions Enough RAM to never swap Disks and controllers
Hardware RAID 0+1 Lesson from the database gurus Smooth out spikes Superior to Software RAID, RAID 5
SCSI not IDE SCSI is non-blocking
Filesystem Layouts I
Understand the way Apache opens files for reading Not a bug, actually proper behavior Example:
Docroot /usr/local/web/sites/puppysurprise/docs File /teams/1997/Pats/qbs/Drew_Bledsoe/index.html 12 layers of parent directories!
Filesystem Layouts II
Based on this, put our docroot at /puppysurprise Why not? Worst case, it’s on a separate partition anyway
at this point, right?
Apache Configuration I
We’re focusing on those performance-tuning tips that are filesystem-specific. Plenty of others exist.
Options +FollowSymLinks AllowOverride None Don’t use suexec (or UserDir at all, if possible)
Apache Configuration II
Can you afford to log in some other fashion? Unless docs and logs are on separate buses, there’s
always going to be some contention. Logging options:
disable it altogether buffered logging remote syslog remote database conjectures: named pipe? netpipe?
Site Design
Performance favors the horizontal Let the sysadmin set the access controls in httpd.conf Akamai? Or at least Aka-me?
Conclusions
File /puppysurprise/97Pats/Drew_Bledsoe.html is in a performance-optimized location.
Putting /puppysurprise/ on it’s own RAID 0+1 controller is a hardware performance optimization.
Server logs are on their own partition (at worst) or another machine altogether.