preventing serversickness
DESCRIPTION
Preventing Server Sickness Becoming A Pandemic - Benelux March 2013TRANSCRIPT
![Page 1: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/1.jpg)
Prevent Server Sickness Becoming a Pandemic!
Gabriella DavisThe Turtle Partnership
[email protected]: gabturtle
![Page 2: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/2.jpg)
Fixing Your Server
2
What causes server sickness
Tools to spot sicknessGetting Your Server Back to Full Health
![Page 3: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/3.jpg)
Server Sickness
3
![Page 4: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/4.jpg)
Server Sickness
4
The problem with Domino
How does a server get sick?–Vulnerabilities–Aging Configurations–Bad Habits–Developers Gone Wild
![Page 5: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/5.jpg)
The Problem With Domino
5
“My Server Is Running Fine”
Server Stability–Often despite our best effortsTasks that just run–even without being properly configured
![Page 6: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/6.jpg)
Vulnerabilities
6
Start with the OS–patch levels–unnecessary processes with exposed ports–disk and data security
Then the hardware–It’s all about disk performance–Using a SAN? Is the SAN configured for Domino?–Transaction logs configured?
![Page 7: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/7.jpg)
Vulnerabilities
7
Security–ACLs
• -Default- and Anonymous• LocalDomainServers
HTTP vs HTTPs
LDAPDIIOPSametime
![Page 8: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/8.jpg)
Aging Configurations
8
What can give you problems over time–Database sizes–More users–More tasks and features
![Page 9: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/9.jpg)
Bad Habits
9
What are your users doing?–what features are they using–how are they using them
• are they creating repeating 10yr appointments for instance• are they copying themselves on emails
Password quality for HTTP passwords
![Page 10: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/10.jpg)
Giving Developers Power
Allowing development to dictate replication and agent scheduling
The curse of not production tested XPages codeDemands for “LDAP” or “DIIOP” for an application to work
10
![Page 11: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/11.jpg)
Tools to Spot Sickness
11
![Page 12: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/12.jpg)
Tools to Spot Sickness
12
Understanding Priorities
DDM Probes and Event AnalysisStatistics
Catalog.nsfQoS - new with Domino 9Enhanced Fault Reporting - new with Domino 9
![Page 13: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/13.jpg)
Understanding Priorities
13
Server role–What do you want from your server–What are statistics telling youWarning Levels–Is it safe to ignore ‘Warning (Low)’ and focus on
‘Fatal’ or ‘Failure’
![Page 14: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/14.jpg)
Bringing Problems to You
14
Event Handlers, Event Generators, Statistics, Fault Reports and DDM Probes - where to start
Setting Statistic ThresholdsChoosing and configuring probes
Reviewing FaultsSetting up QoS behaviour
![Page 15: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/15.jpg)
Bringing Problems To You
15
Why we set up collection hierarchies for DDM–and howDaily and Weekly DDM reviews–What to look out for
![Page 16: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/16.jpg)
Probes for Mail Servers
16
Security - Weekly
Directory PerformanceCritical mail routes
Mail ‘Slack’
![Page 17: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/17.jpg)
Probes for Application Servers
17
Agent run times –agent cpu usageSecurity and Web Configuration
![Page 18: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/18.jpg)
Probes for Struggling Servers
18
OS level –disk performance (beware of reported SAN
problems)–memory–network
![Page 19: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/19.jpg)
What to look for
19
Fatal problems
Persistent WarningsPeak activity behaviour–uptick in problems at 9am, 1pm etcRepetitive low level ‘annoyances’
![Page 20: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/20.jpg)
Catalog.nsf
20
Not every database is immediately visible but they are all there (just hidden with selection formulae)
It’s a good place to start looking for multiple replicaIt’s a good place to find ACL issues
Replicates around your domain and updates overnight
![Page 21: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/21.jpg)
QoS - Quality of Service
Monitor server health and performanceMonitors application behavior, stability and hangsRestarts Domino if it thinks there are memory issues or an application is hungShuts down Domino if a clean shutdown doesn’t happen and the server hangsControlled via notes.ini settings and dcontroller.iniRequires Domino to be running under the Java Controller
• nserver -jc21
![Page 22: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/22.jpg)
QoS Configuration
Starting Domino under Java Controller should create a dcontroller.ini file
QOS_Enable=1In Notes.Ini
• QOS_ProbeInterval (defaults to 1 min)• QOS_ProbeTimeout (defaults to 5 mins)• QOS_ShutDown_Timeout• QOS_Apps_Timeout• QOS_Shutdown_Timeout
22
![Page 23: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/23.jpg)
QOS - Potential Problems
QOS doesn’t support passwords on server ids , the restart will pause at the password entry screen
QOS timeouts being too lowDon’t enable QOS on servers without transaction logging
23
![Page 24: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/24.jpg)
Enhanced Fault Reporting
Fault Reporting Database -lndfr.nsf
Expanded to include a by Disposition view–all faults when analyzed have a disposition value
that categorises as• Problem• Possible Problem (possibly actionable ) • Possible Problem (likely NOT actionable ) • Informational• Unknown (investigate)
24
![Page 25: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/25.jpg)
Possible Problem - Actionable
Out Of Memory: Represents a crash in which the Java virtual machine (JVM) ran out of a memory resource such as heap space. Launched Notes multiple times: Indicates that the user quickly launched multiple instances of the Notes clientPossible hang: Indicates that the Notes client was manually terminated while it appeared to be doing useful work.User Kill: Indicates that the user manually terminated the client while it appeared to be waiting for input or network timeout
25
![Page 26: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/26.jpg)
Back to Full Health
26
Getting Control–Mail , Databases and ECLs–SMTP–Agent Scheduling–Directories–Adminp–LDAP–Tasks and Internet Site DocumentsDomino Configuration Tuner
![Page 27: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/27.jpg)
Back to Full Health
27
Getting Control–Mail , Databases and ECLs–SMTP–Agent Scheduling–Directories–Adminp–LDAP–Tasks and Internet Site DocumentsDomino Configuration Tuner
![Page 28: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/28.jpg)
Getting ControlMail and Databases
28
Setting ACLs at directory level (Editor)Lock down ECLs via PoliciesIntroducing quotas alongside server based archivingConsider archiving files to a dedicated serverUpgrade to 8 and enable OOO router instead of agentsDisable forwarding rules set up by usersUse message tracking and mail rules very sparinglyDisable on the fly searching of non indexed databases
![Page 29: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/29.jpg)
Database Management ToolsDBMT Server Command
• runs copy-style compact operations • purges deletion stubs • expires soft deleted entries • updates views • reorganizes folders • merges full-text indexes • updates unread lists • ensures that critical views are created for failover
–Replaces Updall• Load updall - nodbmt tells updall to run but not perform the
functions that DMBT already does
29
![Page 30: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/30.jpg)
DBMT Parameters-compactThreads
-updallThreads-ftiThreads-timeLimit refers to compact timeout for DBMT-range starttime stoptime–compactNdays (run Compact every x days)–ftiNdays (run FT Index every x days)–force d (day Sunday =1) fixup if compact fails for
consecutive day
30
![Page 31: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/31.jpg)
Getting ControlSMTP
31
Restrict relaying to specific ip addresses not network ranges
Beware of allowing authenticated relaying and opening up to dictionary attacksRestrict rights to send to internal groups from internet addresses
Don’t accept mail for local part matchesConfigure your server for HTML mail not plain text
![Page 32: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/32.jpg)
Getting ControlSMTP (more)
32
Don’t allow all connecting hosts to deliver mail inbound, if you use a service restrict to those hosts
Use services / tools to spot attacks such as–persistent attempts to mass deliver within a time
period–continual failures by a host to deliver to a correct
addressMove responsibility for that first line of defense away from native Domino
![Page 33: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/33.jpg)
Getting ControlAgent Scheduling
33
When are agents set to run–amgr_newmaileventdelay–amgr_newmailagentminintervalIf you’re using OOO agents how often are they scheduledDo users have private agents running–Sh Agents [DBName]
• All shared and private agents in a databaseWho has rights to run agents
![Page 34: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/34.jpg)
Getting ControlDirectories
34
Avoid adding additional views to the Domino Directory
The risk of allowing local replicas with Author rightsDirectory Assistance –Sh xdir
![Page 35: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/35.jpg)
Getting ControlAdminp
35
Purge old documents
Requests awaiting approvalTell adminp process NEW not ALL
![Page 36: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/36.jpg)
Getting ControlLDAP
36
Allowing anonymous access to query LDAP
Authenticating LDAP queriesExtended Directory Catalog used by LDAP
Relying on DNSNot configuring the LDAP task correctly to allow large searches with no timeoutsMaintaining schema.nsf
![Page 37: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/37.jpg)
Getting ControlTasks and Program Documents
37
Disable tasks you don’t need
Schedule overnight tasks so they don’t overlap–and don’t conflict with backupsUse program documents so you can review and manage easily–sh config servertasksat*Keeping templates on every serverUsing compact -B
![Page 38: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/38.jpg)
Getting ControlInternet Site Documents
38
Web Configuration means TCPIP tasks are configured in the server document and are server wide–often enabled by defaultInternet site documents require you to opt in for TCPIP services–configured by hostname
![Page 39: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/39.jpg)
Domino Configuration Tuner
39
Domino Configuration Tuner is an analysis tool based on a set of pre-configured best practice/worst practice rules
The Rules are shipped by IBM with the Lotus installs and are updated via a public update siteMakes recommendations on configuration changes to enhance performance and security and reduce TCO
![Page 40: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/40.jpg)
How does it work?
40
Run and installed via the Domino Configuration Tuner databaseUpdated by online template updates and rule updatesDCT rules and results are held in a local database and will require a restart of the client for changes to take effectScans–Server documents–notes.ini settings–advanced database propertiesIntended to scan servers in a single domain
![Page 41: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/41.jpg)
How does it work?
41
Creates reports on each scanned server based on the rules you select
Each report contains –Issues–recommendations for adjustments–links to supporting documentation
![Page 42: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/42.jpg)
Pre-requisites
42
v8 Notes client (standard or basic) or administrator
dct.nsf database and dct.ntf templateservers 7.x or higher
![Page 43: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/43.jpg)
Setup
43
DCT.NSF
StdDominoConfigTuner Template (dct.ntf)ID must have reader access to names.nsf
ID must have ‘View Administrator’ rightsRequires no server or domain changes
![Page 44: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/44.jpg)
View Administrator Rights
44
Server Document
Security TabView Administrator is a subset of ‘Administrator’ rights
Think of it as ‘Show’ not ‘Tell’ rights–Sh users - YES–tell http refresh - NO
![Page 45: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/45.jpg)
List of all rules
Review rule , description and supporting documentationAll rules are enabled by default for all scans
Enable and Disable rules
DCT Preferences
45
![Page 46: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/46.jpg)
Connects to the IBM site to download–must have outbound connectivity
DCT Updates
46
![Page 47: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/47.jpg)
DCT Updates
47
Click ‘check for updates’
Connects to an external IBM site to identifies any template or rule updates
![Page 48: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/48.jpg)
DCT Updates
48
Accept license and updates download
It’s not possible to selectively download
![Page 49: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/49.jpg)
DCT Updates - Finished
49
“Successful” screen will notify you to restart your client
You may need to do 2 client restarts before DCT can be used
![Page 50: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/50.jpg)
Running the tuner
50
First select the servers in your current domain you want to run against
The list of servers is retrieved from the domain of the home server identified in your location document
Change locations to scan a different domain
![Page 51: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/51.jpg)
Running the tuner
51
You can manually type in the full hierarchical names of any other servers you want to scan as part of this analysis
Separate multiple server names with commas, semi colons or new linesYou can only scan servers you can reach so you need a connection document to any you list–or the server needs to be available via your
passthru server in your location
![Page 52: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/52.jpg)
Summary results
Issues by criticality
Understanding the Results
52
![Page 53: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/53.jpg)
Summary results
Servers that failed to scan–reason why scan failed
Understanding the Results
53
![Page 54: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/54.jpg)
Summary results
Detailed list of rules evaluated
Understanding the Results
54
![Page 55: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/55.jpg)
View the current report
Select ‘change’ to view a different report
Understanding the Results
55
![Page 56: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/56.jpg)
Understanding the Results
56
Filter results to make analysis easier–by server–by specific rules–by severity
![Page 57: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/57.jpg)
Categorised results of recommendations
Sorted by criticality and then by server name
Understanding the results
57
![Page 58: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/58.jpg)
Understanding the results
58
Each recommendation comes with an explanation so you can evaluate on a result by result basis if you want to make the change
![Page 59: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/59.jpg)
Understanding the results
59
Each recommendation is provided with a link to a best / worst practices supporting documentation
![Page 60: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/60.jpg)
Working with Rules
60
Disabling and enabling rules can be done through the ‘Preferences’
![Page 61: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/61.jpg)
Working with Rules
61
Selecting a rule shows the description and links to the best / worst practice documentation
![Page 62: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/62.jpg)
Making Changes
62
Advanced Database Properties–assigned en masse via Domino Adminnotes.ini settings–assigned via the command set config xxx = x–shown via the command sh config xxx = xMany recommendations refer to ‘some databases’ but don’t specify which ones - check which ones will be affected
![Page 63: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/63.jpg)
Resources
63
Domino Configuration Tuner blog–http://www.bleedyellow.com/blogs/DCT/–details and explanations of new rules published
each month
![Page 64: Preventing serversickness](https://reader035.vdocuments.site/reader035/viewer/2022062319/5554095bb4c90577468b5132/html5/thumbnails/64.jpg)
Summary• No matter how well your servers are configured they will continue to
degrade in performance over time unless you pro-actively monitor and fix• Many of the server performance issues will be seen first by your users
before they filter down to you• Make reviewing your server configuration using DDM probes followed by
a DCT analysis part of every server upgrade• Enable probes that are specific to the server role. Mail and Directory
probes on Mail servers and Agent probes on Application servers• Use Security and Database probes configured in DDM to stay on top of any
low level warnings that could cause larger problems in the future• Don’t over configure your servers to monitor everything or you’ll be
looking for a needle in a haystack. Ask your servers to tell you only what you need to be aware of so immediately
• Use the built in tools, DCT, Statistics, DDM, Catalog, Activity Trends to monitor your servers and gain a good understanding of what is their ‘normal’ behaviour so you can more easily spot when something goes wrong.