network administration research and analysis week-7
TRANSCRIPT
Network Network AdministrationAdministration
Research and AnalysisResearch and Analysis
Week-7Week-7
Theory of Network AdminTheory of Network AdminBurgess – Ch.11Burgess – Ch.11
Science vs. TechnologyScience vs. Technology Studying Complex SystemsStudying Complex Systems Purpose of ObservationPurpose of Observation Evaluation Methods and ProblemsEvaluation Methods and Problems Evaluating a Hierarchical SystemEvaluating a Hierarchical System Deterministic and Stochastic BehaviourDeterministic and Stochastic Behaviour Observational ErrorsObservational Errors Strategic AnalysesStrategic Analyses
A Scientific Basis A Scientific Basis for System Administrationfor System Administration
System admin has always involved System admin has always involved experimentationexperimentation
Development of Networks has lead to Development of Networks has lead to exponential increase in system complexity and exponential increase in system complexity and corresponding increase in difficulty of corresponding increase in difficulty of ManagementManagement
A purely mechanical approach may no longer be A purely mechanical approach may no longer be adequate: time for a theoretical basis….adequate: time for a theoretical basis….
World-wide interest, encouraged by professional World-wide interest, encouraged by professional organisations organisations (SAGE, USENIX, ACM, IEEE, ACS)(SAGE, USENIX, ACM, IEEE, ACS)
Network Admin ResearchNetwork Admin ResearchScience vs TechnologyScience vs Technology
System Admin studies mostly System Admin studies mostly “Applied Research” which result in “Applied Research” which result in the development of a specialised the development of a specialised toolset that solves local/specific toolset that solves local/specific problemproblem
Some workers have attempted to Some workers have attempted to collate results to form a more collate results to form a more general technology of more general technology of more permanent or global value.permanent or global value.
But this is not But this is not SScience cience !!
What is “Science”?What is “Science”?The Scientific MethodThe Scientific Method
Knowledge advanced by series of studies Knowledge advanced by series of studies that either verify/falsify a hypothesisthat either verify/falsify a hypothesis
Study may be theoretical or practical but Study may be theoretical or practical but all contribute to a larger on-going all contribute to a larger on-going discussion that leads to progressdiscussion that leads to progress
A single study is rarely the end of the A single study is rarely the end of the discussiondiscussion
Each study is usually repeated and verified Each study is usually repeated and verified or challenged by other researchersor challenged by other researchers
Reproducibility is very importantReproducibility is very important
Scientific MethodScientific Method
Motivation Motivation – statement of context and – statement of context and objectivesobjectives
Appraisal of problemsAppraisal of problems Theoretical Model Theoretical Model - used to understand or - used to understand or
solve problems and provide a framework for solve problems and provide a framework for comparison and measurementcomparison and measurement
Design an experiment Design an experiment – the Approach– the Approach Perform an Experiment Perform an Experiment – obtain Results– obtain Results Evaluation or Verification of Approach and Evaluation or Verification of Approach and
ResultsResults
Scientific MethodScientific Method
Science is a dialog of TheoriesScience is a dialog of Theories Science proceeds by ExperimentScience proceeds by Experiment Need Theory to interpret Need Theory to interpret
observationsobservations Need observations to disprove Need observations to disprove
TheoryTheory
Network Admin Research:Network Admin Research:Studying Complex SystemsStudying Complex Systems
Areas of study in System Admin have been Areas of study in System Admin have been Technical and/or Behavioural and include:Technical and/or Behavioural and include:– Reliability studiesReliability studies– Finding and evaluating methods for system Finding and evaluating methods for system
integrityintegrity– Observation which apply to non-linear behaviourObservation which apply to non-linear behaviour– Issues related to strategy and planningIssues related to strategy and planning
Mostly Empirical or Qualitative case studyMostly Empirical or Qualitative case study
Purpose of ObservationPurpose of Observation
Gather Info about a Problem to enable Gather Info about a Problem to enable development of a Technology which development of a Technology which solves itsolves it
To evaluate the Technology for To evaluate the Technology for effectiveness effectiveness (ie whether it fulfils it’s design goals)(ie whether it fulfils it’s design goals)
But evaluation of SysAdmin But evaluation of SysAdmin experiments is difficult due to Vested experiments is difficult due to Vested Interests and lack of clearly defined Interests and lack of clearly defined metricsmetrics
Evaluation Methods Evaluation Methods and some Problemsand some Problems
Ideally there should be a repeatable Ideally there should be a repeatable test yielding measurementstest yielding measurements
The trouble is that while a good The trouble is that while a good system administrator could do this system administrator could do this heuristically, these are heuristically, these are – Very difficult to quantifyVery difficult to quantify– Different SysAdmins work in different Different SysAdmins work in different
waysways– Extreme variability in systems and usersExtreme variability in systems and users
Some Research TopicsSome Research TopicsEfficiency & AutomationEfficiency & AutomationNetwork Administration Network Administration methods/modelsmethods/modelsReliability StudiesReliability Studies
– Fault managementFault management– MetricsMetrics– Patterns of eventsPatterns of events
prediction & performanceprediction & performance
A Common Research topic A Common Research topic and the problemsand the problems
Ways to relieve Administrators of tedious Ways to relieve Administrators of tedious work, so they can use there talents better in work, so they can use there talents better in other ways. What sort of experiment is other ways. What sort of experiment is needed?needed?
Measure time spent working on a system Measure time spent working on a system but the time required usually expands to occupy the time but the time required usually expands to occupy the time available!available!
Record actions of an automatic system and Record actions of an automatic system and compare with those of a human compare with those of a human administrator administrator but depends on the person - different people do things in but depends on the person - different people do things in different waysdifferent ways
Eg
Network Admin Research:Network Admin Research:effect of Vested Interests….effect of Vested Interests….
SysAdmins require tools….SysAdmins require tools…. Such tools often acquire a dedicated following Such tools often acquire a dedicated following
of users who grow to like them regardless of of users who grow to like them regardless of what the tools allow them to achievewhat the tools allow them to achieve
Marketing skills of one software vendor might Marketing skills of one software vendor might be better than others and create a bias in the be better than others and create a bias in the marketplace that effects the perceived marketplace that effects the perceived usefulness of a particular toolusefulness of a particular tool
So one cannot estimate the effectiveness of a So one cannot estimate the effectiveness of a tool based just on the number of those who tool based just on the number of those who use ituse it
Evaluating Hierarchical SystemEvaluating Hierarchical System What level of detailed decomposition of What level of detailed decomposition of
levels within the hierarchy is appropriate?levels within the hierarchy is appropriate? Building a model of the hierarchy is often Building a model of the hierarchy is often
the best way to address complexity – the best way to address complexity – focus focus on what’s important or practicalon what’s important or practical
Experiments based on this model might Experiments based on this model might then involvethen involve– MeasurementsMeasurements– SimulationsSimulations– Case studiesCase studies– User surveysUser surveys
FaultsFaultsIEEE classify software anomalies as:IEEE classify software anomalies as:
O/S crashO/S crash Program hangProgram hang Program crashProgram crash Input problemInput problem Output problemOutput problem Failed required Failed required
performanceperformance
Perceived total Perceived total failurefailure
System error System error messagemessage
Service DegradedService Degraded Wrong outputWrong output No outputNo output
Most common faults for Most common faults for SysAdmin are:SysAdmin are:
Input ProblemInput Problem– Missing or inappropriate configurationMissing or inappropriate configuration
Failed performanceFailed performance– Usually through loss of resourcesUsually through loss of resources
Software problems can be eliminated Software problems can be eliminated by revaluation of individual software by revaluation of individual software componentscomponents
Reliability and RedundancyReliability and Redundancy
Average (Mean) time before failureAverage (Mean) time before failure
With parallel or redundant componentsWith parallel or redundant components
With serial or dependent componentsWith serial or dependent components
Probability of FailureProbability of Failure
edTimeTotalElapsMeanUptimeR
nparallel
RRRR
1...
11
21
...321 RRRRseries
)exp()( RttP
MTBF and ComputersMTBF and Computers
Computer system MTBF doesn’t account Computer system MTBF doesn’t account for:for:– Dependency Dependency – Not all systems have same – Not all systems have same
attachmentsattachments
– Fail-over and Latency of serviceFail-over and Latency of serviceSystems may fail, then recover after a single Systems may fail, then recover after a single
delaydelaythis may occur repeatedly !!this may occur repeatedly !!
– Patterns of usagePatterns of usageUser behaviour may bias the outcomeUser behaviour may bias the outcome
Some MetricsSome Metrics
NetNet– Total number of packetsTotal number of packets– Amount of IP fragmentationAmount of IP fragmentation– Density of Broadcast messagesDensity of Broadcast messages– Number of CollisionsNumber of Collisions– Number of Sockets(TCP) in and outNumber of Sockets(TCP) in and out– Number of malformed packetsNumber of malformed packets
Some MetricsSome Metrics
StorageStorage– Disk Usage in BytesDisk Usage in Bytes– Disk Operations per SecondDisk Operations per Second– Paging rate (free memory and Paging rate (free memory and
thrashing)thrashing)
Some MetricsSome Metrics
ProcessesProcesses– Number of privileged processesNumber of privileged processes– Number of non-privileged processesNumber of non-privileged processes– Maximum percentage CPU used in Maximum percentage CPU used in
processesprocesses
Some MetricsSome Metrics
UsersUsers– Number logged onNumber logged on– Total NumberTotal Number– Average time spent logged on per userAverage time spent logged on per user– Load AverageLoad Average– Disk Usage rise per session per user per Disk Usage rise per session per user per
hourhour– Latency of ServicesLatency of Services
DistributionsDistributions
Delta – constant XDelta – constant X Uniform – constant YUniform – constant Y Gaussian or RandomGaussian or Random Normal – “bell curve”Normal – “bell curve” Black-Body or Planck – approx Black-Body or Planck – approx
exponentialexponential Poisson – random arrival with mean Poisson – random arrival with mean
raterate Pareto – Power LawPareto – Power Law