longitudinal workforce analysis using routinely collected data: challenges and possibilities shereen...
TRANSCRIPT
Longitudinal Workforce Analysis using Routinely
Collected Data:Challenges and Possibilities
Shereen Hussein, BSc MSc PhD
King’s College London
Longitudinal Analysis
General advantages
• Can control for individual heterogeneity
• Subject serve as own control
• Between-subject variation excluded from error
• Can better assess causality than cross-sectional data
General challenges
• Conventional statistical methods require independence between observations
• Longitudinal data are likely to violate this assumption
• Missing data due to attrition
• Data availability
29/5/2012 2
Workforce Data Example: NMDS-SC
• Structure• Design• Coverage• Time span• Type of information collected• Data collection and archiving• size
29/5/2012 3
NMDS-SC data structureSocial care providers in England
Complete NMDS-SC returns
Aggregate information on the workforce Detailed information on
all or some individual workers
Providers’ Database
workers’ DatabaseLinkable
29/5/2012 4
NMDS-SC longitudinal analysis: potential
• Data coverage• Wide range of providers and individual workers’
information• Sector specific- uniqueness• Hierarchical structure• Workforce development and business sustainability• Timely
– Demographics, austerity, unemployment• Economics
– Care costs, including turnover costs– Pay
• Linkable to local data characteristics
29/5/2012 5
Challenges in NMDS-SC longitudinal analysis
• No sampling framework
• No regular intervals for data collection
• Irregularities in data completion by different providers
• Additions/alterations of variables and fields
• Cumulative nature and consequences on data size and structure
• Archiving 29/5/2012 6
Challenges in NMDS-SC longitudinal analysis- continued
Computational• Data size
– Innovation in system design and architecture
• Accumulative property– Scalability of the system
• Changes in data fields• Variable additions and
omissions• Data over-ride and
archiving– Software and hardware
issues
Methodological
• Unusual patterns of follow-up– Censoring
• Variability in the database over time
• Unbalanced cohort design• Missing data
– Update frequency– Attrition– True exit
• Other methodological issues
29/5/2012 7
Providers’ level longitudinal mapping
• From December 2007 to March 2011• Linked 18 separate databases on the providers’
level• Each has records from 13,095 to 25,266
421,671 valid records included in the construction
• Number of updates ranged from 0 to 18 per provider
• Continuous process, more records added every 3 months
29/5/2012 8
Meta-data analysis: providers with different number of events
29/5/2012 9
Specific example 1: Providers with 18 updates
29/5/2012 10
29/5/2012 11
Specific example 2: Providers with 2 updates
Density distribution plot of providers with at least 2 updates during the period December 2007 to March 2011
29/5/2012 12
density distributions of number of days elapsed between two updated providers’ events
29/5/2012 13
Simple example using providers’ database:
workforce stability over time• Longitudinal changes in care workers’
turnover and vacancy rates over time – From January 2008 to January 2010
• Changes in reasons for leaving the sector, identified by employers– Differentiating between those with improved
(reduced) turnover rates and those with worse (increased) turnover rates
29/5/2012 14
Pre analysis
• Selecting and constructing providers’ panel– Including those with at least two updates
within +/- 3 months of T1 and T2 – 2953 providers with mean coverage duration of
602d
• Investigate sample representation
• Data quality checks
• Data manipulation/imputation
29/5/2012 15
Some findings: changes in turnover rates
29/5/2012 16
Reason for leaving and turnover rate changes
29/5/2012 17
Analysis expansion: next steps
• Consider changes over a longer period of time• Examine other providers’ characteristics• Different take on panel inclusion criteria • Link to individual workers’ longitudinal
databases to examine relations with detailed workforce structure– Pay, qualifications, profile etc.
• Build economic elements within analyses models, e.g. specific-turnover costs, within the longitudinal model
29/5/2012 18
Workers’ level longitudinal analysis
• A much larger database– Same period of time- over 11M records
• Providers not required to complete information for ‘all’ workers– Structural/design missing data– True missing data
• Linkage issues – more data fields required for identification and linkage
• Considerably large number of variables and fields– Careful planning; analysis-tailored data retrieval
• Changes in database– Amendments, new variables etc.– Programming intensive and demanding models (may not be
replicable for different databases)29/5/2012 19
29/5/2012 20
Issues to consider• Suitability of models
– Longitudinal structure– Competing risks
• Measurement window– Late entry into risk sets
• Use proxies, other variables in the dataset• Adopt suitable approach/model
– Censoring (LHS and RHS)• Assumptions
– Guided by:• Sector-specific knowledge• Intelligence from other variables in the data
29/5/2012 21
Current longitudinal researchWatch this space!!
• Workforce mobility within the sector
• Occupation durations
• Characteristic-specific probabilities of exiting or remaining in the sector
• Characteristic-specific probabilities of moving employer within the sector or having multiple jobs
• Career pathways within the sector
29/5/2012 22
Acknowledgments
• Thanks to the Department of Health for funding this work
• Thanks to Skills for Care for providing the data on regular basis
• Thanks to Analytical Research Ltd for their technical and quantitative support
29/5/2012 23
Further information
• 02078481669
• See:
• http://www.kcl.ac.uk/sspp/departments/sshm/scwru/res/knowledge/nmdslong.aspx
29/5/2012 24