impact of stability on setting and meeting specifications in an uncertain world william r. porter...

Impact of Stability on Settingand Meeting Specifications

in an Uncertain WorldWilliam R. Porter

MSBW 2012

Impact of Stability on Specs MBSW May 2012

Setting and Meeting Specifications

Statistical approaches to setting specifications and devising methods to meet them have been developed in many different contexts, including: Widget manufacturing, Biologicals manufacturing, Classical stability trial design and Measurement uncertainty.

Copyright 2012 W. R. Porter 2


Widget Manufacturing

Classical industrial quality control was developed for the durable goods industries, especially for manufacture of machined items. Parts had to mate together without customization. Primary source of product variability was the

manufacturing process itself. Measurement tools were highly precise and

accurate…• …as shown by Gage R&R (repeatability & reproducibility)

studies



Biologics Manufacturing

The three most important things required for developing a biologics product have been: Analytical methods, analytical methods and analytical

methods.• (i.e., location, location and location—just as in real estate

investment—but scale [dispersion] is even more important.)• The major source of variability in product performance was

traceable to the methods used to monitor quality, particularly bioassays.



Classical Stability Trial Design

Stability trial design was formalized in ICH guidelines Q1A, Q1B, Q1C, Q1D and especially Q1E. ICH Q1E Appendix B suggests designs and methods

for data analysis and interpretation.• Focus is on studies with only a few batches using ANCOVA. • Methods to evaluate batch variation are unconvincing and

established by fiat.



Measurement Uncertainty

All measurements are uncertain; there are none, which are not uncertain. All measurements are wrong, but some are useful.

• (with apologies to G. E. P. Box)

Only a quantitative estimate of uncertainty distinguishes useful data from worthless garbage.

• In Bayesian terms, numbers without informative prior distributions are worthless. Useful measurements have informative priors.

Beginning in the 1990’s, mainly in Europe, efforts to formalize evaluation of measurement uncertainty were undertaken.

• GUM: Guide to the expression of uncertainty of measurement (1995, 2008).• EURACHEM/CITAC Guide: Use of uncertainty information in compliance

assessment (2007).



Measurement Uncertainty (2)

The pharmaceutical industry has been slow to incorporate accepted international practices for quality control of chemical measurements. Producers must set tighter specifications than required by

customers to allow for measurement uncertainty.

Formal uncertainty studies as part of method validation need improvement. This should not be a problem, as methods for assessing

measurement uncertainty are now well-established—we just have to get on with it.



Specification Uncertainty

Many specifications are digital—that is, expressed as decimal numbers with a specified number of significant digits. Digital specifications also imply the expected maximum

uncertainty in measurements used to confirm compliance to specifications.

Frequently, the number of significant digits in written specifications are TOO SMALL to meet actual quality expectations.

• ICH impurity guidelines are at least one order of magnitude too imprecise.

• Round off errors result.



Customer Expectations

We know what patients expect. What P(failure) is acceptable?

Regulators expect that any sample, selected for the convenience of the regulators from normal supply channels, will meet specifications for purity and potency at all times within the stated shelf life of the drug product for all batches. The testing can be performed by any qualified

laboratory using any qualified equipment and reagents by any trained personnel following validated SOP’s.



Sampling

What distinguishes the approaches used for widgets, biologics and classical stability trials is how they: Address the extent to which

measurement uncertainty contributes to overall variation, and.

Address within-batch and between-batch sampling as sources of variation.

• Sampling has been described as “The Mother Lode” of all errors.


McConnell J, Nunnally BK, McGarvey B. Sampling—The ‘Mother Lode of All Errors. J. Validation Technol. 18(1); 45-49 (2012).


Convenience Sampling

Since any sample from any batch (not just ‘random’ samples) must meet specifications during the entire shelf life, we need to know: What is the variability of samples selected within

batches? What is the variability of samples selected between

batches? How confident are we of our estimates for these

components of variance? How does measurement uncertainty affect our

estimates?Copyright 2012 W. R. Porter 11


Batch Variation in Widgets

Traditional approaches to industrial quality control, as promoted by Walter A. Shewhart, W. Edwards Deming, Joseph M. Juran and their followers, rely on control charts. Typically, a minimum of ~30 batches is thought to be needed to

demonstrate that a process is in statistical control, but Donald J. Wheeler suggests that as few as 10 batches may be adequate.

• Measurement uncertainty is small compared to batch variation.

• QbD initiatives are based largely on methods devised for controlling the quality of widgets.

Formal random sampling plans are well-defined. Experimental (non-continuous) data can be handled using

the ANOM (Analysis of Means) graphical method.Copyright 2012 W. R. Porter 12


Example Widget Batch Data

Wheeler DJ. Advanced Topics in Statistical Process Control. Knoxville TN: SPC Press . p. 374 (1995).

The example in the reference is analyzed graphically using Analysis of Means. Batches 2 & 4 were detectably higher than the Grand Average. Batch 3 was detectably lower than the Grand Average. Note that ANOM control limits are tighter than Average & Range chart limits for continuous processes.


1-WAY ANOVASource of Variation SS df MS F P-value F critBetween Batches 19830 4 4957.5 7.89 0.0012 3.06Within Batches 9425 15 628.3Total 29255 19

Batch 1 Batch 2 Batch 3 Batch 4 Batch 5

250 310 250 340 250

260 330 230 270 240

230 280 220 300 270

270 360 260 320 290


Now That’s Odd…

The data represent grams of coating per sample aliquot. Note that each value ends in ‘0’. The laboratory weighed the recovered coating using a balance

readable only to the nearest 0.01 kilogram. The smallest range within batches is 0.04 kg. The data granularity is too big; within-batch variation is on the

same order of magnitude as measurement uncertainty. The measurement tool is not capable of providing sufficient

accuracy and precision.


Batch 1 Batch 2 Batch 3 Batch 4 Batch 5

250 310 250 340 250

260 330 230 270 240

230 280 220 300 270

270 360 260 320 290


Batch Variation in Biologics

Assay variation is the dominate factor in controlling the quality of biologics, and batch variation takes a back seat. Biologics are typically compared, batch by batch, to a certified

reference standard. From personal experience, variation between testing

laboratories participating in round-robin certification of new global standards grossly exceeds in-house assay variability.

• Batch variation is swamped by measurement uncertainty.

• ICH recognized the need for a separate guideline (Q5C).

Sampling plans are poorly defined• Biologics traditionally were homogeneous solutions, so sampling

was not considered to be an issue.



Example Bioassay Data


Wolfenson C, Groisman J, Couto AS, Hedenfalk M, Cortvrindt RG, Smitz JE, Jespersen S. Batch-to-batch consistency of human-derived gonadotrophin preparations compared with recombinant preparations. Reprod Biomed Online. 2005 Apr;10(4):442-54.

The relative standard deviation is expressed as a percentage and is obtained by multiplying the standard deviation (perbatch) by 100 and dividing this value by the average (per batch).

Product Drug FSH LH HCGproduct immunoactivity immunoactivity immunoactivitybatch no. lU/vial (relative lU/vial (relative lU/vial {relative

SD. n = 5) SD. n = 5) SD. n = 5)Pergonal 0331206B 58.77 (2.2) 13.49(3.6) 3.39(1.7)Humegon 43905119 65.12(1.7) 5.77(1.0) 6.86(1.8)2nd WHO standard (FSH 54/LH 46) 77.72(5.0) 7.39 (2.4) 7.22 (4.8)4th WHO standard (FSH 72/LH 70) 86.14(5.3) 3.82(1.8) 10.10(5.1)Menopur 32509 74.17(1.9) 0.29 (5.2) 9.61 (2.3)Menopur 32307 73.44 (3.9) 0.48(1.7) 9.05 (3.3)Menopur 34104 82.62(1.3) 0.39(3.1) 11.06 (1.8)

Table 4. FSH. LH and HCG immunoactivity in different HMG preparations.


Now That’s Odd…

Two generations of the WHO standard were tested. The 2nd generation standard is certified to contain FSH 54 IU/ampoule

and LH 46 IU/ampoule. The 4th generation standard is certified to contain FSH 71.9 (69.0-74.9 [95% fiducial limits]) IU/ampoule and LH 70.2 (61.7-80.0 [95% fiducial limits]) IU/ampoule. The values reported are in comparison with the standards provided by the test kit vendor.

The reported values for LH differ grossly from the certified values; the FSH values are systematically high.

The reported values are overly precise; the last two digits in the four-digit reported values are meaningless. The ‘uncertainty’ reported is the within day within analyst repeatability,

and does not include inter-day, inter-analyst, or most importantly, interlaboratory uncertainty.

There is insufficient evidence that any of the batches are different.



Batch Variation in Stability Trials

Conventional experimental design (ICH Q1E) relies on a small number of batches. Minimally 3 batches, and who even does more than the

minimum?• Not enough batches are studied to demonstrate that the process is

in a state of statistical control, using the widget-making approach to quality control.

• Recall that at least 10, and preferably 30 batches are required for control charts.

Measurement uncertainty is a substantial component of variance, even for small molecule drugs.

Sampling issues are not addressed in the guidance.



Example Stability Batch Data

Subbarao N, Huynh-Ba K. Evaluation of Stability Data. In: Huynh-Ba K (Ed) Handbook of Stability Testing in Pharmaceutical Development. New York: Springer 266-267 (2009).


Outlier?

Outlying batch?


Now That’s Odd…

Batch 4 is much less stable than the other batches tested. But there is no reason not to believe that Batch 4 is just as representative of

the process as the other batches. Could the initial potency for Batch 4 have been mis-measured?

The 9-month result for batch 1 seems unusually low. But, after the fact, there is no way to determine if this low result is a laboratory

error (e.g., due to inadequate sample preparation). We must include it.


91

92

93

94

95

96

97

98

99

100

101

0 3 6 9 12 15 18 21 24

% P

ote

nc

y

Time, months

Batch Variation

Batch 1

Batch 2

Batch 3

Batch 4

Batch 5

Batch 6


Batch Variation & Measurements

In widget-making, the accuracy and precision of measurements, as demonstrated by Gage R&R studies, is typically a minor component of total variance. Batch variation is easy to detect and measure.

In biologics manufacture, the accuracy and precision of measurements, as demonstrated by interlaboratory testing, is typically the dominant component of total variance. Batch variation is difficult to detect and measure.

Conventional small molecule stability trials occupy a middle ground. Batch variation is handled by crude conventions.



Components of Variance

The relative importance of different sources of variation in the measured quality of drug products must be rigorously assessed in order to define realistic release limits. Obtaining sufficient data is a challenge. R&D

(non-GMP) data and GMP data may need to be combined to increase the reliability of our assessment.



Measurement Variation

Measurement uncertainty can be affected by : Within operators within equipment within labs within

days (repeatability). Between operators. Between equipment. Between days. Between labs (interlaboratory precision). Between dosage forms/strengths (sample

preparation, excipient interactions).• Interactions with other components of measurement

uncertainty.


} (intermediateprecision)


Initial Product Variation

Initial uncertainty can be affected by: Manufacturing sites. Manufacturing scale (equipment variation). Dosage strength. Packaging. Between batches. Homogeneity within batches. Interactions between all of the above.



Stability (Time-Dependent) Variation

Degradation rate uncertainty can be affected by : Overall “average” rate of degradation Interaction with:

• Manufacturing sites.• Manufacturing scale (equipment variation).• Dosage strength.• Packaging.• Environmental excursions (temperature, humidity).• Between batches.• Within batches.• Interactions between all of the above.



Estimating Uncertainties

Not all of the factors enumerated in the previous slides will have equal weight. We need to distinguish between “the vital few

and the useful many” (Juran).• Pareto’s principle: 80% of the variation in product

quality is caused by 20% of the quality-impacting factors.

Designed experiments and observational studies can provide insight.



But Wait! There’s More!What About Supply Chain Control?Control of storage conditions during

manufacture and distribution (e.g., maintaining cold conditions for temperature sensitive products) has been a major recent concern. Proper design of stress degradation

experiments during product development and monitoring of conditions during distribution can address these issues.



Unaddressed Supply Chain Issues

What about mail-order pharmacies? Regulatory expectations are that the supply chain

ends when the patient takes the drug, and not before then. Medications are dispensed with storage instructions.

The U.S. Post office specifically states that control of temperature is NOT provided, and that the shipper is responsible for protection against temperature extremes.

What about military deployments, where troops are issued 180 day supplies of medications?



Using Stress Experiments

In order to study uncertainty due to stability issues, we need ways to SHRINK TIME, because the issue is what will happen at the end of shelf life. Properly designed stress degradation studies

can map the temperature×humidity design space.

• In many cases, results for two years under ‘normal’ storage can be achieved in weeks.



Error Budget

Given a set of final specifications, these must be narrowed by amounts sufficient to account for: Measurement uncertainty (guard banding). Stability-related changes in quality metrics,

especially within and between batch uncertainty at end-of-shelf-life.

Whatever remains are the narrow limits that define the release specifications.



Fixed vs. Random Effects

We tend to do at least an adequate job of designing experiments to study the effects of ‘fixed’ factors on product quality targets. That is, we can estimate bias [location].

We tend to do a poor job of designing experiments to study the effects of ‘random’ factors on variation of product quality. That is, we do not do enough to estimate uncertainty

or variation [scale]—of measurements, sampling or degradation.



Degrees of Freedom


For n = 3, C.L is 220% larger than for n = ∞For n = 4, C.L is 62% larger than for n = ∞

For n = 7, C.L is 220% larger than for n = ∞For n = 14, C.L is 62% larger than for n = ∞

Governs design for fixed effects

(bias)

μ


Insurance

When estimates of a component of variance are based on limited data, then: Consider meta-analysis combining data from non-

GMP development studies with GMP data to increase degrees of freedom.

Inflate the estimated variance component by a factor to account for the uncertainty of estimation of the variance component to compensate for low degrees of freedom.



OOS and OOT

Failure to adequately estimate measurement uncertainty and stability variation will result in many OOS and OOT results. Either measurement uncertainty is underestimated, or Presumed shelf life is too long. Thus release specifications are too wide.

If a process is under control, OOS and OOT results will be rare; contrapositively if OOS and OOT results are not rare, the process is not under control.



Root Causes for Failure

Have all sources of variation been studied enough to obtain a reliable estimate of their magnitude? HAVE YOU IDENTIFIED THE VITAL FEW?

Are factors in experimental designs assumed to be fixed really random instead? ARE YOU ESTIMATING LOCATION OR SCALE?

Are release specifications based on sufficient degrees of freedom for EACH factor? DO YOU REALLY HAVE ENOUGH DATA?



But Really, Who Cares?

In a QbD world, we should aim to produce consistent product with minimum variance. “While conformance to specifications is important, the fundamental

concept that some processes are predictable, while others are not, makes the issue of conformity to specifications an issue which cannot be addressed directly. If a process is predictable, then its conformity or nonconformity will also be predictable. If a process is unpredictable, then its conformity will be unpredictable, and anything we say about the process will amount to little more than wishes and hopes.”

• Wheeler DJ. Advanced Topics in Statistical Process Control: The Power of Shewhart’s Charts, Knoxville, TN: SPC Press, p. 187 (1995).

If we can reduce process variation, reduce product degradation variation and reduce measurement uncertainty to low enough levels through QbD, then setting release limits becomes moot.



Discussion


σ→σ →σ →σ →σ →σ

impact of stability on setting and meeting specifications in an uncertain world william r. porter...

Documents

impact of stability

specs mbsw

analytical methods

expression of uncertainty

formal uncertainty studies

porter3 slide

devising methods

porter2 slide