PPQ Power Assessment Theoretical Results

Yalin Zhu (yalin.zhu@merck.com)

2020-10-07

Preliminaries

Without loss of generality, suppose \(n\) outcomes of the Critical Quality Attribute (CQA) are normally distributed, which is denoted by \(X_i \stackrel{\text{i.i.d}}{\sim} \mathcal{N}(\mu, \sigma^2)\), where \(i=1,\dots, n\), then the distributions of sample mean and standard deviation are as known: \[\begin{equation} \bar{X} \sim \mathcal{N}(\mu, \frac{\sigma^2}{n}) \end{equation}\] and \[\begin{equation} \label{diststd} \dfrac{(n-1)S^2}{\sigma^2} \sim \chi^2 (n-1). \end{equation}\] Moreover, sample mean and sample standard deviation are independent under normal distribution assumption.

Denote the lower and upper specification limits as \(L\) and \(U\), respectively. The prediction or tolerance interval can be expressed by \[\begin{equation} \left[ Y_1, Y_2 \right] = \left[ \bar{X} - kS, \ \bar{X} + kS \right] , \end{equation}\] where \(k\) is a specific multiplier for the interval. For example, for prediction interval, \(k=t_{1-\alpha/2,n-1}\sqrt{1+\frac{1}{n}}\).

Specification test for one release batch

The outcome at release can be any one of the sample, so \(X_{rl} \sim \mathcal{N}(\mu, \sigma^2)\), then the probability of passing PPQ at release should be

\[\begin{equation} \begin{split} \Pr(\text{Passing Specification for Release}) & = \Pr(L \le X_{rl} \le U) \\ & = \Phi (U) - \Phi(L) \end{split} \end{equation}\]

This probability is very easy to calculate using software, such as pnorm() in R.

Test for PPQ Batches

\[\begin{equation} \label{probpass} \begin{split} \Pr(\text{Passing a Single PPQ Batch}) & = \Pr(L \le Y_1 \le Y_2 \le U) \\ & = \int_{L}^{U} \int_{L}^{y_2}f_{Y_1,Y_2}(y_1, y_2) dy_1 dy_2 \end{split} \end{equation}\]

Now it is essential to obtain the bivariate joint distribution of the lower and upper prediction/tolerance interval, that is, find joint probability density function (PDF) \(f_{Y_1,Y_2}(y_1,y_2)\).

Since \(Y_1=\bar{X} - k S\) and \(Y_2=\bar{X} + k S\), we can use another bivariate PDF \(f_{\bar{X},S}(x,s)\) to calculate \(f_{Y_1,Y_2}(y_1,y_2)\) by using Jacobian transformation.

Solve \(\bar{X}\) and \(S\) as \(x=\dfrac{y_1+y_2}{2}\) and \(s=\dfrac{y_2-y_1}{2k}\), then Jacobian of the transformation is \[\begin{equation} |J|= \left| \begin{array}{cc} \dfrac{\partial x}{\partial y_1} & \dfrac{\partial x}{\partial y_2}\\ \\ \dfrac{\partial s}{\partial y_1} & \dfrac{\partial s}{\partial y_2} \end{array} \right| = \left| \begin{array}{cc} \dfrac{1}{2} & \dfrac{1}{2}\\ \\ -\dfrac{1}{2k} & \dfrac{1}{2k} \end{array} \right| = \dfrac{1}{2k}. \end{equation}\]

Thus, () can be calculated as

\[\begin{equation} \label{extend} \begin{split} \int_{L}^{U} \int_{L}^{y_2}f_{Y_1,Y_2}(y_1, y_2) dy_1 dy_2 & = \int_{L}^{U} \int_{L}^{y_2}f_{\bar{X},S}\left(\dfrac{y_1+y_2}{2}, \dfrac{y_2-y_1}{2k}\right) |J| dy_1 dy_2 \\ & = \dfrac{1}{2k} \int_{L}^{U} \int_{L}^{y_2}f_{\bar{X}}\left(\dfrac{y_1+y_2}{2}\right) f_{S}\left( \dfrac{y_2-y_1}{2k}\right) dy_1 dy_2. \end{split} \end{equation}\] The second equation follows from normal sample mean and standard deviation being independent.

Similarly, we can obtain the PDF of sample standard deviation \(f_S(s)\). By (), let \(v= \dfrac{(n-1)s^2}{\sigma^2}\), then Jacobian of the transformation is \[\begin{equation} |J| = \left|\dfrac{dv}{ds}\right| = \dfrac{2(n-1)s}{\sigma^2}. \end{equation}\] Thus, \[\begin{equation} \label{Spdf} \begin{split} f_S(s) & = f_V\left(\dfrac{(n-1)s^2}{\sigma^2}\right) |J| \\ & = \dfrac{2(n-1)s}{\sigma^2}f_V\left(\dfrac{(n-1)s^2}{\sigma^2}\right) \end{split} \end{equation}\]

Plug () in (), we can get the final results. \[\begin{equation} \begin{split} & \Pr(\text{Passing a Single PPQ Batch}) \\ = & \dfrac{1}{2k} \int_{L}^{U} \int_{L}^{y_2}f_{\bar{X}}\left(\dfrac{y_1+y_2}{2}\right) \dfrac{2(n-1)\dfrac{y_2-y_1}{2k}}{\sigma^2}f_V\left\{\dfrac{(n-1)\left[\dfrac{y_2-y_1}{2k}\right]^2}{\sigma^2}\right\} dy_1 dy_2 \\ = & \dfrac{n-1}{2k^2 \sigma^2}\int_{L}^{U} \int_{L}^{y_2}f_{\bar{X}}\left(\dfrac{y_1+y_2}{2}\right) f_V\left\{\dfrac{(n-1)(y_2-y_1)^2}{4k^2\sigma^2}\right\} (y_2-y_1) dy_1 dy_2, \end{split} \end{equation}\] where \(\bar{X} \sim \mathcal{N}(\mu, \frac{\sigma^2}{n})\) and \(V \sim \chi^2 (n-1)\). Then this quantity can be easily calculated by software, such as functions dnorm(), dchisq() and integrate() in R.

We can also calculate the probability of passing \(m\) PPQ batches, then under the assumption of independence and similar expected performance across batches, the probability will be \[ \Pr(\text{Passing } m \text{ batches }) = \{\Pr(\text{Passing a Single PPQ Batch}) \}^m\]