Reference-scaling: Don’t use FARTSSIE!

» What BE limits should be taken into consideration while estimating sample size for replicate design studies for USFDA & EMA.

If you want to go with reference-scaling, none! They depend on the CVwR and are obtained in the actual study. Hence, you need simulations, where your assumed CVwR is the mean. Either use the tables provided by the two Lászlós* or package PowerTOST for R. The software is open source and comes free of costs.

» Should it be 80-125% ?

No (see above). Only if you don’t want reference-scaling (i.e., conventional ABE). Actually reference-scaling was introduced to avoid the extreme sample sizes required for ABE.

» We normally use FARTSSIE

FARTSSIE cannot perform the necessary simulations and therefore, is not suitable for any of the reference-scaling methods.

Do not assume a T/R of 95% for HVDP(s). The two Lászlós recommend a 10% difference and for good reasons. Therefore, 90% is the default in PowerTOST’s reference-scaling functions.

If ever possible avoid the partial replicate design. Though the evaluation works for the EMA’s methods, sometimes no convergence is achieved for the FDA’s mixed effects model (if CVwR <30%) – independent from the software. Study done, no result, you are fired.
Better to opt for full replicate 2-sequence designs: 4-period (TRTR|RTRT) or – if you are afraid of drop­outs – 3-period (TRT|RTR). In the latter one you need at least twelve eligible subjects in sequence RTR according to the EMA’s Q&A-document (not relevant in practice).

After installing R and downloading PowerTOST, start the R-console, and type library(PowerTOST). If you want to make yourself familiar with PowerTOST, type help(package=PowerTOST).
Try these examples, where "2x2x4" denotes the 4-perod full replicate, "2x2x3" the 3-period full replicate, and "2x3x3" the 3-sequence 3-period (partial) replicate.

Sample sizes for the FDA (22, 34, 30) are always smaller than ones for the EMA (28, 42, 39) because in RSABE there is no 50% CVwR limit for scaling. On the other hand Health Canada’s ABEL sometimes require less subjects than the EMA’s ABEL because the upper cap is 57.38%. If your boss insist in a T/R of 95% add the argument theta0=0.95 to the function calls but don’t blame me if the study fails. If you want 90% power instead of the default 80%, add the argument targetpower=0.90.
I suggest to perform a power analysis (recommended by ICH E9) assessing the influence of deviations from assumptions (lower/higher CVwR, T/R deviating more than expected from 100%, less eligible subjects due to drop­outs). Try (with the default partial replicate):

For the FDA CVwR can be as high as ~97% until power drops to 70%, though the sample size is smaller (30) compared to the EMA’s and Health Canada’s (39). With the same sample size 39, the CVwR can rise to ~76% (Health Canada) and only to ~64% (EMA). This is a consequence of differing upper scaling caps.

Reference-scaling: Don’t use FARTSSIE!

» Just few words in defence of FARTSSIE. May be the approximations are rude, far from exact PowerTOST (see #), but the results overally seems to be correlated for CV>30%:

Nope. The EMA’s ABEL not only means extending the limits (based on the observed – not the assumed CVwR) but also assessing the GMR-restriction. It’s a complex framework and therefore, no simple formula exists to calculate power. Try this one:

For all CVs FARTSSIE’s sample sizes will give the desired power for ABE (column pwr.ABE) but not for ABEL (column pwr.FARTSSIE). For CV 30% it will be too high (since only 34 subjects are required) but for CV >30% too low (more subjects are required). Only sampleN.scABEL() will give sample sizes (n.ABEL) with at least the desired power (pwr.ABEL).

As expected, FARTSSIE screws completely up if one wants to assess the Type I Error. Set “Method C1” and the 4-period replicate. Try Calculate Power with CV 30%, T/R 125%, sample size 34. I got

Reference-scaling: USE PowerTOST

Thankyou for the response. Honestly this is too much of statistics for me to understand. We are discussing with our statisticians & medical writing team to fully understand the response and implement accordingly.

With nsims=1E6 and setseed=F on the other hand I don't succeed to obtain power>0.8 (20 tries).
One more reason to use PowerTOST's power.scABEL() / sampleN.scABEL() functions . Also w.r.t the tables in the paper of the two Laszlos.

—
Regards,

Detlew

mittyriSenior

Russia,
2017-12-07 16:03

@ d_labesPosting: # 18043Views: 1,980

Use PowerTOST, not Endrényi L, Tóthfalusi L tables

here comes a magik (citing the paper):The precision of the estimation was evaluated by running the simulations twenty times at twenty-two different conditions (different CV’s, different GMR’s and different designs).

Precision in Endrényi L, Tóthfalusi L tables

» here comes a magik (citing the paper):» The precision of the estimation was evaluated by running the simulations twenty times at twenty-two different conditions (different CV’s, different GMR’s and different designs).» » » Seed was sampled 'unfortunately' about 20 times?

What do you mean by 'unfortunately'?

This sentence describes an investigation of how precise the simulated power values are.
Read the paragraph forward:"The standard deviation of the simulated powers (20 values under each combination of CV, GMR and design, each with 10 000 sims; my insertion) was calculated under each condition; the mean of these standard deviations was 0.460. Thus, the precision of the power estimation is about ±0.5%."
They must have of course always choosen a different seed for the 20 repetitions under each combination of CV, GMR and design. Else you would get always the same result under each combination.

Of course they had also the opportunity to obtain a precision estimate via binomial distribution, without the burden of all that calculations. But then the random number generator had to be ideal.

Precision of PowerTOST

» I tried to figure out the precision of PowerTOST estimations with the following code:» scABELdata$power[j] <- power.scABEL(CV=0.8, n=54, theta0=0.95,» design="2x3x3", nsims=10000,» setseed=F)» please correct if it is wrong.

I think so.
You got the right answer to a wrong question. For theta0=0.95 and targetpower=0.8 sample sizes are 57 for design="2x3x3" and 56 for design="2x2x3". Therefore, with 54 power will be <0.8 for both designs:

Reference-scaling: Don’t use FARTSSIE!

» If you want to go with reference-scaling, none! They depend on the CVwR and are obtained in the actual study. Hence, you need simulations, where your assumed CVwR is the mean. Either use the tables provided by the two Lászlós* or package PowerTOST for R. The software is open source and comes free of costs.

Reference-scaling: Don’t use FARTSSIE!

» Just wondering what parameters/variables are applied in the simulations being done like wise in the sample size suggested by Lazlo or by package powertost,

Try the respective functions sampleN.RSABE() for the FDA’s reference-scaling and sampleN.scABEL() for the EMA’s and Health Canada’s average bioequivalence with expanding limits in PowerTOST. Both functions are extensively documented. In the R-console after typing library(PowerTOST) try help(sampleN.RSABE) or help(sampleN.scABEL).
László Tóthfalusi’s code (in MatLab) is not available in the public domain. It is slow (therefore, only 50,000 simulations) and for years the two Lászlós recommend PowerTOST instead.

» Also what could be impact of difference in variable assumption for simulation incase of borderline high and very large variability (like 30% and 300%),

With high (say >50%) CwR the GMR-restriction (80.00% ≤ GMR ≤ 125.00%) precedes over the CI-decision. At CVwR 300% practically only the GMR is important (esp. for RSABE).

» varying T/R ratios (100%, …

Never, ever assume a T/R-ratio of 100%. It will strike back when the study fails with a too low sample size. We recommend 90% at the best.

» … 80.01%)

That’s for CVwR ≤30% (no scaling) almost simulating the Type I Error (assuming true Null). Would be at 80%.
Try