From fixed to flexible trial designs, the Summer 2019 nQuery release (Ver 8.4) sees the continued strengthening of nQuery to help Biostatisticians and clinical researchers save costs and reduce risk.
41 new sample size tables have been added in total.
With increasing costs and failure rates in drug development an increasingly important issue, adaptive trials offer one mechanism to alleviate these problems and make clinical trials better reflect the statistical and practical needs of trial sponsors. These issues have also spurred an increasing openness towards innovative clinical trial designs in regulatory agencies around the world.
Summer 2019 nQuery Release Notes | ver 8.4
What's new in the nQuery Adaptive module?
The Summer 2019 release extends the number of tables on offer for adaptive designs. 7 new tables will be added to this area.
In this release the following areas are targeted for development:
- Conditional Power and Predictive Power
- Unblinded Sample Size Re-estimation
- Blinded Sample Size Re-estimation
A background to these areas along with a list of the sample size tables which are added in the Summer release is given in the adjacent sections.
Conditional Power and Predictive Power
In group sequential designs and other adaptive designs, access to the interim data gives the ability to answer the important question of how likely a trial is to succeed based on the information accrued so far. The two most commonly cited statistics to evaluate this are conditional power and predictive power.
Conditional power is the probability that the trial will reject the null hypothesis at a subsequent look given the current test statistic and the assumed parameter values, which are usually assumed to equal their interim estimates. Predictive power (a.k.a. Bayesian Predictive Power) is the conditional power averaged over the posterior distribution of the effect size. Both give an indication of how promising a study is based on the interim data and can be used as ad-hoc measures for futility testing or for defining “promising” results for unblinded sample size re-estimation.
Building on the initial nQuery Adapt release, 1 table will be added for conditional power and predictive power as follows:
- Conditional Power for 2x2 Cross-over Design
Crossover trials use a repeated measures design, where each subject receives more than one treatment, with different treatments given in different time periods. The main benefit of a crossover trial is the removal of the between subject effect that impacts parallel trials. This can yield a more efficient use of resources as fewer subjects may be required in the crossover design than comparable designs.
Unblinded Sample Size Re-estimation
In group sequential designs and other similar designs, access to the interim data provides the opportunity to improve a study to better reflect the updated understanding of the study. One way a group sequential design can use the interim effect size estimate is not only to decide whether or not to stop a trial early but to increase the sample size if the interim effect size is considered “promising”. This optionality gives the trialist the chance to initially power for a more optimistic effect size, thus reducing up-front costs, while still being confident of being able to find for a smaller but clinically relevant effect size by increasing sample size if needed.
The most common way to define whether an interim effect size is promising is conditional power. Conditional power is the probability that the trial will reject the null hypothesis at a subsequent look given the current test statistic and the assumed parameter values, which are usually assumed to equal their interim estimates. For “promising” trials where the conditional power falls above a lower bound, a typical value would be 50%, the initial target power of the sample size can be increased to make the conditional power equal the target study power.
Building on the initial nQuery Adapt release, the following table will be added for unblinded sample size re-estimation:
- Interim Monitoring and Unblinded Sample Size Re-estimation for Survival
This table allows nQuery Adapt users to extend their initial group sequential design for survival in two groups using the Log-Rank test (with or without unequal follow-up) by giving tools which allow users to conduct interim monitoring and conduct a flexible sample size re-estimate at a specified interim look.
This table will be accessible by designing a study using either of the two group sequential designs for survival tables and using the “Interim Monitoring & Sample Size Re-estimation” option from the group sequential “Looks” table. This table will provide for two common approaches to unblinded sample size re-estimation: Chen-DeMets-Lan and Cui-Hung-Wang. There is also an option to ignore the sample size re-estimation and conduct interim monitoring for standard group sequential design.
The Chen-DeMets-Lan method allows a sample size increase while using the standard group sequential unweighted Wald statistics without appreciable error inflation, assuming an interim result has sufficiently "promising" conditional power. The primary advantages of the Chen-DeMets-Lan method are being able to use the standard group sequential test statistics and that each subject will be weighted equally to the equivalent group sequential design after a sample size increase. However, this design is restricted to the final interim analysis and Type I error control is expected but not guaranteed depending on the sample size re-estimation rules.
The Cui-Hung-Wang method uses a weighted test statistic, using pre-set weights based on the initial sample size and the incremental interim test statistics, which strictly controls the type I error. However, this statistic will differ from that of a standard group sequential design after a sample size increase and since subjects are weighted on the initial sample size, those subjects in the post-sample size increase cohort will be weighted less than those before.
There will be full control over the rules for the sample size re-estimation including sample size re-estimation look (for Cui-Hung-Wang), maximum sample size, whether to increase to the maximum sample size or the sample size to achieve the target conditional power and bounds for what a “promising” conditional power is, among others.
Blinded Sample Size Re-estimation
Sample size determination always requires a level of uncertainty over the assumptions made to find the appropriate sample size. Many of these assumed values are for nuisance parameters which are not directly related to the effect size. Thus it would useful to have a better estimate for these values than relying on external sources or the cost of a separate pilot study but without the additional regulatory and logistical costs of using unblinded interim data. Blinded sample size re-estimation allows the estimation of improved estimates for these nuisance parameters without unblinding the study.
In the Summer 2019 release, five tables will be added for blinded sample size re-estimation using the internal pilot method. The internal pilot method assigns an initial cohort of subjects as the “pilot study” and then calculates an updated value for a nuisance parameter of interest. This updated nuisance parameter value is then used to increase the study sample size if required, with the final analysis conducted with standard fixed term analyses with the internal pilot data included.
The new additions to the Adapt Module expand the scope of the nQuery Adapt blinded sample size re-estimation tables to the cases where unequal sample sizes and continuity corrections are needed. The new tables will be as follows:
- Blinded Sample Size Re-estimation for Two Sample t-test for Inequality (common variance, unequal n's)
- Blinded Sample Size Re-estimation for Two Sample t-test for Non-inferiority (unequal n's)
- Blinded Sample Size Re-estimation for Two Sample t-test for Equivalence (unequal n's)
- Blinded Internal Pilot Sample Size Re-estimation for Two Sample χ2 Test for Non-inferiority (Continuity Corrected)
- Blinded Internal Pilot Sample Size Re-estimation for Two Sample χ2 Test for Inequality (Continuity Corrected)
These tables will provide full flexibility over the size of the internal pilot study, whether sample size decreases are allowable in addition to increase and tools to derive the best-blinded estimate from the internal pilot.
Blinded sample size re-estimation for the two sample t-test updates the sample size based on a blinded estimate of the common within-group standard deviation. Three methods are available to estimate the within-group standard deviation from the internal pilot data: pilot standard deviation, bias-adjusted pilot standard deviation, upper confidence limit for pilot standard deviation.
Blinded sample size re-estimation for the two sample chi-squared test updates the sample size based on a blinded estimate of the total proportion of successes and combining this with the initial proportion difference estimate. The user can enter either the proportion of successes or number of successes for the equivalent analysis.
List of New Adapt Tables
Blinded Sample Size Re-estimation
- MTT24U Blinded Sample Size Re-estimation for Two Sample t-test for Inequality (common variance, unequal n's)
- MTE28U Blinded Sample Size Re-estimation for Two Sample t-test for Non-inferiority (unequal n's)
- MTE29U Blinded Sample Size Re-estimation for Two Sample t-test for Equivalence (unequal n's)
- PTE12 Blinded Internal Pilot Sample Size Re-estimation for Two Sample χ2 Test for Non-inferiority (Continuity Corrected)
- PTT27 Blinded Internal Pilot Sample Size Re-estimation for Two Sample χ2 Test for Inequality (Continuity Corrected)
Conditional Power and Predictive Power
- MTT38 Conditional Power for 2x2 Cross-over Design
Unblinded Sample Size Re-estimation
- STT17 Unblinded Sample Size Re-estimation and Interim Monitoring for Two Survival
How to Update
To access the adaptive module you must have a nQuery Advanced Pro subscription. If you do, then nQuery should automatically prompt you to update.
You can manually update nQuery Advanced by clicking Help>Check for updates.
nQuery Advanced 8.2最新更新|版本说明
nQuery Advanced 8.2 | April 2018 Update
The nQuery April 2018 release will add a wide range of sample size tables ranging from extensions of pre-existing tables for a better and clearer user experience to the those based on the latest academic research and user feedback.
In the April 2018 release, we will be adding 52 new sample size tables to nQuery Advanced and 20 new tables to nQuery Bayes. This release summary will provide an overview of what areas have been targeted in this release along with the full list of tables being added.
nQuery Advanced Tables
In the April 2018 release, three main overarching areas were targeted for significant improvement. These were:
- Epidemiological Methods
- Non-inferiority and Equivalence Tests
- Correlation and Diagnostic Testing (ROC) Methods
There is also a number of tables which do not fall into these categories related to areas such as inequality testing for log-normal data, testing of variance and standard deviations and non-parametric tests. These are described at the end of this document.
We will provide background on each of these below and a list of the sample size tables which will be added in the April release. References for each method are provided at the end of this article
Epidemiology is the branch of medicine which primarily studies the incidence, distribution, and possible control of diseases and other factors relating to health. Epidemiological studies are cornerstone of research into areas such as public health, health policy or preventative medicine (e.g. vaccines).
Due to often having to study the effect of medicines and interventions at a more complex society-wide level, processes and methods for epidemiology often adjust for problems that are less prominent in well-controlled clinical trials. These issues include being unable to individually randomise, relying on observational data or attempting to extract causal relationships from highly complex data.
Due to these and other issues, the study designs and statistical methods used by epidemiologists will often have to include adjustments for exogenous effects and have a greater reliance on processes such as pair-matching. While statistical methods for clinical trials provide a useful starting point for getting adequate sample size estimates, there is a growing desire for methods which have found traction in the epidemiological field.
In the April release, 12 new tables will be added with the main areas of focus in the Epidemiology upgrade being the following:
- Case-Control Studies (Observational and Prospective)
- Vaccine Efficacy Studies
- Cluster Randomized Trials (CRT)
- Mendelian Randomization Studies
These areas and the tables in each category are explored below.
Case-Control studies are those where the analysis assumes that the effect of a treatment or intervention or prognostic factor can be modelled by comparing the effect on a paired cases and controls. In the epidemiological context, this is most commonly associated with retrospective studies attempting to find a relationship between a risk factor and a disease of interest (e.g. effect of smoking on lung cancer rates) using pre-existing sources such as health databases.
In this context, the nQuery April 2018 release adds an additional four tables which should add additional flexibility when planning a case-control study using nQuery. These are:
- Test for Binary Covariate in Logistic Regression
- Case-Control Test with M Controls per Case (McNemar Extension)
- Conditional Logistic Regression with Binary Risk Factor
- Conditional Logistic Regression with Continuous Risk Factor
These tables complement our pre-existing nQuery tables for chi-squared tests, exact tests, correlated proportions and logistic regression and add options for conditional logistic regression to allow for greater flexibility when exploring sample estimates for case-control studies.
Vaccine efficacy studies face significant challenges compared to other clinical trials. These include having a much larger scale due often being nation or region wide campaigns, dealing with rare diseases or conditions and the challenges of doing work in the field rather than in a fully controlled setting. For reasons such as these, vaccine efficacy designs and statistical methods have developed their own approaches and terminology to help the relevant researchers and public or private bodies of interest.
For vaccine efficacy, the nQuery April 2018 release adds two additional tables tailored for finding the sample size for the precision of an estimate of the vaccine efficacy. These are:
- Confidence Interval for Vaccine Efficacy in a Cohort Study
- Confidence Interval for Vaccine Efficacy in a Case-Control Study
In conjunction with our wide range of pre-existing tables for binomial proportions and survival rates, these tables will give researchers in vaccine research more tailored options for their study.
Cluster Randomized Trials
Cluster randomized trials are studies where the unit of randomization is a cluster or group rather than the individual subject. This is a common design when there are natural blocks or clusters such as schools or hospitals. By assigning the same treatment to all subjects within a cluster, the administrative and financial cost of field trials can be reduced significantly. For this reason and others, this design type is very commonly seen in public health policy studies.
For cluster randomized trials, the nQuery April 2018 release includes four additional tables which expand upon our pre-existing options for cluster randomized trials. These are:
- Cluster Randomized Trial for Inequality Test of Two Means (unequal sample sizes)
- Cluster Randomized Trial for Non-inferiority Test of Two Means
- Cluster Randomized Trial for Equivalence Test of Two Means
- Cluster Randomized Trial for Superiority by a Margin of Two Means
These options expand upon our pre-existing tables for cluster randomized trials comparing means, proportions, incidence rates and survival curves and for alternative cluster randomized trials such as the matched-pair design.
Mendelian Randomization is form of randomization which takes advantage of the growing availability and understanding of genetic information to make causal claims about potential treatments without using the common fully randomized approach. By using well characterised relationships between genes and phenotypes with a known secondary effect on an outcome of interest, mendelian randomization offers the opportunity to use genetic information as a instrumental variable to find the causal relationship between a risk factor of interest and a disease outcome.
For studies which use Mendelian Randomization, the nQuery April 2018 release provides two new tables. These are:
- Mendelian Randomization for Continuous Outcome
- Mendelian Randomization for Binary Outcome
These provide the first tables in nQuery which account for this novel design and innovative approaches such as this will be of active interest in the area
2. Non-inferiority and Equivalence Testing
Non-inferiority and equivalence testing are used to statistically evaluate how similar a proposed treatment is to a pre-existing standard treatment. This is a very common objective in areas such as generics and medical devices. This is particularly important if using a placebo group would be required otherwise.
As non-inferiority and equivalence testing will typically involve evaluation against a well-defined treatment (e.g. RLD), there is a lower incidence of the large parallel studies typically seen in Phase III clinical trials. One-sample, paired samples or cross-over designs are common as these will generally require a lower cost and sample size.
In the April release, we will be adding an additional 20 (22 in the CRT Means reference above are included) tables for non-inferiority and equivalence testing. These are focused on expanding the options available for the non-inferiority and equivalence testing of continuous data, binary data and incidence rates. The focus areas are as follows:
- Continuous Outcome Studies
- Binary Outcome Studies
- Incidence Rate Outcome Studies
These areas and the tables in each category are explored below.
Continuous Outcome Studies
In the context of non-inferiority and equivalence testing, the comparison of continuous outcomes using means is the most common situation to encounter. A wide range of design types and statistical methods are available for comparing this type of data depending on the assumptions and constraints relevant to proposed study. Common design types in this context would be one-sample, paired, cross-over and parallel studies. The most common statistical methods are based on assuming either that the data is normally distributed and comparing the difference in means (Additive model) or that the data is log-normally distributed and analysing the ratio of (geometric) means (Multiplicative model).
For non-inferiority and equivalence testing of continuous data, the nQuery April 2018 release adds an additional 12 tables. These are as follows:
- Non-inferiority for One Normal Mean
- Non-inferiority for One Log-Normal Mean
- Non-inferiority for Paired Means Ratio
- Equivalence for One Mean
- Equivalence for Paired Means
- Equivalence for One Log-Normal Mean
- Equivalence for Paired Means Ratio
- Non-inferiority for cross-over design
- Non-inferiority for Two-sample Mean Ratio on Log-scale
- Non-inferiority for Cross-over Mean Ratio on Log-scale
- Non-inferiority for Two-sample Mean Ratio on Original Scale
- Non-inferiority for Cross-over ratio on Original Scale
These tables expand upon the large number of pre-existing tables for non-inferiority and equivalence testing for means to give the largest number of options available to find the sample size.
Binary Outcome Studies
In the context of non-inferiority and equivalence testing, the comparison of binary is less common but has grown in more recently as additional statistical methods have become popularised. Common design types in this context would be one-sample, paired and parallel studies. In the context of binary data, one of the most noticeable aspects is wide variety of options available ranging from relatively simple normal approximation tests to more complex exact methods and sample size methods have followed this trend in regard to binary data analyses generally.
For non-inferiority and equivalence testing of binary data, the nQuery April 2018 release adds an additional 2 tables. These are as follows:
- Non-inferiority for a Single Binary Proportion
- Equivalence Test for Two Independent Proportions
Note that both these tables integrate more exact binomial enumeration methods as an option in addition to the typical normal approximation methods. They also include options for a wide range of proposed statistics with the main categories being chi-squared tests (including option for continuity correction), Z and t-test approximations and several likelihood score statistics (Miettinen and Nurminen, Gart & Nam, Farrington and Manning). These tables expand upon the large number pre-existing tables for the non-inferiority and equivalence testing of binary proportions.
Incidence Rates Studies
In the context of non-inferiority and equivalence testing, the comparison of incidence rates is a relatively uncommon scenario. Incidence rates are where the outcome of interest is the number of events which occur on average in a given time period (i.e. the event rate). The wider availability of software to analyse incidence rates directly rather than relying on normal approximations has seen a growth of interest in methods such as Poisson and Negative Binomial regression. This has naturally extended to the case of non-inferiority and equivalence testing of incidence rate data. The time-dependent nature of incidence rates means that models can integrate greater flexibility for time dependencies and this is reflected in the rapidly growing literature for sample size in the area.
For non-inferiority and equivalence testing of incidence rates data, the nQuery April 2018 release adds 6 tables. These are as follows:
- Non-inferiority for Two Rates using Poisson Model
- Equivalence for Two Rates using Poisson Model
- Equivalence for Negative Binomial Model (Equal Follow-up, Dispersion)
- Non-inferiority for Negative Binomial Model (Equal Follow-up, Dispersion)
- Equivalence for Negative Binomial Model (Unequal Follow-up, Dispersion)
- Non-inferiority for Negative Binomial (Unequal Follow-up, Dispersion)
These tables expand upon the pre-existing options for analysing incidence rate data in the context of inequality testing. These methods represent the latest research and in the last two tables can integrate the effects of dispersion and unequal follow-up on the sample size estimate.
3. Correlation and Diagnostic Testing (ROC) Methods
Correlation, agreement and ROC methods are interested in characterising the strength of the relationship between a predictor (e.g. presence of treatment) and outcome (disease progression) of interest. These measures are often used in conjunction with models and statistical testing to characterise the nature of the relationship of interest. These methods are of interest when attempting to communicate the strength of a model or a relationship.
These types of measures are seen throughout statistical practise but are particularly prominent in areas such as diagnostic testing, the social sciences and biomarker studies.
In the nQuery April release, we will be adding an additional 9 additional tables in this area which fall in the following main categories:
- Correlation and Agreement Measures
- Diagnostic Screening Measures
These are summarised below.
Correlation and Agreement Measures
Correlation measures are used to characterise the strength of relationship between continuous and/or ordinal outcomes and measures such as Pearson’s correlation are ubiquitous in statistical practise. Agreement measures are used to analyse the strength of the ability of more than one rater (e.g. tester or test) to agree and correctly diagnose the condition of one or more entities (e.g. subject disease status). Both of these are common outcomes of interest in a wide variety of settings.
Due to the ubiquity of these methods, a wide range of measures have been proposed to adjust for scenarios which diverge from the most common correlation and agreement measures (e.g. Pearson correlation and Cohen’s Kappa). Common complications adjusted for are the presence of ordinal instead of continuous variables or divergences from common distributional assumptions (e.g. Normal)
In the nQuery April release, we are adding four additional options in this area. These are as follows:
- Confidence Interval for Kendall-Tau Correlation
- Confidence Interval for Spearman Correlation
- Test for One Intra-cluster Correlation
- Test for One Cronbach Alpha
These add to the options available for other common correlation and agreement measures such as the Pearson Correlation, Lin’s Concordance Coefficient and Cohen’s Kappa Coefficient.
Diagnostic Screening Measures
Diagnostic screening measures are very common in clinical research. These measures are used to assess the performance of a diagnostic test to accurately predict a condition of interest in the population(s) of interest. Areas where this type of analysis have become particularly popular are biomarker studies, machine learning and predictive genetic tests.
Commonly, this strength is characterised by the Area under the Curve (AUC) of the Receiving Operating Curve (ROC) which provides a useful summary measure of screening performance over all potential cut-off points for a screening measure. However, a large number of other statistics may be of interest at specific cut-offs such as sensitivity (a.k.a. recall), specificity and positive predictive value (PPV, a.k.a. precision), among other.
In the nQuery April release, we are adding 5 new tables in this area. These are as follows:
- Confidence Interval for Area under the Curve (AUC)
- Confidence Interval for One Sensitivity
- Confidence Interval for One Specificity
- Simultaneous Interval for One Sensitivity and One Specificity
- Test for Paired Sensitivity
These add to the pre-existing tables already present in nQuery for the testing of AUC values under varying design types and sensitivity and specificity.
4. Miscellaneous Tables
In the April release of the nQuery, 11 tables do not fit into the above categorisations. These cover areas such as the testing of log-normal means, the testing of variances and standard deviations and non-parametric tests. These tables are as follows:
- Confidence Interval for One Variance
- Confidence Interval for One Variance using Tolerance Probability
- Confidence Interval for One Variance using Relative Error
- Confidence Interval for One Standard Deviation
- Confidence Interval for One Standard Deviation using Tolerance Probability
- Confidence Interval for One Standard Deviation using Relative Error
- Confidence Interval for Ratio of Two Variances
- Confidence Interval for Ratio of Two Variances using Relative Error
- One Sample t-test for Log-Normal data
- Paired t-test for Mean Ratio (logscale)
- One Sample/Paired Sample Wilcoxon Sign-Rank Test
These options expand upon nQuery’s pre-existing options in these areas.