4  Sampling, power and effect size

In this chapter, you will learn how to conduct some analysis on the related concepts of sampling, power, and effect size.

Effect size is a measure of the strength of the relationship between two variables in a statistical population. It is used to quantify the size of the difference between two groups or the strength of an association between two variables. In this section, we will learn how to calculate the effect size for different types of data.

The size of an effect is important when planning a study and trying to determine the sample size required. If an effect is small, we need a larger sample size to detect it. If an effect is large, we can detect it with a smaller sample size.

The ability of a study to detect an effect, using its sample, is called statistical power. Power is the probability that a study will correctly reject a false null hypothesis. In other words, power is the probability that a study will find a true effect when there is one (avoiding a Type 2 error).

Tip

The power of a study is influenced by the sample size, the effect size, and the significance level. A larger sample size, a larger effect size, and a higher significance level all increase the power of a study.

However, since our significance level is usually set at 0.05 and the effect size is determined by the data, the sample size is the only factor that we can control to increase the power of a study.

4.1 Different measures of effect size

There are different ways to calculate effect size depending on the type of data and the statistical test used. Here are some common effect size measures:

  • Cohen’s d: This is a measure of the difference between two means in standard deviation units. It is commonly used in t-tests and ANOVA tests.

  • Eta-squared (\(\eta^2\)): This is a measure of the proportion of variance in the dependent variable that is explained by the independent variable. It is commonly used in ANOVA tests.

  • Phi coefficient (\(\phi\)): This is a measure of the association between two binary variables. It is commonly used in chi-square tests.

  • Correlation coefficient (\(r\)): This is a measure of the strength and direction of the relationship between two continuous variables. It is commonly used in correlation tests.

In the following sections, we will calculate the effect size for different types of data using some of these measures.

4.2 Calculating Cohen’s d

Cohen’s d is a measure of the difference between two means in standard deviation units. It is calculated as the difference between the means divided by the pooled standard deviation. The formula for Cohen’s d is:

\[ d = \frac{{\bar{X}_1 - \bar{X}_2}}{{s_p}} \]

where:

  • \(\bar{X}_1\) and \(\bar{X}_2\) are the means of the two groups.

  • \(s_p\) is the pooled standard deviation, calculated as:

\[s_p = \sqrt{\frac{{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}}{{n_1 + n_2 - 2}}} \]

where:

  • \(n_1\) and \(n_2\) are the sample sizes of the two groups.

  • \(s_1\) and \(s_2\) are the standard deviations of the two groups.

Let’s calculate Cohen’s d for a hypothetical dataset with two groups. The dataset contains the following information:

  • Group 1: Mean = 10, Standard deviation = 2, Sample size = 30
  • Group 2: Mean = 12, Standard deviation = 3, Sample size = 30

To calculate Cohen’s d, we first need to calculate the pooled standard deviation (\(s_p\)) using the formula above. Then, we can calculate Cohen’s d using the formula for Cohen’s d.

Let’s calculate Cohen’s d for this dataset using R:

# Calculate Cohen's d

# Group 1

mean1 <- 10

sd1 <- 2

n1 <- 30

# Group 2

mean2 <- 12

sd2 <- 3

n2 <- 30

# Calculate pooled standard deviation

sp <- sqrt(((n1 - 1) * sd1^2 + (n2 - 1) * sd2^2) / (n1 + n2 - 2))

# Calculate Cohen's d

d <- (mean1 - mean2) / sp


d
[1] -0.7844645

The calculated value of Cohen’s d is -0.7844645. This negative value indicates that the mean of Group 1 is smaller than the mean of Group 2 by approximately 0.78 standard deviations.

4.3 Calculating Eta-squared and other effect size measures

It is possible to calculate other effect size measures such as Eta-squared, Phi coefficient. However, these measures are most commonly calculated using the output of statistical tests such as ANOVA and chi-square tests etc. To obtain these measures for the purpose of sample size calculation, you would usually look at previous studies or meta analyses to determine the expected effect size. For clinical research, you may also use the minimal clinically important difference (MCID) as a guide to determine the effect size.

4.4 Power analysis

Power analysis is a method used to determine the sample size required to detect an effect of a given size with a certain level of confidence. It is important to conduct a power analysis before conducting a study to ensure that the sample size is adequate to detect the effect of interest.

To conduct a power analysis, you need to specify the following parameters:

  • The effect size: The size of the effect you want to detect. This is usually determined based on previous studies, meta-analyses or MCID.

  • The significance level (\(\alpha\)): The probability of rejecting the null hypothesis when it is true (Type 1 error rate). This is commonly set at 0.05.

  • The power (\(1 - \beta\)): The probability of correctly rejecting the null hypothesis when it is false (1 - Type 2 error rate). This is commonly set at 0.80 or 0.90.

  • The number of groups or conditions: The number of groups or conditions in the study.

The sample size required to achieve a desired power level can be calculated using power analysis functions in R. There are many packages in r for power analysis. The package pwr is one such package that provides functions to calculate the sample size required for different types of statistical tests.

The functions for some basic research designs are:

  • pwr.t.test(): For t-tests

  • pwr.anova.test(): For ANOVA tests

  • pwr.chisq.test(): For chi-square tests

  • pwr.f2.test(): For regression models

Let’s calculate the sample size required to achieve a power of 0.80 for a t-test with the example data we used earlier. We will use the pwr.t.test() function from the pwr package to calculate the sample size required to achieve a power of 0.80 for a t-test with the following parameters:

  • Effect size (Cohen’s d) = -0.7844645

  • Significance level (\(\alpha\)) = 0.05

# Load the pwr package


library(pwr)

# Calculate the sample size required for a t-test
# using the d value calculated earlier

pwr.t.test(d = d, sig.level = 0.05, power = 0.80)

     Two-sample t test power calculation 

              n = 26.50429
              d = 0.7844645
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

NOTE: n is number in *each* group

The output of the pwr.t.test() function provides the sample size required to achieve a power of 0.80 for a t-test with the specified effect size and significance level. The output includes the following information:

  • n = The sample size required for each group to achieve a power of 0.80.

  • d = The effect size (Cohen’s d) used in the power analysis.

  • sig.level = The significance level used in the power analysis.

  • power = The power level achieved with the specified sample size.

The sample size required to achieve a power of 0.80 for a t-test with the specified effect size and significance level is 26.5 for each group. Since the sample size must be a whole number, we would need to round up to the nearest whole number. Therefore, the sample size required for each group is 27.

4.5 More complex power analysis

For more complex designs, different approaches to power analysis might be necessary, such as using simulation. This is possible to do in R, but is beyond the scope of this chapter.