agreeAgreeStat Analytics
Research & Software for Analyzing Inter-Rater Reliability Data

Intraclass Correlation Sample Size Determination
Prospective Power Analysis/1-Way Random Subject Effects Model

Problem

Assume you are in the planning stage of an inter-rater reliability experiment, and that the one-way random subject effects design will be adopted.  However, you do not know how many subjects should be used.

AgreeStat360 can be used to determine the optimal number of subjects as well as the optimal number of ratings to take per subject.  The input data needed to run this module is described in the figure below. To allow the software to suggest the most meaningful recommendations, it is essential to provide the following:

  • The maximum number of subjects and number of ratings per subject you would consider as shown in the figure. In this example I assume no more than 10 ratings will be taken from a single subject, and no more than 500 subjects will be considered.

  •  The desired statistical power for detecting the Minimum Detectable Diffefference of 0.1 would be 90%.

intraclass correlation sample size determination with AgreeStat360

Analysis with AgreeStat/360

To see how AgreeStat360 processes this dataset to produce various agreement coefficients, please play the video below.  This video can also be watched on youtube.com for more clarity if needed.

Results

The output that AgreeStat360 produces is shown below.  It contains among other things, the 3-column "Power Table" on the right side showing the magnitude of the power associated with the number of subjects and number of ratings per subjects in the first 2 columns.

  • The highlighted row shows that with 50 subjects and 8 ratings per subject for a total of 40 ratings, one can expect to achieve a power of 0.9046. 

  • If a particular row is of interest due to the number of ratings per subject (e.g. 8), you may then use the bottom left box to modify the number of subjects only and observe how that affects the power.

  • If a particular row is of interest due to the number of subjects (e.g. 50), you may then use the bottom right box to modify the number of ratings per subject only and observe how that affects the power.

intraclass correlation sample size determination