Marczyk, DeMatteo, Festinger CH 4

Assessing Reliability

Test-Retest Reliability

Refers to the stability of test scores over time and involves repeating the same test on at least on other occasion. For example, administering the same measure of academic achievement on two separate occasions 6 months apart is an example of this type of reliability. The interval of time between administrations should be considered with this form of reliability because test-retest correlations tend to decrease as the time interval increases.

Split-Half Reliability

Refers to the administration of a single test that is divided into two equal halves. For example, a 60-question aptitude test that purports to measure one aspect of academic achievement could be broken down into two separate but equal tests of 30 items each. Theoretically, the items on both forms measure the same construct. This approach is much less susceptible to time-interval effects because all of the items are administered at the same time and then split into separate item pool afterward.

Alternate-Form Reliability

Is expressed as the correlation between different forms of the same measure where the items on each measure represent the same item content and construct. This approach requires two different forms of the same instrument, which are then administered at different times. The two forms must cover identical content and have a similar difficulty level. The two test scores are then correlated.

Inter-Rater Reliability

Is used to determine the agreement between different judges or raters when they are observing or evaluating the performance of others. For example, assume you have two evaluators assessing the acting-out behavior of a child. You operationalize "acting-out behavior" as the number of times that the child refuses to do his or her schoolwork in class. The extent to which the evaluators agrees on whether or when the behavior occurs reflects this type of reliability.

