As defined earlier, traditional assessment generally refers to written testing, such as multiple choice, matching, true/false, fill in the blank, etc. Learners typically complete written assessments within a specified time. There is a single, correct response for each item. The assessment, or test, assumes that all learners should learn the same thing, and relies on rote memorization of facts. Responses are often machine scored and offer little opportunity for a demonstration of the thought processes characteristic of critical thinking skills.
Traditional assessment lends itself to instructor centered teaching styles. The instructor teaches the material at a low level, and the measure of performance is limited. In traditional assessment, fairly simple grading matrices such as shown in Figure 1 are used. Due to this approach, a satisfactory grade for one lesson may not reflect a learner’s ability to apply knowledge in a different situation.
Figure 1. Traditional grading |
Still, tests of this nature do have a place in the assessment hierarchy. Multiple choice, supply type, and other such tests are useful in assessing the learner’s grasp of information, concepts, terms, processes, and rules—factual knowledge that forms the foundation needed for the learner to advance to higher levels of learning.
Characteristics of a Good Written Assessment (Test)
Figure 2. Effective tests have six primary characteristics |
Reliability is the degree to which test results are consistent with repeated measurements. If identical measurements are obtained every time a certain instrument is applied to a certain dimension, the instrument is considered reliable. The reliability of a written test is judged by whether it gives consistent measurement to a particular individual or group. Keep in mind, though, that knowledge, skills, and understanding can improve with subsequent attempts at taking the same test, because the first test serves as a learning device.
Validity is the extent to which a test measures what it is supposed to measure, and it is the most important consideration in test evaluation. The instructor must carefully consider whether the test actually measures what it is supposed to measure. To estimate validity, several instructors read the test critically and consider its content relative to the stated objectives of the instruction. Items that do not pertain directly to the objectives of the course should be modified or eliminated.
Usability refers to the functionality of tests. A usable written test is easy to give if it is printed in a type size large enough for learners to read easily. The wording of both the directions for taking the test and of the test items needs to be clear and concise. Graphics, charts, and illustrations appropriate to the test items must be clearly drawn, and the test should be easily graded.
Discrimination is the degree to which a test distinguishes the difference between learners and may be appropriate for assessment of academic achievement. However, minimum standards are far more important in assessments leading to pilot certification. If necessary for classroom evaluation of academic achievement, a test must measure small differences in achievement in relation to the objectives of the course. A test designed for discrimination contains:
- A wide range of scores
- All levels of difficulty
- Items that distinguish between learners with differing levels of achievement of the course objectives
Please see Developing a Test Item Bank for information on the advantages and disadvantages of multiple choice, supply type, and other written assessment instruments, as well as guidance on creating effective test items.