Teachers’ Estimation of Problem Difficulty

 

 

 

Fong-lok LEE
Rex M. Heyworth
The Chinese University of Hong Kong
Paper presented at The Thirteenth Annual Conference, Hong Kong Educational Research Association.
(Draft)

 

 

It is a common practice that teachers would estimate the difficulty levels of problems they assign to their students. The purpose may be either to order the problems in an exercise according to their difficulty so that students may be more motivated to continue with their work, or to choose the problem most appropriate to the academic standard of the students. Either of these purposes would require a teacher‘s ability in accurate estimation, otherwise the purposes could not be achieved.

Problem difficulty measures are usually measured with item difficulty ratio and response time. The first one being the ratio of the number of respondents answer correctly to the total number of responses to the problem (Gronlund, 1981) while the second is defined as the total time which elapses between the presentation of the item and the response (Mason, Zollman, Bramble & O'Brien,1992; Plake, Glover, Kraft, & Dinnel, 1984; Mason, Zollman, Bramble & O'Brien, 1992). As both measures require an item being tested among a group of students, it is unlikely that a teacher would use these as the basis of estimation which is done on the fly.

On the other hand, as what a teacher can have when estimating a problem is only the problem expression itself. It is thus possible that the estimation is based on some observable features of the problem expression. Such assertion is supported by researchers such as Jerman (1983), Lester (1980), Silver & Thomson (1984) and Zweng, Turner, & Geraghty (1979) who suggested that mathematics problems are more complex and more difficult to solve when they require several steps to obtain a solution, when subgoals must be reached before a solution can be obtained, and when the problems contain numbers that are of high computational complexity. Since factors suggested above can be observable from problem expressions, it is thus possible that teachers are using these factors to estimate the problem difficulty. The present study is thus aimed to identify these factors and to investigate the accuracy of teachers’ estimation.

Procedure and Results

If the problem difficulty measure is for arranging problems in the order that maintains students‘ motivation or for choosing problems suitable to the students’ academic standard, it is the students‘ perception of difficulty during the problem solving process that should be important. The students’ perception of problem difficulty during the problem solving process was obtained and treated as the frame of reference. Two other measures of problem difficulty, namely, the item difficulty ratio and the teachers’ estimation were also obtained and the accuracy of these measures are determined by comparing them with the students’ perception. Besides, factors that predict these measures are identified and compared. The following paragraphs describe how these were done.

Students’ perception of Problem Difficulty

One hundred and twenty-five secondary school students in Hong Kong participated in two mathematics tests containing 20 and 12 problems respectively (Appendix A). Together with each item, there is a five-point scale indicating how difficult the student thought the item was with 1 as very easy and 5 as very difficult. Students were asked to mark on the scale as they were solving each problem. The mean ratings of item difficulty of 125 students were then collected; the data are found in Table 1.

Item difficulty ratio

The test was then marked and the total number of correct responses for each question in the tests was counted. The item difficulty ratio for each question was calculated by using this formula:

Item difficulty ratio = 

The data obtained are in Table 1.

Teachers' Estimation on Problem Difficulty

Twenty-eight teachers completed a questionnaire containing the same items and the same estimation scale in the mathematics test papers but did not solve the problems. Most of the teachers who participated in this test were studying for the Diploma in Education in The Chinese University of Hong Kong. All except one were part-time students having full-time jobs and had taught for several years. Colleagues of some of these students also participated in the test. Table 2 shows the profile of the participating teachers.

Based on the data in Table 2, the teachers who participated in the test had an average of 5.03 years of teaching experience and of these, 4.70 years were in teaching secondary 3, 4 or 5 mathematics in which logarithms is taught. Also, all 28 of the teachers were university graduates with 13 of them holding a Diploma of Education or a masters degree. Hence, these teachers should have the background to rate the different factors as well as problem difficulty. The estimation of the teachers on the difficulty level of each item was averaged and reported in column 5 of Table 1.

Factors Affecting Problem Difficulty

To find ways how the teachers estimate the problem difficulty, the questionnaire on teachers' estimation of problem difficulty not only asked teachers to predict the item difficulty, but also identify factors they thought would be important in predicting problem difficulty. Six factors, called the complexity factors, were assumed to affect how teachers predict the problem difficulty. Each factor is separately described below:

 

Table 1

Table showing Teachers’ Estimation and Students’ Perception of Problem Difficulty
 
 

Paper

Questions Students’ perception
Item Difficulty Ratio
Teachers' Estimation
1
1
1.40
0.82
1.63
 
2
1.80
0.69
2.85
 
3
2.22
0.60
2.78
 
4
1.90
0.78
2.11
 
5
1.71
0.74
1.81
 
6
1.85
0.62
2.26
 
7
2.28
0.52
2.56
 
8
1.62
0.82
2.74
 
9
1.84
0.60
3.11
 
10
2.46
0.49
2.85
 
11
2.12
0.62
2.44
 
12
2.66
0.38
2.78
 
13
2.38
0.63
3.63
 
14
1.87
0.70
3.11
 
15
2.77
0.28
3.22
 
16
1.94
0.71
3.70
 
17
2.36
0.57
3.56
 
18
3.17
0.10
3.74
 
19
2.54
0.54
4.00
 
20
2.53
0.40
3.26
2
A1
2.28
0.43
3.33
 
A2
1.97
0.78
2.96
 
A3
1.86
0.73
2.56
 
A4
2.46
0.55
3.48
 
A5
2.33
0.70
3.67
 
A6
2.56
0.59
3.89
 
B1
2.87
0.34
3.44
 
B2
2.44
0.61
3.23
 
B3
2.45
0.64
3.11
 
B4
2.45
0.33
3.37
 
B5
3.17
0.37
4.00
 
B6
3.36
0.27
4.33
 
 

Table 2

Profile of Teachers Participating in the Estimation of Problem Difficulty
 
Teacher Characteristics No. of teachers
Sex  
Male 24
Female 4
Age group  
20-25 4
26-30 14
31-35 5
36-40 2
>40 3
Education level  
Secondary 0
Post-Secondary 0
University Degree 15
University Degree + Diploma of Education 7
Master or Above 6
Mathematics as major subject studied  
Yes 26
No 2
Teaching experience:  
0-2 years 3
3-4 years 13
5-6 years 5
7-8 years 3
more than 9 years 4
Teaching Experience (Secondary 3,4,5)  
0-2 years 5
3-4 years 12
5-6 years 5
7-8 years 3
more than 9 years 3
 
 

Table 3

Problem Difficulty as Predicted by the Complexity Factors
 
Paper No. Q. No.
Factors
   
f1
f2
f3
f4
f5
f6
1
1
1
2.5
0
2
2
1
 
2
2
3.0
1
2
2
1
 
3
4
6.5
2
1
1
1
 
4
1
3.0
1
2
2
1
 
5
1
3.0
1
2
2
1
 
6
1
5.0
1
2
2
1
 
7
3
4.0
2
1
1
1
 
8
2
3.5
1
2
1
1
 
9
2
1.5
2
3
3
1
 
10
2
3.0
2
2
2
1
 
11
3
5.0
1
1
1
1
 
12
2
5.0
1
3
3
1
 
13
2
2.5
0
3
4
1
 
14
3
3.0
3
2
2
1
 
15
3
8.0
4
3
3
1
 
16
1
2.0
2
2
1
1
 
17
4
3.0
3
2
2
1
 
18
5
3.8
2
4
4
1
 
19
3
6.0
3
3
2
1
 
20
1
5.0
1
1
1
1
2
A1
2
2.5
2
1
1
2
 
A2
1
1.5
1
2
2
2
 
A3
1
1.5
1
2
2
2
 
A4
1
2.3
1
3
3
2
 
A5
2
7.5
2
3
3
2
 
A6
2
4.0
4
4
4
2
 
B1
1
5.0
1
1
2
3
 
B2
1
3.5
2
1
2
3
 
B3
2
4.5
1
1
2
3
 
B4
2
6.0
2
1
2
3
 
B5
1
5.5
2
2
3
3
 
B6
4
9.5
2
2
3
3
 

Perceived number of difficult steps during the problem solving process (f1)

This measure reflected whether the students would encounter any difficulties in the solving process. Difficult steps were assumed to be those at which students usually made non-trivial errors. Experienced teachers would normally be capable of identifying these steps by just looking at the problem expressions and so may estimate the problem difficulty. For the present study, this was done by counting the chances for the occurrences of frequent errors. Frequent errors are the errors that occurred more than three times by the students in the mathematics test (Appendix A). For each question, the number of difficult steps was then counted. These are shown in Table 3.

Number of steps required to finish the problem (f2)

This is defined as the number of steps that an expert would require to finish a problem. Since it is possible that there may be more than one solution path to each problem, It was decided to count the number of steps of the shortest paths. The number of steps required was calculated by a computer system called Electronic Homework (Lee, 1996). The results of this comparison are given in Table 3.

Numerical complexity (f3)

A measure of numerical complexity was developed. An intuitive expression of numerical complexity would be the larger a number is, the more complex it should be since it is harder to calculate with larger numbers. However, to avoid using too detailed a scale, which might not be necessary, the numerical complexity was measured by assigning weights to the numerical values instead of using the numerical values themselves. Every value between one and ten was assigned a weight of 1, while decimals and numerical values greater than ten were assigned weights of 2. The sum of such weights then gave the value of numerical complexity of the problem which is shown in Table 3.

Number of occurrence of "log" (f4)

This is simply the number of logarithmic functions that can be found in the problem. Such numbers were counted and are listed in Table 3.

Number of operations in the question (f5)

This again was obtained by simply counting the number of operations in the problem. An operation is any one of the following: addition, subtraction, multiplication, division and exponent. Results of the counting are listed in Table 3.

Degree of familiarity of the student to the question (f6)

Students might find that some problems are more familiar than others and it is possible that they would find the familiar problems easier to solve. For the topic of logarithms, students normally learn in three stages; first the simplification of numerical expressions, secondly the simplification of expressions involving variables, and thirdly to the solving of logarithmic functions. Further, knowledge learned at the earlier stages is also used at later stages. It is therefore reasonable to assume that problems learned at earlier stages should be more familiar to the students. This forms the basis for the value of the degree of familiarity assigned to each problem. For simplicity, all problems on the simplification of numerical expressions were assigned a value of 1, those on simplification of expressions involving variables were assigned a value of 2 and problems on solving of logarithmic equations were assigned a value of 3. Values assigned to the problems in the test can be found in Table 3.

Teachers Rating of the Relative Importance of the Complexity Factors

Teachers were requested to rate the importance of each of these factors on a five-point scale before they estimated the problem difficulty. Besides rating these suggested factors, the teachers were also required to add any other factors which they thought were important. The data obtained showed that no additional factors were suggested by the teachers. Values of the relative importance of the complexity factors as rated by the teachers can be found in Table 4.

 

 

Table 4

Teachers' rating on importance of factors affecting problem difficulty
 
Factor
Level of importance
(f1) Perceived no. of difficult steps during the problem solving process
(r1) 4.00
(f2) No. of steps required to finish the problem
(r2) 3.43
(f3) Numerical complexity
(r3) 3.86
(f4) No. of occurrence of "log"
(r4) 2.96
(f5) No. of operations in the question
(r5) 3.21
(f6) Students Degree of familiarity with the question
(r6) 3.93
 

From Table 4, it can be seen that all the levels of importance were greater than or at least equal to 3, the mid-value, which shows that the complexity factors were considered as quite important by the teachers.

Comparing the two Measures of Problem Difficulty

The correlation coefficient of the teachers’ estimation and the students’ perception was found to be.74 while that between item difficulty ratio and students’ perception was -.86. The negative sign in the latter coefficient represents the condition where the estimation is higher (more difficult), fewer students will answer the question correctly, hence causing a low item difficulty ratio. Hence, it can be seen that teachers’ estimation was not as accurate as item difficulty ratio although both predict accurately to a certain extent.

To investigate how teachers and students rate the problems, the statistical method multiple linear regression was employed to investigate the effects of predicting these four difficulty measures by using the complexity factors. The result of the prediction is shown in Table 5.

Table 5

Summary of regression coefficients found

 
 
  Item Difficulty Ratio Students' Perception Teachers' Estimation
Regression Coefficients 0.52*** 0.81*** 0.72***
*p<.05, **p<.01, ***p<.001.

 

Table 5 shows item difficulty ratio, teachers’ estimation and students’ perception can be predicted by the complexity factors, although to different degrees of accuracy. Further investigation shows that not all the complexity factors are required to predict these dependent variables. Table 6 shows the list of variables that appear in the regression equations.

Table 6

Variables in the Equations to Predict the Problem Difficulty Measures

 
 
Item Difficulty Ratio Students’ perception Teachers' Estimation
machstep

 

familar

machstep 

notmfac

pererr

familar

nolog

numcomp 

 

Note. machstep = Number of steps required for the computer to finish the problem; familar = Familiarity of the problems to the students; notmfac = Number of operator in the problem expression; pererr = Perceived no. of difficult steps during the problem solving process; numcomp = Numerical complexity; nolog = No. of occurrence of "log".

From Table 6, it can be seen that only one of the complexity factors appears in the regression equation to predict item difficulty ratio, while in the equations to predict students’ perception and teachers’ estimation, there are 4 and 3 of the factors respectively. The fact that only one factor appears in the equation to predict the item difficulty ratio shows that this measure is based on the number of steps required to finish the problem since the more steps required, the greater chance for a student to make errors. On the other hand, students‘ perception are found to depend on 4 factors, among which, the number of steps required to finish the problem and the number of perceived errors may require more in-depth thinking on the part of the students, while the other two factors, the number of operations within the problem expression and the degree of familiarity of the problems, can be observed directly from the problem expression. Hence students’ perception of problem difficulty is based on both “surface” and “deep” characteristics of the problem expressions.

Finally, in the equation that predicts teachers’ estimation, only 3 of the factors, number of operations, number of occurrences of ’log‘ and the numerical complexity can be found in the equation. As these factors are easily observable just from the problem expressions, it seems that the teachers participated in this study based their judgment on some easily obtainable and superficial variables. That might explain why their prediction of the student's achievement was not as good as those predicted by the item difficulty ratio.

Conclusion and Discussion

Results of the present study show that although teachers might, to a certain extent, correctly estimate the problem difficulty, their estimation is not as good as that done by item difficulty ratio. Besides, although the teachers claimed that all the 6 complexity factors are important in their estimation, only 3 of them were actually used. Further, these 3 factors were those can be observed just from the problem expressions. It seems that although the teachers know how to estimate the problem difficulty accurately, in the real process, they would rather base their estimation on some “surface” features. This might be the reason why it is not as good a measure as the item difficulty ratio.

Problems used in the present study involved simplifying logarithmic expressions and solving logarithmic equations, which deal with algebraic expressions and equations having one variable. The results obtained thus may also be applicable to algebraic problems in general. For problems other than in algebra, further investigation is needed. Also, as the sample size of the present study is quite limited, whether the result can be generated to all teachers needs further investigation.

 
Reference
Gronlund, N. E. (1981). Measurement and evaluation in teaching. NY: Macmillan.
Holzman, T. G., Pellegrino, J. W., & Glaser, R. (1983). Cognitive variables in series completion. Journal of educational psychology, 75. No. 4, pp. 603-618. Jerman, M. E. (1983). Problem length as a structural variable in verbal in verbal arithmetic problems. Educational studies in mathematics, 5, 109-123.
Lane, S. (1991). Use of restricted item response models for examining item difficulty ordering and slop uniformity. Journal of educational measurement 28 No. 4, pp. 295-300.

Lee, F. L., (1996). Electronic Homework: An intelligent tutoring system in logarithms. Unpublished PhD Dissertation. The Chinese University of Hong Kong. Lester, F. K. (1980). Problem solving: Is it a problem? In M. M. Lindquist (Ed.), Selected issues in mathematics education. pp. 29-45. Chicago: McCutchan.
Linville, W. J. (1970). The effects of syntax and vocabulary upon the difficulty of verbal arithmetic problems with fourth grade students. Dissertation Abstracts International, 30, 4310A. Loftus, G. R. & Loftus, E. F. (1976). Human memory: The processing of information. New York: Erlbaum.

Marzano, R. J., & Jesse, D. M. (1987). A study of general cognitive operations in two achievement test batteries and their relationship to item difficulty. Washington, D.C.: Office of educational research and improvement, Department of Education. (ERIC Document reproduction service No. ED 299321).

Mason, E., Zollman, A., Bramble, W. J., & O‘Brien, J. (1992). Response time and item difficulty in a computer-based high school mathematics course. Focus on Learning in Mathematics. Vol. 14, No. 3.

Mayer, R. E. (1975). Information processing variables in learning to solve problems. Review of Educational Research, 45, 525-541.

Newman, D. L., Kundert, D. K., Lane, D. S., & Bull, K. S. (1988). Effect of varying item order on multiple-choice test scores: Importance of statistical and cognitive difficulty. Applied Measurement in education, 1(1), pp. 89-97. Plake, B. S., Glover, J. A., Kraft, R. G., & Dinnel, D. (1984). Cognitive capacity usages in responding to test items. Journal of Psychoeducational Assessment, 2 (4), 333-343.
Silver, E. A., & Thompson, A. G. (1984). Research perspectives on problem solving in elementary school mathematics. The elementary school journal, 84, pp. 529-545. Zweng, J. J., Turner, J., & Geraghty, J. (1979). Children's strategies of solving verbal problems. Columbus, OH. (ERIC document reproduction service No. ED 178359).
 

Appendix A
Mathematics Test Items used for the Measuring of Problem Difficulty

PAPER I
1. 

2. 

3. 

4. 

5. 

6. 

7. 

8. 

9. 

10. 

11. 

12. 

13.

14. 

15. 

16. 

17.

18.
19.

20.  PAPER IIA 1)

2)

3)

4)

5)

6)
PAPER IIA 1) 

2) log(9x-26)=2

3) 

4) 

5) 

6)

Abstract

Teachers usually estimate the difficulty levels of problems before assigning them to students. The purpose may be either to order the problems in an exercise according to their difficulty levels so that students may be more motivated to continue with their work, or to choose the problem most appropriate for a student to do. Either way would require teachers’ abilities to accurately estimate the problem difficulties.

Twenty-eight mathematics teachers were asked to estimate the difficulty of two sets of logarithmic problems and at the same time, they were required to evaluate the importance of 6 complexity factors that contributed to their estimation. The 6 complexity factors are respectively, number of steps to finish the problem, number of difficult steps perceived, number of operations and numerical complexity of numbers in the problem expression, number of occurrence of "log" and the degree of familiarly of the problem to the students. Results show that teachers rated these factors as quite important in their estimation. Also, multiple regression analysis of the data collected shows that although their estimation was correlated to students‘ perception of problem difficulty, the correlation is not as high as that between students’ perception and the traditional measure, the problem difficulty ratio. Besides, in contrast with the teachers’ claim that all the 6 complexity factors are important to their estimation, it is found that their estimation was based on only 3 of the factors: familiarity of problem, number of “log” and the numerical complexity, all are obtainable by just looking at the problem expressions. Other factors like number of steps required and number of perceived steps, which would require more in depth thinking, were not found in the equation to predict teachers’ estimation. Implication of the finding is then discussed.