This study describes how a computer system can be made to simulate the errors generated by students in solving logarithms problems. A set of 14 primary mal-rules was hypothesized to be the causes of students’ errors and was incorporated into a computer system with the correct rules to solve logarithms problems. The system was found able to explain 90.71% of the errors by composing the primary mal-rules and the correct rules. Also, when 5 experienced teachers were asked to diagnose the same set of errors, the reasons they gave were found to agree with the primary mal-rules. Results of the study suggest that it is possible to replace the traditional way of representing errors with mal-rules by the use of fewer primary mal-rules. The use of primary mal-rules would make the system simpler and capable of diagnosing more errors.
A major function of an intelligent tutoring systems is to diagnose students’ errors. This capability is enabled by storing a library of mal-rules (incorrect rules), each of which represents a previously recorded error. Each mal-rule is also accompanied by a tutoring script that tries to help the students to correct the error. Besides mal-rules, an intelligent tutoring system also includes a set of correct rules which describes the correct actions that should be taken when solving problems. Whenever a student’s input is to be diagnosed, the computer system tries to match the input expression with the rules stored; if a matched rule can be found and if it is a correct rule, then the input expression will be considered as correct. On the other hand, if a matched mal-rule is found, then the input expression will be considered as an error, and the corresponding tutoring script will then be displayed to help the student correct it.
This matching process works well in most cases. However, as there may be many possible errors and some of them are infrequent (VanLehn, 1982; Sleeman, 1984; Payne & Squibb, 1990), it may be impossible for any system, no matter how large it is, to include all of them. A possible solution to this may be by just leaving some of the infrequent errors undiagnosed, but in this case, the system could offer no help to the students.
Composition of Rules
The use of mal-rules to represent errors has the disadvantage that all errors are treated equally: no matter what the cause of the error is, each error is represented by a mal-rule. However, as rules are declarative representations of procedures (pieces of procedural knowledge) and as a series of procedures may be composed into one, provided the action part of the preceding procedure is identical to the condition part of the one that follows (Anderson, 1983; 1990), some observed errors may be the result of the composition of two or more procedures. Just representing an error as a mal-rule may neglect the fact that the error is due to an erroneous component which then makes it hard to understand why the error occurs. The situation is even worse in some relatively complex topics such as logarithms, in which the use of the notation “log” causes a lot of confusion.
What is required is that some mal-rules be decomposed so that the erroneous components can be identified. The identification of erroneous components not only helps to understand the reasons why errors occur, but also helps to reduce the number of rules that have to be stored in a system. As one erroneous production, when composed with other rules, may produce several mal-rules, if these erroneous productions can be singled out and represented as rules in the system, then the original mal-rules may no longer be needed, since they can be generated by composing the erroneous productions with others on the fly (i.e., when diagnosing). Although the system may become slower since the composition takes time, it can diagnose exactly the same as before.
The purpose of the present study is thus to investigate the possibility of finding the erroneous procedure which, when composed with others, would produce all the mal-rules observed. To distinguish these productions from the others, they are referred to as the primary errors and the rules representing them as primary mal-rules. In order to verify whether primary mal-rules can really explain students’ errors, teachers’ diagnoses of the same set of students’ errors were collected and compared with the primary mal-rules. It is only when the two agree with each other that the primary mal-rules can be a valid explanation. The following sections describe how this was done.
Three stages, collection of data, analysis of data and coding of rules to be used in the computer system, were involved in obtaining the results. The following subsections describe each of these in detail.
Collection of Data
Data on students’ errors and teachers’ diagnoses on the causes of errors were obtained separately as follows:
Students’ Errors
Two test papers were used to collect students’ mal-rules as shown in Appendix A. The two papers contained items similar to those frequently found in common text books with Paper 1 concentrating on simplifying expressions containing logarithms of numbers and Paper 2 concentrating on both simplifying expressions containing logarithms of variables and solving logarithmic equations. In terms of problem difficulty and degree of abstraction, the items in Paper 1 are, on average easier than those in Paper 2 although both papers contain easy items as well as hard ones.
One hundred and twenty-five secondary four students from two subsidized schools of Hong Kong participated in the test programme. As the aim of these tests was to collect mal-rules, these students were recommended by teachers as the most likely to make errors. It was stipulated that these students should be of average academic ability in mathematics, the reason being that the good students would not make mistakes and poor students with insufficient knowledge to continue would soon give up and thus produce no mal-rules.
The mal-rule collecting tests were administered in October 1995. Students sat the tests during their normal lessons and were invigilated by their own teachers. The tests were administered in two separate periods and the students were provided with enough time to finish the problems. Scripts were then marked and errors found were recorded and coded as mal-rules with methods to be described in later sections.
Teachers’ Diagnoses on Causes of Errors
The teachers’ diagnoses on causes of students’ errors were obtained in order to be compared with the hypothesized causes of errors that were used in the computer system. Five experienced teachers were shown the list of mal-rules obtained and were asked to diagnose why the students had made such errors. The results collected were analyzed and are reported in Table 4.
Analysis of Data
Students' test papers in the two mal-rule collecting tests were analyzed
to obtain the mal-rules. As the focus of the present study was on logarithms
and it was assumed that students should be quite familiar with basic arithmetic
skills such as multiplication, division and exponentiation, such errors
were broadly categorized as groups and one mal-rule is used to represent
the whole group of errors. An example is errors such as expressing the
number 125 as
and expressing
the number 8 as
; these
were grouped together and represented as one mal-rule. The following shows
the mal-rule that represents such errors:
![]()
This broad categorization is not an accurate representation of errors
since errors in identical forms might not be caused by the same reasons.
For example, expressing the number 125 as
might
be a careless slip since
was actually intended. On the other hand, expressing the number 8 as
might be due to the misunderstanding that the product of 4 2’s is equivalent
to
. However, if such errors
were placed into different groups according to their reasons behind them,
only a few errors would fall into each of these groups. As it would be
very inefficient for any cognitive system to take care of all these infrequent
mal-rules, the above categorization would enable the system to work more
efficiently. However, the above measure was not applied to all errors related
to logarithms and frequent errors. In this case, errors that were caused
by the same reasons were represented as separate rules according to the
reasoning behind them. Examples of these are:
,
which represents errors such as expressing
as
;
as
and others.
Besides the above classification of errors, the analysis was based on the following principles:
was expressed by a student as 
[Rule 3]
,
then it would be treated as being composed of two errors: first, expressing
and secondly expressing
as 
The mal-rules collected were grouped in terms of their assumed nature and causes. For example, errors like those represented by [Rule 2] above could be placed in the same group since they were thought to be caused by misinterpreting the expression "log A" as "log times A" for any expression A. Fourteen such groups were formed. Groups of a similar nature were then placed in the same category. A total of 6 such categories were then formed. Table 1 shows the descriptions of these groups and categories and the number of occurrences of the errors corresponding to mal-rules represented by the groups.
As seen in Table 1, the mal-rules are divided into five categories. The first two categories are related to logarithms, one on simplifying logarithmic expressions and the other on solving logarithmic equations. The next two groups are mal-rules about other algebraic manipulations, one for solving equations and the other for simplifying expressions. Finally, the last group contains the rules representing the errors called slips which are possibly caused by careless work.
The purpose of grouping these mal-rules was to enable the hypothesizing of the primary mal-rules. As the present study focuses on errors in solving logarithms problems, only groups about logarithms, i.e., those belonging to the first two categories, were considered. For each of these groups, all the mal-rules were hypothesized to be caused by one or two reasons which were then broken down into primary mal-rules. Thirteen such primary mal-rules were then formed. Table 2 shows the assumed reasons and the corresponding primary mal-rules.
The primary mal-rules were then incorporated into a computer system called Electronic Homework (Lee, 1996), in which the correct rules were already incorporated. The computer system diagnosed the errors which are examples of the mal-rules obtained. The computer system then recorded the rules used in diagnosing the errors and stored these for later comparison with the reasons given by human counterparts. The purpose was to see whether the computer diagnosis was comparable with that of the human experts, i.e., the teachers.
Mal-rules
According to the analysis, 114 mal-rules (Appendix B) and 13 primary mal-rules were obtained.
With the primary mal-rules, the computer system Electronic Homework was found to be able to diagnose most of the errors by composing the primary mal-rules with either the correct rules or other primary mal-rules. The primary mal-rules used to explain each mal-rule can be found in Table 3. For the sake of simplicity, only those rules with frequencies higher than 5 times were listed. Table 3 also shows the number and the proportion of rules explained.
It can be seen from Table 3 that 90.71% of the errors found can be explained by the primary mal-rules. It should be pointed that although some of the errors are left unexplained, this only implies that a satisfactory explanation has still not yet been found, but does not mean it will not be explained in the future when more data is gained. Hence, the exact figure should exceed that reported.
Teachers’ Diagnoses
In Table 4, the reasons given by the teachers for each of the mal-rules are listed with their codes. The meaning of these codes is found at the bottom of the table. Some of the reasons suggested are too general to give any insight as to the causes of error. For example, reasons such as “confusion about grouping terms” or “difficult problem” are too general and so were rejected.
Although some discrepancies were found among the teachers’ suggestions of the reasons, the majority seem to be common to those suggested in Table 2 which shows the primary mal-rules used by the system to explain the mal-rules. For example, for the first subgroup of rules (AA1, AA2,...), the system used a primary mal-rule MtR1 to explain them, where MtR1 states:
[MtR1]
On the other hand, reasons given by the teachers were:
log 5 + log 5
The teacher suggested this might have occurred because the student wanted
to have log 10 in the next step which might then easily lead to the answer.
Such a suggestion is reasonable, but it is true that the student is still
treating the logarithmic function as the multiplication with a variable
"log". Actually all the teachers suggested the same explanation to this
group of rules as that given by the computer system.
By using similar arguments, the reasons suggested by the teachers to
explain the mal-rules can be categorized under the headings of the primary
mal-rules used in Table 2. Table 5 shows the result of such categorization:
Table 5 shows that the teachers’ suggestions are largely compatible
with the primary mal-rules used by the computer system. In a way, Electronic
Homework is doing just what experienced teachers are doing.