These are the functions for computing the effect of the components’
histories for each student (except for the fixed feature, the constant
intercept). Some features have a single term like exponential decay
(expdecafm), which is a transform using the sequence of prior trials and
a decay parameter. Other features are inherently interactive, such as
base2, which scales the logarithmic effect of practice by multiplying by
a memory decay effect term. Other terms like base4 and ppe involve the
interaction of at least 3 inputs.
It should be noted that most features are dynamic in this method. A
“dynamic” feature means that its effect in the model potentially changes
with each trial for a subject. Most dynamic features start at a value of
0 and change as a function of the changing history of the student as
time passes in some learning system.
Constant (intercept) - This is a simple generalized linear model
intercept, computed for a categorical factor (i.e., whatever categories
are specified by the levels of the component factor).
Total count (lineafm) - This feature is from the well-known AFM
model [7], which predicts performance as a linear function of the total
prior experiences with the KC (here of course it could also be the count
for the student or item or other categorical factor in the
history.
Log total count (logafm) - This predictor has been sometimes used
in prior work and implies that there will be decreasing marginal returns
for practice as total prior opportunities increase, according to a
natural log function. For simplicity, we add 1 to the prior trial count
to avoid taking the log(0), which is undefined.
Power-decay for the count (powafm) - This feature models a
power-law decrease in the effect of successive opportunities. By raising
the count to a positive power (nonlinear parameter) between 0 and 1, the
model will describe less or more quickly diminishing marginal returns.
It is a component of the predictive performance equation (PPE) model,
but for applications not needing forgetting, it may provide a simple,
flexible alternative to logafm.
Recency (recency) - This feature is inspired by the strong effect
of the time interval since the previous encounter with a component
(typically an item or KC). This feature was created for this paper and
has not been presented previously. This recency effect is well-known in
psychology and is captured with a simple power-law decay function to
simulate improvement of performance when the prior practice was recent.
This feature only considers the just prior observation; older trials are
not considered in the computation.
Exponential decay (expdecafm) - This predictor considers the
effect of the component as a decaying quantity according to an
exponential function. It behaves similarly to logafm or powafm, as shown
in Fig. 3.
Power-law decay (base,base2) - This predictor multiplies logafm
by the age since the first practice (trace creation) to the power of a
decay rate (negative power), as shown in Fig. 4. This predictor
characterizes situations where forgetting is expected to occur in the
context of accumulating practice effects. Because this factor doesn’t
consider the time between individual trials, it is essentially fit with
the assumption of even spacing between repetitions and doesn’t capture
recency effects. The base 2 version modifies the time by shrinking the
time between sessions by some factor, for example, .5, which would make
time between sessions count only 50% towards the estimation of age. This
mechanism to scale forgetting when interference is less was originally
introduced in the context of cognitive modeling.
Power-law decay with spacing (base4) - This predictor involves
the same configuration as base2, multiplied by the means spacing to a
fractional power. The fractional power scales the effect of spacing such
that if the power is 0 or close to 0, then spacing the scaling factor is
1. If the fractional power is between 0 and 1, there are diminishing
marginal returns for increasing average spacing between trials.
Performance Prediction Equation (ppe) - This predictor was
introduced over the last several years and shows great efficacy in
fitting spacing effect data (cite). It is novel in that it scales
practice like the powafm mechanism, captures power-law decay forgetting,
spacing effects, and has an interesting mechanism that weights trials
according to their recency.
Log PFA (Performance Factors Analaysis) (logsuc and logfail) -
These expressiond are simply the log-transformed performance factor
(total successes or failures), corresponding to the hypothesis that
there are declining marginal returns according to a natural log
function.
Linear PFA (linesuc and linefail) - These terms are equivalent to
the terms in performance factors analysis (PFA).
Exponential decay (expdecsuc and expdecfail) - This expression
uses the decayed count of right or wrong. This method appears to have
been first tested by Gong, Beck, and Heffernan. This method is also part
of R-PFA, where it is used for tracking failures only, whereas R-PFA
uses propdec to track correctness. The function is generally the same as
for expdecafm. However, when used with a performance factor, the
exponential decay weights on the events seen recently, so a history of
recent successes or failures might quickly change predictions since only
the recent events count for much, especially if the decay rate is
relatively fast.
Linear sum performance (linecomp) - This term uses the success
minus failures to provide a simple summary of overall performance. The
advantage of this model is that it is parsimonious and therefore is less
likely to lead to overfitting or multicollinearity in the
model.
Proportion (prop) - This expression uses the prior probability
correct. It is seeded at .5 for the first attempt.
Exponential decay of proportion (propdec and propdec2) - This
expression uses the prior probability correct and was introduced as part
of the R-PFA model. This function requires an additional nonlinear
parameter to characterize the exponential rate of decay. For propdec, we
set the number of ghost successes at 1 and ghost failures at 1 as a
modification of Galyardt and Goldin. This modification produces an
initial value that can either decrease or increase, unlike the Galyardt
and Goldin version (propdec2), which can only increase due to the use of
3 ghost failures and no ghost successes. Our initial comparisons below
show that the modified version works as well for tracking subject level
variance during learning. Galyardt and Goldin illustrate an extensive
number of examples of propdec2s behavior across patterns of successful
and unsuccessful trials at various parameter values. The new propdec
behaves analogously, except it starts at a value of .5 to represent the
different ratio of ghost success to failure at the beginning of
practice. As a point of fact, the number of ghost attempts of each type
are additional parameters, and we have implemented two settings: 1 ghost
success and 1 ghost failure (propdec), or 3 ghost failures
(prodec2).
Logit (logit) - This expression uses the logit (natural log of
the success divided by failures). This function requires an additional
nonlinear parameter to characterize the initial amount of successes or
failures.
Exponential decay of logit (logitdec) - This expression uses the
logit (natural log of the success divided by failures). Instead of using
the simple counts, it uses the decayed counts like R-PFA, with the
assumption of exponential decay and 1 ghost success and 1 ghost
failure.