Planning> Sequencer Background
The Ebbinghaus model uses the following formula:
where is retention, is time, and is strength.
A compliment to the Ebbinghaus Forgetting Curve:
Other models suggest a sigmoid function. CDF of the logistic distribution:
Bayesian Knowledge Tracing
Bayesian Knowledge Tracing determines, given the pattern of learner responses, how likely a learner knows a skill.
Bayesian Knowledge Tracing is based on Bayes’ Theorem.
Here I am using
: to mean ‘given’ because ‘pipe’ implies table to Markdown.
- the posterior – what we believe after seeing the data
- the prior – what we believe before the data
- the likelihood – how likely the data was given our prior belief
- the normalizer – how likely is the data given all hypotheses
As can be difficult to formulate, often the following expression is useful.
For BKT, we have the following factors:
- - probability the skill is learned
- - probability the skill will be learned on a particular item
- - probability the learner will just guess the right answer
- - probability the learner will mess up even knowing the skill
For any item, the probability of getting the answer correct is:
Putting this all together, the probability the learner has learned the skill is, given a correct answer:
Item Response Theory
Item Response Theory determines how likely a learner will correctly answer a particular question. It is described as a logistic function.
The parameters read:
- - learner ability
- - item difficulty
- - item discrimination; how likely the item determines ability
- - item guess
There are two common forms:
The formulas change slightly depending on author.
Item Response Theory can be extended into Performance Factor Analysis, a competing model with Bayesian Knowledge Tracing.
Knowledge Space Theory
We assume a learner has either learned a skill or not. Given skills
/, we would form prerequisites, such as:
+ -> -
+ -> *
* -> /
The knowledge space represents all possible sets of knowledge a learner might have, such as:
+, *, /
+, -, *, /
An individual learner has a likelihood for each of these sets. KST makes the assumption that an individual question may inquire about multiple skills. We begin by asking questions that use multiple skills, and work backwards to assess learner knowledge.
Several automated systems exist to automatically determine prerequisites based on learner performance.
Spaced Repetition suggests that learners will be more optimal by spreading out their practice, with reviews happening less frequently as ability improves.
The most popular algorithm is SuperMemo 2. The first review is after 1 day, the second review is after 6 days. After which, the next review is:
…where is how difficult or easy the item is. is between 1.3 and 2.5, and it uses learner responses on a Likert scale to determine the next time to review.
Later versions of SuperMemo include other considerations, such as:
- Similar cards
- Previous iteration duration
- Ebbinghaus forgetting curve
The latest is version 11/15.
For each distribution, I am most interested in , or the mean of the distribution, and , or the variance, which can determine how confident we can be in our assertion.
Beta distributions map probabilities from 0 to 1, where is the count of positive examples and is the count of negative examples. Computation is fairly straightforward for most statistics.
…where is the mean and is the variance.
The exponential distribution is often used to describe the frequency of time-bound events.
…where mu and sigma are provided. Mu and sigma can be determined from a sample by using a gaussian kernel.
The Pareto Distribution uses for scale and for shape.
The Poisson distribution is often used for counting events.
Binomial distributions count the number of events each with a probability of , where is the number of successes.
The Bernoulli Distribution only has two hypotheses: 0 or 1. The mean is the probably of 1.