Hi! I can't really share the company, but I do love the space and happy to discuss what I've been reading.
So the idea of Knowledge Tracing originated, from my understanding with a paper in 1994: http://act-r.psy.cmu.edu/wordpress/wp-content/uploads/2012/1... this sort of introduced the idea that you could model and understand a students learning as it progresses through a set of materials.
Going forward a few years you have a Stanford paper, Deep Knowledge Tracing (DKT): https://stanford.edu/~cpiech/bio/papers/deepKnowledgeTracing... which delves into utilizing RNN(recurrent neural networks) to aide in the task of modelling student knowledge over time.
Jumping really far forward to 2024 we have another paper from Carnegie Mellon & University of Pittsburgh: Automated Generation and Tagging of Knowledge Components from Multiple-Choice Questions: https://arxiv.org/pdf/2405.20526 and A very similar paper that I really enjoyed from Switzerland: Using Large Multimodal Models to Extract Knowledge Components for Knowledge Tracing from Multimedia Question Information https://arxiv.org/pdf/2409.20167
Overall the concept I've been sort of gathering is that, if you can break down the skills involved in smaller and smaller tasks, you can make much more intelligent decisions about what is best for the student.
The other thing I've been gathering is that Skills Taxonomies are only useful in as much as they help you make decisions about students. If you build a very rigid Taxonomy that is unable to accommodate change, you can't really adapt easily to new course material or to make dynamic decisions about students. So the idea of a rigid Taxonomy is quickly becoming outdated. Large language models are being used to generate fine-grained skills (Knowledge Components) from existing course material to help model a students development based on performance in a way that can be easily updated when materials change.
I have worked through and replicated some of the findings in these later papers using local models, for example using the open Gemma 2 27b models from Google to generate Knowledge components and using Sentence Embedding models and K-means clustering to gather them together and create groups of related Knowledge Components. It's been a really fun project and I've been learning quite a bit.
Thank-you!
It's a long time that I have a similar idea and I'm interested in developing it, but never found the time to dig deeper, with those references I will jump start in the subject and refine.
It's nice to know I'm not the only one thinking about that.
The trick for me is that it's a path in a graph for each student, so even if some component is not as strong for one student, he can fill the gap by taking another route. A good framework would be resilient if it finds many possible paths to reach the same result, and not forcing one path. But then, teaching in this way is more difficult.