This is pretty neat! It's great to see relatively recent advancements in machine learning, put to use for child education.
A few questions for @Cherian:
1. I see the ASR usage, but where does computer vision come into play?
2. Are you training and/or fine tuning asr models to deal with the speech characteristics of children and new speakers?
3. Is the asr all cloud side, or do you have it running locally in some fashion?
#2#3 Due to strict internal privacy policy, ASR is local to device and learner.
We plan to train and fine tune through continuous beta testing with families that opt in through our research groups but not through existing learners/subscribers.
Some other team members may comment to provide more info
Anyone in Bay Area welcome to come and check out demo.
Osmo is a favorite with Hacker News parents. Two years back, we started this labor of love to help them in this difficult journey of learning how to read.
Critique us. We’d love to hear from you.
#1 computer vision is used to detect what books the learner is reading to unlock digital content for each book (72+ books) and page (20+ Pages) as well and detect position of book relative to base/device/camera. Cv also tracks the wand which is used as a pointer to track the words the learner is currently pronouncing. The wand is also a pointer for certain games to practice what is being taught.