It works by logging into your Kindle web reader account using Playwright, exporting each page of a book as a PNG image, and then using a vLLM (gpt-4o or gpt-4o-mini) to transcribe the text from each page to text. Once we have the raw book contents and metadata, then it's easy to convert it to PDF, EPUB, etc.
The repo supports a few different options for TTS providers to generate audiobooks from the resulting text.
I had a lot of fun w/ this project :)
It works by logging into your Kindle web reader account using Playwright, exporting each page of a book as a PNG image, and then using a vLLM (gpt-4o or gpt-4o-mini) to transcribe the text from each page to text. Once we have the raw book contents and metadata, then it's easy to convert it to PDF, EPUB, etc.
The repo supports a few different options for TTS providers to generate audiobooks from the resulting text.
Would love feedback && thanks!