About a year ago, I made a wistful plea for Amazon’s Whispersync technology, which keeps your place in an eBook synchronised over multiple devices, to be extended to other formats, particularly audiobooks.
I wasn’t being all that serious at the time. Obviously you can’t extend Whispersync to print books, even in principle, but even synching eBooks and audiobooks sounds like a tough technological challenge. All the same, there must be some interest in the idea because my earlier post seems to be getting a fair few hits via search engines.
So I got to thinking … is eBook to audiobook Whyspersinc remotely feasible? Now Amazon do own Audible.com so any solution would be entirely in house. I have no idea whether they have done any work on it, but essentially the problem is one of mapping position within a file. Amazon eBooks (mobi format) denote your place in a book by “locations” which do not, in general, correspond to the pages in the original print version (if any). A “location” marks the start of a short passage of text and you might typically have several “locations” to an original page of print. But you can’t specify your place down to the level of an individual word of the text.
Audiobooks are different. They come in audio files with a known playback time expressed in hours minutes and seconds. Your place is expressed as a number of hours, minutes and seconds listening time into that file and it could correspond to halfway through a word.
Amazon would have to map eBook “locations” to the start of the corresponding block of recorded words in the audiobook. They could try to do this perfectly or settle for something approximate. If the latter, they could just divide the overall audiobook playing time (adding up multiple Audible files for a large book that has to be split up) by the number of “locations” in the eBook, to get a number of minutes and seconds per “location”. Getting from place in eBook to place in audiobook then boils down to application of a simple mathematical formula.
It could go wrong though. What about long fantasy books with masses of appendices? The latter are usually included in eBooks but left out of audiobooks, so the formula would in general produce an incorrect result. Even if you could detect and adjust for appendices, the simple formula would fall foul of other distorting factors, such as variations in the narrator’s speed, sections of denser or sparser text, etc.
A more accurate solution would require human intervention. That means someone listening to the audiobook while looking at the eBook and building up a table of “locations” and corresponding timing points in the audiobook. Very time consuming and very expensive. Who would pay for it? Maybe audio to text transcription is at the level that this matching and mapping exercise could be automated. I wouldn’t rule it out but can see issues, for example automated mapping might struggle with weird character or place names. Some human intervention might still be required to resolve “queries” raised by the automated system, and to deal with those pesky appendices where applicable.
And there are other practical questions. What if the user is reading a book which is spread over several audiobook files and his/her place in the eBook corresponds to an as yet undownloaded audiobook file, or one of those wretched appendices? Not the end of the world, the user would have to be presented with an error message, but clearly there are a number of details that have to be thought about and planned for.
Will we ever see this? I think we will, but it may not happen for a while and may not be perfect when we get it.