I'm finding it hard to follow the TTS (but it's growing on me) but the core idea is superb and I'm catching interesting snippets. Changing the TTS voice between items is a genius idea I'd have never thought of myself but it really helps divide it up and keeps my attention.
A suggestion, perhaps, is to lean on tldr.io's system of providing well written short summaries of Hacker News items rather than the actual content. That way it'd sound a lot more like a regular news bulletin and skirt around problems of third party content. (I know one of the founders if you want an intro but I believe you can grab their stuff somehow anyway..)
I've pointed them to this thread to see what they think. Look forward to seeing this progress. (And as an aside, I really wish someone would crack the TTS 'uncanny valley' problem 100%. It's a surprisingly difficult problem.)
There's quite a difference between the "normal" voices in OSX, and the high-quality ones. If you have a few spare gigabytes, try out the French, Spanish, and Swedish versions.
For my kids, it's at least a few minutes entertainment to make the computer speak high-quality foreign curse words.
I'd never noticed this before - cool! Samantha and Tom in the US English section sound particularly good.
The only major problem I seem to perceive is that the transitions between certain words and phones aren't smooth enough and are jarring. It doesn't sound like they're far off but I guess 90% of the work is in the last 10% ;-)
You're absolutely right about the TTS uncanny valley problem. We're betting on the fact that this is an industry effort. Currently we don't have the resources to really invest in the core TTS technology but always on the look-out for breakthrough/experimental tech.
I find I don't have enough time to read all the stuff on Hacker News, but I spend heaps of time on the train where I’m usually listening to music.
Since I work at a startup doing some stuff with text-to-speech, I hacked together this text-to-speech "radio station" so I can listen to Hacker News on my mobile instead of music.
It uses native HTML5 audio that worked fine with iOS and Android in my testing (though some OEMs like HTC screw up the player skin), and uses the RSS feed to grab top articles. Obviously TTS isn’t perfect, but I find most articles except coding ones are comprehensible.
These guys rock. I'd been using SoundGecko to que up and convert articles for listening on the train. Syncs with Dropbox or Drive, done. Love the concept of putting a dedicated station for content providers, social aggregators like HN.
Re. HTML5 audio, until recently Chrome for Android had a problem where if you closed the screen/changed tab it would cease the audio playback.
Yeah unfortunately there isn't a lot of control over HTML5 audio AFAIK. There is a preload option but it's specific to the individual <audio> element.
Maybe as a hack I could change it so each article has a hidden <audio> somewhere that preloads it in the background, then when you actually play back it should use cache.
I don't know if that's wise over a cell connection though, downloading all the MP3s.
Perhaps a better approach would be to add a "preload" button, and then use the hidden element as you suggested so we aren't loading audio we aren't interested in.
Another idea would be to add a personal user queue to do the same thing. Awesome site btw.
Audio is the worse way to consume content, why not just use Evernote or whatever and read what you've missed? Then you can continue to listen to music and catch up with what you've missed while you're on the train.
I actually find that Audio is my favorite way to consume content because I usually want to use my sight and hands to be doing something more proactive and creative (ie. programming)
Perhaps this is unique to myself, but I have found that I can function amazingly well digesting something through Audio and working on something visually. I have no formal neuroscience education, but perhaps it's something to do with how the brain (or perhaps just my brain) processes various inputs and outputs.
As a secondary note to the author of this app. What a great idea! I really like where you guys are going with this. I do find it hard to listen to as the narrator still sounds like a robot. But I'm sure with enough time this will be solved as well.
I can't code and listen to speech audio at the same time. I can do it if I'm only pattern-matching or doing something that is only visual and doesn't take much conscious thought. I used to listen to audiobooks at an old job whenever I had to do tedious work, digging through spreadsheets for anomalous data. And, I now listen to audiobooks while I'm riding my bike or exercising.
I figured that it was because coding, or writing something like this comment, uses the speech center of my brain, as does the incoming audio stream. Music doesn't have any adverse affects on my ability to code or write comments, since I don't really pay attention to any of the vocals that may be present.
As an audiobook fanatic, my go to explanation for audio's effectiveness is that speaking and listening were the human race's primary way of transmitting and receiving information until very recently - the era of most of the human race being literate is a blink in the scale of evolutionary history.
A recent article on audiobooks (in the New Yorker, I think?) cited studies saying that audiobook readers had markedly better recall of physical descriptions in books, presumably because the visual processing centers of the brain aren't occupied with the task of reading itself.
I wonder how difficult/viable it would be to crowd source actual people reading the articles? I am sure there must be radio journalist students who wouldn't mind reading pieces for feedback from listeners (on their reading?).
Nice job! Just the other day I was thinking about how great it would be to have a voice-read readability (http://readability.com/).
* Is there any way to go to the listing on HN? (I like to check comments occasionally).
* When clicking "View original article" it takes me to a separate page where the article is read and there's a new button to visit the original article. When I click the new button it loads the article in the frame. Is there any way all this can be collapsed to just one step? (I click it the first time, it goes to the separate page and already has the article loaded).
* It would be really interesting to have a bookmarklet for this like with readability.
Can we get an option to speed up the voice? I think I can process info a little faster than base speed. I believe I saw something where blind people who use screen readers have the output come at blazing speeds.
Also is it possible to skip links, or replace links with a ding noise? I'm not likely going to write links down to visit them, but if i'm interested, i'd probably go back into the article to click it.
Now, if only we could somehow provide the comment threads, and allow us the ability to assign voices to different handles. I imagine something like a booming Voice of God for pg, annoyed grammarian for tptacek, maybe an old Mac-style gibbering lunatic for losethos, etc.
:)
EDIT: I just tried to listen to A Tale of Two Cities from Gutenberg on SoundGecko. I think I broke it. Anyway to cancel a job?
On my Macbook OSX Mountain Lion, it works great and started playing on load (Which didn't happen when i loaded in on iOS but thats to be expected)
One thing that REALLY bugged me was this: The link on each item that reads "View Original Article" just links to the service: SoundGecko - It doesn't link to the original article at all!! - Not even the HN post so the source and the context (HN Comments) is reachable AT ALL!!
Seriously this site is brilliant, BUT what you've done - In my opinion this is unacceptable, at least link to the original article author.
and then linking to the service you're scraping as a courtesy
This is absolutely fantastic! I occasionally have to make 2+ hour commutes to/from jobs, and I've long been wanting a way to digest the latest tech news while driving down the road.
Plus, I never have the forethought or, the desire to sift through the various pod casts and guess at which would I would like or find interesting, so being able to 'tune' into the front page seems like an excellent way to pass the time.
Excellent work, man. You've literally built the exact thing I've been wanting for a long time.
One feature addition that would be revolutionary (IMHO) would be to allow blog authors to embed an audio version of the story, in their own voice, using a custom HTML5 tag. Then, when you parse their page, use the supplied audio instead of TTS. Then, in conjunction with this, give preference to stories with supplied audio, so an HN page 3 story gets promoted to a top story on your radio app, as reward for the effort of supplying the audio.
One suggestion off the top of my head is to allow for offline listening. Ie, configure soundgecko to get a snapshort of HN front page at say 5pm, TTS the front page contents, email the sound file as an attachment for listening.
This would be extremely useful for areas that has low mobile phone signal, or countries data plans are expensive or/and restrictive. (looking at you Australia!)
If you click on, "What is Soundgecko", there's a forever looping background movie that looks really cool. It looks like it's downloading 2.8mb of the movie every time the previous 2.8mb finishes downloading, how does this work? Is it something like Node.js file streaming?
[Link here](http://soundgecko.com/)
I'd be awesome if this was on github so people could improve the parsing. For one, it reads stuff that is obviously not meant to be read "live", but are just informational tables.
I'd also be awesome if this was a Shoutcast stream that also had live people doing shows as well.
Edit: Also, switch voices between entries, and continue reading entires one after the next.
Yeah parsing is a tough challenge for something as diverse as Hacker News where the content could be anything between a thesis, a blog post, to a picture gallery.
Will investigate Shoutcast but there might not be enough content to have a continuous stream of content.
You'll be glad to know it already switches voices between entries and automatically plays the following entry :)
A really cool idea, I'm going to used this extensively I think.
One thing though: you need to put in audio clues as to when one article ends and another begins. I'm not looking at the player and sometimes the same voice gets selected for two consecutive articles, and I have no idea a new article just began!
Loved it,though its still paining to work it flawless on my Dolphin browser. I'd be more happy if we had RSS feed for this station. I'd love to import that RSS directly in to my Pulse reader and consume all podcast right in to pulse without leaving.
EDIT: And, How to get this the hacker-news as my Soundgecko channel?
I agree, please clarify, because this sounds extremely useful. Within the Pocket app, if I tap on an article's title, it doesn't highlight, it opens. Then two-finger tap does nothing. Long tap in a paragraph highlights one word with the option to speak that word.
Now I feel bad since I used to write the support docs at Apple and I messed up here.
1. Go to Settings then Accessibility. Check that Speak Selection is On.
You can also choose Dialects and set speaking rate or if you want words highlighted as they're speaking.
2. In Pocket app select a body of text by tapping and holding on a text. Selector bars will appear, drag them to the start and the end of an article.
You can do this for anything now, websites, emails, etc. I set my voice to be South African English (since it oddly sounds natural to me) and set my speed at about a little to the left of middle.
So I just tried it with: Bascamp Personal, the Bascamp for all your projects outside of work.
For me the link bait was super obvious when read and kind of ruined it, at least in this case. It felt like they said Basecamp 150 times in less then 2 minutes.
It's been a while since I actively followed TTS, but the speech of this one sounds surprisingly good. The service itself is useful as well. It could be a bit more interactive for my taste, but works quite well!
Thumbs up. I've been a fan of Long Zheng's work since his Taskforce Initiative (which inspired what I am doing now on the side). I use MetroTwit daily. And now I've got to take a closer look at SoundGecko.
Hi, this is a great idea, perfect for when I'm in transit. It would be nice if you could add a feature to control the speed of the voice. The output is way too slow for my liking
Very cool. Now our eyes can get some rest by letting this app read the top posts.
Can we also get to hear the top 2 or 3 comments from the corresponding discussion page ?
I think early on I investigated getting comments for posts but couldn't do it reliably or "say it" in a consistent way that made sense. May add it as a future feature.
I'm a co-founder of the startup that built SoundGecko. But more specifically I had a bigger part in building this "feature" for Hacker News since it's something I actually wanted to use personally.
Really nice and useful work, longzheng! If you like these kind of services and want to be able to convert any RSS feed (not just HN) into a TTS-spoken podcast, check out the somewhat similar site podcastomatic.com. I use it to kill my daily commute by listening to TTS-spoken blogs :)
longzheng, is there any way to be able to change articles? I'm on ff19, and clicking on a new article just continues the original article. Hovering over the left side of the article shows the "play" symbol, but clicking anywhere there just continues the original article. Not sure if this is a bug or not.
That's weird. It should definitely switch to the new article if you click the picture/play icon or the title. As a workaround, could you try "up/down" arrows? It's possible the HTML5 audio player in FF19 is buggy.
Yeah, doesn't work on FF19 even with the arrow keys. But it did work on Chrome Canary - must be a bug on FF.
edit: It looks like FF19 is complaining about the content-type: "audio/mpeg" when changing articles. Not sure what one it is using, but it does manage to play the first article only.
A suggestion, perhaps, is to lean on tldr.io's system of providing well written short summaries of Hacker News items rather than the actual content. That way it'd sound a lot more like a regular news bulletin and skirt around problems of third party content. (I know one of the founders if you want an intro but I believe you can grab their stuff somehow anyway..)