There are a lot of good ideas and sentiments in this article. And then at the end I find out the trail-blazing product he left academia for is "a newsfeed based on your interests."
How many startups like this are there now? 300? That is probably a low estimate. I'm sure that everyone living in the Valley personally knows at least one founder working on the exact same product. And none of them are better than my Hacker News/ Facebook/ Reddit/ RSS feed combo.
I've heard some of the many Prismatic competitors describe themselves as "a Pandora for news" which is an apt description, since rdio, soundcloud, and Spotify are better than Pandora. Self, social and community sourcing for digital entertainment tends to be better than algorithms. While I believe the "problem" domain is severely overworked, if you're going to stick with it then my bet is that social algorithms like collaborative filtering a la Netflix does a better job than topic modeling.
On the optimistic side, a good company is more than a single product. Maybe the author's considerable expertise and experience from building Prismatic will lead to something cool down the road.
As a prismatic user, I think the product he's built is pretty far ahead of any competitors I've seen.
I don't think the fact that others have tried with limited success is an indictment of prismatic at all. If anything, that strengthens that argument that this project required a lot of NLP skill.
When I think about what I want the future to look like and what's missing now, a smarter news aggregator isn't on the top 10 list. I have no problem wasting infinite amounts of time with existing entertainment technologies. At the same time, thousands of engineers clearly disagree with me as they are risking their livelihoods and investing their their talents in that field.
I do find the whole field boring. Making a news aggregator that's X% better than existing ones doesn't improve the world a whole lot, or even offer a compelling consumer value proposition.
I don't want to sound like I'm against all consumer entertainment tech. It's clearly something that people enjoy, and it's exciting for some people to make. But this particular application is one whose value I am very skeptical of.
Completely agree that a news aggregator isn't one of the world's 10 most pressing needs. It isn't in the top 100 either.
But Prismatic has told me about articles that make me better at what I do. Depending how widespread that experience is, they may be having a larger impact on our 10 most pressing needs than most teams that attack those needs directly.
Thank you for the kind words! Indeed, the NLP and ML behind Prismatic are pretty intricate. Not just because they use something mathematically sophisticated, but you have to make a lot of good decisions about what can be tackled with a simple approach and when to spend a month thinking about a harder problem.
My feelings exactly. It's made worse by the fact that, out of all the academic departments out there, AI and NLP research has perhaps the greatest potential to transform society in so many ways.
He leaves the field trying to create intelligent machines in order to write a Google News clone? I truly don't get it.
To me, it feels like this person's talent is wasted on this project. With such a strong background in natural language processing, why isn't this person involved in a Siri competitor, in Web real time chat, VOIP compression, or AI speech? Is this the limited scope at which a typical CS Ph.D operates?
Reach for the stars. Let people of lesser ability make news apps, even if it's a struggle for them.
A really good news aggregation/management site would help me out a lot more than a marginally better Siri. I waste a lot of time on the news and miss a lot of stuff.
Agree with pseut. I would need a Siri smart enough to aggregate and manage news for me. We need someone to build that great aggregator, and then we'll talk about Siri integrating that.
Sure, but your original point seemed to be that any small degree of progress on a Siri variant would be more important than any large degree of progress on a news aggregator. My only point is that the magnitude of the progress matters, and a big change in the "smaller" problem of news aggregation would free up a lot more of my time than a small change in the "bigger" problem. As much as I'd like to be able to dictate research papers and code up a bunch of data analysis over my phone via Siri++ while I'm on a long run on the beach, at some point you have to trust that people choose to work where they think they can make the most progress.
I appreciate your passion but before you build a tower to space, you should probably make sure you're not building on shifting sands first. It's a telling arrogance you display when you say "let those other people struggle for their small gains, I have bigger fish." As if they would never be a customer down the line or that maybe you'll "get out ahead of them" and own the space... I'm not sure what your motivation is for this perspective, anyway.
The author already addresses your view though since he shared it. We already have been using PHDs like this for some time now. I think you just don't understand that communication hubs like reddit and HN are not "a waste of talent" especially when you consider portals like these get sold to Conde Naste for millions x 10^x of dollars.
Contrary, I think "reaching for the stars" here is taking ownership of the application instead and understanding the real burden of operations and the impact to your clients and what the real net benefit is (if any) in your application, not the number of 0s on your check. It's not sexy, but nothing complicated is. You're not going to do anything groundbreaking implementing what you were taught in school in the tiny scope of your contract, you're just creating tiny bubbles of technology that may or may not ever offer a benefit to anyone outside that space; contracting yourself out to another Fortune 500 to design a protocol or format to be used in some rented-out walled garden. It might be a great model for retiring, but not so much for a society, it it certainly isn't "reaching for the stars" unless you want to be alone when you get out there.
We can waste an incredible amount of time making sure the foundation is secure. Or we can plan for the shifting sands, like Google building commodity-hardware-failure into their business plans. Larry and Sergey didn't try to make computing hardware more reliable; they planned for it to fail.
To the contrary, many complex things are sexy. The iPhone is greatly complex, both in software and hardware, but it's still sexy. By the way, what if Steve Jobs and Steve Wozniak were happy just running a small electronics store in Palo Alto?
Interesting to see your opinion of "reaching for the stars". Also interesting to see you write 2x as much about what "isn't reaching for the stars".
Ok, another perspective: why would a PHD waste his time on trivial conveniences alike web chat and voice recognition just because they are hard to solve, when there are unmet needs in a much larger market called media?
Or how about this: the key to solving problems correctly is to solve them as close to the root cause as possible. Why write a speech recognition when you can get the same functionality much more effectively by eliminating the microphone altogether and interfacing directly with the user (just a hypothetical scenario)? Why write it for iOS when the only reason the person is carrying a phone is so they can be connected on the go, and the only reason they are on the go is to learn something they could have read on Prismatic this morning? You'll never tackle any of these problems if you think that the lower layers are already solved problems cemented in concrete, only worthy to be maintained by tech monkeys.
Why do we make software and hardware? Why do people farm? Why do we do anything?
It's about advancing our society. Making more efficient use of our human resources to enable us to discover new ways of making more efficient use of our human resources. To raise quality of life. To allow more humans to live, which increases our pool of labor and of brainpower, both of which we can use to multiply the other further.
Why do we do this? At this point we're descending into existential ennui. When you look into the abyss, remember the staring game -- make it blink.
I'm all for exploration and self discovery or whatnot, but we're talking about here is effective use of resources. My perspective of your opinion is you feel that PhDs should keep fitting into the spots where they are requested by big corporations, to develop their products that require high technology and expertise to be done quickly and cheaply, because everything else is just a solved problem that an expert would be wasting his time on.
My counter is that these things that we take for granted are not "solved problems" and innovation in this space is the most fundamental and disrupting innovation one can engage in, and increasingly the press would turn the public's eye away from this fact since obviously those who control the aggregation of news can control what a large number of people think about different topics. But perhaps you already knew that and find change at this level and its consequences too scary to contemplate.
Prismatic has a much larger vision of bringing together elements from my AI and machine learning research background along with modern interaction design to build smart everyday consumer products.
The current app is just the first step in a longer roadmap of building smart products. Currently, we're great at discovering new articles, but in the future will include discovering relevant apps, music, movies, and local events. Don’t be too surprised if we’re on your TV soon.
Keep in mind that if it happened that you had an AI that could quickly scope people's desires out, the application you would want to be selling would be:
"The 'it gives you what you want' thing"
Rather than any kind of fancy description. It wouldn't be the description that sold customers, it would the fact that it really, actually gave them what they want effortlessly that would sell them. And course, news is gives you a lot information to easily parse. So from the "I have a program that's unique in its language comprehension" perspective, this makes perfect sense. If the product was more specialized, it quite possibly would not have as much scope to demonstrate that it could choose for you.
Of course, whether AI could possibly work for this is another question but I can understand why he'd want to try this route.
To be fair, PG is the one advocating that engineers satisfy their immediate, and often extraordinarily boring, needs.
There are 300 startups that are funded doing newsfeeds, even while those startups might be a tiny minority of exciting ideas.
On the other hand, the role of business is not to realize and finance science fiction. Become a writer (or perhaps a Researcher) rather than a programmer if you genuinely have a great imagination.
Great post. As someone who works in a different field of science, one bit got me thinking:
>Like any academic community, the work within NLP had become largely an internal dialogue about approaches to problems the community had itself reified into importance.
I would argue that this is an issue of goals. If you're motivated by the application of research results to solve practical problems (as the author is), then this is a valid criticism. But for me science is also self evidently valuable — understanding the principles that govern the universe is a noble goal irrespective of finding opportunities to apply them.
Perhaps the field plays a role as well. In all sciences one's goal is to discover the rules and facts that govern a particular system. In the natural sciences, this system happens to be the world in which we live, whereas in the formal sciences (math / CS), the system is often an artificial one of human construction. In the natural sciences, the importance of the rules you discover is self evident (they govern our own lives and capabilities!), whereas in the formal sciences, its more necessary to justify the importance of your discoveries (with, say, practical applications) because the importance of the system you're studying isn't as self-evident.
This isn't to say that the natural sciences are in some way superior; just a speculation about the attitudes/motivations of academics in different fields.
In the natural sciences, this system happens to be the world in which we live
Not to say I disagree with your post, but it's it more accurate to say that the natural sciences today explore models for the world in which we live? The models tend to be fairly obvious for things on the human scale, but when you talk about systems on the atomic scale, or on the cosmic scale, the immediacy of the models tends to break down. The result is that, just as for the abstract sciences (math, CS, etc), the usefulness of the models has to be justified. So, I feel that the line between purely abstract sciences and "natural sciences" is not quite as fine and well-defined as your post makes it to be.
> isn't it more accurate to say that the natural sciences today explore models for the world in which we live?
This is entirely true, and there are absolutely areas of natural science where the "right models" are pretty unclear, such as theoretical physics, as you mention. In this sense my distinction is somewhat fuzzy.
I am likely biased to perceive a large gulf because I work in molecular genetics, a field in which the there is a lot we are certain about; much of what we do is filling in holes and fleshing out the details of overarching models that are known to be largely correct.
I really like knowing that there are people dedicated to thinking deep thoughts. There was a recent potential proof published on a tough math problem, and apparently the author had gone dark for 15 years creating a fantastic mathematical universe and vocabulary all on his own. Thats worthwhile. Maybe not for everyone to go off doing it, but definitely for some.
Great article, and agreed on most everything you said. Regarding Prismatic itself, some constructive feedback (feel free to ignore).
My immediate first impression was that this was another app that would waste my time. I want less of those, not more. The absolute best news feed app I've used is Flipboard for the iPad, and even then I don't use it too much as I feel too much like I'm wasting time with it (like HN!).
Secondarily, the homepage doesn't grab you enough. The text in the graphics is too blurry and the pictures are generic. The homepage below the icons doesn't have the production values to explain why it's cool.
I'm not trying to be negative here. Some of the points in your article really hit home (email summarization would be awesome and paradigm shifting). So I wonder if you might focus more on making something that will save people time and solve a pain point vs. "another web-based time-wasting thing" (that may not be fair, but that was a first impression).
For example, can you scrape an inbox and list of Facebook/Github/Tumblr/RSS/Twitter feeds to get a single high-sensitivity greatest hits from all monitored news feeds? This way you can check a single page in five minutes on your phone and feel reasonably content that you saw the top headlines for that week. Kind of like news.ycombinator.com/best.
> For example, can you scrape an inbox and list of Facebook/Github/Tumblr/RSS/Twitter feeds to get a single high-sensitivity greatest hits from all monitored news feeds
You've missed the point already. The web is vast, on the whole filled with 99.99% crap. But .01% of something large is still very big. You cannot list the interesting things a priori with a few feeds. Prismatic is about learning about what you actually like to read and giving it to you. It's got the smooth/sleak usability of Google reader (read title, keep smashing J when something isn't worth reading, essentially letting you skim/reject more than 10 uninteresting articles/minute), but without the low recall "enumerate every site I think I am interested in" subscription model.
Ok, I might well be missing the point. But enough people upvoted that I think this is a common perception: "Oh, not another time-waster". So, maybe make the homepage address this issue head on with a good video that shows why this is compelling, awesome, and productivity-increasing.
I do think it's mostly a time filler like say, hackernews or a good subreddit. Maybe you'll find some article that will increase your productivity, but that's definitely not the focus. It's just about giving you a stream of articles that you'll probably enjoy reading.
j/k: Jump to next/previous article
up/down: Scroll to next/previous article with animation
space: Scroll to next article
s: open share box for active article
o: open active article in new tab
b: bookmark active article in new tab
f: Go to search field and find new interests.
r: to recommend active story
gh: go to home feed
gd: go to your feed of saved stories
gr: go to your feed of recommended stories
gg: go to top news 'global' feed
Another comment on Prismatic (it took me a few days to get around trying it): Hacker News and Reddit are good for finding interesting new information, and they combine it with a pretty good discussion that usually attracts experts. That combination is really fantastic, and even if Prismatic improves the information-finding, can it also link me to an expert discussion of the article?
Hi Aria. I have a question regarding the use of automatic sentiment analysis and it's relationship to choice. It seems to me that there is a fundamental divergence between the program of "give me what I will enjoy" and "give me what I need". It seems that your startup is focused entirely on the former question, and there's no doubt that if you are successful you will indeed be consuming more attention, which is of course the primary commodity in the attention economy.
But wouldn't it be better to use NLP to emancipate people from the attention economy altogether, to help identify not what people want, but what they need to he happier, and more engaged in the real world doing real things for real people?
>But wouldn't it be better to use NLP to emancipate people from the attention economy altogether
I work on ads recommendations, and yes, it would be "better" if I stop recommending KFC to the 36% of obese Americans & instead recommend Crossfit or Yoga or Gold Gym or Myoplex. If I do that, they will quit the site en-masse and the negative engagement rates will go through the roof, I'll be out of a job, will end up eating at McDonalds to save money, and then become obese, and then click on the same KFC coupon recommended by another data scientist hired to replace me :)
Yes, it would be "better" if people read the Economist instead of Cosmo, or went to grad school instead of building crud web apps, or watched Frontline PBS instead of Jerry Springer, but then, it would also be much "better" for the environment if we got rid of the 1000s of walmarts & safeways selling groceries & Uncle Sam shipped you healthy lettuce and spinach weekly via USPS....communism 101 :)
Sorry you got downvoted - wasn't me, I swear (it actually annoys me that people use downvoting to mean "I disagree".)
The problem with your comment is that you present a false choice.
I believe that there is strong - very strong - demand for real, lasting happiness. Indeed I think that is the strongest demand there is. NLP and technologies like it can be at the core of the tools that help people choose who they want to be and help (not coerce) them to act consistency with those choices.
I cannot reason with somebody whose main thesis is "I believe" and "I think". The real world actually shows the opposite of your beliefs. The obese guy does click consistently on KFC coupons recommended via NLP, and does dismiss ads of fitness products. Its really his choice. Maybe he has found "real, lasting happiness" eating the KFC :) I mean, who am I, a mere data scientist, to decide what leads to "real,lasting happiness" for the masses ? Even saints & philosophers can't decide upon that. So I do what NLP is best for - if the obese guy wants KFC I give him that.
Am not being flippant...I actually share your concerns very much, just that data mining in the real world has made me uber-cynical for the future of mankind.
Ah, but you miss the crux of my point: people act inconsistently with their own choices! You're right that it's not up to me to decide what's good for people, but what gets me is when people decide for themselves what is good, and then find themselves unable to do it.
How does prismatic actually use NLP for finding articles you like? Does it use sentiment analysis of reviews from people it thinks are similar to you?
Btw I think "subreddits on steroids" is an incredible testimonial catchphrase - it would have been enough for me to buy your app, but I don't own a smartphone.
I recently answered a Quora question about this that has many more details http://qr.ae/1viyQ.
The core NLP behind Prismatic is topic modeling; we don't use anything off the shelf, but something crafted pretty specifically to our needs.
We do model user similarity based upon interest overlap and social graph analysis. So we know how likely you are to care about what someone in your extended network shares.
Do you plan to publish the algorithm/model at some point, or is it too closely connected to the business's "secret sauce"?
The latter would be perfectly understandable, so not meant to be a hostile question, more curiosity out of personal interest of how you handle that aspect of private-sector research. Worries about that are part of what keeps me personally from jumping from academia. I'd like to do less paper-writing and more applied work, which fits nicely, but I'd still want to do some paper-writing and be able to discuss techniques publicly, which seems to fit more awkwardly. I assume it's possible to pull off both, but all the people I know who've left academia for startups have stopped publishing completely, and many of them doing ML/big-data stuff are quite secretive about their techniques.
We will publish on interesting aspects of our models. Frankly, our main reason for not doing so is time and resources.
In terms of industry research on the whole and publishing. It is true that you can't publish everything, but look at Google. Arguably some of the influential systems papers of the last decade or so have come out of Google. Google doesn't publish all the details of their search algorithms, but it turns out that because they address real-world problems they've done enough great stuff that some of it can safely be shared without endangering their moat.
Another aspect to consider is that while industry publishes less, we do tend to churn out useful open-sourced (Prismatic will definitely be doing that soon) that is of at least comprable utility to people out there as most papers.
Hard to say: systems has a long lag time before good ideas get "proven" good by being successes in the marketplace. For example, paravirtualization was investigated from around 2000, and then Cambridge released Xen in 2003. It caught on by the end of the decade, in the late 2000s. If something released in 2010 will end up having similar impact, we'll know it by the late 2010s...
Hi. I'm trying use natural language generation (NLG) software, and found lots of long dead projects, two large lisp-based packages (FUF/SURGE, KMPL) that are not updated in years and not user friendly at all and a simple java package (SimpleNLG) that is indeed simple but not very sophisticated. Is this representative of the NLP/NLG software?
Controlled languages are interesting for natural language generation purposes - for example, GF (http://www.grammaticalframework.org/) may be useful for building sentences (in multiple languages, if needed) from an abstract data structure.
- Do you think you would have the opportunity to avail yourself of the fruits of "academia" if the current approach was fundamentally broken?
- What would happen if viability of academic research was tied to one being a successful business person? (Asking in light of your comment elsewhere in this page re. "tenure system".)
Loved your post. Very articulate and nicely written.
I've been amidst the academia -> industrial fun transition myself this past year, absolutely best decision i've made in the past half decade.
One point you only lightly touched on, but which I think is worth restating is this: in academia theres this frustrating sense of "if you're not narrow, you can't be deep", whereas I see a lot of the more sophisticated bits of industry really valuing folks who strive for broad depth.
Would you agree with that assessment?
(admission: i'm presently having a go at building some tech products/tools that tie into a whole range of my researchy interests)
Really interesting article, especially since I'm an assistant prof idly considering the same sort of move. How much of a "business model" did you have before commitment -- did you have enough connections that you were confident you'd get funding once you had the idea mapped out in detail? Was UMass still a viable safety net, or were you fully committed?
A really fantastic article overall. I was never personally at risk of academia, so I don't resonate with that part of the article. But I particularly liked these two sentences for their practical insight:
that process of taking qualitative ideas and struggling to represent them computationally is the core of artificial intelligence (AI).
and
that path from research to product rarely works, and when it does it's because a company is built with research at its core
Thanks! It seems like an obvious thing, but thinking about "operationalizing" intuitions computationally really helped me hone in on what I should be focusing on with research. My advisor Dan Klein was really awesome at teaching that.
Awesome post. I -- along with many of my colleagues from grad school -- have had pretty similar experiences with academia (minus the outlook for a promising academic career ;) ). This sort of thing (very smart people abandoning academia) is going to continue to happen unless academia figures out a way to make itself more relevant.
academic |ˌakəˈdemik|
adjective
2 not of practical relevance; of only theoretical interest : the debate has been largely academic.
My experience in academia thus far: for every five smart people abandoning academia for industry, one truly brilliant person remains behind. That's all that is required for the model to sustain. (And even then, there are not enough academic jobs to go around.)
Well, I think the number one thing that would correct the balance between academia and industry is if people could freely go from one to the other. Someone could take insights from one arena to the other, enriching both.
There are many reasons why this is difficult, but the primary obstacle is the tenure system.
In the UK it is possible to switch between academia and industry, so long as you are satisfied with fixed term contracts in academia.
Recruiters tend to get confused when someone turns at their doorstep with a PhD ("which marks did you get?"), and academics tend to not have a clue of what's the benefit of industrial experience (in spite of being effectively "agile" product managers themselves when participating in, say, a European Framework project).
Agree with the author that academia sometimes focusses on a very narrow band of research topics, which might or might not improve end user experience. Also, agree with the author that personalized news has very interesting NLP+Machine Learning problems.
However, I am not convinced that personalized news is what users wants, and whether users want to discover popular news and articles by serendipity, socially, rather than personalization by algorithms. Further, I think personalized news has very limited opportunity for generating significant revenue.
I faced a similar problem in academia, but it was in economics. I started my undergrad career as a computer science major, later ending up in economics, and I went on for my PhD in economics after finishing my BA. For those that aren't familiar, a PhD in economics is a lot like a PhD in mathematics. If you don't already respect economists mathematical ability, you really should, but that's not my point here because math is just a tool, and contains no absolute truths for the social scientist.
In an effort to combine solid theoretical research with well built empirical models, I learned how to program Python to scrape data from web sites. Since nobody around me knew anything about computer science, I found myself having to teach myself everything when it came to "how would I write a program that collects X things from Y products in Z Internet markets". In the beginning, when I was learning to use Python libraries for doing HTTP requests (eventually converging to using 'requests'), how to parse that HTML (started with BeautifulSoup and then converged to 'lxml'), and how to aggregate and analyze that data (eventually converging to 'pandas'), I had to spend a lot of time learning the ins and outs of Python. To this day, I still think it was a great choice because Python has changed the way I approach any kind of business question that could be answered with a well defined empirical model.
As a research assistant, I would spend 12 hours writing scripts to automate the collection and analysis of data for some project I discussed with a professor. The day after writing that script, I would go and talk to the professor and want to discuss some of the computational issues with the script (say, encoding issues, or even the use of computers on EC2 to help collect massive amounts of data every day), but the economics professors would not care at all to hear about any of that. "You need to be studying the economics, not the computer science" they'd tell me. This would infuriate me, because in my mind, if you wanted to be able to ask all these interesting questions that rely on the data, then you need to spend time to make sure you've done the computation correctly and in a way that is reliable. Maybe I dug my own grave by showing my excitement as I began to become more fluent with Python, and thus more confident in my abilities.
Nonetheless, I eventually decided to leave my PhD program because my interests in topics that lie at the intersection of computer science and economics were a bad fit for my program. It was so disheartening to me because I truly loved economics, and was confident that what I intended to do with my PhD research was going to be unique and lay a framework for the field of industrial organization. However, the experience I received in my program was the lack of an adviser that could truly help me achieve what I wanted to do, and a regular negative response to anything I'd bring up that was out of the realm of what my professors were used to talking to graduate students about.
Overall, as my tone may signal, graduate school has left me very, very bitter.
I've talked with many friends in a similar situation as yours (econ phd) and they all faced the same issues. Some advisors don't even care if you are coding in Python or in stone tables, as long as you "have three stars in your results table". Your comments about the quality of the computations also reminded me a lot about Ken Judd's (often ignored) arguments about paying attention to all these implementation details.
Good luck with your current goals, and I really hope you are more successful than within the limitations and issues of academia.
PS: Just out of curiosity, what were you interested on, regarding your IO research?
>>> Some advisors don't even care if you are coding in Python or in stone tables, as long as you "have three stars in your results table".
That is beautiful. It perfectly describes the situation!
I still to this day have scripts that collect data on a variety of markets, but I tend to favor applications to platforms and multi-sided markets. My main one was mobile apps and app stores, but I also am researching video games, hotels, desktop CPUs, to name a few. The questions that interest me most are antitrust problems and finding creative ways to measure competition in an Internet market. I also enjoy IO theory, with my absolute favorite topic being low-price guarantees. To this day, I'm still digging for a creative way to do an empirical study on low-price guarantees to test a theoretical hypothesis I wrote a paper about in an IO theory course I took in the 1st semester of my PhD (was the only 1st year student to take a field course in year 1).
In the first semester of my 2nd year of my PhD (left after year 2), I got my solo work into a conference held jointly by Harvard and MIT called the "First Cambridge Area Economics and Computation Day", but my advisors didn't seemed to care much. The conference was awesome and my work got a lot of attention there, including a long talk with the Chief Economist at Microsoft.
Trying to understand how firms compete in the android store actually sounds like a great topic. Of course, we can only observe--at most--prices and sales per product, but it would be nice to think of how firms compete between each other and their reactions. My guess is that the biggest problem would be to find an identification strategy, like a change in the structure of the app store, so we can exogenize our regressors.
BTW, great job with AppNash, it looks very interesting!
If you want to talk more about the econometric approach, feel free to email me. Would love to chat more. If you discovered AppNash, then you can also figure out my email :). FWIW, your point about finding exogenous changes is right on (model = 2SLS), however getting the quantity sold data is not a simple procedure -- it's a matter of converting sales ranks to quantities. With out giving away all the secrets in public, here's a classic paper that guided some of my approach: http://www.business.illinois.edu/finance/papers/2003/chevali...
The original intent for my dissertation was the following: Chapter 1: Describe the theoretical framework of how to set up a system to automate the collection of data from an entire market (enabling the safe use of population estimators); Chapter 2: Apply that theory to analyze the app store data to test various firm- and product-level questions about the structure of strategies taken by firms (developers) in the app stores; Chapter 3: Another application of the theory using higher frequency data from the Internet (probably hotels, since hotels change their rates at the minutely/hourly level quite often, and regional hotel markets provide very well defined markets so new entry takes a while and is thus easy to account for).
I think part of your problem was that PhD economics programs are so narrow, and this is part of the problem with economics generally.
It has ignored too much of what is occurring outside its discipline, and turned in on itself and neat models that do not actually describe how the world works.
I don't want to get into a "what is wrong with economics as a discipline" argument, because I have far too much to say there. But I will say that between micro, macro and econometrics, macro is the farthest from reality by far (my program taught pure "Minnesota macro" or "freshwater macro", e.g. DSGE is god). Econometrics varies by program, but in my program it was 100% theory, and that was the problem. The idea of spending an entire year of econometrics in 2011 and not touch a single piece of real data truly pains me. I learned a lot that year, but it was so hard to find relevancy -- that's why I tried on my own by creating my own large datasets with Python.
However, micro 1 (consumer/producer theory) and micro 2 (game theory) were awesome and are relevant. In game theory, we talked about everything from spying and wars, faking orgasms, cancer cells, penalty kicks, and so on.
Yeah I was thinking of macro in my comments. I'm currently doing an MPP in economic policy (I'm a policy public servant). The great thing about the course is that it's multidisciplinary - we do look at economics, but it keeps coming back to how it is applied in policy.
Game theory was my favorite course in college. I really think it should be offered at more schools to wider audiences; I had to take it as an economics minor elective during summer classes.
I found it funny that your background was the opposite of mine. I was in love with Economics in high school and undergrad. But when I took my first programming course in undergrad, I slowly shifted away from a B.A. to B.Sc. Btw, my primary attraction to CS was that I could create things in software. In grad school, I tried to merge my love for econ with CS .. it made for good CS work but no contributions to Economics. Funny how life turns out :) Best of luck to you!
What exactly do people get out of all these news aggregators? I'd imagine the front page of reddit or at least alocalised version is more interesting than a personilised version.
When does it go from news to infotainment?
And if thats not the case why would someone think something beyond simple facebook like or twitter comment extractions is a big deal? I've seen lots of papers where representing a web page and a user in a vector space based on TFIDF performs well.
I've even see Yahoo research post slides about using some form of random bucket for news recommendation because the novelty of newer / stranger articles improves ad click throughs.
Then again did this article come out the same time as the PR release about funding for their company? If thats the case is this just a wide scale PR initiative?
I signed up to see how accurate it could be - the interests outlined in the e-mail I received seem pretty accurate but the stories that are being displayed have nothing to do with those topics. There was a dearth of interesting reading material.
Having spent several years doing academic research before finally leaving to work on real-world problems in machine learning and NLP, I can relate to this article.
If any of you have problems in this domain, I am interested to chat.
Building Prismatic is way more than just about having a formal ML background. It's that along with large-scale systems skills and having a sixth sense for working with data (text especially) and knowing what simple ideas will work and what needs to be complicated.
So to answer your question yes, you need a formal ML background but you need a lot else. Luckily, you can pick up all these skills from online courses, real world building, and a lot of self study and improvement
How many startups like this are there now? 300? That is probably a low estimate. I'm sure that everyone living in the Valley personally knows at least one founder working on the exact same product. And none of them are better than my Hacker News/ Facebook/ Reddit/ RSS feed combo.
I've heard some of the many Prismatic competitors describe themselves as "a Pandora for news" which is an apt description, since rdio, soundcloud, and Spotify are better than Pandora. Self, social and community sourcing for digital entertainment tends to be better than algorithms. While I believe the "problem" domain is severely overworked, if you're going to stick with it then my bet is that social algorithms like collaborative filtering a la Netflix does a better job than topic modeling.
On the optimistic side, a good company is more than a single product. Maybe the author's considerable expertise and experience from building Prismatic will lead to something cool down the road.