The irony is that this is still going on at Google, and presumably everywhere else in the industry as well. Some new strong-willed engineer joins, takes a look at the code, declares "This is amateurish. We should be doing so much better," and then gets a whole bunch of people to rewrite it, usually in a different language, with a different style of coding.
I'll credit this with a major shift in my thinking about what constitutes "good" programming. From the outside, I looked at Google and thought "wow, they're accomplishing amazing things, they must have amazing engineering practices". And yeah, things are pretty rigorous...but what I found was that the people who were, by and large, responsible for all those amazing things didn't care. Most of them had fairly loose preferences for favorite programming languages, and favorite development methodologies, and basically ignored the fad du jour. What they did have was an obsessive focus on the user, and on getting things done so they could move on to get other things done. Navel-gazing about what language was best or whether we should be using OOP or how stupid the previous engineers were was generally reserved for the B-players. The As were thinking about how we could return results as you type, or how we could process real-time microblog feeds, or how we could expose new and potentially groundbreaking new features without losing millions of dollars from UI tweaks.
Basically, the code quality of the initial version was "good enough", as was the code quality of every subsequent version of Google (except when it wasn't, which is when it got rewritten), and that's all that mattered. As long as you can do useful things for the user, it doesn't matter whether your coworker thinks you're a 1337 hacker.
That's not how I read the article. It sounded like Page's original version didn't work so he hired an RA to fix it. The RA spent more time debugging the (then very immature) Java than he did debugging Page's code so he rewrote it in Python.
Once they started getting traffic, the Python script that the RA had written couldn't stand up anymore, so they hired engineers to fix it again, who did so by rewriting it in C++.
Page's choice of Java originally was probably a faddish choice, given that Java was being hyped into the stratosphere by Sun at the time, even though neither the language nor its implementation were ready for prime time. The RA's choice of python was more reasonable, but Python was a highly esoteric research language at the point and did not have the ecosystem around it yet for writing scalable production code.
What they did have was an obsessive focus on the user
I'm surprised to read this. Google's products never seem to me to be the result of an obsessive focus on the user. Rather they seem to be the by-product of an elegant engineering concept, with the interface something of an afterthought.
I disagree. Google gained popularity not just for search results, but because their clean search page was a breath of fresh air compared to the busy portal sites of the day. Gmail became popular because its UI was so much more refined than the other email sites of the day. Likewise for Google Calendar.
I think they've taken people's attention seriously from day one.
I started using Google in 2001, after seeing it on the screen of another computer in an Internet Cafe. I was trying to search something, and Altavista was returning shit filled with all kinds of commercials.
Opened Google, typed my search query in that lovely text input, hit Enter, and WOW, the first 10 results were all relevant. And no commercials. I was so impressed that I instantly switched.
Even when they added commercials, they were marked as such, while being clean and non-intrusive (well, that changed in the meantime -- I find their search page a little cluttered with commercials nowadays).
Good UI design is invisible - if your UX designer is doing their job, you will never even think about them. For things like Search, GMail, and Maps, the user need's definitely come first, and then the elegant engineering is a byproduct as people try to figure how the hell to implement the product they have in mind.
The notable flops have been the ones that have been driven by an elegant engineering concept; in particular, one of the major differences between Wave and GMail is that Wave seemed to ask "What can we do, and how should we expose that to the user?" while GMail (pre-Labs) largely asked "What should we do, and how can we figure out how to implement that?"
> if your UX designer is doing their job, you will never even think about them.
UX is not something you can stitch on afterwards. Good UX is deeply ingrained in engineering and an omnipresent, pervasive focus of the team. I think that's what parent is referring to in regards to what Google is lacking.
That depends on the product. A 50/50 UX back end split is fairly common. But, some products need a simple interface and a ridiculously complex back end. EX: Weather map's.
A big part of it is that the page loads FFFFFAST. They never went for clutter. They engineered the pages, so you can do what you want to do quickly. Do not underestimate the importance Google places on page-load times.
They don't consistently fix it if it's slow, though. It takes about four seconds to search for a term you haven't previously searched for in Gmail (which I think requires having 10k+ emails, but I'm not sure), and it has for a long time.
I haven't heard many people express dismay or annoyance with the UX in their software. Certainly their software doesn't "look" great and I think that's what a lot of people are referring to when they talk about Google's terrible UIs.
But the fact is, most of the UIs do what you expect and get out of the way without making a bug fuss.
It's not just how it looks. To pick on my favorite tool, Google AdWords: you select a campaign, then you select the keyword tab, then you click on "add keywords", then it asks you which campaign you want to select... Didn't I just tell you that 3 clicks ago? Why do I have a choice again? That's an example of poor UX design.
The irony is that this is still going on at Google, and presumably everywhere else in the industry as well.
The article describes how Google constantly improved and rewrote their codebase to make it better. I wish this was going on everywhere else in the industry. Unfortunately, in many cases the very first version of some application is deemed perfect, so no one is allowed to replace it, until it's too late.
Love it, I never went to school for cs, founded a company that did very well, and reached a point where I needed to move on. I was the chief geek, so the board wanted to replace me with someone credentialed, and all I heard from him during my transition was what a mess everything was. Not long after leaving they lost what I considered to be the best coders, because it is fun to code something up and deploy it, everything else can be labeled as work too easily.
I was developing an app in Java at the time (1996). The Java motto was "write once, run anywhere." (everywhere?) We used to say "write once, debug everywhere." It was a mess (granted, we were using it to build a client, not a server). A San Jose Mercury news reporter and asked Eric Schmidt (who was running the Java show at the time) to respond to my claim that Java was buggy and Eric Schmidt replied with something along the lines of "If it's crashing, then they must be bad programmers."
Of course I got the last laugh because he went on to become a billionaire. Wait a minute.... maybe he got the last laugh. Either way we're both laughing to this day.
The "debug everywhere" thing typically refers to GUIs and filesystem conventions. If you don't slavishly use File.separatorChar everywhere and let a "/" slip in there, yeah, you'll have bugs on windows.
For most code, and if you're not going to run on windows, it's not an issue. It's nothing like dealing with header files.
Maybe it's been long enough since I've written anything in Java for Windows that I've simply forgotten, or maybe I somehow never manipulated paths directly, but I don't recall having to worry about path separators on Windows. I believe I was using \ and / interchangeably, particularly by appending some cross-platform path with /s to the home directory path, which would have \s on Windows (e.g. %AppData%/.company/program/). Could it be something they fixed over time?
Yeah maybe they do an auto replace on windows, I actually haven't been bitten by the bug because I've never written a program intended to run on windows, ever :) Just happened to notice that File.separatorChar is there and that's what it's for. When it first came out, apple was using ":", anyone remember that?
What's quite funny is that back then Java was still almost exclusively an "applet" or "desktop application" thing. Choosing Java for a server application was quite unusual.
The most surprising thing to me was that the language choice rather than the application architecture is what's blamed (in the excerpt) for only being able to serve 10 reqs/sec.
To put this into context, at the time you would have been going via cgi scripts or maybe trying to embed directly into apache. Mod_python did not exist, asyncore was the only thing you had to do async callouts so that you did not block the thread, etc. There is a reason perl was considered state of the art for serving web pages at this point in history. To give you a bit of an example, YahooMail (nee RocketMail) was basically a bunch of html wrappers around python code to read/manipulate mbox files and you would have been amazed at the amount of hardware it required to keep that performant; google had to be faster and was doing a lot of background work that we never had to deal with.
In the late 90s there was just not the same set of options available in the standard library, people were just learning what was required for horizontal scaling, and I can easily see how the added burden that Python 1.4 would have imposed compared to C or C++ would be a deal-killer when it came time for a re-write.
I don't see how that is so surprising. At the time, Python was still a rather new interpreted language that was more or less focused on research applications. Nowadays there are tons of improvements, and let's not forget we have CPU cycles to burn.
Python is still more stable than Java, although the gap has closed dramatically over the last decade. It's just that Python is slower (which wasn't the case in 1996).
>What surprised me was that Larry Page had lots of trouble getting his crawler and indexer to work, partly because he was in Levy's words "not a world-class programmer" but also because of lots of bugs in the brand new and still unstable language he was using, which was called Java.
Hey this is really strange. My user agent was not set (I thought but it was a different issue) and so I googled other solutions besides System.setProperty("http.agent", ""); and so I stumbeled over this entry. Nothing related to your finding :)
Then I posted it on twitter and yesterday on hackernews. That's why I thought that it was my finding. Sorry :)
Hey. Sorry for the comment on the other thread ( http://news.ycombinator.com/item?id=2459123 ), I was thinking "damn, why didn't I submit this separately?", I thought it was a weird coincidence.
I think it was linked on Twitter / Buzz a few months ago (that's where I first heard of it), and that may have improved the post's ranking on Google, which would explain how you stumbled on it 15 years later ;)
Anyway, I'm glad you shared it, since lots more people got to see it.
Slightly tangential note. I read Steven Levy's book last week - it is a very good read. If you are even remotely interested in technology and entrepreneurship (which is probable give the site you are seeing this on :) ) - you should definitely go read this.
The parts on Google's early years are very nice. The chapter on Google in China is also phenomenal.
An excerpt from the new book In the Plex by Steven Levy, quoted in the Quora post:
"Over the course of that two years Page and Brin had figured out Backrub's applicability to web search, failed to license the technology to Excite or Altavista, and had founded their own company."
I still remember the days when Altavista was the very best at Web search, and Excite had some features that made it worth checking as a plan B. It was near the end of the two years described in the excerpt when I started noticing Backrub regularly crawling my site, which used to rely primarily on Yahoo for online search referrals. I became a Google user as soon as the Web crawling from Google identified a site I could visit to see the sender of the Web crawler in action. I became hooked almost immediately, and started telling my friends about Google in online forums as I discovered more and more pleasantly surprising highly relevant results from searches I did. Too bad for Altavista and Excite that neither company grabbed PageRank and other Google technologies when they had the chance.
As I remember it, the "highly relevant" results on Google were almost exclusively determined by a $299 Yahoo! Directory listing or a DMOZ directory listing. If you could get the Yahoo Directory or DMOZ to link to you with your preferred anchor text - you were all but guaranteed the #1 spot on Google. Of course it didn't mean as much back then, but boy was it easy.
This brings back a lot of memories. Like you said, Yahoo! was referring all of the traffic. For those of you that weren't around or don't remember - the Yahoo Directory was based on the alphabet, like the phone book. I had to beat out AAAUsedLaptops.com with exclamation-point-used-laptops.com (listed as "! Used Laptops"). No wonder Google won.
There were plenty of good websites in those days that had (and still have) DMOZ listings absolutely for free. That's even more so for the earliest sites listed on Yahoo back in its human-curated days.
I sort of miss the time when knowing how to use Google (and knowing to use Google at all) allowed you to be instantly 10-100x smarter and more productive than everyone else. (it still applies, just not in the tech industry; being able to google to find drivers for obscure industrial or enterprise computing hardware is sometimes a competitive advantage, but that's about the limit)
I think that being skilled in the use of Google still gives you a huge competitive advantage over your peers. You'd be surprised how many people don't even think to search for answers to their problems.
Heh, I know where you are coming from. In high school (before Google was really well known and popular) I participated in a "science olympiad". One of the competitions was focused on computers. You had to create a spreadsheet to calculate certain values, etc. One of the tasks was to answer a question using the internet. It was something along the lines of "Who was the man that refused to leave his house before threats of Mt. St. Helens erupting and was eventually buried in the ash." I typed something like [grumpy old man buried mt st helens] and the first result contained his name in the snippet. I finished way earlier than anyone else :)
There are two excepts of the book at the bottom of the answer that also link to a free preview of the first chapter. I'd recommend reading the preview even if you don't plan on buying the book right now.
I don't think the problem was with Python. It seems that the problem was that their code wasn't very good.
It's worth pointing out that there are lots of good tools available now (which were not available then) which make Python an excellent choice for high performance systems.
I will take my Python zealot hat off now. You just raised my hackles slightly, is all :-)
Scaling a website and scaling a search engine are two very different things. Python websites are OK, Python search engines... well: http://whoosh.ca/blog/fools_errand
I don't recall the name, but I remember using something exactly like (the originally intended) Backrub my first year in the dorms (1996 or early 1997). The program I was using was more of an overlay where you could draw on the page (using a simple MS-paint like UI) in addition to adding text annotations. You could either keep the changes local, or share them on some central site. It's downfall, as I recall, was threefold: buggy code, popularity, and the publicly annotated copies of popular websites turned into layer upon layer of graffiti. Something like the original concept for PageRank as mentioned would've been perfect.
It was a neat concept, but really only practical (as implemented) for the static web. It died long before dynamic page generation became widespread. (I just can't remember the name of it! :) )
Was it perhaps CritLink (aka crit.org)? Ka-Ping Yee created that back in this time frame and it was the first service I was aware of that would enable this sort of third-party annotation of web pages.
Possibly something that became ThirdVoice. The concepts are very similar. Whatever it was, it had to be around in the 1996-1997 timeframe, and the sources I can find for ThirdVoice show it as starting in 1999. (Perhaps Beta?).
Considering it cost about $2-$3 to create and ship the hardcover book to amazon, depending on amazon's cut of the kindle book, the publisher might be making more money on the hard cover.
I'll credit this with a major shift in my thinking about what constitutes "good" programming. From the outside, I looked at Google and thought "wow, they're accomplishing amazing things, they must have amazing engineering practices". And yeah, things are pretty rigorous...but what I found was that the people who were, by and large, responsible for all those amazing things didn't care. Most of them had fairly loose preferences for favorite programming languages, and favorite development methodologies, and basically ignored the fad du jour. What they did have was an obsessive focus on the user, and on getting things done so they could move on to get other things done. Navel-gazing about what language was best or whether we should be using OOP or how stupid the previous engineers were was generally reserved for the B-players. The As were thinking about how we could return results as you type, or how we could process real-time microblog feeds, or how we could expose new and potentially groundbreaking new features without losing millions of dollars from UI tweaks.
Basically, the code quality of the initial version was "good enough", as was the code quality of every subsequent version of Google (except when it wasn't, which is when it got rewritten), and that's all that mattered. As long as you can do useful things for the user, it doesn't matter whether your coworker thinks you're a 1337 hacker.