What.cd's Gazelle on GitHub

Spittie · on July 10, 2013

They wrote a small announcement in their forum. For anyone that can't read it:

  This is an important change for Gazelle coders. You may have noticed that gazelle/ is now redirecting to http://whatcd.github.io/Gazelle/. 
  This is due to the fact that we are now hosting our public Gazelle repository on GitHub. 
  The old gazelle site had become severely outdated and a nuisance to maintain. 
  Luckily for us, GitHub is the perfect solution. 
  You can view our Gazelle repository here—it is accompanied by a GitHub page providing a brief overview of the project as well as a wiki with important information such as installation instructions, coding standards, and API documentation.

kurrepalt · on July 10, 2013

Cool, it's like watching how people did PHP 4-5 years ago when there weren't many frameworks available, and people used GLOBAL variables for accessing the DB/cache.

wdewind · on July 10, 2013

https://github.com/WhatCD/Gazelle/blob/master/classes/wiki.c...

Yeah, not to be negative, but it literally has SQL and HTML in one file. Ooof.

kurrepalt · on July 10, 2013

Yes, the thing I also noted first was that it doesn't have any bootstrap/router (only 'bootstrap' is the one that's included at the top of each .php-file, wordpress-style). Development must have been pure pain. :(

I really understand why people hate PHP so much when all they get to see is purely non-optimized shit. It generally has a really bad reputation which it doesn't deserve. Especially in HN-circles.

wdewind · on July 11, 2013

Yes I agree wholeheartedly. There are two very fair knocks against PHP: it's not the fastest, and the API uses somewhat inconsistent standards and conventions (isset vs. is_array for instance...so frustrating). But for the slowness aspect, it's not so slow that you are really hindered by this unless you are doing something that requires very intense performance, in which case why are you using a higher level language anyway. Pretty much the rest of its bad reputation is undeserved, and the downsides are on the level of give and take you get with any other language.

mercurial · on July 10, 2013

I guess coding standards where that locality trumped separation of concerns. Also, I find string interpolation in SQL kind of hot. Especially when magic quotes are deprecated.

I'm sure you can do nice things in PHP if you try hard enough, but you sure don't need to make any effort to do something absolutely terrible.

dewey · on July 10, 2013

Well, the project is about 5 years old and grew over time. There were discussions about moving to a framework/a rewrite but that's something you can't pull of with < 5 developers while still keeping the current codebase up to date and fixing bugs.

FZeroX · on July 10, 2013

It's not an age thing, the code we were writing was considered terrible back then too, and there were frameworks available. Wheel re-invention is fun though!

calpaterson · on July 10, 2013

If you want to start a bittorrent tracker but don't want to bother with all of the trouble of things like Gazelle, you might like thehighseas, a straightforward bittorrent tracker I wrote:

https://github.com/calpaterson/thehighseas

dewey · on July 10, 2013

That's a different league though. Gazelle provides a lot more than just listing torrents.

calpaterson · on July 10, 2013

Absolutely, I just thought I'd mention it in case people want to get into bittorrent but want an easier way to get started

lysa · on July 10, 2013

I'm not sure about that. Feature-wise it might be in a different league but the code is absolutely horrible.

FZeroX · on July 10, 2013

Gazelle isn't actually a tracker though - it's a web front-end / torrent website, but doesn't actually do the tracking part itself.

calpaterson · on July 10, 2013

Thanks - but having written a bittorrent tracker + web interface, I'm aware of the distinction. Most people who want to set up something simple don't actually care though, and my hope is that thehighseas helps people get started sharing with friends.

FZeroX · on July 12, 2013

Sorry, I didn't mean to patronize your face off - was clarifying more for the benefit of other readers who are not so au fait with the infrastructure :)

daturkel · on July 10, 2013

This might get passed off as something for people looking do to evil/illegal/etc projects, but Gazelle is easily the best tracker front-end out there and it's super customizable (google "Gazelle tracker" and you'll likely find a bunch of screenshots of different sites implementing it differently—from skinning it to adding new features). Bittorrent is underused as a medium for legal downloads and I encourage anyone interested in a torrent site of any kind to check this out before they try to write their own.

FZeroX · on July 10, 2013

This is my go-to example of a cool legal* website running Gazelle: http://panda.cd/

If you know of any others please let me know!

* Not that the well-known ones are illegal either, really, but the point is slightly academic as they will still throw people in jail over it...

jzelinskie · on July 10, 2013

There's pressure being put on Gazelle right now because Waffles and AnimeBytes are writing a replacement called Batter[1] using Django. I've also been working on a tracker in Go called Chihaya[2] that will replace Ocelot. I'm glad they finally opening things up more, but it might be a bit too late. If you're interested in working on a more modern rewrite of Gazelle's software stack that has actual software development practices (continuous integration, tests, style guidelines), check out the projects. Both projects are BSD 2 Clause, btw.

[1] https://github.com/wafflesfm/batter

[2] https://github.com/pushrax/chihaya

MrDOS · on July 12, 2013

What's the motivation for replacing Ocelot? My understanding is that it's mostly a drop-in replacement for XBTT (please correct me if I'm wrong); is one of the design goals of Chihaya to modernize the API?

wcauchois · on July 10, 2013

thanks for the links. i gotta say though, your indentation style on chihaya is crackodactyl.

jzelinskie · on July 10, 2013

It's only gofmt with lines <80 chars; there is a function call or two broken into multiple lines to meet that. I'm scrapping that limit and just going to make it as readable as possible. The real crazy thing is GitHub using 8 spaces for tabs!

quchen · on July 10, 2013

Could someone explain what this is?

aroch · on July 10, 2013

'Gazelle' is the bittorrent tracker frontend developed by What.CD (For those not in the know, one of the largest music trackers out there).

bigdubs · on July 10, 2013

actually 'ocelot' is the tracker itself, gazelle is just the site that the users access to get the torrents

aroch · on July 10, 2013

I corrected myself as soon as I posted...Not enough coffee this morning

andrewflnr · on July 10, 2013

So, what is a bittorrent tracker?

FZeroX · on July 10, 2013

The tracker tells you who else has the files you want, in a nutshell.

Basic bittorrent outline:

You download a .torrent file which contains (amongst other things) a list of files, their hashes, and a list of trackers to connect to.

Your BitTorrent client connects to the tracker and says "Hi, who else has or is looking for these files"?

It returns a list of these people (aka "peers") and your BitTorrent client then connects directly to peers to download the files (and upload them to other people).

andrewflnr · on July 10, 2013

And all this time I thought a torrent file had the actual peers in it. So you don't even really need the .torrent if you have the addresses of some trackers that know where the files are that you want?

nadaviv · on July 10, 2013

You would also need the torrent's hash (which enable you to specify to the tracker which files you want exactly). That's how magnet link [1] works - they contain the torrent's hash and optionally a list of trackers (most clients today support DHT [2], a decentralized tracker-less way to find peers).

[1] https://en.wikipedia.org/wiki/Magnet_URI_scheme

[2] https://en.wikipedia.org/wiki/Distributed_hash_table

FZeroX · on July 12, 2013

I made a few simplifications for brevity and "these files" was one of them. What the .torrent actually contains is a list of the files and how those files are divided up into chunks of equal size (say half a megabyte per chunk). The bulk of the torrent is a list of hashes of each chunk, which allows your client to verify the data it has received from other peers. If you then take a hash of <<the part of the torrent that lists the files and the chunk hashes>> the resulting hash is known as the infohash. The infohash describes this particular torrent uniquely, and it's the infohash you send to the tracker when asking for peers.

Hopefully from this explanation you can see that you need it for two purposes, one so you can give the right infohash to the tracker, and two so you can verify the data you receive from peers.

dewey · on July 10, 2013

http://en.wikipedia.org/wiki/BitTorrent_tracker

quchen · on July 10, 2013

Ah, thanks.

samolang · on July 10, 2013

What.cd is a private torrenting site. They decided to open source the code they use to run their site.

sciurus · on July 10, 2013

It appears they decided to publish to code, but the license isn't open source as defined at http://opensource.org/docs/osd

slacka · on July 10, 2013

See my post above for links/references, but it was originally released under the GPLv3.

dewey · on July 10, 2013

A minor detail: The codebase was open source for a long time already, just on their own git and available on what.cd/gazelle and the IRC channel instead of GitHub.

nutmeg · on July 10, 2013

The Bittorrent tracker software used to run What.CD, a popular music torrent site.

dustywusty · on July 10, 2013

This code's pretty riddled with SQL injection vulnerabilities. Can't imagine anyone recommending use of this for new projects.

For a single instance, https://github.com/WhatCD/Gazelle/blob/master/sections/user/...

The $UserId variable, which is used throughout the queries within this file, is set by an unfiltered GET variable.

deeebug · on July 10, 2013

Actually they check the $_GET['id'] variable, which is used to set the $UserId variable. Check the top of the source:

if (empty($_GET['id']) || !is_numeric($_GET['id']) || (!empty($_GET['preview']) && !is_numeric($_GET['preview']))) {

FZeroX · on July 10, 2013

I was an architect and coder on this project back at the very beginning and for a few years after launch, if you have any questions about the code then ask away :)

kbar13 · on July 10, 2013

not really sure why we're posting private trackers' projects, but here is waffles.fm's:

https://github.com/wafflesfm/batter

antocv · on July 10, 2013

Is waffels still going, are they accepting new members?

dewey · on July 10, 2013

Yes and yes, but this is not the place to ask for invites.

foobarbazqux · on July 10, 2013

Why not?

dewey · on July 10, 2013

Because it's against their rules and the rules of pretty much all the other private trackers too.

antocv · on July 10, 2013

I respect those rules but I think they're stupid, as I havent met anyone in person who knew what waffles.fm or what.cd is but Ive been a member of what.cd since a few weeks after what.cd began (after the great oink piggie raid, what was that one called again?), its been years now.

adamdavis · on July 10, 2013

I remember working with gazelle for a torrent site for live Phish recordings. Truly a terrifying code base.

dfrey · on July 10, 2013

You must maintain a 0.4 ratio of pushes to pulls on github to use this repository.

adPothier · on July 10, 2013

Ocelot, their custom-made C++ tracker is a nice piece of code. They were forced to drop out php-based and XBTT-based trackers some years ago when the server load from their became huge peer swarm became unmanageable.

TorrentFreak made a great story about this[1].

[1] http://torrentfreak.com/what-cd-debuts-lightweight-tracker-f...

FZeroX · on July 10, 2013

PHP wasn't really the limiting factor, it's the architecture that XBTT uses that really kills you - an SQL query for every single announce! (With millions of peers that means hundreds of SQL queries per second just to maintain your swarm)

Some other cool trackers to look at are Shadowolf (although I believe that's discontinued, sadly) and Lioness if that's floating around anywhere (What.CD likes wildcat-based codenames)

brass9 · on July 10, 2013

Is the ocelot code opensource? I'd be interested in it.

dewey · on July 10, 2013

Yes, it's in the repository (https://github.com/WhatCD/Gazelle/blob/master/ocelot-0.6.tar...)

expnsv_hdphns · on July 10, 2013

Not free (or "open source") from what I can tell.

  COPYRIGHTED AND PATENTED PENDING BY WHAT.CD INCORPORATED, 
  DBA/AKA PROJECT GAZELLE

Which is a bit confusing, but then later:

  FOR NONCOMMERCIAL PURPOSES ONLY

slacka · on July 10, 2013

It was first released under the GPLv3 http://torrentfreak.com/gazelle-rejuvenates-the-bittorrent-t...

You can view a mirror of the original release here: https://code.google.com/p/gazellewhatcd/

sciurus · on July 10, 2013

Looking at your link, I see https://code.google.com/p/gazellewhatcd/source/browse/trunk/....

That isn't the GPLv3. That isn't even a free software license (https://www.gnu.org/philosophy/free-sw.html), since they only grant a license "FOR NONCOMMERCIAL PURPOSES".

k2052 · on July 10, 2013

I find it repulsive that a torrent tracker is released under a restrictive license with Copyrights and Patents.

stayclassytally · on July 10, 2013

Also, this page:

http://whatcd.github.io/Gazelle/

brass9 · on July 10, 2013

I don't understand. Why is this on HN now? IIRC project Gazelle had been open-sourced several years ago.

dewey · on July 10, 2013

Because they just started to mirror their own git to GitHub and the documentation is more accessible there than the internal wiki on a private tracker.