More

sinwave · on July 2, 2015

Lol - the code example seems to imply that Zuckerburg's user id is 4.

samhoggnz · on July 2, 2015

That's because it is 4!

https://www.facebook.com/4

sanderjd · on July 2, 2015

As a ruby programmer, my gut reaction was that this had something to do with bugs caused by the object id of `nil` being 4. But of course that's not the case :)

judofyr · on July 2, 2015

You need to upgrade your Ruby version. The object id of `nil` is 8 these days :)

sanderjd · on July 2, 2015

Ha, well they also got rid of `id` on nil altogether, so it stopped being a common bug. (I didn't know that its object id is actually 8 now though, so thanks for the useless-knowledge-nugget.)

revskill · on July 2, 2015

Who is 1,2,3 ?

ci5er · on July 2, 2015

You could always try...

  - https://www.facebook.com/1
  - https://www.facebook.com/2
  - https://www.facebook.com/3

My guess (and there may even be apocrypha about it), is that these were old registration test accounts that didn't survive the first Zuckerbergian purge.

leebyron · on July 2, 2015

Correct. SQL auto increment keys plus test accounts resulted in Zuck being 4.

prezjordan · on July 2, 2015

It is :)

sinwave · on May 29, 2015

Geoffrey Hinton gave results to this effect in a talk about "Dark Knowledge" [1]. Haven't seen any of these results published, though. I think he mentions something in the talk about NIPS rejecting the paper.

[1] - https://www.youtube.com/watch?v=EK61htlw8hY

sinwave · on May 4, 2015

Thought I'd chime in that for image compression (e.g in the JPEG2000 standard), the 2D discrete wavelet transform takes advantage of similar pixel intensities for neighboring pixels at various scales (i.e. "transformed magnitudes all nice and grouped for you"). The 2D-DWT is actually pretty cool under the hood. And, asymptotically, a bit faster than the FFT (DWT runs in O(N), and in 2D, O(width*height)).

sinwave · on April 29, 2015

May I please have your faster than nlogn sorting algorithm

titanomachy · on April 29, 2015

Perhaps he meant something along the lines of using sorting for the sake of development expediency when a different approach would have been more efficient. For example, finding the kth-largest element of a list: for large lists it might be more efficient [O(kn)] to repeatedly drop the max element than to sort and then index [O(nlogn)]. Of course, the first approach requires taking the time to write your own function, whereas the second may be as simple as `my_list.sort()[k]` depending on your language.

asgard1024 · on April 29, 2015

Sure thing: http://en.wikipedia.org/wiki/Radix_sort

alecco · on April 29, 2015

Radix sort is amazing, but it's kinda O(n log n), see

https://en.wikipedia.org/wiki/Radix_sort#Efficiency

  > Radix sort complexity is O(wn) for n keys which are integers of
  > word size w. Sometimes w is presented as a constant, which would
  > make radix sort better (for sufficiently large n) than the best
  > comparison-based sorting algorithms, which all perform O(n log n)
  > comparisons to sort n keys. However, in general w cannot be
  > considered a constant: if all n keys are distinct, then w has to
  > be at least log n for a random-access machine to be able to store
  > them in memory, which gives at best a time complexity O(n log n).[2]
  > That would seem to make radix sort at most equally efficient as the
  > best comparison-based sorts (and worse if keys are much longer than
  > log n).

vvanders · on April 29, 2015

This is one of those cases where O(...) notation belies the actual performance on real systems.

The big advantage of Radix sort is not O(wn), it's that Radix sort linearly accesses its values in memory. This means that you can take full advantage of DDR read speeds since the prefetcher will run ahead of your cache misses to get you the data(or if you're smart you can prefetch yourself).

DDR fetch speeds are on the order of hundreds to thousands of cycles and that's where Radix's performance gain comes from.

alecco · on April 29, 2015

Yeah, BigO is kind of dumb outside CS classes and interview questions. A Ludic Fallacy! Real top sorting records are most often based on radix (GPU) and merge (CPU+SIMD).

vvanders · on April 29, 2015

+1 Radix will destroy nlogn(sometimes by multiple orders of magnitude) for any data set that you can fit in memory(and with the radix size tuned for your appropriate cache line length).

Never underestimate the strength of the prefetcher and using the correct data access patterns.

marcosdumay · on April 29, 2015

For bounded countable sets (like integers between x and y), there are linear time sorting algorithms.

lorenzhs · on April 30, 2015

It's pretty clear that he's not talking about asymptotically faster than n log(n), but faster on real machines. While optimisations---like in-place partitioning, tail recursion, and using insertion sort when your problem size falls below a threshold (e.g. less than 16 objects)---won't change asymptotical complexity, they will make your implementation run five times faster.

Asymptotic notation hides the constant factors, which is good when you're talking about asymptotic growth. But when you want a practically fast algorithm, constant factors matter!

sinwave · on April 28, 2015

This is actually some really great advice to hear upon graduating from college. Thank you.

sinwave · on April 20, 2015

This is really cool work. There is a lackluster theano feature which allows you to print a flowchart figure for the "computation graph" corresponding to the symbolic representation of your model.

@ajtulloch's library provides what I imagined the feature would be at first glance - a comprehensible, elegant graphical representation of your NN model. And on top of that, all in Haskell, with Haskell DSL for running torch - so cool.

sinwave · on March 21, 2015

Yeah, imagining him touring while writing his thesis has been a source of inspiration lately.

sinwave · on March 11, 2015

Am I wrong that the title seems to imply something negative about word vectors? But the article is super pumped about them!

juxtaposicion · on March 11, 2015

It's just a play on the phrase a 'picture is worth a thousand words' :) The word vectors themselves contain sophisticated relationships that seem almost miraculous, we're definitely not negative on them

(Speaking as one of the authors of the post)

sinwave · on March 6, 2015

There is a seemingly dead github repo called "R2D3" by hadley wickham. Wonder what came of that project?

https://github.com/hadley/r2d3

sinwave · on March 4, 2015

cf. Searle's Chinese Room argument!