Hacker Newsnew | past | comments | ask | show | jobs | submit | more epr's commentslogin

There are multiple videos already, some even in broad daylight. For example, the nbc news clip on youtube about 15 seconds in.


> The big problem is that NEAT can't leverage a GPU effectively at scale (arbitrary topologies vs bipartite graphs)

Is that true? These graphs can be transformed into a regular tensor shape with zero weights on unused connections. If you were worried about too much time/space used by these zero weights, you could introduce parsimony pressure related to the dimensions of transformed tensors rather than on the bipartite graph.


Or even use CuSparse, if you don't mind a little bit of extra work over normal cudnn.

https://developer.nvidia.com/cusparse


Just because you can pack the topology into a sparse matrix doesn't make it actually go faster.

Sparse matrices often don't see good speedup from GPUs.

In addition, each network is unique, each neuron can have an entirely different activation function, and the topology is constantly changing. You will burn a lot on constantly re-packing into matrices that then don't see the same speedups a more wasteful topology pretends to have.

On the flip-side out narrative of "speedup" is on bipartite graphs crunch faster in gpus and it might not be the same if the basis is actually utility of behaviors generated by the networks. A cousin thread explores this better.


Another plausible strategy to neutralize arbitrary topologies: compile individual solutions or groups of similar solutions into big compute shaders that execute the network and evaluate expensive fitness functions, with parallel execution over multiple test cases (aggregated in postprocessing) and/or over different numerical parameters for the same topology.


We can also show that sparse NNs under some conditions are ensembles of discrete subnets, and the authors of the original dropout paper argue that [dropout] effectively creates something akin to a forest of subnets all in "superposition".


If the point of this article is that there's generally performance left on the table, if anything it's understating how much room there generally is for improvement considering how much effort goes into matmul libraries compared to most other software.

Getting a 10-1000x or more improvement on existing code is very common without putting in a ton of effort if the code was not already heavily optimized. These are listed roughly in order of importance, but performance is often such a non-consideration from most developers that a little effort goes a long way.

1. Most importantly, is the algorithm a good choice? Can we eliminate some work entirely? (this is what algo interviews are testing for)

2. Can we eliminate round trips to the kernel and similar heavy operations? The most common huge gain here is replacing tons of malloc calls with a custom allocator.

3. Can we vectorize? Explicit vector intrinsics like in the blog post are great, but you can often get the same machine code by reorganizing your data into arrays / struct of arrays rather than arrays of structs.

4. Can we optimize for cache efficiency? If you already reorganized for vectors this might already be handled, but this can get more complicated with parallel code if you can't isolate data to one thread (false sharing, etc.)

5. Can we do anything else that's hardware specific? This can be anything from using intrinsics to hand-coding assembly.


Don't forget the impact of network. I managed to get a several hundred times performance improvement on one occasion because I found a distributed query that was pulling back roughly 1M rows over the network and then doing a join that dropped all but 5 - 10 of them. I restructured the query so the join occurred on the remote server and only 5 - 10 rows were sent over the network and, boom, suddenly it's fast.

There's always going to be some fixed overhead and latency (and there's a great article about the impact of latency on performance called "It's the latency, stupid" that's worth a read: http://www.stuartcheshire.org/rants/latency.html) but sending far more data than is needed over a network connection will sooner or later kill performance.

Overall though, I agree with your considerations, and in roughish terms the order of them.


I meant this kind of thing to fall under #1. Don't do work that can be avoided includes pulling 1M rows * a bunch of columns you don't need over the network.

From your description though, it doesn't sound like something I'd classify as a network issue. That's just classic orm nonsense. I guess I don't know what you mean by "distributed query", but it sounds terrible.

The most classic network performance issue is forgetting to disable nagle's algorithm.

The most classic sql performance issue is not using an index.


A distributed query is something you execute over multiple instances of your DBMS. I actually would disagree with you that this specific issue was an instance of (1). There wasn't anything wrong with the query per se but rather the issue was with where the bulk of the work in the query was being done: move that work to the right place and the query becomes fast.

When considering performance issues, in my experience it's a mistake not to explicitly consider the network.


> I actually would disagree with you that this specific issue was an instance of (1). There wasn't anything wrong with the query per se

I think OP's #1 agrees that there's nothing "technically" wrong with such a query (or an algo). It just generated work you didn't have to do. Work takes time. So you used time you didn't have to use. I also think this is the number 1 way of improving performance in general (computer, life).

A perfectly valid query of fetching 1M rows turned into 99.xxx% unnecessary work when you only needed a handful of rows. The query wasn't slow, it was just generating more work than you actually needed. The network also wasn't slow, it simply had to transfer (even at peak theoretical efficiency) a lot of data you never used.

You then used an equally valid query that wasn't even necessarily fast, it just generated much less work. This query (quote from #1) "eliminate[d] some work entirely", the work of carrying over unnecessary data.


> 1. Most importantly, is the algorithm a good choice? Can we eliminate some work entirely? (this is what algo interviews are testing for)

Unfortunately this has turned into a cargo cult in practice. There are plenty of cases where doing more work results in better performance, because the "faster" algorithm has some pretty horrible constants in practice.

A lot of interviews turn into a pop quiz about rote memorization of obscure algorithms because "that's what Google does", rather than actually focusing on being able to reason and benchmark why an implementation is slow and what approached could be taken to fix that.


I did not mean to endorse current software interviewing practices by pointing out a small overlap with good optimization fundamentals. The current status quo "google" style interview is basically a joke. It started from a good place, but fizz buzz eventually became invert a binary tree on the whiteboard, which by this point has been gamified to an absurd degree that it means very little, and likely optimizes for the wrong types of candidates entirely.


No


I've been daily driving Linux for maybe 10 years, and I see all these pushes for "lets make it easy. no cli!" as counter-productive. The cli is the interface on unix based systems, and the more time you spend learning fundamental knowledge like that instead of some ephemeral, unportable skin for the cli, the better off you'll be.


For power users. General users want this point and click thing, Windows/Mac got them used to it. It's sorta functional on mobile OS too. Its not like the CLI would (ever) disappear.


Problem is that some things will be hard no matter what.

There are people that believe that everything should or can be easy.

It is just painful to watch those people.

Yes you can build some software that will make FFT useful and you don’t even know you use FFT in an application.

But telling if an image is important for you to not compress it lossless way or 1000 of your images is or is not - well you have to sit down and make decisions and go one by one or make some rules that won’t be perfect.


Feynman's Razor


The example is hilariously terrible. Firstly, this is the currently required code:

    def set_deadline(deadline):
      if deadline <= datetime.now():
        raise ValueError("Date must be in the future")
    
    set_deadline(datetime(2024, 3, 12))
    set_deadline(datetime(2024, 3, 18))
There simply is no trade-off to be made at this point. Perhaps there will be eventually, but right now, there is one function needed in two places. Turning two functions that already could be one into a class is absurd.

Now, as far as teaching best practices goes, I also dislike this post because it doesn't explicitly explain the pros and cons of refactoring vs not refactoring in any detail. There is no guidance whatsoever (ie: Martin Fowler's Rule of Three). This is Google we're talking about, and newer developers could easily be led astray by nonsense like this. Addressing the two extremes, and getting into how solving this problem requires some nuance and practical experience is much more productive.


Almost all programming tutorials and even books to a certain extent suffer with the problem of terrible examples. Properly motivating most design patterns requires context of a sufficiently complex codebase that tutorials and books simply do not have the space of getting into. This particular case is especially bad, probably because they had the goal of having the whole article fit in one page. ("You can download a printer-friendly version to display in your office.")

> There is no guidance whatsoever (ie: Martin Fowler's Rule of Three).

That is completely unfair imo. Although not properly motivated, the advice is all there. "When designing abstractions, do not prematurely couple behaviors that may evolve separately in the longer term." "When in doubt, keep behaviors separate until enough common patterns emerge over time that justify the coupling."

Simplified maxims like "Rule of Three" do more harm than good. Don't couple unrelated concerns is a much higher programming virtue than DRY.


> Properly motivating most design patterns requires context of a sufficiently complex codebase

As someone that's made a best selling technical course, I strongly disagree.

It's 100% laziness and/or disregard for the reader.

The reason examples are as bad as they are is that people rush to get something published rather than put themselves in the audience's position and make sure it's concise and makes sense.

It's not like webpage space is expensive. There's plenty of room to walk through a good example, it just requires a little effort.


>It's not like webpage space is expensive.

It is not the webpage space. It is people's limited attention spans and ability to focus. A complex example is needed to properly motivate certain concepts, but too complex an example also contains too many other details that the reader gets bogged down/distracted from the main concept being discussed.

At least that is my hypothesis for why almost all programming books and tutorials have terrible examples. I am happy to be proven wrong.

Coming back to the article, I looked at some of the previous articles from the same series, and to me it feels like a very conscious decision to only include 3-4 line code examples.


> It's not like webpage space is expensive. There's plenty of room to walk through a good example, it just requires a little effort.

Right at the top of the page:

> A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

So no, there isn't room for a longer example.


What does sales have to do with what you're claiming? Please share the course and or examples of it being done well without requiring that excessive context, so that there's something to support your claim.


Well if my course and teaching was crap I wouldn't get good reviews and therefore many sales. I've spent $0 on marketing.

https://www.udemy.com/neo4j-foundations/

There are many people who do teach and explain topics well. Richard Feynman comes to mind.

I've found Abdul Bari on YouTube to also be an excellent teacher around technical topics.


Not related to the topic at hand, but who buys these courses? Going off the chapter titles it looks like it’s all basic ‘read the documentation’ kind of stuff (to me). I could imagine it being useful to beginners, but not anyone with a moderate amount of experience (they’d just go to the Neo4j documentation).

On the other hand, what beginner starts with Neo4j and Cypher? Is there really enough of them to justify a whole course? Apparently there are, it just feels weird to me.


You're right in that if you go through the docs you can find all the info you might need.

It's really catered for beginners, people that have next to no knowledge of graph databases or Neo4j and want to get up to speed in just a few hours.

I imagine some people may not even be super technical, but may want to learn just the basics of querying a DB at work to get some basic info out of it.

Apart from lessons there are also exercises for people to practice what they just learnt, and I do my best to point out gotchas and keep it mildly entertaining with a gentle progression in difficulty.


My thought here is that they're focusing on the wrong thing.

Yes, we have repetitive date-checking logic, but we don't want lay a trap for the future. Not sure what language this is in so I'll use C#:

EnsureDateInFuture(DateTime Time) { if (Time <= DateTime.Now()) throw new Exception("Date must be in the future"); }

One check for the future, one copy of the error message, not locked in.


I was going to say you were talking nonsense, but then realized I’d replaced the original post in my mind, by this much nicer post that someone else linked in this thread:

https://verraes.net/2014/08/dry-is-about-knowledge/

They essentially say the same thing, but one is better than the other.


Your example, deduplicating the two functions into one, illustrates an interesting point, although I'd prefer still having the two specialized functions there:

    def set_deadline(deadline):
      if deadline <= datetime.now():
        raise ValueError("Date must be in the future")
    
    def set_task_deadline(task_deadline):
      set_deadline(task_deadline)

    def set_payment_deadline(payment_deadline):
      set_deadline(payment_deadline)

    set_task_deadline(datetime(2024, 3, 12))
    set_payment_deadline(datetime(2024, 3, 18))
You lose absolutely nothing. If you later want to handle the two cases differently, most IDEs allow you to inline the set_deadline method in a single key stroke.

So the argument from the article...

> Applying DRY principles too rigidly leads to premature abstractions that make future changes more complex than necessary.

...does not apply to this example.

There clearly are kinds of DRY code that are less easy to reverse. Maybe we should strive for DRY code that can be easily transformed into WET (Write Everything Twice) code.

(Although I haven't worked with LISPs, macros seem to provide a means of abstraction that can be easily undone without risk: just macro-expand them)

In my experience, it can be much harder to transform WET code into DRY code because you need to resolve all those little inconsistencies between once-perfect copies.


I can only assume the Google example would be part of a script/cli program that is meant to crash with an error on a bad parameter or similar. Perhaps the point is to catch the exception for control flow?

My personal goal is to get things done in as few lines of code as possible, without cramming a bunch on one line. Instead of coming up with fancy names for things, I try to call it by the simplest name to describe what it's currently doing, which can be difficult and is subjective.

If we wanted to define a function which crashes like the example, I would probably write this:

    def throw_past_datetime(dt):
        if dt <= datetime.now():
            raise ValueError("Date must be in the future")
If the point is not to crash/throw for control flow reasons, I'd write this in non-cli/script code instead of defining a function:

    dt = datetime(2024, 5, 29)
    if dt < datetine.now():
        # Handle past date gracefully?
If it needs to do more in the future, I can change it then.


>You lose absolutely nothing. If you later want to handle the two cases differently, most IDEs allow you to inline the set_deadline method in a single key stroke.

Problem with unintentional coupling isn't that you can't undo it. It is that someday someone from some other team is going to change the method to add behaviour they need for their own use case that is different from your own and you won't even notice until there is a regression.


in this case (which shouldn't happen because it requires that you merged things that don't belong together - see accidental duplication), at least the one changing the method has all information on his hands and doesn't have to keep a potentially complex graph of copied code in his mind.


Did we not all evangelize Google in it's early days?

Also, none of these accounts saying nice things appear to be bots or kagi-focused in any way, so I think it's safe to assume they do actually just like it.


I don't know...my spidey sense has been going off a bit.

Kagi has a free trial, but you have to pay, which is the difference between it and early Google.

Of course, now we have Google ads instead, so who knows, maybe not bots.


Go check my history. Send me a mail and I'll send a photo of the fields and the wind breaks here that you can geolocate.

I am definitely not a bot.

I am however extremely fed up with Google. And equally thankful that I have found something that works as well as old Google (or better).


Obviously, some of this branching, etc. is not differentiable. Is it doing finite differences?


> Even most Hello Worlds don't check the pipe was written to correctly.

The word for this is not "perfect", but "overengineered"


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: