1. When I said we revise the paper between two submissions, I wasn't implying it...

abhgh · 2024-12-01T09:36:56 1733045816

I feel like we are debating slightly different perspectives, and with that lens I agree with what you say. Here is the difference in perspectives (and this is decoupled from this particular paper): your take is that today the reviews work in a certain way and here are some things we could do to maximize our chances of acceptance. My take is that reviews shouldn't work this way.

To take some examples:

1. > You cannot choose who will read.

Specifically no, but generally, yes. I'd expect the reviewer to understand ML. And if this is not the brand of ML they're familiar with, I'd expect them to put in the work to familiarize themselves during the review process, in the interest of fairness. After all are we not seeking out qualified reviewers for the review process? This is not just anyone who stumbles across a paper on the internet.

2. > message is too diffuse

Any message would appear diffuse/opaque/abstract to someone unfamiliar with the area. This is exactly why an objective review process must equalize such communicative biases. This is partly facilitated by the conference picking the right reviewers and with their review-assignments, and partly it is also the duty a reviewer to fill in whatever gaps of comprehension that remain.

3. > Read best paper awards of good conferences, notice how much material is there in the same number of pages, and reverse engineer what they did to make the paper clear, concise and easy to follow.

Good general advice but you are preaching to the choir. I do read best papers from various conferences and I run reading groups where we discuss papers from ongoing conferences. I run an applied ML research group in the industry - this pretty much comes with the job. Further, I don't think that best papers are head-and-shoulders above non-best papers; they are often voted to the top because they solve a broadly known problem, or they further the understanding of such a problem. Writing plays some role here, but is not the discriminative factor.

4. Requiring justifications. Yes, there is a rebuttal phase for that.

Just to be doubly clear, I am not saying papers (and this paper) can't be improved. But that is not the argument I am making.

somethingsome · 2024-12-01T10:28:10 1733048890

I totally understand your take, and you are right in some aspect. Reviews SHOULD NOT WORK IN THIS WAY. I'm totally agreeing on that, but consider this, the population is increasing, the number of topics are increasing, we are no more in a system where the review system work as it was designed to.

When it was designed, only a few people got the chance to read, and information was not easily available. So, people became experts, and knowing the sota was mandatory. Now, due to the high quantity of (good and bad) researches, you cannot expect the review system to work properly.

But you are still stuck in this system. So consider what is important:

- do you want to write and hope that by chance the right people will read it, and they will be educated enough in your topic and have enough time at their disposal. Or

- Do you think your idea is good and should be more known.

If it's the later, it's your work to make your idea as clear as possible so that any (good) researcher can understand it, and therefore use it. We must work in the reality of the current system if we want to spread interesting ideas to the community. The publication system is a social system, and it evolves with the people inside, you want to write to spread knowledge. How can you do that if the probability is that the reader will not fully understand?

The time is very limited and I always have many things to do, I only read what I filter as worthy enough. That filter is based on the quality of writing. If some paper is important but badly written, it will automatically fall in my 'if I have time to read it' and most of the time it will never reach the 'to read' category, because there are many many paper well written with good ideas inside.

We work in a biased system. It's extremely difficult to find reviewers, we do what we can with what we have.

I also was infuriated when I got a review saying 'you didn't explain structure from motion' in a conference with a topic on structure from motion. But the reality is this. If I want my papers to be read, I must adapt to my audience.

> Read best paper awards of good conferences

With that sentence, I did not mean 'read it for knowledge' but read it with the lenses of the writer, why did they present the topic in this way, what makes this paper clear and another on the same topic not clear at all. Reverse engineer the writing style. It's not about knowledge of the content of the paper, it's about communication. Best paper do not always have the best ideas inside, but they are presented in a way that even if the topic is difficult they provide insights on it. And often those insights are what readers want to read. The maths are not important, the important thing is the insight you give to the readers. That insight can be translated in their field of they internalized it enough.

godelski · 2024-12-02T06:02:54 1733119374

  > I also was infuriated when I got a review saying 'you didn't explain structure from motion' in a conference with a topic on structure from motion. But the reality is this. If I want my papers to be read, I must adapt to my audience.

Honestly, I hate this take. I don’t think it’s good for science or academia. Papers do not need be readable to everyone. The point is to be readable by other experts in your niche. Otherwise, I don’t know who to write to, and that’s exactly the same problem the parent is having.

Writing to too broad of an audience also makes papers unnecessarily long. You have to spend more time motivating the work and more time on the background. This has spinoff effects where reviewers can demand you cite them, contributing to the citation mining nightmare. I’ve seen 8 page papers with 100+ references (the paper I referenced has 78). This is more what we expect from a survey paper. When background sections are minimal you can’t justify asking unless you are critical to the exact problem being solved.

Every paper rewrite is time and money that should be better spent on research or other activities. Every rewrite is an additional submission into the next conference. I don’t think labs are submitting 20+ papers per round because they wrote 20 papers in that 3 months (with some exceptions) but rather because they wrote a bunch and are recycling works from the previous few years. This increases reviewer load as well.

The question then is how people enter a topic. Truth be told, I don’t think it’s any easier than when papers had under 30. For reference, that one Cybenko paper we all know has under 30 references but is 10.5 pages. What I think we should do instead is allow citing of blogs and encourage people to write tutorials. I think this would actually be a really useful task for 2nd and 3rd year PhD students. You learn a lot when writing those things and that’s the stage where you should be entering expert level at your niche. The problem is that we have no incentivize to do any of the other critical tasks in academia. This is why I personally hate it. We are hyper focused on this novelty thing but in reality that doesn’t exist and is highly subjective. It just encourages obscurification, which we’ve routinely seen from high profile labs.

We work in ML, how are we not all keenly aware of reward hacking and knock-on effects?! I honestly think the fact that we cannot get our house in order is evidence that we can’t safely build safe AGI yet. This is certainly magnitudes easier of a task, not to mention has significant reward (selfishly, it highly benefits us too!). Everyone feels that something is off but no one wants to do anything about it. We’ve only implemented half added measures that are more about signaling. Can’t let an LLM review for you? But the author is responsible for proving the reviewer used one? We’re all ML experts… we all know this isn’t possible except in the most cases. It’s as if you got shot while blindfolded and the police won’t investigate until you bring them evidence of who shot you and with what kind of gun. It shouldn’t matter if a review is bad because it was written by an LLM or because it was by a human. Just like it shouldn’t matter if you were shot or stabbed.

  > The maths are not important, the important thing is the insight you give to the readers.

I also hate this take. The math often __is__ the insight. I agree that a lot of papers have needless math (look at any diffusion paper or any paper with attention copy pasting the same equations). But other works need them. The reason to use math is the same reason we program. If there was an easier way to communicate, we would (note: math isn’t just symbols, it can be words too). Math and programming are hard because they are languages that are precise and dense. The precision is important when communicating. Yes, it might take longer to parse but it is unambiguous when interpreted (it is also easier to parse when you’re trained and in the habit. Just like any other language).

I think we lost our way in academia. We got caught up in by excitement. We let the bureaucrats take too much control and dictate the universities. We got lost in our egos (definitely not new) and too focused on prestige and fame. Our systems should be fighting these things, not enabling or encouraging them. Yes, the people at the top benefit from these systems, but the truth is that even they benefit from fixing things.

somethingsome · 2024-12-02T10:11:09 1733134269

Thank you for taking the time to read this already long thread.

> Honestly, I hate this take. I don’t think it’s good for science or academia I completely understand, I don't say that I like the system as it is, just that we are stuck in it because there is no real alternative.

I would also love to have experts reading papers, but the sad truth is that it is often a first year phd student doing the review outside of its field.

> The question then is how people enter a topic.

I really like the idea of encouraging PhDs to write blogs and supporting material for their paper, or at least reference to good quality blogs that lead them to insights. But it takes time, time that usually they don't have, teaching assistants in particular spend most of their time working with students, projects to follow, etc..

If you have a solution to how to do it properly I'm interested. I had myself some ideas on how to solve this, but it would require time and money that I don't have.

> We are hyper focused on this novelty thing but in reality that doesn’t exist and is highly subjective.

Thanksfully I don't work on fashionable research, of course I have some people working on fashionable dl, but we stay focused on the why and not just putting boxes one with the other and hope it works.

> The math often __is__ the insight.

When the math is the insight, it should be there and followed by the insight in text and analysis. I don't say don't put math, but put interesting math. Nobody cares about copy pasted math from any other paper, with possible mistakes inside.

But I keep my view, the math is a mean, but ultimately what we want to transmit is the insight behind, the math can be recreated at will when the insight is there.

I love math, and some fields are more mathematical than others, and they profit from it.

> We let the bureaucrats take too much control and dictate the universities.

Agree totally.

godelski · 2024-12-02T22:05:55 1733177155

Sorry the topic is obviously a bit sensitive to me haha. Thanks for the tone, I can tell you'd make a good mentor and manager.

  > If you have a solution to how to do it properly I'm interested.

I have purposed solutions to lots of this stuff actually. But it does require push from others and it’s necessary for those in prominent academic positions push them. I think the issue is that there's a lot of interconnected parts to all of this. I doubt that there's an optimal solution, which in those settings I think flexibility is how one should error on. It gives us the room to adapt more easily to local changes in environment. But there will always be some downside and I think it is easy to focus on those and not compare against the value of the gains.

For blogs:

I think we should just count these as publications. People are just looking at citations anyways (unfortunately). We should at least count these as citable.

There’s a good knock-on effect to this one too. It can encourage replications (the cornerstone of science), tutorials (which is teaching and like a stepping stone towards books. This also helps students write better), and can help us shift towards other publication media in general. Like why do we publish works about video or audio in a format that can’t play video or audio? Its absurd if you ask me. The only reason we do is momentum.

I think it is also important too just be writing. To be a good scientist you must also be a philosopher. It is not just about doing the work, it is about why. The meta-science is just as important as the science itself. I think so is the fun and creativity that we have more room for in blogs (I think there should be more room for this in papers too). Research is just as much of an art as it is a science, and we need to be working the creative muscles just as much as the analytical ones. I also think it helps with the discourse between papers. I mean Andrew Gelman's and Scott Aaronson's blogs are famous for having all of these things. They are valuable to their communities and I think even a broader audience. But as the environment is now, this is disincentivized. I think more people want to do it and are motivated, but it is easy to push off when there is little to no reward (sometimes even punishments, like an advisor or boss saying "spend less time writing and more time researching"). If you're going to "slack off", you might as well do something that is more recovering, right? [0]

For reviewers/review system:

Again, incentive structures. The problem right now is that it’s seen as a chore. One that if you do poorly it doesn’t matter. So my first suggestion is we throw out the conference and journal structure as we know it.

The purpose of these structures was because we didn’t have the ability to trivially distribute works. You hire editors to make sure you don’t publish garbage because it’s expensive to publish and they correct spelling and grammar to make sure it’s providing the most value (communication). There may be conversations to improve the works, but not to down right throw them away. Everyone here is well aligned in goals. They're all on the same team! But none of that happens now. We have a system where it is reviewers vs authors. This should not be an adversarial setting. In fact, a few spelling mistakes are justification to reject a paper now (ask me how I know). The purpose was always about communicating work, not about optimizing for what is the best type of work and most important. Truth be told, no one knows and we can only tell later down the line.

There’s two common critiques I hear with regard to this:

  1) How do we discover work?
  2) How do we ensure integrity of the work?

Well who actually goes to the conference or journal websites to read them? We all use the arxiv versions, which are more up to date. We're also reading preprints, and especially in ML this is the way to keep up to date. I only go to grab the bibtex because the authors only have the arxiv one on their GitHub or webpage (pet peeve of mine). We pretty much discover from other papers, google, peers, twitter, and well you’re a researcher you know. The “getting published” is just a byline in a tweet and a metric for the bureaucrats.

The physicists created arxiv because it was what they were doing already. Which was you publish your draft in your big lab and others read it and critique it and you take that feedback to improve. There's always mean people, but mostly everyone is on the same side here. We just extended who had access to the library, that's all.

Discovery is a problem. But what about integrity and verifiability? I find this a feature, not a bug (and you'll be able to infer how this couples with writing directly to niche peers instead of broader groups). Sometimes if you try to take too much control, you end up getting less control. The truth is that the validity of a work is at least doubly intractable. You can't figure it out just from reading the paper. The only way verification happens is through replication. This cannot be done by reading alone. Works are (often, but not always) falsifiable through review, but not verifiable. The distinction matters.

And I actually think the increased noise here is a good thing. Too many conflate "published" (in a conference or journal) as a mark of credibility. Both outsiders and insiders (other academics). It is too lazy of a metric. Many of our problems arise from such an issue and I'd say that oversimplification of a metric is a corollary to Goodhart's Law. We researchers can all read papers and determine pretty quickly if it is fraudulent or not, at least if it is in our field. But outsiders can't. They can't even tell the credibility of a conference or journal, and there's too many scam ones these days. This creates an extra problem where the science journalists, who are also caught in the ridiculous expectation of generating work in infinitesimal amounts of time, end up writing on works with poor understanding of them (and the context surrounding them). Adding noise here pushes them to reach out to experts which will increase the quality overall, as the expert talking to them will not just filter crap but also be able to point to important nuance, things that a novice would miss. Especially when those things are buried in technical language or equations :)

In addition to this it removes power from the lazy bureaucrats AND harms the frauds. I think it is easy to believe that fraudulent work would flourish under this environment, but I think the opposite. Yes, arxiv has plenty of fraudulent works on it, but they are small in comparison. The frauds go to the fraud journals. Their scheme only works because they are able to have someone give their crap a mark of approval. When there is no seal of approval, one must go ask the experts. It is just a shift in the trust structure, not a destruction of it. There'll be collusion rings, but we already have those (including in high profile venues!). I do suspect there may be a bit more at first, before it stabilizes, as everyone adapts. But I think we already do most of this stuff naturally, so it won't be that hard.

But I do think we should keep conferences. There is value in meeting in person (I also think universities should encourage more inter-departmental collaboration, as well as extra-departmental and extra-university collaboration. I do think it is silly that we aren't having deep conversations with our neighbors). But I think these should be more invitations. You have invited speakers and the rest is you focusing on getting people to talk and facilitate these collaborations. That's one of the primary goal of conferences, building these social networks. Social media helps, but there's a big difference sitting face to face. I also think they should have training sessions (like they do) and workshops should be focused around these, not around publication. So less stuff is "published" in conferences, because publishing is just releasing work!

There's obvious downsides to this and there definitely is a lot of room for refinement, but I think this is a fairly good structure. At the end of the day we need to have faith in one another. The structure of science has always been "trust, but verify." But we stopped doing the latter and pigeonholed our measures. So now we have neither trust nor verification. I think it has all been good intention. I'll save you the clique, but what is a clique if not something everyone can state but few people actually follow? I get the desire to remove all noise, but I think such a goal is fruitless, it is impossible. So instead I think it is about finding the optimal noise. Instead of trying to get rid of it, we should embrace it. I hope as ML researchers we can recognize this, as noise is critical to our works. That without it, it all falls apart. Because, it is a feature, not a bug. It is a necessary attribute for generalization, and I think it isn't just for machines.

[0] Personally I find that the big reason for stress is that we remove creativity, flexibility, and a lot of the humanity from the work. We're over burdened by deadlines, there's too much work to do, and the truth is that the work is difficult to measure. Progress is difficult to see a priori, so you can constantly feel "behind". This just leads to slowdown and burnout. We're all here out of passion (at least I hope so! It'd be insane to do a PhD or research without the passion!). The structure should be made to gently push us back on track for when we get too lost by some other fascinating rabbit hole (who knows if it goes anywhere. But is going down it a bad thing?). But if we can't have fun, we are less effective at our jobs. If we can't take time to think about the meta, we get worse at our jobs. If we aren't talking to our peers, we get worse at our jobs.