Hacker Newsnew | past | comments | ask | show | jobs | submit | source99's commentslogin

I would love to hear some stats on how many car/bike collisions happen every day in SF.


Prior to the pandemic an average of 1.4 bike accidents per day - https://www.mjqlaw.com/san-francisco-2022-bicycle-accident-s...


I am unable to drill down on my phone right now but presumably we can figure this out for other years thanks to the visionzero plan/project. https://sfgov.org/scorecards/transportation/traffic-fataliti...

Yes I know this site only shows fatalities but I'm assuming the data is there if somebody is able to figure out the API/powerbi stuff?


I don't know either but I do know that this collision has already got infinity times more press coverage and police attention than that time a wrong-way driver hit me on Howard and fled the scene. It's a problem with the whole AV discourse.


I can name two times in the last five years where, biking, I had a worse encounter than described (though I reserve judgment of fault until more details come out). Including one that required hospitalization, where the human driver decided to drive away.

The frustrating bit about AV discourse is that AVs are held against the standard of a perfect driver. Which I don't actually object to--all drivers should be compared to perfection. But it's enraging when people do so for AVs without comparing against human drivers, which, as everyone should know, fall far short of perfection. And with AVs we have the capability to iteratively improve them toward perfection, while there's no way to do the same with humans.


SewerAI | SewerAI.com | UX Designer | Bay Area (some days remote) | FT | $130K - $160K + Equity (series A/B company) We help cities manage their sewer infrastructure. We have a data platform that manages terabytes of video data and an AI tool that helps cities find problems in their sewer inspection videos. Our AI is focused on computer vision - not LLMs. Our data platform helps cities and private business run their inspections quickly, easily and accurately. We are looking for a UX designer that can help drive customer productivity and be the voice of the customer. please reach out to jobs at sewerai dot com


I have a very similar issue and likely the same cause. What has helped for me: Cortisone shots (temporarily relief but not a fix), losing weight, core strengthening activities and switching to a standing desk 100% of the time. YMMV. 43yr old male, lifelong developer.


There are two kinds of VC funded startups - Those that give money to customers and those that give money to employees.


This is one of my favorite test cases:

Prompt: Here is a riddle. It is a common riddle but with some changes that make it more difficult. You are an alien that will live for at least 10000 years. You have no sense of temperature. you can not feel hot or cold. you have eyes and can see. you are in a house. downstairs are 3 light switches that control 3 light bulbs that are upstairs. Each light bulb will last for exactly 3 years. you can tell by looking at a bulb if it has burned out after 3 years. devise a strategy to determine which switch controls which light. ChatGPT 4 The standard solution to this problem involves using the heat generated by the bulbs, but since you as an alien can't feel temperature, we need to utilize the fact that these bulbs have a specific lifespan of 3 years. Here's a strategy:

Turn on the first switch and wait for 3 years. After 3 years, turn it off.

Right after turning off the first switch, turn on the second switch.

Leave the third switch off throughout this process.

Now, go upstairs to inspect the bulbs:

The bulb that is off and has burned out is controlled by the first switch. This bulb was turned on for 3 years, so it has reached the end of its lifespan and burned out.

The bulb that is on is controlled by the second switch. You turned this on right after turning off the first switch, so it's still functioning.

The bulb that is off and hasn't burned out is controlled by the third switch. This switch was never turned on, so the bulb is still in a brand-new condition.

This strategy ensures that you can confidently identify which switch controls which bulb.


It's telling that it didn't pick up on the fact that the whole "once you look you can't touch the switches anymore" isn't in this version of the riddle. I mean the obvious strategy in this case is turn on the first switch, go upstairs, look at the bulbs. Go back downstairs and try the second switch. You've not got them all mapped out in about 30 seconds.

Using GTP4 I asked if there was a way to do it in less than 3 years, but it couldn't figure this out even if I told it you can look and use the switches as much as you want. Instead it suggested turning on a switch for 10 minutes, then using your "excellent alien vision" determine which 3 year lifespan bulb has 10 minutes of wear on it.

Makes me think GPT4 doesn't really have better reasoning, it just looks like better reasoning because it's been fed way more data.


Variants of common riddles remain the final frontier. I'm trying to cross the river in a canoe with a carrot, cabbage, and cucumber...


Variants of common riddles actually can be solved with GPT-4, but you have to rewrite it so it doesn't look like the riddle from memory(sometimes, it's as easy as changing names to something completely different). Turns out Language models trust their memory quite a bit. Slightly related - they won't actually use the results of tools if it differs a lot from what it expects the output to be - https://vgel.me/posts/tools-not-needed/


"Language models trust their memory quite a bit."

All they have is memory, either in the weights or the input prompt. To the extent that these models appear to reason, it is precisely in the ability to successfully substitute information from the prompt into reasoning patterns in the training data. It shouldn't be any surprise that this fails when patterns in the prompt strongly condition the model to reproduce particular patterns of reasoning (eg, many words in the riddle indicate a well known riddle, but the details are different).

I know the impulse to anthropomorphize is almost impossibly seductive, but I find that the best way to understand and use these models is to remember: they are giant conditional probability distributions for the next token.


LLMs trained on code reason better. Perform better on reasoning benchmarks even if the benchmarks have nothing to do with code. You're wrong.

https://arxiv.org/abs/2210.07128


Code is often just a sequence of steps (sometimes with comments to indicate goals). As such, it is just another form of patterns of reasoning. Many chains of thought that you would utilize in code are useful skeletons to think about other things.

I don't see how this undermines my point.


If code transfers to reasoning tasks that don't have anything to do with code then what is being "substituted" ? Ideas and concepts ?

Code and MMLU don't share similar "reasoning patterns" unless you're being extremely vague. In the, "they both require reasoning" sense.


Structure. GPT has seen lots of logical constructions/arguments for things. These are either explicitly in code (in documentation) or are implicitly in code (code is often a linear sequence of steps building to, for example, a return value). ChatGPT learns patterns like this. A prompt may condition the generator to produce something like one of these patterns with elements from the prompt substituted into the generated text. This works relatively often, but fails exactly in the case where the prompt so strongly indicates a pattern that won't work for the prompt given.

I won't say these models can't reason per se, but they can only reason using their memories and the prompt. There is nothing else for them to compute on.

In a hand wavy kind of way, when ChatGPT fails at a riddle phrased in a way as to make it seem similar to a common riddle, you're seeing overfitting. But given the quantity of data these models consume, its hard to imagine how to test for overfitting because the training data contains things similar to almost anything you can imagine. Because of that I'm still very suspicious of claims that they "reason" in any strong sense of the word.

But if you try very hard you can find "held out" data and when you test on it, GPT4 stops looking so smart:

https://teddit.net/r/singularity/comments/121tc48/gpt4_fails...

That said, I've been very impressed by GPT4 as a productivity tool.


>but they can only reason using their memories and the prompt.

Eh no.

https://arxiv.org/abs/2212.10559

>But if you try very hard you can find "held out" data and when you test on it, GPT4 stops looking so smart:

This can be done to anybody. This can be done to you. It's not a gotcha. Nobody is saying GPTs don't/can't memorize.


Two things about this.

1. the paper in question demonstrates a formal duality between the transformer architecture and gradient descent. If you take this to indicate that the model reasons in some way, then it would be true of the smallest GPT as well as the largest (it is, after all, a consequence of the architecture rather than anything the model has learned to do per se). In any case, the fact that the model can perform the equivalent of a finite number of gradient-like steps on its way to calculating its final conditioned probabilities doesn't really suggest to me that the model reasons in a general way.

2. You are right that no one disputes the model's ability to memorize (and rephrase). What is at question here is whether the model can reason. If it can do 10 code questions it has seen before but fails to do 10 it hasn't (of similar difficulty) then it strongly suggests that it isn't reasoning its way through the questions, but regurgitating/rephrasing.


>If it can do 10 code questions it has seen before but fails to do 10 it hasn't (of similar difficulty) then it strongly suggests that it isn't reasoning its way through the questions, but regurgitating/rephrasing.

First of all, coding is one thing where expecting perfect try on first pass makes no sense. That GPT-4 didn't one-shot those problems doesn't mean it can't solve them.

Moreover, all this says if true is that GPT-4 isn't as good at coding as initially thought. Nothing else. Doesn't mean it doesn't reason. There are many other tasks where GPT-4 performs about as well on out of distribution/unseen data


This really just makes it seem like it's not reason at all. The trick (or rather un-trick) here is that you can look at the bulbs as many times as you want. Even if I explicitly tell GPT4 that, it doesn't get it.

It's not reason, it's mapping.


Maybe I'm missing something, but this seems like a really poorly worded riddle. Or just a bad riddle.

Given what you stated, the solution would be walk upstairs look at the 3 bulbs, walk downstairs switch one of the switches. Walk upstairs see which bulb changed on/off. Repeat for the other two switches.

This takes roughly 5 mins rather than your solution which takes 3 years.


You're missing and getting the point at the same time.

Your solution would be the correct one, while GPT pulls out the "only look once"-constraint out of nowhere. The riddle is perfectly fine.


Ah thanks, that clears it up. The formatting made me miss that that was the ChatGPT response. I thought that was the user sharing their solution that they look for.


This is the crux of the problem for GPT, enough of the text is from the “standard” riddle that is cannot break from it to actually answer the question.

I’m seeing this all of the time when trying to get GPT to perform certain answering problems. It’s heavily biased towards the “correct” answer - even when the prompt presents directly contradictory information.


Except the part about "with some changes that make it more difficult" is false. The changes make it trivial.

Edit: Omitting that part still makes GPT4 over complicate the solution


You just passed the test that you're not LLM.


The reply is funny. I told ChatGPT that it can do multiple trips and that it can see both switches and light bulbs at the same time. ChatGPT keeps insisting that you have to turn on a switch, wait a few minutes , flip switch to off, turn other switch to on, then go and look and feel the switches. The classic solution to the riddle.

  But if you can see both switches and light bulbs, you turn on one switch, you see which bulb turn on. You then turn on the 2nd switch and see which light turn on. You are done. No wait needs to happen ;-)


"feel the switches"?


Arguably LLMs are handicapped by tuning that always expects some answer without questioning the question.


Actually, this is a really good point. Asking a question means that you have an internal model or the world and you realize that there are some data that is way off. Is it even possible at all to train an LLM to ask a question? This also gives me an idea for the future where most online text is generated by AI: we will play the game of responding questions with questions and then we know we are human!


Are you looking for: turn one light on, walk upstairs and note the light that is on. Walk downstairs, turn another light on, walk upstairs and note the new light that is on. Now you know which switches control which lights and it only took you 5 minutes instead of 6 years


SewerAI | Walnut Creek CA | Bay Area / Remote | AI Engineer & Backend Engineer | https://www.sewerai.com/

SewerAI is an AI and data management company that is improving the speed, effectiveness and cost for cities and engineering firms to manage infrastructure. Our cutting edge AI lowers the total cost of ownership of cities’ infrastructure by reducing the dirty, dull and dangerous work that must be done to maintain infrastructure.

Come join a growing and A-round-funded startup to help provide real value to the world.

https://sewerai.com/careers


SewerAI | Remote | USA | Full-time | UX + UI Designer | $120K - $140K

SewerAI is a funded startup (with real revenues) that is helping cities manage their physical infrastructure. Yes we are really named SewerAI and our first product is AI to find problems in sewer inspection videos. Our second product is a cloud data management platform to help cities manage all of their data. We sell these products both to cities and the private service firms that support them.

We are looking for our first design hire. We worked with a consulting firm to roll out the first version of our site but we now like to bring that expertise internally. We are looking for someone that can roll up their sleeves to work with our product and engineering team to add new features and capabilities to the site. There is a lot of work to do!

We are a technology company that is looking to make improvements in the physical world. Helping the environment and the citizens these cities serve.

Please apply here - https://www.sewerai.com/careers-listing/ui-designer


If anyone is interested in the subject this is my company. www.sewerai.com


My company- www.sewerai.com


I had to stop reading dice man. I just found it to be very depressing because of all the negative actions he takes.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: