More

skp1995 · 2025-02-08T13:16:54 1739020614

Hey I am the coredev on sidecar. The reason you see autogenerated PRs is cause I am using our agents to write the code for the agent lol

The big difference is the complete loop, each PR gets its own VM with the tool chains installed so the agent can run cargo check or cargo tests etc.

We do find the LLMs of today are not the best elite engineers but very very competent junior engineers. It's been a weird but eye opening workflow to use.

bravura · 2025-02-08T16:12:07 1739031127

I am super interested in sidecar. When will it support o3-mini-high?

I also need an SDK to script it, tbh. What I want is to have actually a few different agents that interact with each other. Do you expose a good SDK?

skp1995 · 2025-02-08T17:47:57 1739036877

I does support o3-mini-high already, we use it for a few flows in the agent.

What kind of SDK support are you looking for?

skp1995 · 2025-01-08T22:26:39 1736375199

Hey! One of the creators of Aide here.

ngl the total expenditure was around $10k, in terms of test-time compute we ran upto 20X agents on the same problem to first understand if the bitter lesson paradigm of "scale is the answer" really holds true.

The final submission which we did ran 5X agents and the decider was based on mean average score of the rewards, per problem the cost was around $20

We are going to push this scaling paradigm a bit more, my honest gut feeling is that swe-bench as a benchmark is prime for saturation real soon

1. These problem statements are in the training data for the LLMs

2. Brute-forcing the answer the way we are doing works and we just proved it, so someone is going to take a better stab at it real soon

skp1995 · 2024-11-17T01:10:54 1731805854

I do have to ask, I have worked in codebases which used lifetimes and didn't lean into Rc/Arc and vice-versa.

I used to think Arc/Rc was a shortcut to avoiding the borrow checker shenanigans, but have evolved that thinking over time.

You do mention it in your comment so wondering if you have anything to share about it

skp1995 · 2024-11-17T00:43:43 1731804223

Rust can be hard to get right because of the borrow checker. I had a similar situation happen to me where I went about refactoring the code to make borrow checker happy ... until the last bit when things stopped compiling and I realized my approach was completely wrong (in the rust world, I had a self-reference in the structs)

Having said this, the benefits of borrow checker out weight the shortcomings. I can feel myself writing better code in other languages (I tend to think about the layout and the mutability and lifetimes upfront more now)

My rust code now is very functional, which seems to work best with lifetimes.

I would love to know more about the authors pain, I do hope rustc gets better at lifetime compilation errors cause some of them can be very very gnarly.

estebank · 2024-11-17T00:48:30 1731804510

> I do hope rustc gets better at lifetime compilation errors cause some of them can be very very gnarly.

When this happens, file tickets! We do our best to improve diagnostics over time, but the best improvements have been reactive, by fixing a case that we never encountered but our users did.

skp1995 · 2024-11-17T00:59:34 1731805174

will keep that in mind going forward! The most recent ones which I have been hitting are around "higher-ranked lifetime error"

I know my way around this now, which is to literally binary search over the timeline of my edits (commenting out code and then reintroducing it) to see what causes the compiler to trip over (there might be better ways to debug this, and I am all ears)

Most of the times this error is several layers deep in my application so even tho I want to ticket it up, not being able to create a minimal repo for anyone to iterate against feels like a bit of wasted energy on all sides, do let me know if I should change this way of thinking and I can promise myself to start being more proactive.

estebank · 2024-11-17T01:18:29 1731806309

If it's public code, a link to a branch with the issue can still be useful. Looking at the compiler internals you can get a better sense on how to minimize the issue. That being said, not having a minimised repro lowers the chance of it getting addressed quickly.

Even if you have already figured out how to deal with it, your future colleagues might not, and by improving the diagnostic you would also be getting that time manually commenting code back.

oneshtein · 2024-11-17T07:14:54 1731827694

> Rust can be hard to get right because of the borrow checker.

In the same vein: «C/C++ can be hard to get right because of the valgrind.» ;-)

skp1995 · 2024-11-11T15:44:42 1731339882

I am missing the link to the thread, but diffusion models also give a very consistent output when prompted with `IMG_{number}` part of the reason could be the training data distribution

skp1995 · 2024-11-07T17:50:07 1731001807

Thats a fair point, a significant part of our 4 person team had to skill up on the VSCode codebase to be able to meaningfully make changes to it.

I would love to know your workflow, you mention CLI tool or VSCode plugin, which one of them work for you? Whats missing from them where Aide can fill in the gap

portpecos · 2024-11-07T18:27:44 1731004064

Plugin, which gives you access to the UI.

tertle950 · 2024-11-07T23:22:10 1731021730

Make a plugin that interfaces with a CLI tool! Best of both worlds, I think!

skp1995 · 2024-11-07T17:21:44 1731000104

There is pinned context on the very top where you can pin the files which you frequently use.

We will start including the open file by default in the context very soon (one of the gotchas here, is that the open file could not be related to the question you have)

skp1995 · 2024-11-07T09:20:57 1730971257

It supports all platforms, we took inspiration from the macos spotlight search widget for inline editing.

santiagobasulto · 2024-11-07T09:25:24 1730971524

I think it's confusing for people that only one big green button shows "Download for mac" and the other download links are at the footer.

skp1995 · 2024-11-07T11:52:46 1730980366

ugh.. in which case our platform detection code is not working as expected. We will look into that

skp1995 · 2024-11-07T03:32:05 1730950325

JetBrains is very interesting, what are the best performing extensions out there for it?

I do wonder what api level access do we get over there as well. For sidecar to run, we need LSP + a web/panel for the ux part (deeper editor layer like undo and redo stack access will also be cool but not totally necessary)

skp1995 · 2024-11-07T03:27:47 1730950067

how do you use the computer usage.. I do find it a very interesting API layer to play around with

jaylane · 2024-11-08T02:00:34 1731031234

I've been using it recently to have cline check my local dev server to review it's changes and iterate if there is anything off with the design changes it has made. Example prompt:

I have attached a screenshot of an updated design for the Navbar component. My local dev server is running at localhost:3000. Update the component to match the new designs and double check your changes after save.