More

yaantc · 2024-11-20T17:25:11 1732123511

Emacs is in the process of moving from legacy languages modes using regexps and elisp for syntax analysis to new modes using tree sitter.

In this context, what does a name like "c-mode" should mean? Options: 1) it should stick to the old mode, cc-mode here. To use the new mode, use explicitly c-ts-mode; 2) it should move to the new tree sitter mode, c-ts-mode. To use the old mode, use explicitly cc-mode; 3) it should mean the new preferred Emacs mode, with a way for the user to take back control if they have a different preference. This preferred mode will change at some point from legacy to tree sitter.

The change is (3), with a move to tree sitter in Emacs 30 (to be released soon) IIUC. It makes sense to me. Saying that anyone own a name as generic as "c-mode" in an open source project just because they're first and have a long history as a contributor (thanks by the way!) seems excessive. Change of default is normal in an evolving project, and as long as it's clearly documented with a way to override (which is the case IIUC) it's fine to me. One can dislike the change, but it's impossible to please everyone anyway. Emacs users are used to adjust configuration based on their preferences.

I understand it can be an emotional situation for the maintainer of the legacy mode. But I don't see the need to call foul play.

juxtapose · 2024-11-20T22:51:02 1732143062

I agree with giving users control, but unfortunately I cannot agree with the move to c-ts-mode. And I cannot disagree more with associating CC mode with "legacy" when it's objectively better than the other alternative, at least currently. I don't think Emacs developers are doing users a favor in this specific case.

CC Mode is extremely capable. Over the years it has developed to such a maturity that almost all needs can be satisfied, and performance has never been a problem for me. It contains very few, if any, bugs, that affect my use.

On the other hand, the tree-sitter major modes are not at al production-ready to be considered as default. For one thing, the whole highlighting can break for complex macros and ifdefs. (I'd be glad to be enlightened whether it's theoretically possible to fix at all -- can you correctly highlight ifdefs without doing semantic analysis with the help of a compiler?) For another, CC mode has a feature called c-guess that can quickly analyze an existing source buffer and generate a format definition which proves extremely valuable. Alas, c-ts-mode has zero support for it.

I had high hopes for tree-sitter. I turned on tree-sitter modes for all my coding when it was out, and now I have zero enabled. They still have a long way to go and I don't want to spend time debugging emacs code at work. :-)

Tree-sitter is not a panacea. Fast parsing alone is not what makes a good major mode.

fasa99 · 2024-11-21T20:48:37 1732222117

As someone whose pronouns are C-programmer/vim, I feel unsafe.

My living nightmare would be to develop highly verbose Java programs in an editor with 999 gorillion different "modes" with seemingly random names.

"oh, you're making an singletonfactoryfacade in Treesat-19 mode, you'll need to be using CCC-mode, treesat-19 mode is for factoryencapsulationfactory patterns"

shadowgovt · 2024-11-20T19:55:52 1732132552

And indeed, if anything, a project like emacs being unable to make a decision like this results in a project that slowly dies from the weight of its own history.

Tree-sitter is fairly universally understood now to be "the future." While cc-mode will likely have its place for a long time (hard to beat regexes on speed, even if they break down when the input is too noisy), moving the default to the tree-sitter implementation aligns with the other language modes going to tree-sitter. For good or ill, consistency is almost certainly better than new users having to learn "Your code is parsed by tree-sitter. Oh, except your C and C++ code, unless you set this flag, because Mackenzie threw a fit in 2024. That's a fun bit of history you get to care about forever now as a user!"

yaantc · 2024-09-02T08:14:38 1725264878

Hi, in case you're not already aware of the name clash, there's already a `rr` in the programming world. It's "record and replay": https://rr-project.org/.

Very different, but a very fine tool tool too.

rafram · 2024-09-02T12:59:55 1725281995

It doesn’t seem like the rr that GP linked to is their own project, just something they’ve found useful.

In any case, in the non-software world, “RR” stands for railroad, as it does in the name of that tool. You can’t own a common two-letter abbreviation.

yaantc · 2024-08-10T11:59:26 1723291166

See just above the map: "This has been age-standardized, assuming a constant age structure of the population for comparisons between countries and over time.". This is what you suggests IIUC?

konschubert · 2024-08-10T12:21:52 1723292512

Yes and i apologise for not seeing this.

yaantc · 2024-07-18T15:21:26 1721316086

Take the infinite loop as just an example of an issue with depth-first search and backtracking. To be more general, I'd say that the issue is that the overall performance of a Prolog program can be very dependent on the ordering of its rules.

As an anecdote, a long time ago for a toy project switching two rules order got the runtime to finding all solutions from ~15mn to a around the second (long time, memory fuzzy...). The difference was going into a "wrong" path and wasting a lot of time evaluating failing possibilities, vs. taking the right path and getting to the solutions very quickly.

So in practice even if Prolog is declarative to get good results you need to understand how the search is done, and organize the rules so that this search is done in the most efficient way. The runtime search is a leaky abstraction in a way ;)

It's not an issue limited to Prolog, many solvers can be helped by steering the search in the "right" way. A declarative language for constraint problem like MiniZinc provides way to pass to the solver some indication on how to best search for example.

Also, most modern Prolog support tabling, which departs from strict DFS+backtracking and can help in some cases. But here too, to get the best results may require understanding how the engine will search, including tabling.

yaantc · 2024-07-08T12:39:40 1720442380

You may want to try x2go. It uses the older NX protocol version 3, while NoMachine is at version 4. It's good enough for my use case, and support remote applications just fine: this is how I use it.

yaantc · 2024-07-07T07:55:23 1720338923

> I bet it had diesel generators when it was in service with AT&T to boot.

20 to 25 years ago I visited a telecom switch center in Paris, the one under the Tuileries garden next to the Louvre. They had a huge and empty diesel generators room. They had all been replaced by a small turbine (not sure it's the right English term), just the same as what's used to power an helicopter. It was in a relatively small soundproof box, with a special vent for the exhaust, kind of lost on the side of a huge underground room.

As the guy in charge explained to us, it was much more compact and convenient. The big risk was in getting it started, this was the tricky part. Once started it was extremely reliable.

tonyarkles · 2024-07-07T14:01:50 1720360910

> by a small turbine (not sure it's the right English term)

That's the right English word yes. And that's pretty cool!

yaantc · 2024-06-21T12:21:18 1718972478

According to https://www.withouthotair.com/, no.

yaantc · 2024-06-18T12:36:04 1718714164

> I wonder how (if?) this thing runs Linux.

See there: https://www.qualcomm.com/developer/blog/2024/05/upstreaming-...

Should be good after all this has landed in upstream Linux and all the distros (which may take a bit of time, as usual).

w0m · 2024-06-18T13:23:50 1718717030

As an interim, I really want to see WSL performance. Hopefully in the next few days we get some clarity there; not a really an MBA competitor for me until then.

yaantc · 2024-05-29T07:26:15 1716967575

> [...] the standard way to get structured output seems to be to retry the query until the stochastic language model produces expected output.

No, that would be very inefficient. At each token generation step, the LLM provides a likelihood for all the defined token based on the past context. The structured output is defined by a grammar, which defines the legal tokens for the next step. You can then take the intersection of both (ignore any token not allowed by the grammar), and then select among the authorized token based on the LLM likelihood for them in the usual way. So it's a direct constraint, and it's efficient.

__loam · 2024-05-29T07:40:05 1716968405

Yeah that sounds way better. I saw one of the python libraries they recommended mention retries and I thought, this can't be that awful can it?

yaantc · 2024-05-08T17:27:24 1715189244

For a text based version of the "tree of chats" idea, using Emacs, Org mode and gptel see `gptel-org-branching-context`in: https://github.com/karthink/gptel?tab=readme-ov-file#extra-o...

tomfreemax · 2024-05-08T18:54:20 1715194460

Of course, it can be done with emacs and org mode...

It's almost like every software or library will get ported to JavaScript eventually, with the difference, emacs and org mode was before.