Not sure I agree in this regard. We are after all, aiming to create a mental mod...

mjburgess · 2024-03-10T23:27:26 1710113246

There's nothing to replicate. ML models are associative statistical models of historical data.

There are no experimental conditions, no causal properties, no modelled causal mechanisms, no theories at all. "Replication" means that you can reproduce an experiment designed to validate a causal hypothesis.

Fitting a function to data isnt an experiment, it's just a way of compressing the data into a more efficient representation. That's all ML is. There are no explanations here (of the data) to assess.

dartos · 2024-03-11T03:45:34 1710128734

I don’t think that’s true either.

Take the research into Loras for example. Surely the basic scientific method was followed when developing it. You can see that from the paper.

Obviously the results can be reproduced. Unlike in many other fields, reproducibility can be pretty trivial in CS.

Training a model isn’t really a science, but the work gone into creating the models surely is.

mjburgess · 2024-03-11T07:46:32 1710143192

CS isnt science, it's discrete mathematics

dartos · 2024-03-11T12:39:52 1710160792

All sciences are progressively more impure (eg. Applied) forms of math.

data_maan · 2024-03-11T10:59:19 1710154759

dartos · 2024-03-11T12:46:32 1710161192

Also there’s literally a causal relationship between model topology and quality of output.

This can be plainly seen when trying to get a model to replicate its input.

Some models perform better in fewer steps, some perform worse for many steps, then suddenly much better.

How is discovering these properties of statistical models NOT science?

mjburgess · 2024-03-11T14:20:03 1710166803

I do think there's an empirical study of ML models and that could be a science. Its output could include things like,

"the reason prompt Q generates A1..An is because documents D1..Dn were in the training data; these documents were created by people P1..Pn for reasons R1..Rn. The answer A1..An related to D1..Dn in so-and-so way. The quality of the answers is Q1..Qn, and derives from the properties of the documents generated by people with beliefs/knowledge/etc. K1..Kn"

This explains how the distribution of the weights produces useful output by giving the causal process that leads to training data distributions.

The relationship between the weights and the training data itself is *not* causal.

Eg., X = 0,1,2,3; Y = A,A,B,B; f(x; w) = A if x <= w else B

w = 1 because the rule x <= 1 partitions Y st. P(x|w) is maximised. These are statistical and logical relationships ("partitions", "maximises").

A causal relationship is between a causal property of an object (extended in space and time) to another causal property by a physical mechanism that reliably and necessarily brings about some effect.

So, "the heat of the boiling water cooked the carrot because heat is... the energetic motion of molecules ... and cooking is .... and so heating brings about cooking necessarily because..."

heating, water, cooking, carrot, motion, molecules, etc.. -- their relationships here are not abstract; they are concretely in space and time, causally effecting each other, etc. etc.

dartos · 2024-03-11T15:59:01 1710172741

So what do you call the process of discovering those causal properties?

Was physics not actually a science until we uncovered quarks, since we weren’t sure what caused the differences in subatomic particles? (I’m not a physicist, but I hope that illustrates my point)

Keep in mind most ML papers on arxiv are just describing phenomena we find with these large statistical models. Also there’s more to CS than ML.

mjburgess · 2024-03-11T16:21:08 1710174068

You're conflating the need to use physical devices to find relationships, with the character of those relationships.

I need to use my hand, a pen and paper to draw a mathematical formula. That formula (say, 2+2=4) expresses no causal relationships.

The whole field of computer science is largely concerned with abstract (typically logical) relationships between mathematics objects; or in the case of ML, statistical ones.

Computer science has no scientific methodology for producing scientific explanations -- it isnt science. It is science in the old german sense of just "a systematic study".

Scientists conduct experiments in which they hold fixed some causal variables (ie., causally efficiacious physical properties), and vary others, according to an explanatory framework. They do this in order to explore the space of possible explanations.

I can think of no case in the whole field of csci in which there are cases where causal variables are held fixed; since there is no study of them. Computer science does not study even voltage, or silicon, or anything as physical objects with causal properties (that is electrical egnineering, physics, etc.).

Computer science ought just be called "applied discrete mathematics"

dartos · 2024-03-11T19:17:20 1710184640

I see where you’re coming from, but I think there’s more to it than that, specifically with non determinism.

So if I observe some phenomena in a bit of software that was built to translate language, say the ability to summarize text.

Then I dig into that software and decide to change a specific portion of it, keeping same all other aspects of the software and its runtime, then I notice it’s no longer able to summarize text.

In that case I’ve discovered a causal relationship between the portion I changed and the phenomenon of text summarization. Even though the program was constructed, there are unknown aspects.

How is that not the process of science?

Sorry if this is just my question from earlier, rephrased, but I still don’t see how this isn’t a scientific method.