It is also about trying to get the most of that hypothesis testing, defining success and failure the best you can.
I have encountered this "mediocre success" many times in AI solutions due to lack of problem definition. For instance, now with LLMs is very easy to write a prompt that gives you the output you want in 5 or 6 examples you have in mind. The problem is to build up your testing scenario from there, and gather as much data as possible until you make it representative of your use cases.
That is the only way to actually test your prompts, RAG strategies, and so on, instead of buying the last CoT-like prompt trend.
I'm not sure if that is a metric you can rely on. LLMs are very sensitive to the position of your item lists along the context, paying extra attention at the beginning and the end of those list.
See the listwise approach at "Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting", https://arxiv.org/abs/2306.17563
I wouldn't be surprised to see it help, along with the "you'll get $200 if you answer this right" trick and a bunch of others :) They're definitely worth trying.
Academic silo with little to no real transfer to business. ML eventually enabled building better continuous and discrete models for inference, control, and prediction.
I would have though the other way around: marketing buzzword for a non-problem solved by engineers since Maxwell's times by variations of applying the concept of hysteresis.
At least my recollection of fuzzy logic are from around the late eighties / early nineties and always involved the example of a thermostat that can only turn fully on or fully off. :)
True, this is the first thing I’m reminded when I hear this term. Then I wonder what Fuzzy logic has anything to do with washing machines. Modern washing machines has everything routine programmed and are deterministic
Do they? Mine noticeably takes much longer to run on very dirty kitchen towels compared to clothes, even on the same setting and even though the clothing load is heavier. I had been assuming there's something to detect how soiled the water is.
Random fact of the day: (some?) Dishwashers detect soiled water by using a resetable fuse on the water pump and counting how many times the fuse trips; dirty water makes the pump work harder and trip the fuse.
I don't think that would work for clothes washers though, because there isn't the same kind of pumping water for recirculation. Somewhat related, my washing machine (upright, high efficiency) measures the load by spinning it with a calibrated input impulse and measuring the rotation of the basket, less rotation means more clothes means more water. It's helpful to load the clothes around the edge to get the right amount of water. But that doesn't illuminate how cycle duration is determined.
I have sometimes found myself into that situation. I have to be careful not to overthink where to put my "brain room" for that day, otherwise I carry this overhead burden that rumbles all day long, questioning if I should be putting that effort elsewhere.
Definitely, brains are fun. They can be your best ally and worst enemy.
I totally agree. I work on the private sector, coming from a research position too. I was also focused on the "interesting" side of the problem: the modeling, integrating domain knowledge into the analysis, drawing all sorts of plots... But there were other unavoidable and "uninteresting" needs for the research project, like building a data gathering system with its API and everything. This required my best software engineering abilities. Needless to say my best weren't precisely THE best, so as the project got bigger, the not-so temporary fixes increased, as well as poor design choices (if any). This finally led to a complete reestructure and almost fresh start.
I feel some of it could be avoided, so I learned the hard way that the whole modelling + software engineering process is a subtle craft. It is important to take care on the implications of your code and, specially, on how its done, since it may fall back onto you eventually. This reconciled me with the more technical stuff (my tools) to eventually put up a good work in a more satisfying way.
I believe the article is oriented to those who feel like the "nice guy" the author is describing, which is my case, and expects you to act on how you feel to his words. I feel myself very reflected on the profile seeking for high affiliation, low power, and high emotional control; however, I am not sure to what extent should it only focus to a work environment.
Over the time, I have found myself more willing to fight over my "slice" of power in the workplace, making my demands assertively and showing that I know my value without traces of regret or false humility. That is something I have had to work on a lot, it makes me uncomfortable; but I understand that I "deserve" some power. If I am doing things right I cam claim my state for influence and being heard. I think this aligns with your point on being responsible with yourself and not just being a bystander.
On the other hand, there are other spheres in which I do not expect such power beforehand, so I'm not usually ready to draw my weapons. This may be a social encounter with barely-known people, friends, or even family. There are power fights there too, but somehow they feel different. Maybe I'm not as convinced of how I deserved such power, or maybe I'm not willing to do the effort anymore and I just want to fulfill my affiliation urge.
While you have a point, this process may make it harder to be a troll. Nowadays you can be a troll anywhere and anytime you want, specially online. I'm curious how this approach may benefit those who share their plain thoughts without the fear of being automatically targeted by trolls looking for triggered responses and tribal support.
I think it's a little more complicated than that. Trolls are often very fixated on their targets but they also tend to be impulsive and looking for quick fixes of glory. This is why things like shadow banning and slow modes and signup waiting periods work pretty well on them. Very few have the patience to wait for a fresh sock account that's limited to be useful.
"I think it's a little more complicated than that." Agreed, but decades of observation tell me there is more than enough supply across the shitpost to troll spectrum to go around :)
The author's point of view seems to me quite condescending. Like being the janitor is considered less (and that, inevitably, everybody sees the janitor that way), so the author is trying to flip that view around (because he can, he is "not less").
Needless to say that being polite is out of the question here, and that I am sure that the author's intentions are not what I just commented, but I can't help to read it that way.
I would agree that "please" and "thank you" can take you far, however you don't just have to address them to the janitor, it's for everybody.
I have encountered this "mediocre success" many times in AI solutions due to lack of problem definition. For instance, now with LLMs is very easy to write a prompt that gives you the output you want in 5 or 6 examples you have in mind. The problem is to build up your testing scenario from there, and gather as much data as possible until you make it representative of your use cases.
That is the only way to actually test your prompts, RAG strategies, and so on, instead of buying the last CoT-like prompt trend.