More

weinzierl · 2024-09-19T12:01:34.000000Z

How much does this thing cost? What ballpark are talking about?

weinzierl · 2024-09-19T11:34:38.000000Z

I have a thing for monospaced fonts and sans-serif.

What they all have in common is slab-like serifs on the i and j. It is not an easy problem to solve to make these two fit harmonically in their allotted space but does anyone know a monospace font that solved this issue in a different way?

BugsJustFindMe · 2024-09-19T13:23:13.000000Z

Pointfree does it right by giving i and l friendly curves.

weinzierl · 2024-09-19T05:49:10.000000Z

Soldering by hand is a dying art...

I'm not sure. Back in the day we avoided SMD like the plague and it had a reputation of being unapproachable. THT parts were highly sought after and I would even say that a good deal of the success of AVR was because they offered THT versions of their µs long after most others had stopped. Some of us even engaged in the uphill battle of lead-free soldering only to be disillusioned.

We thought hand soldering will die with THT but it didn't.

I see a young generation that has mostly overcome these hurdles. With their young steady hands, sharp eyes, high-lead solder, small temperature controlled irons and other modern equipment they just go about. I envy them.

dotancohen · 2024-09-19T06:58:15.000000Z

High lead solder? Are you sure about that?

For what it's worth, I took the time to learn leadless soldering specifically so that I could teach my kids. I like to introduce them to safe hobbies (that why they all went skydiving before their 10th birthday, too).

jdietrich · 2024-09-19T09:19:57.000000Z

Leaded solder is safe to use if you wash your hands afterwards. Leaded solder - particularly 63/37 eutectic solder - is much more forgiving of poor technique. Lead-free isn't a massive inconvenience if you know what you're doing, but it can be absolutely infuriating for novices.

dotancohen · 2024-09-19T14:35:49.000000Z

What about breathing the fumes? Also, doesn't (long-term) exposure to the fumes severely affect skin? I remember the folks soldering a lot of things for the Gemini and Apollo programs had very wrinkly, obviously damaged skin on their faces.

Kirby64 · 2024-09-19T15:00:36.000000Z

There is no lead in the solder fumes. If anything, lead-free solder is substanially worse, since the flux used for lead-free solder is a lot... harsher. Either way, this can be largely fixed by a small fume fan.

dotancohen · 2024-09-19T15:12:42.000000Z

Thank you. In fact, I drive a small horizontal quadcopter propeller with a 3.7v LiPo while soldering to pull the fumes away.

nolist_policy · 2024-09-19T06:54:43.000000Z

I actually prefer lead-free solder because its more aggressive flux and higher soldering temperature.

This allows burning away and tinning enameled wire with the soldering tip.

Though there are only two lead-free solders that qualify. I use the Felder Ultra-Clear EL Sn100Ni+. The other one is the Amasan BF32-3.

jve · 2024-09-19T08:56:31.000000Z

I suspect tip oxidation is an issue with leadfree? Or isnt it?

At what temperatures you solder mentioned leadfree?

jdietrich · 2024-09-19T09:41:45.000000Z

Tip life is inevitably worse with lead-free solder. Using a brass wool cleaner rather than a wet sponge will help prolong tip life, as will the regular use of a suitable tinning/cleaning paste (e.g. Hakko FS-100 or JBC TT-A). Keep the tip wetted with solder as much as possible - a completely "clean" tip will oxidise much faster than one with a protective layer of solder.

The correct tip temperature for any hand soldering operation is the lowest temperature that will allow the joint to be completed in two to five seconds. In practice, that depends on a host of variables - the composition of the solder alloy, the properties and calibration of your iron, the thermal mass of the joint etc. A usual rule of thumb is the melting point of the solder plus 150°C, but your mileage will vary.

lukego · 2024-09-19T09:24:31.000000Z

Oxidation is a big problem with the harsh no-clean fluxes in many cheap lead-free solder wires. Switching to a rosin flux (RMA or RA) name brand solder should fix that (did for me.)

Metcal have a great big doc on tip care that covers this.

pkolaczk · 2024-09-19T07:30:44.000000Z

Maybe it’s just me but I find SMT easier and faster to prototype with than THT. Apply paste, place all components on the board with tweezers, reflow, done. With THT I have to bend / cut the legs of most components, and solder each point individually.

As for “high lead solder” - you won’t buy it in Europe. We had to learn using lead free for rework and you know what - it’s not much different, assuming you have high quality equipment.

mistaken · 2024-09-19T07:44:18.000000Z

As for “high lead solder” - you won’t buy it in Europe. We had to learn using lead free for rework and you know what - it’s not much different, assuming you have high quality equipment.

That's bollocks. You can buy leaded solder in Europe just fine. You only need to worry about lead-free if you want to sell a commercial product.

pta2002 · 2024-09-19T09:25:06.000000Z

I have a roll of 38% lead solder on my desk which I bought last year from Leroy Merlin in Portugal, so you can definitely buy it in Europe.

atesti · 2024-09-19T13:45:54.000000Z

Any idea where I can buy some from Germany? Reichelt stopped to sell it and I did not have chance to stock up

imtringued · 2024-09-19T10:48:19.000000Z

If you can do your joints on first try, lead free solder is exactly as "difficult" as leaded solder. I never notice the difference except when I have to remove it.

willis936 · 2024-09-19T11:02:00.000000Z

Yeah that's true if you have a reflow oven and avoid 0402 and below. Ovens aren't cheap or small though.

leoedin · 2024-09-19T12:18:09.000000Z

What’s strange is that the larger SMD packages are far simpler to solder than THT ones are. Give me an 0805 or SOP any day over a THT package.

I guess the real driver towards hobbyists using SMD is cheap and fast PCB design. You can’t really do stripboard or wire wrap with SMD packages.

foldr · 2024-09-19T08:42:46.000000Z

>Some of us even engaged in the uphill battle of lead-free soldering only to be disillusioned.

I've never had any trouble using lead-free solder for through-hole or SMT (and I don't have any expensive or sophisticated equipment).

weinzierl · 2024-09-18T18:21:51.000000Z

Isn't it the other way around?

SEO text carefully tuned to tf-idf metrics and keyword stuffed to them empirically determined threshold Google just allows should have unnatural word frequencies.

LLM content should just enhance and cement the status quo word frequencies.

Outliers like the word "delve" could just be sentinels, carefully placed like trap streets on a map.

mlsu · 2024-09-18T22:32:54.000000Z

But you can already see it with Delve. Mistral uses "delve" more than baseline, because it was trained on GPT.

So it's classic positive feedback. LLM uses delve more, delve appears in training data more, LLM uses delve more...

Who knows what other semantic quirks are being amplified like this. It could be something much more subtle, like cadence or sentence structure. I already notice that GPT has a "tone" and Claude has a "tone" and they're all sort of "GPT-like." I've read comments online that stop and make me question whether they're coming from a bot, just because their word choice and structure echoes GPT. It will sink into human writing too, since everyone is learning in high school and college that the way you write is by asking GPT for a first draft and then tweaking it (or not).

Unfortunately, I think human and machine generated text are entirely miscible. There is no "baseline" outside the machines, other than from pre-2022 text. Like pre-atomic steel.

bryanrasmussen · 2024-09-19T05:12:19.000000Z

is the use of miscible here a clue? Or just some workplace vocabulary you've adapted analogically?

mlsu · 2024-09-19T06:18:39.000000Z

Human me just thought it was a good word for this. It implies some irreversible process of mixing, I think that characterizes this process really well.

noduerme · 2024-09-19T08:06:42.000000Z

There were dozens of 20th Century ideological movements which developed their own forms of "Newspeak" in their own native languages. Largely, natural human dialog between native speakers and between those opposed to the prevailing regime recoils violently at stilted, official, or just "uncool" usages in daily vernacular. So I wouldn't be too surprised to see a sharp downtick in the popular use of any word that becomes subject to an LLM's positive-feedback loop.

Far from saying the pool of language is now polluted, I think we now have a great data set to begin to discern authentic from inauthentic human language. Although sure, people on the fringes could get caught in a false positive for being bots, like you or I.

The biggest LLM of them all is the daily driver of all new linguistic innovation: Human society, in all its daily interactions. The quintillions of daily phrases exchanged and forever mutating around the globe - each mutation of phrase interacting with its interlocutor, and each drawing from not the last 500,000 tokens but the entire multi-modal, if you will, experience of each human to date in their entire lives - vastly eclipses anything any hardware could ever emulate given the current energy constraints. Software LLMs are just a state machine stuck in a moment in time. At best they will always lag, the way Stalinist language lagged years behind the patois of average Russians, who invented daily linguistic dodges to subvert and mock the regime. The same process takes place anywhere there is a dominant official or uncool accent or phrasing. The ghetto invents new words, new rhythm, and then it becomes cool in the middle class. The authorities never catch up, precisely because the use of subversive language is humanity's immune system against authority.

If there is one distinctly human trait, it's sniffing out anyone who sounds suspiciously inauthentic. (Sadly, it's also the trait that leads to every kind of conspiracy theorizing imaginable; but this too probably confers in some cases an evolutionary advantage). Sniffing out the sound of a few LLMs is already happening, and will accelerate geometrically, much faster than new models can be trained.

bryanrasmussen · 2024-09-19T08:17:09.000000Z

humans also lag humans, the future may already be spoken, but the slang is not evenly memed out yet.

jazzyjackson · 2024-09-19T07:50:53.000000Z

If you think that's niche wait til you hear about man-machine miscegenation

taneq · 2024-09-18T22:37:28.000000Z

> LLM uses delve more, delve appears in training data more, LLM uses delve more...

Some day we may view this as the beginnings of machine culture.

mlsu · 2024-09-18T22:40:16.000000Z

Oh no, it's been here for quite a while. Our culture is already heavily glued to the machine. The way we express ourselves, the language we use, even our very self-conception originates increasingly in online spaces.

Have you ever seen someone use their smartphone? They're not "here," they are "there." Forming themselves in cyberspace -- or being formed, by the machine.

taneq · 2024-09-19T12:24:59.000000Z

chat is this real?

derefr · 2024-09-18T18:50:13.000000Z

1. People don't generally use the (big, whole-web-corpus-trained) general-purpose LLM base-models to generate bot slop for the web. Paying per API call to generate that kind of stuff would be far too expensive; it'd be like paying for eStamps to send spam email. Spambot developers use smaller open-source models, trained on much smaller corpuses, sized and quantized to generate text that's "just good enough" to pass muster. This creates a sampling bias in the word-associational "knowledge" the model is working from when generating.

2. Given how LLMs work, a prompt is a bias — they're one-and-the-same. You can't ask an LLM to write you a mystery novel without it somewhat adopting the writing quirks common to the particular mystery novels it has "read." Even the writing style you use in your prompt influences this bias. (It's common advice among "AI character" chatbot authors, to write the "character card" describing a character, in the style that you want the character speaking in, for exactly this reason.) Whatever prompt the developer uses, is going to bias the bot away from the statistical norm, toward the writing-style elements that exist within whatever hypersphere of association-space contains plausible completions of the prompt.

3. Bot authors do SEO too! They take the tf-idf metrics and keyword stuffing, and turn it into training data to fine-tune models, in effect creating "automated SEO experts" that write in the SEO-compatible style by default. (And in so doing, they introduce unintentional further bias, given that the SEO-optimized training dataset likely is not an otherwise-perfect representative sampling of writing style for the target language.)

travisjungroth · 2024-09-19T04:58:19.000000Z

On point 1, that’s surprising to me. A 2,000 word blog post would be 10 cents with GPT-4o. So you put out 1,000 of them, which is a lot, for $100.

derefr · 2024-09-19T16:31:35.000000Z

There are two costs associated with using a hosted inference platform: the OpEx of API calls, and the CapEx of setting up an account in the first place. This second cost is usually trivial, as it just requires things any regular person already has: an SSO account, a phone number for KYC, etc.

But, insofar as your use-case is against the TOUs of the big proprietary inference platforms, this second cost quickly swamps the first cost. They keep banning you, and you keep having to buy new dark-web credentials to come back.

Given this, it’s a lot cheaper and more reliable — you might summarize these as “more predictable costs” — to design a system around a substrate whose “immune system” won’t constantly be trying to kill the system. Which means either your own hardware, or a “being your own model” inference platform like RunPod/Vast/etc.

(Now consider that there are a bunch of fly-by-night BYO-model hosted inference platforms, that are charging unsustainable flat-rate subscription prices for use of their hardware. Why do these exist? Should be obvious now, given the facts already laid out: these are people doing TOU-violating things who decided to build their own cluster for doing them… and then realized that they had spare capacity on that cluster that they could sell.)

travisjungroth · 2024-09-19T19:07:32.000000Z

This makes sense. But now I’m wondering if people here are speaking from experience or reasoning their way into it. Like are there direct reports of which models people are using for blogspam, or is it just what seems rational?

brazzy · 2024-09-19T05:58:19.000000Z

But then you'll be competing for clicks with others who put out 1,000,000 posts for less costs because they used a small, self hosted model.

baq · 2024-09-19T06:59:07.000000Z

if you are a sales & marketing intern, have a potato laptop and $100 budget to spend on seo, you aren't going to be self hosting anything even if you know what that means.

nerdponx · 2024-09-19T08:44:59.000000Z

This is about high-volume blog/news-spam created specifically to serve ads and affiliate links, not about occasional content marketing for legitimate companies.

tigerlily · 2024-09-19T09:27:15.000000Z

  Too deep we delved, and awoke the ancient delves.

lbhdc · 2024-09-18T18:28:03.000000Z

> LLM content should just enhance and cement the status quo word frequencies.

TFA mentions this hasn't been the case.

flakiness · 2024-09-18T22:49:35.000000Z

Would you mind dropping the link talking about this point? (context: I'm a total outsider and have no idea what TFA is.)

girvo · 2024-09-18T22:58:49.000000Z

TFA means "the featured article", so in this case the "Why wordfreq will not be updated" link we're talking about.

adastra22 · 2024-09-18T23:15:31.000000Z

To be pedantic, the F in TFA has the same meaning as the F in RTFM.

It’s the same origin. On Slashdot (the HN of the early 00’s) people would admonish others to RTFA. Then they started using it as a referent: TFA was the thing you were supposed to have read.

girvo · 2024-09-19T02:59:18.000000Z

Oh that I'm aware of, but it's softened over time too haha

I miss the old Atomic MPC forums in the ~00s.

jnordwick · 2024-09-19T12:52:10.000000Z

The Fucking Article, from RTFA - Read the Fucking Article - and RTFM - Read the Fucking Manual/Manpage

weinzierl · 2024-09-18T13:24:00.000000Z

"I don't think anyone has reliable information about post-2021 language usage by humans."

We've been past the tipping point when it comes to text for some time, but for video I feel we are living through the watershed moment right now.

Especially smaller children don't have a good intuition on what is real and what is not. When I get asked if the person in a video is real, I still feel pretty confident to answer but I get less and less confident every day.

The technology is certainly there, but the majority of video content is still not affected by it. I expect this to change very soon.

frognumber · 2024-09-18T14:51:51.000000Z

There are a series of challenges like:

https://www.nytimes.com/interactive/2024/09/09/technology/ai...

https://www.nytimes.com/interactive/2024/01/19/technology/ar...

These are a little bit unfair, in that we're comparing handpicked examples, but I don't think many experts will pass a test like this. Technology only moves forward (and seemingly, at an accelerating pace).

What's a little shocking to me is the speed of progress. Humanity is almost 3 million years old. Homosapiens are around 300,000 years old. Cities, agriculture, and civilization is around 10,000. Metal is around 4000. Industrial revolution is 500. Democracy? 200. Computation? 50-100.

The revolutions shorten in time, seemingly exponentially.

Comparing the world of today to that of my childhood....

One revolution I'm still coming to grips with is automated manufacturing. Going on aliexpress, so much stuff is basically free. I bought a 5-port 120W (total) charger for less than 2 minutes of my time. It literally took less time to find it than to earn the money to buy it.

I'm not quite sure where this is all headed.

homebrewer · 2024-09-18T17:03:40.000000Z

> so much stuff is basically free

It really isn't. Have a look at daily median income statistics for the rest of the planet:

https://ourworldindata.org/grapher/daily-median-income?tab=t...

  $2.48 Eastern and Southern Africa (PIP)
  $2.78 Sub-Saharan Africa (PIP)
  $3.22 Western and Central Africa (PIP)
  $3.72 India (rural)
  $4.22 South Asia (PIP)
  $4.60 India (urban)
  $5.40 Indonesia (rural)
  $6.54 Indonesia (urban)
  $7.50 Middle East and North Africa (PIP)
  $8.05 China (rural)
  $10.00 East Asia and Pacific (PIP)
  $11.60 Latin America and the Caribbean (PIP)
  $12.52 China (urban)

And more generally:

  $7.75 World

I looked around on Ali, and the cheapest charger that doesn't look too dangerous costs around five bucks. So it's roughly equal to one day's income of at least half the population of our planet.

knodi123 · 2024-09-18T15:22:01.000000Z

+100w chargers are one of the products I prefer to spend a little more on, so I get something from a company that knows it can be sued if they make a product that burns down your house or fries your phone.

Flashlights? Sure, bring on aliexpress. USB cables with pop-off magnetically attached heads, no problem. But power supplies? Welp, to each their own!

fph · 2024-09-18T20:10:45.000000Z

And then you plug your cheap pop-off USB cable into the expensive 100w charger?

knodi123 · 2024-09-18T21:07:05.000000Z

Yeah, sure, what could possibly go wrong? :-P

But seriously, it's harder to accidentally make a USB cable that fries your equipment. The more common failure mode is it fails to work, or wears out too fast. Chargers on the other hand, handle a lot of voltage, generate a lot of heat, and output to sensitive equipment. More room to mess up, and more room for mistakes to cause damage.

csomar · 2024-09-19T05:40:09.000000Z

Democracy (and Republics) are thousands of year old. Computation is also quite old though it only sky-rocketed with electricity and semiconductors. This is not the first time the global world created a potential for exponential growth (I'll consider the Pharaohs and Roman empires to be ones).

There is the very real possibility that everything just stalls and plateau where we are at. You know, like our population growth, it should have gone exponentially but it did not. Actually, quite the reverse.

bee_rider · 2024-09-18T16:30:08.000000Z

> One revolution I'm still coming to grips with is automated manufacturing. Going on aliexpress, so much stuff is basically free. I bought a 5-port 120W (total) charger for less than 2 minutes of my time. It literally took less time to find it than to earn the money to buy it.

Is there a big recent qualitative change here? Or is this a continuation of manufacturing trends (also shocking, not trying to minimize it all, just curious if there’s some new manufacturing tech I wasn’t aware of).

For some reason, your comment got me thinking of a fully automated system, like: you go to a website, pick and choose charger capabilities (ports, does it have a battery, that sort of stuff). Then an automated factor makes you a bespoke device (software picks an appropriate shell, regulators, etc). I bet we’ll see it in our lifetimes at least.

jodrellblank · 2024-09-18T17:22:59.000000Z

> "The revolutions shorten in time, seemingly exponentially."

The Technological Singularity - https://en.wikipedia.org/wiki/Technological_singularity

MengerSponge · 2024-09-18T20:47:55.000000Z

Democracy is 200? You're off by a full order of magnitude.

Progress isn't inevitable. It's possible for knowledge to be lost and for civilization to regress.

apricot · 2024-09-18T16:02:07.000000Z

> When I get asked if the person in a video is real, I still feel pretty confident to answer

I don't. I mean, I can identify the bad ones, sure, but how do I know I'm not getting fooled by the good ones?

weinzierl · 2024-09-18T18:34:24.000000Z

That is very true, but for now we have a baseline of videos that we either remember or that we remember key details of, like the persons in the video. I'm pretty sure if I watch The Primeagen or Tom Scott today, that they are real. Ask me in year, I might not be so sure anymore.

olabyne · 2024-09-18T13:46:40.000000Z

I never thought about that. Humans losing their ability to detect AI content from reality ? It's frightening.

BiteCode_dev · 2024-09-18T14:14:22.000000Z

It's worse because many humans don't know they are.

I see a lot of outrage around fake posts already. People want to believe bad things from the other tribes.

And we are going to feed them with it, endlessly.

PhunkyPhil · 2024-09-18T17:41:04.000000Z

Did you think the same thing when photoshop came out?

It's relatively trivial to photoshop misinformation in a really powerful and undetectable way- but I don't see (legitimate) instances of groundbreaking news over a fake photo of the president or a CEO etc doing something nefarious. Why is AI different just because it's audio/video?

chowells · 2024-09-19T00:16:01.000000Z

"AI" is different because it's low-effort and easily automated, making it easy to absolutely flood public spaces. Quantity has a quality all its own.

BiteCode_dev · 2024-09-19T06:45:29.000000Z

I did.

And it's not the grounbreaking the problem, it's the little constant lies.

Last week a photoshopped Musk tweet was going around, people getting all up in arms against it despite the fact it was very easy to spot as a fabricated one.

People didn't care, they hate the guy, they just wanted to fuel their hate more.

The whole planet run on fake content, magazin covers, food packaging, instagram pics of places that never looks that way...

And now, with AI, you can automate it and scale it up.

People are not ready. And in fact, they don't want to be.

jerf · 2024-09-18T14:36:52.000000Z

It's even worse than that. Most people have no idea how far CGI has come, and how easily it is wielded even by a couple of dedicated teens on their home computer, let alone people with a vested interest in faking something for some financial reason. People think they know what a "special effect" looks like, and for the most part, people are wrong. They know what CGI being used to create something obviously impossible, like a dinosaur stomping through a city, looks like. They have no idea how easy a lot of stuff is to fake already. AI just adds to what is already there. Heck, to some extent it has caused scammers to overreach, with things like obviously fake Elon Musk videos on YouTube generated from (pure) AI and text-to-speech... when with just a little bit more learning, practice, and amounts of equipment completely reasonable for one person to obtain, they could have done a much better fake of Elon Musk using special effects techniques rather than shoveling text into an AI. The fact that "shoveling text into an AI" may in another few years itself generate immaculate videos is more a bonus than a fundamental change of capability.

Even what's free & open source in the special effects community is astonishing lately.

jhbadger · 2024-09-18T16:51:56.000000Z

And you see things like the The Lion King remake or its upcoming prequel being called "live action" because it doesn't look like a cartoon like the original. But they didn't film actual lions running around -- it's all CGI.

bee_rider · 2024-09-18T16:34:43.000000Z

Plus, movies continue (for some reason) to be made with very bad and obvious CGI, leading people to believe all CGI is easy to spot.

PhunkyPhil · 2024-09-18T17:38:36.000000Z

This is a common survivorship bias fallacy since you only notice the bad CGI.

I'm certain you'd be shocked to see the amount of CG that's in some of your favorite movies made in the last ~10-20 years that you didn't notice because it's undetectable

xsmasher · 2024-09-18T19:20:49.000000Z

This is an amazing demo reel of effects shots used in "mundane" TV shows - comedies and produce procedurals. - for faking locations.

https://www.youtube.com/watch?v=clnozSXyF4k

vundercind · 2024-09-19T03:44:22.000000Z

Luckily, for those of us who prefer when film photography meant at least mostly actually filming things, there’s plenty of very good film and TV (and even more of lesser quality) to keep a person occupied for a couple lifetimes.

bee_rider · 2024-09-18T19:31:06.000000Z

That is really something even as somebody who expects lots of CGI touch-up in sets.

coderedart · 2024-09-19T11:14:24.000000Z

I hate this. I did not notice the vast majority of them. So many backgrounds/sets are just green screens :(

ars · 2024-09-19T02:50:48.000000Z

And keep in mind - that video is 14 years old!

bee_rider · 2024-09-18T19:01:39.000000Z

I won’t be, I’m aware that lots of movies are mostly CGI.

But, yeah, I do think it is some kind of bias. Maybe not survivorship, though… maybe it is a generalized sort of Malmquist bias? Like the measurement is not skewed by the tendency of movies with good CGI to go away. It is skewed by the fact that bad CGI sticks out.

bee_rider · 2024-09-18T19:29:37.000000Z

Actually wait I take it back, I mean, I was aware that lots of Digital Touch-up happens in movie sets, more than lots of people might expect, and more often that one might expect even in mundane movies, but even still, this comment’s video was pretty shocking anyway.

https://news.ycombinator.com/item?id=41584276

hn_throwaway_99 · 2024-09-18T14:37:55.000000Z

I mean, it's already apparent to me that a lot of people don't have a basic process in place to detect fact from fiction. And it's definitely not always easy, but when I hear some of the dumbest conspiracy theories known to man actually get traction in our media, political figures, and society at large, I just have to shake my head and laugh to keep from crying. I'm constantly reminded of my favorite saying, "people who believe in conspiracy theories have never been a project manager."

Suppafly · 2024-09-18T16:46:56.000000Z

>Humans losing their ability to detect AI content from reality ? It's frightening.

And it already happened, and no one pushed back while it was happening.

Sharlin · 2024-09-18T14:34:32.000000Z

It's worse: they don't even care.

bunderbunder · 2024-09-18T14:34:47.000000Z

This video's worth a watch if you want to get a sense of the current state of things. Despite the (deliberately) clickbait title, the video itself is pretty even-handed.

It's by Language Jones, a YouTube linguist. Title: "The AI Apocalypse is Here"

https://youtu.be/XeQ-y5QFdB4

wraptile · 2024-09-18T13:51:19.000000Z

I find issue with this statement as content was never a clean representation of human actions or even thought. It was always driven by editorials, SEO, bot remixing and whatnot that heavily influences how we produce content. One might even argue that heightened content distrust is _good_ for our society.

BeFlatXIII · 2024-09-18T17:01:02.000000Z

It's a defense lawyer's dream.

bongodongobob · 2024-09-18T15:05:45.000000Z

Oh they definitely are. A lot of people are now calling out real photos as fake. I frequently get into stupid Instagram political arguments and a lot of times they come back with "yeah nice profile with all your AI art haha". It's all real high quality photography. Honestly, I don't think the avg person can tell anymore.

ziml77 · 2024-09-18T19:32:58.000000Z

I've reached a point where even if my first reaction to a photo is to be impressed, I then quickly think "oh but what it this is AI?" and then immediately my excitement for the photo is ruined because it may not actually be a photo at all.

bongodongobob · 2024-09-18T19:51:49.000000Z

I don't get that perspective at all. Who cares what made it.

pbhjpbhj · 2024-09-19T10:54:50.000000Z

You don't find a difference between things that exist and things that don't?

bsder · 2024-09-18T15:08:19.000000Z

> When I get asked if the person in a video is real, I still feel pretty confident to answer

I don't share your confidence in identifying real people anymore.

I often flag as "false-ish" a lot of things from genuinely real people, but who have adopted the behaviors of the TikTok/Insta/YouTube creator. Hell, my beard is grey and even I poked fun at "YouTube Thumbnail Face" back in 2020 in a video talk I gave. AI twigs into these "semi-human" behavioral patterns super fast and super hard.

There is a video floating around with pairs of young ladies with "This is real"/"This is not real" on signs. They could be completely lying about both, and I really can't tell the difference. All of them have behavioral patterns that seems a little "off" but are consistent with the small number of "influencer" videos I have exposure to.

weinzierl · 2024-09-18T06:26:33.000000Z

I will certainly give it a try, because I am not 100% happy with SwiftScan anymore.

The biggest threat to these kinds of apps is getting bought by one of the big app boutique shops and being neglected.

SwiftScan started as an ambitious project by a small shop of dedicated people, but since it has been sold it started to show some cracks.

weinzierl · 2024-09-18T06:16:37.000000Z

It's great. I had a subscription for Grammarly for a couple of years and used both tools in parallel, but found myself mostly using languagetool increasingly. It is strictly better, I'd say even for English but certainly if you need other languages or deal with multilingual documents. So I canceled Grammarly and didn't miss it since.

You also can self-host and we do that at my workplace, because we deal with sensitive documents.

weinzierl · 2024-09-16T08:14:47.000000Z

Brilliant write up!

I do not really understand the "Sorting unknown uint-struct blobs" point.

Could you give an example or explain in more detail, what a "unknown uint-struct blob" is?

The odd/even advantage could be put even stronger, because every additional bit you know from the little end gives additional information about the number's divisibility. For example, one bit tells divisibility by two (aka ofd/even), two bits tell divisibility by four, and so on.

kstenerud · 2024-09-16T08:40:53.000000Z

For example, if you had a file that comprised the following struct:

    struct someblob {
        uint64_t timestamp;
        uint64_t checksum;
        uint32_t item_count;
        struct something items[0];
    };

Even if you didn't know that a collection of files were structured this way, you could still read, say, the first 128 bits as an unsigned integer and compare them, and they'd just happen to be naturally ordered because the timestamp field grows from right to left, and would have precedence over the "lower 64 bits" of the checksum field.

It's a very minor benefit (of dubious real-world utility), but I wanted to be comprehensive :P

weinzierl · 2024-09-16T09:25:51.000000Z

Thanks! That makes sense.

Mentally, I would put this in the "conventional" advantage category, because it relies on comparing fixed length chunks of memory and computationally it should not make a difference if `timestamp` is stored LE or BE for sorting.

foldr · 2024-09-16T11:14:15.000000Z

A simpler case is reading only a fraction of a field. For example, suppose that you have a 8 byte key and you read the first four bytes of it. On a big-endian architecture, those are the high bytes and you can sort with them just fine (up to some level of detail). On a little-endian architecture, you'll be sorting by the lower bytes and the results will be meaningless. So the big-endian architecture allows you to sort by the first n bytes of a struct without caring what fields it contains. While there is obviously no guarantee that the results of this will be meaningful in the general case, it is far more likely than for a little-endian architecture.

weinzierl · 2024-09-16T12:31:10.000000Z

My counter argument to this would be that it is as expensive to compare LE k[4]s with each other as it is BE k[0]s.

As long as you deal with fixed length chunks of data accessing it from either end should be equal effort (in first approximation[1]).

This is qualitatively different from the odd/even case, because for a number of unknown length you can tell odd/even in O(1) for LE but need O(n) only for BE (you have to find the LSB in n steps).

Mathematically there is more information you get from just having the LSBs than just having the MSBs without knowing the whole number and its length. I think this the only reason, why LE is marginally better, everything else boils down to convention.

[1] I know that on modern architectures it can be faster to read memory upwards than downwards, because of the pre-fetcher, but this is what I meant with the advantage is because of convention. If we had a symmetric pre-fetcher the point would be moot.

foldr · 2024-09-16T15:41:24.000000Z

True. There is a significant asymmetry, though, in that you are more likely to be in a situation where you know the starting address of an object and a minimum size than you are to be in a situation where you know the end address of an object and a minimum size. Strictly speaking that's also an arbitrary convention (as I guess the address of a struct could be defined as the address of its last byte), but it's a near-universal one.

kstenerud · 2024-09-16T10:34:01.000000Z

Actually, in this case it would. Consider the layout (byte-by-byte):

    BE: t8 t7 t6 t5 t4 t3 t2 t1 c8 c7 c6 c5 c4 c3 c2 c1

In the big endian case, the byte-by-byte of the struct naturally places the timestamp at the high end of the 128 bit value you blindly read.

    LE: t1 t2 t3 t4 t5 t6 t7 t8 c1 c2 c3 c4 c5 c6 c7 c8

In the little endian case, it's the CHECKSUM at the high end of the 128 bit value.

weinzierl · 2024-09-16T12:14:32.000000Z

I think we agree, but it nags me that I still can't follow your line of thought.

Do you want to:

- Compare just the timestamp, so

    1970-01-01 00:00 0x01
    1970-01-01 00:00 0x00
    1970-01-01 00:00 0x01
    1970-01-01 00:01 0x01
    1970-01-01 00:01 0x00
    1970-01-01 00:01 0x01

could be a valid ordering, with the first three and last three in arbitrary ordering, because the checksum doesn't play a role.

- Compare timestamp and checksum, in the sense of ordering all files with the same checksum by timestamp, like this

    1970-01-01 00:00 0x00
    1970-01-01 00:01 0x00
    1970-01-01 00:02 0x00
    1970-01-01 00:00 0x01
    1970-01-01 00:01 0x01
    1970-01-01 00:02 0x01

- Compare timestamp and checksum, in the sense that files with the same timestamp are ordered by checksum, in effect grouping equal checksum files together under their respective date.

    1970-01-01 00:00 0x00
    1970-01-01 00:00 0x01
    1970-01-01 00:00 0x02
    1970-01-01 00:01 0x01
    1970-01-01 00:01 0x02
    1970-01-01 00:01 0x02
    1970-01-01 00:01 0x03

In the first case you could just compare the first 64-bit, so I don't think that's it. The second case would be an advantage for little-endian, so it doesn't support your argument. Third case supports the argument for BE, but is an unusual thing to want.

In other words: Is the checksum crucial for your line of argumentation, or could you make your point with just a timestamp? If not, why not compare just 64-bit. If yes, I don't follow why BE is better in this case.

kstenerud · 2024-09-16T13:25:44.000000Z

Basically, (and this is getting really esoteric at this point), if you use big endian byte ordering in your data structures when saving to disk, then you can place items in order of descending "sorting order" importance at the beginning of your file. Anyone wishing to sort such files wouldn't need to know anything about the actual structure of the file, or what is stored where. They could simply choose an arbitrary number of bits to read (say, 512 bits), do a big endian sort based on that, and it will always come out right (even though they're technically reading more than they have to).

    struct myfile {
        uint32_t year;
        uint8_t month; // Assuming packed structs here
        uint8_t day;
        uint32_t seconds;
        uint16_t my_custom_ordering;
        uint8_t some_flags;
        uint64_t a_checksum_or_something;
        char name[100];
        ...
    }

Reading the first 64 bytes from this file will give year, then month, then day, then seconds, then my_custom_ordering, then some_flags, then a_checksum_or_something, then the first few bytes of name (assuming we used big endian byte ordering). The extra bytes won't hurt anything because they're lower order when we compare.

To do this with little endian ordered data, you would have to:

1) Reverse the ordering of the "sortable" fields to: my_custom_ordering, seconds, day, month, year

2) Know in advance that you have to read exactly 12 bytes (no more, no less) from any file using this structure. If you read any more, you'll get random ordering based on the reverse of what's in the "name", "a_checksum_or_something", and "some_flags" fields (because they comprise the "higher order" bytes when reading little endian).

3) If you were to add another field "my_extra_custom_ordering", you'd have to adjust the number of bytes you read. With big endian ordering, you can still read 64 bytes and not care. You'd only care once your "sortable fields" exceeds 64 bytes - at which point you'd read, say, 100 bytes to be completely arbitrary... It doesn't matter because with BE everything just sorts itself out.

The comparator function is also much simpler with BE: Just do a byte-by-byte compare until you find a difference. With LE, you have to start at a specific offset (in the above case, 11) and decrement towards 0.

weinzierl · 2024-09-16T13:46:29.000000Z

That made it click. Thanks a lot for your patience and the detailed explanation.

tekacs · 2024-09-16T09:10:53.000000Z

This comes in really really handy in lexicographical ordering.

For example, if storing in the keys of a KV store a pattern of:

[u32, String, u32, String, …]

If you want those arrays to be sorted lexicographically, you’ll want to store those u32 instances in big endian, so that both those and the strings sort from left-to-right.

weinzierl · 2024-09-16T07:49:01.000000Z

Is there a tool that can remove unused CSS while considering static HTML?

What I have in mind is a tool that I throw a CSS file and a bunch of static HTML files at, it will try to apply each CSS rule to the HTML like a browser would and remove all the rules that don't apply.

I don't expect it to ascertain if a rule had visible effects. I also don't expect it to consider JavaScript. Just plain CSS and static HTML. It doesn't look to me like CSSnano or LighteningCSS could do that.

csande17 · 2024-09-16T08:02:24.000000Z

https://purgecss.com/ does this, kind of -- it used to be recommended by Tailwind back when Tailwind shipped a giant zip-bomb stylesheet of every possible property/value combination by default. I don't think it does the more complicated browser-like analysis you mention, though; it might just check whether class names appear in your HTML using a regex search.

The AMP WordPress plugin also does something like this IIRC (to try and fit stylesheets into AMP's size limit) but the tooling for it might be written in PHP.

alberth · 2024-09-16T12:42:12.000000Z

How do you remove the unused CSS?

https://purifycss.online/

Above is a nice online version of Purify. But it just seems to minimize the CSS, and doesn’t delete the unused CSS.

craftkiller · 2024-09-16T14:26:45.000000Z

I wrote that for my company ~3 jobs ago, except instead of working only on static HTML, it would: for a small percentage of our traffic loop in the background processing a couple CSS selectors at a time, adding CSS selectors that matched to a bloom filter that would be occasionally POSTed back to the server. Then I'd parse through our logs, pulling out the bloom filters, and comparing them to our CSS rules to find unused rules. It wasn't perfect so it required manually checking before deleting CSS rules but it went a long way towards flagging potentially unused CSS rules.

stigok · 2024-09-16T08:47:26.000000Z

This would work nicely with static HTML, indeed. But once you have some JavaScript i.e. dynamic HTML, it won't work reliably anymore. Worse, it might even give you a list of manually curated CSS properties to "allow list".

Izkata · 2024-09-16T15:44:06.000000Z

Waaay back there used to be a Firebug plugin that would monitor which CSS was ever used as you interacted with a page. Worked great for exactly these dynamic pages - I was using it with React for a few years before Quantum killed the XUL addons.

I'd forgotten about it and am now wondering if anyone made a replacement...

Ayesh · 2024-09-16T16:24:13.000000Z

Chrome Dev tools has this feature. It's not as useful because when you navigate to another page, it resets the used CSS rules.

merb · 2024-09-16T12:59:05.000000Z

this is incorrect.

the only thing that does not work is generating the css class like:

var x = 'hase-';

var t = x + 'danger'

which is an antipattern. (it can even happen inside java, c#, whatever language you use and it is still an antipattern)

phito · 2024-09-16T13:47:22.000000Z

Can you please elaborate why it is an antipattern? I'm not sure I understand.

sionisrecur · 2024-09-16T16:15:34.000000Z

At least regarding tailwindcss, it checks your code to filter out unused css classes. So it is recommended to have full class names in the code instead of constructing them by concatenating strings. So if you have a variable `buttonColor` that can be `red` or `blue` it is better to do something like

    switch(buttonColor) {
        'red': return 'button-red';
        'blue': return 'button-blue';
    }

over

    return 'button-' + buttonColor;

merb · 2024-09-16T17:25:15.000000Z

if you are in a big project and want to refactor classes the string concatenation pattern gets extremely painful, it’s even worse when it’s done on the server and on the frontend. At some point you will have trouble removing your classes. My company did this a lot and know we struggle to clean it up since basically all classes are used somewhere. What we do is basically use purgecss and than diff it and look for all occurrences that got removed which will take us a few months.

phito · 2024-09-17T08:17:07.000000Z

Oh ok it was about concatenation, not just this example case. Got it, thank you!

weinzierl · 2024-09-13T19:05:35.000000Z

We had similar requirements and used an older iPad, a model without SIM slot. You can lock down the WiFi to make it completely offline.

phoyd · 2024-09-13T21:20:10.000000Z

Only cellular iPads have GPS IIRC.

weinzierl · 2024-09-14T15:10:24.000000Z

Now that you say it, I think our iPad doesn't have GPS indeed. The only app that required it and that we were using was the star map. Since we used the iPad at home it worked for us to just enter our position once manually.

afiodorov · 2024-09-14T07:59:06.000000Z

No, all of them do