> But who cares? Why not just let the command figure out the right buffer size a...

jcynix · on Oct 25, 2022

> commits every block individually

It uses a syscall per block write (which would slow it down if you use 512 byte blocks instead of 8M for example), but the OS does the file buffering and final writing to the device, unless you pass the fsync or fdatasync options to dd.

Edit:

here's writing to a old and slow stick and you can see that dd is fast and then the OS has to transfer the buffered data onto the stick:

  dd if=some-2GB-data status=progress of=/dev/sdX bs=8M
  1801256960 bytes (1,8 GB, 1,7 GiB) copied, 7 s, 257 MB/s

  0+61631 records in
  0+61631 records out
  2019556352 bytes (2,0 GB, 1,9 GiB) copied, 378,59 s, 5,3 MB/s

And the stick is placed in a USB 3.2. port on a fast machine ;-0

noir_lord · on Oct 25, 2022

Can also do `oflag=direct` and it'll just skip as much of the caching as it can.

jcynix · on Oct 25, 2022

> Can also do `oflag=direct` and it'll just skip as much of the caching as it can.

Correct, and that's one more point for dd compared to head/tail (which are fine commands by themselves).

But wouldn't help much in my example, where I used an (very) old "high speed usb 2.0 stick" with 4 MByte/s write speed to demonstrate the difference between buffering and actual writing.

ltbarcly3 · on Oct 25, 2022

Thats when it matters, with cat it would cache the entire write. If you don't want it to cache everything and have no idea how much time is left or how fast it is writing, and then wait an hour for it to unmount instead then cat is fine.

smegsicle · on Oct 25, 2022

>> But who cares? Why not just let the command figure out the right buffer size automatically?

i think he's talking about cat being the command that figures it out automatically

LukeShu · on Oct 25, 2022

Right, and the parent comment is saying that `cat` doesn't figure it out very well.

I recall that GNU cat used a constant value that was tuned for read performance on common hard drives, giving no consideration to write performance. Looking at the GNU cat sources today, it doesn't look at all how I remember; I'd have to study it a bit to tell you what it does.

Edit: Hrmm, it seems I was mis-remembering. Perhaps I was thinking of the minimum values (32KiB, 64KiB, 128KiB below)? Or perhaps I was thinking of a BSD cat? Anyway:

How GNU cat sizes its buffers, by version:

- >=9.1 (2022-04-15) : Use the `copy_file_range()` syscall and let the kernel figure it out

- >=8.23 (2014-07-18) : max(128KiB, st_blksize(infile), st_blksize(outfile))

- >=8.17 (2012-05-10) : max(64KiB, st_blksize(infile), st_blksize(outfile))

- >=7.2 (2009-03-31) : max(32KiB, st_blksize(infile), st_blksize(outfile))

- at least as far back as 1996 : max(st_blksize(infile), st_blksize(outfile))

(In my psuedo-code, `st_blksize(fd)` is the `ST_BLKSIZE(buf)` of the result of `fstat(fd, buf)`.)

_abox · on Oct 25, 2022

No, I was talking about dd. As far as I know it does not have any smart tuning of block sizes. It definitely didn't in the past and I doubt it was added.

Not sure what cat does but like I said in my other comment its not really cat itself that does the disk writing in that command but rather the shell redirect. Edit: Nope, I'm wrong there!!

LukeShu · on Oct 25, 2022

Ah, I misunderstood you. You're correct, `dd`'s default block size was and is 512, which is specified by POSIX.

I interpreted "the command" in the original article as "the command you end up running, whether it be `dd` or `cat` or something else."

But you're mistaken about the shell redirect. In either scenario it's the command (`dd` or `cat`) making the write() syscall to the device. The shell passes the file descriptor to the command, then gets out of the way.

_abox · on Oct 25, 2022

Indeed I was mistaken about the shell, sorry about that. The redirect method I forgot a really long time ago and my memory made it into something it was not.

In that case I guess dd does call a sync() on every output block? Because it's definitely slower and the LED pattern on a USB stick is also much more 'flashy' when using 512 bytes.

LukeShu · on Oct 25, 2022

> In that case I guess dd does call a sync() on every output block?

Only if you tell `dd` `oflag=sync` or `oflag=dsync`.

kreetx · on Oct 25, 2022

For context, choosing the block size has implications on the speed at which data gets written to a USB stick, so not being able to tune that on cat can be a problem.

Neil44 · on Oct 25, 2022

I guess that made sense when the standard HDD block size was 512b. Then it went to 512/4kb then pure 4kb, not sure what SSD's do maybe also 4kb? Experimenting with speeds and block sizes in the past has shown very quickly diminishing returns increasing the block size. As long as the CPU can keep the queue depth over 1 the device should be flat out.

simcop2387 · on Oct 25, 2022

Generally they do a logocal 4kb, but it's usually physically larger pages of more than 1M per write. a good ssd will helpfully cache and batch up small writes but if it gets it wrong then it'll amplify the wear and kill a drive quicker than needed. That's another reason to do dd with a larger block size, since it'll make it a lot less likely that you write multiple blocks for a single update

Neil44 · on Oct 25, 2022

Good point. I guess there's so much going on with various types of caching and wear levelling that 'let the device figure it out' is best. And the queue can be on the device now with NVME not on the host so its not a dumb queue any more.

lhcohdhdohx · on Oct 25, 2022

[flagged]

tuetuopay · on Oct 25, 2022

well so does dd without the conv=fdatasync option

rascul · on Oct 25, 2022

This might be what you were thinking of.

https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/iob...

_abox · on Oct 25, 2022

I don't read it that way. He calls out that 'weird' command specifically. But indeed he doesn't specify.

I wonder what cat does in terms of buffers, I kinda doubt it has any special optimisations though I would guess the shell redirect might have. As that's really the thing doing the work there, not cat. Edit: Nope, I'm wrong there!!

Also that command does more than just specify the memory buffer like he says. That's my point, it's useful for tuning which can be super helpful with huge images.

It can also lead to some dangerous gotchas as well when working with files. But with full disk images these don't apply generally.

LukeShu · on Oct 25, 2022

> though I would guess the shell redirect might have. As that's really the thing doing the work there, not cat.

No? The shell redirection is just

    int tmp_fd = open("/dev/sdb", O_WRONLY|O_TRUNC, 0666);
    dup2(tmp_fd, STDOUT_FILENO);
    close(tmp_fd);

(plus error handling and whatnot)

`cat` ends up with a file descriptor directly to the block device, same as `dd` does; the only difference is whether the `open()` call comes before or after the `execve()` call.

_abox · on Oct 25, 2022

Ah ok good point! I stand corrected.

I really doubt cat is smart enough to figure out a suitable block size though. At least with dd you can specify one.

bhaney · on Oct 25, 2022

> I really doubt cat is smart enough to figure out a suitable block size though

It seems to try, at least

https://github.com/coreutils/coreutils/blob/master/src/cat.c...

LukeShu · on Oct 25, 2022

Going through the historical versions (copying from my other comment):

- >=9.1 (2022-04-15) : Use the `copy_file_range()` syscall and let the kernel figure it out

- >=8.23 (2014-07-18) : max(128KiB, st_blksize(infile), st_blksize(outfile))

- >=8.17 (2012-05-10) : max(64KiB, st_blksize(infile), st_blksize(outfile))

- >=7.2 (2009-03-31) : max(32KiB, st_blksize(infile), st_blksize(outfile))

- at least as far back as 1996 : max(st_blksize(infile), st_blksize(outfile))

(In my psuedo-code, `st_blksize(fd)` is the `ST_BLKSIZE(buf)` of the result of `fstat(fd, buf)`.)

tomrod · on Oct 25, 2022

Literally my thoughts too!

"Who cares?" --> people that care are people that want to have control over that option, as not every tool is written intelligently.

guenthert · on Oct 25, 2022

Indeed. dd can also serve also as an ad-hoc performance test tool (it gets a bit tricky when you need multiple threads/streams though).

andrewflnr · on Oct 25, 2022

I can confirm I've seen dramatic speed differences with block sizes in dd. I haven't tried comparing with cat, though.

jrumbut · on Oct 25, 2022

I don't have data at hand but if you choose the right value it's meaningfully better and can also lead to more efficient patterns in bash scripts that are more complicated than `cat in > out`

Unfortunately dd has not just footguns but foot cannons that are amplified by the mistakes people often make with string escaping, odd file names, loops, null checking, and conditionals in bash.

tinus_hn · on Oct 25, 2022

Another one is the conv=noerror read and fill bad blocks with zeroes option.

account42 · on Oct 26, 2022

If your goal is to recover as much data as possible then something like GNU ddrescue would be more appropriate.

tinus_hn · on Oct 26, 2022

You can also edit text files using notepad. Does that mean there is a cult around the useless vi?