Hacker News new | past | comments | ask | show | jobs | submit login
dd built-in progress introduced in coreutils 8.24 (savannah.gnu.org)
101 points by vdfs on Jan 10, 2016 | hide | past | favorite | 48 comments



On BSD, unlike Linux, many utilities including dd(1) will dump a progress report when receiving SIGINFO. This was traditionally bound to ^T in shells, either by default or using stty(1).


It's SIGUSR1 on Linux, however there is no signal bound to ^T in Linux, nor is it apparently possible to bind new ones.

`killall -USR1 dd` works like a charm, or for the impatient: watch killall -USR1 dd


This example highlights a very useful alternative function of `watch`: to simply run a command on regular intervals without caring about the output.


Yea, but its a pain to open another tab and run another command and switch back to this tab to see the progress.

Why they didn't give a progress bar in the first place beats me.


progress(1) is a separate utility on BSD; one of my favorites. (Although it is also built in to BSD ftp.)

I regularly use progress with dd, tar and other utilities.

At least one BSD dd now has a "msgfmt" option that outputs parseable, printf-style format strings.

Truthfully, I prefer it when these basic utilities are static except for fixing bugs. Using the new features in scripts means the scripts might lose some portability.


There's also the pv(1) utility on Linux and OS X Homebrew:

  DESCRIPTION
  
       pv  shows the progress of data through a pipeline by giving infor-
       mation such as time elapsed, percentage completed  (with  progress
       bar), current throughput rate, total data transferred, and ETA.

       To use it, insert it in a pipeline between two processes, with the
       appropriate options.  Its standard input will be passed through to
       its  standard output and progress will be shown on standard error.
(edit: formatting)


I use this often on OS X :)


This is how it looks...

    [sc-lx-phys-05 /home/cvogel]
    $ dd if=/dev/zero of=/dev/null status=progress
    11086075904 bytes (11 GB) copied, 7.000001 s, 1.6 GB/s


I find this style of formatting specifiers an eyesore:

    fprintf (stderr,
            _("%"PRIuMAX"+%"PRIuMAX" records in\n"
              "%"PRIuMAX"+%"PRIuMAX" records out\n"),
            r_full, r_partial, w_full, w_partial);


I assume you mean the `%"PRIuMAX"` bits. It is an eyesore, but it is how to portably write the specifiers for types whose size my vary across architectures. That is, blame POSIX, not GNU coding style.


The `PRIuMAX` et al macros are specified by ISO C (starting with the 1999 standard), not (just) by POSIX.


That makes sense; the most recent editions of POSIX "import" ISO C '99. I had a copy of POSIX handy to check it, but not ISO C :)


Yeah, it's ugly, but it's sadly the only way to correctly do cross platform prints where the type may or may not be 64 bit.


The `PRIuMAX` macro in particular is not necessary. It expands to a format usable for `uintmax_t`, but you can also use `"%ju"`. Both were introduced in ISO C99.

(Unless your `printf` implementation doesn't recognize `"%ju"`, and the `PRIuMAX` macro and friends are defined in some non-standard system-specific manner.)


Related, the recently posted "How to write C in 2016":

https://news.ycombinator.com/item?id=10864176


How is this supposed to work? The underscore is a convenience macro for the gettext localization function. But gettext works by looking up its string argument in the localization database, and here the string varies from one platform to another, so how can you be sure that the localization will be found?

I guess the answer is that you have to redo the localization for each platform that has a different expansion of PRIuMAX. But that's horrible: the localization will be identical except for the format specifiers, which will just be copied from the English version of the string. Is there support for automating this in the translation toolchain?


What is the behavior of gettext() if the message is not found?

(This question is the answer to your question.)


If the message is not found, gettext returns the argument unchanged, which would be a bug (English text incorrectly appearing instead of localized text). So that's not it.

The answer to my question is that xgettext and gettext have special handling for PRIuMAX and the other formatting macros in inttypes.h. http://www.gnu.org/software/gettext/manual/gettext.html#Prep...


i've found that pv is useful for this purpose. however i've never been quite clear on what the best way to use bs with pv is. is it

  dd if=/dev/sda bs=1M | pv > sda.file
or is this better

  dd if=/dev/sda bs=1M | pv | dd of=sda.file bs=1M


If all you are doing is the above, then you don't need dd at all. pv will act like dd (but with stdin/out) and uses a sensible buffer size you can override. Your first example becomes:

  pv --buffer-size 1m < /dev/sda > sda.file
http://www.ivarch.com/programs/quickref/pv.shtml

  -B BYTES, --buffer-size BYTES
    Use a transfer buffer size of BYTES bytes. A suffix of "k", "m", "g", or
    "t" can be added to denote kilobytes (*1024), megabytes, and so on. The 
    default buffer size is the block size of the input file's filesystem 
    multiplied by 32 (512kb max), or 400kb if the block size cannot be 
    determined.
pv is great tool and works really well. On Linux you can even point it at other process ids, and it will automatically scan their open file descriptors and show progress of relevant ones.


Just be very careful which direction you point those arrows :-)

I've have similar nightmares about typoing `if` and `of` when using dd.


Quite often, when I'm about to write a command involving `dd`, the first character I type is #.


You can also drop the `<` in that example, pv takes a file to read from as an argument:

  pv --buffer-size 1m /dev/sda > sda.file
Will do the same thing, and give an accurate % complete progress bar.


> Will do the same thing, and give an accurate % complete progress bar.

I'm not sure that's true, last time I tried to do:

  pv < /dev/zero > /dev/sda
To zero a drive, it correctly printed out the total size of the output block device. Perhaps it also does this type of scanning on stdin, if possible.

That said, looking at the main page, it appears that this functionality is only available for the output end:

Note that if the input size cannot be calculated, and the output is a block device, then the size of the block device will be used and pv will automatically stop at that size as if -S had been given. (http://www.ivarch.com/programs/quickref/pv.shtml)


Ah, I see what you mean, I was thinking in terms of images/files.

As an arguement it gives % of the input (64GB usb drive, 8GB image):

  > pv /mnt/str/Downloads/stick-8g.img > /dev/sdi
  916MiB 0:00:35 [1.88MiB/s] [========>                                                                           ] 11% ETA 0:04:17
Otherwise it seems to give the size of the output if it's a block device.


There's also this incredible tool that spies on processes to get real time progress:

https://github.com/Xfennec/progress


dd was always a big special to me. It's the only command-line utility that I know that takes the simple approach to argument parsing. I don't know why everyone wants to prefix their arguments with --


As mentioned in the online manual at http://gnu.org/software/coreutils/dd that command line syntax is inspired by the DD (data definition) statement of OS/360 JCL


For one thing, nobody wants to type:

  mv opt=force old=oldname.txt new=/path/to/new.txt
In dd it doesn't matter because it's used for "surgical" jobs where you have to think very carefully about several simultaneous numeric details.


I've been using dcfldd for years now, just because of the progress bar.


ddrescue also shows progress by default. Not sure if it has a comparable feature-set as dcfldd, but for data rescue/forensics, it definitely does the trick, and seems to be much more actively maintained.


Example (from Reddit post): # dd if=arch.iso of=/dev/sdb bs=4M status=progress 61432564 bytes (61 MB) copied, 3.024017 s, 20.3 MB/s


Is a progress bar or percentage really that much to ask ?


You don't always know how many bytes remain to be copied, e.g. in the case of a FIFO. It could theoretically do things differently in the cases where you do know, however.


I get "No repositories found"...


I get "no repositories found" for that link.


See: cat -v Considered Harmful by Rob Pike


This is a pretty useful feature for something that regularly copies large files/disks you'd want to see the progress of, in fact I'd say it should have been default behavior for years.

I'd make some crude comment about cargo culting, but do you have an explicit complaint? dd itself is derided by the plan9 crew for many more reasons than a progress bar, and the cli arguments are obtuse because they're a joke.

[0]: http://www.catb.org/jargon/html/D/dd.html


Couldn't you just send a SIGUSR1 signal to see the progress?


That's more awkward and outputs a separate line per interrupt. Also to do that programmatically (to provide progress to a program using dd underneath) is tricky to do robustly due to the default disposition of SIGUSR1. SIGUSR1 is really only a hack provided on systems that don't support SIGINFO (which is easier to generate and whose default disposition is easier to handle robustly)


You could but letting dd take care of it instead of doing it yourself is much better IMHO.


What if you started dd without status=progress ?


Updating dd to continuously dump text to the screen by default would probably break a lot of existing scripts.


Just on SIGUSR1 would be good enough to get an idea where dd is at. And it would be on stderr, not stdout.


> it would be on stderr, not stdout

Oh, that's true.

Anyway, what behavior do you want to see exactly? You can still send the USR1 signal and get the update behavior just like before. Were you just pointing out that there are cases where you still need to use that feature, or something else?


Then that's your own fault?


But surely if you could send a SIGUSR1 and get the progress it would be useful. I won't always have the foresight to put the progress argument. Maybe I expected the task to be quick but it's not.


If you're on BSD, that'll kill dd.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: