Recovering deleted files using grep

sp332 · on Aug 19, 2010

Make sure your output file is on a different filesystem! Otherwise, it might be saved in the newly-freed blocks of the file you're trying to recover.

johnswamps · on Aug 19, 2010

Better yet, mount the hard drive read-only (on a different computer if necessary).

ErrantX · on Aug 20, 2010

Similar option: grab a Helix CD and boot from that (and use a USB key etc. to copy files to)

Wilfred · on Aug 19, 2010

The author's intent is to write enough of the surrounding context that you recover it first time.

Still, it raises two questions:

What about fragmentation?

Why don't have a GNU safe-rm yet that moves files to the (freedesktop.org specified) trash location to avoid this?

someone_here · on Aug 19, 2010

Because you told it to remove the file instead of moving the file? I don't see why we need a safe-rm on the command line.

File managers already implement a trash function as per the freedesktop.org spec: http://www.ramendik.ru/docs/trashspec.html

thaumaturgy · on Aug 20, 2010

> I don't see why we need a safe-rm on the command line.

I think this is hilarious. :-) Throughout Unix/Linux/BSD history, there is a steady series of essays, lamentations, wails, and gnashings-of-teeth regarding the recovery, attempted recovery, or irretrievable loss of really important data that got somehow mistakenly rm'd by some admin.

...and, every single time, someone says, "Shouldn't this be made safer?", and every single time someone else says, "Nope, rm is doing exactly what it's supposed to! Just be more careful!"

As if the huge volume of arcane commands and various scripting languages disguised as configuration files weren't proof enough that the mass of Unix/Linux/BSD admins and developers all share a common streak of masochism, we also seem hell-bent on ensuring that we have tools which can -- and eventually will -- bite us in the ass.

For my part, I think that having some form of undelete option standard in every file system is as obvious as keeping backups.

BrandonM · on Aug 20, 2010

The problem is not Unix as much as it is the work habits that Unix users have developed. The rm command is hard core, and yet everyone (including me) uses it regularly. It would be much smarter to create a command named "trash" or "del" or whatever to instead move files to a trash folder. Then "empty-trash" could actually use rm.

Alternatively, just slow down a little bit before using rm, especially when operating as root. Understand that it's (intended to be) permanent. Use echo first when using rm with a splat in order to ensure you're actually deleting what you expect to delete.

The question, "Shouldn't this be made safer?" is irrelevant. At some level, you have to have an rm command. If users decide to use it regularly, then it's up to them to "Just be more careful!" The smarter thing would be to create a workflow that doesn't rely on using rm at all. Why whine and complain (not you, I mean users in general) about an operation that can be easily changed?

barrkel · on Aug 19, 2010

I never bother with "safe" deletion even on the desktop (Windows 7); I don't even have deletion prompts. For the once or twice a year that something gets mistakenly deleted - or more likely, overwritten - restore from backup does nicely.

Assuming you have backups, of course. Which you'd be insane not to.

fragmede · on Aug 19, 2010

If the file was on an ext3 filesystem, you can use ext3grep, written by Carlo Wood (http://www.xs4all.nl/~carlo17/howto/undelete_ext3.html)

(Grepping your hard drive for file fragments is suggested in the ext3 FAQ - http://batleth.sapienti-sat.org/projects/FAQs/ext3-faq.html)

cryptoz · on Aug 19, 2010

> To help prevent this problem from happening in the first place, many people elect to alias the rm command to a script which will move files to a temporary location, like a trash bin, instead of actually deleting them.

Whatever happened to backups?

To help prevent this problem...

KEEP A BACKUP.

derefr · on Aug 20, 2010

The kinds of files I most often regret rm-ing are the temporary files I have created myself as a step in a process, then deleted after I had moved onto the next step, not realizing an error had crept into the processor and that I would have to run it again on the source files (which are now, conveniently, gone.) Backups don't solve this problem, because the files themselves are never more than an hour old. A "trash" folder, however, fixes this perfectly: the semantic is that the file no longer has any place it "belongs," and may be purged if you successfully complete the project, but may be needed again if the project must be "rewound" to that step.

However, you're right that making rm(1) express move semantics isn't the right solution. Maybe if the filesystem had a "BEGIN TRANSACTION" command that you could ROLLBACK...

BrandonM · on Aug 20, 2010

Storage is cheap.... Why remove intermediate files at all? If you don't want to permanently remove files, then why use rm? Just create a "del" command that moves deleted files to a trash folder. You can then make part of your backup routine be to empty the trash after performing the backup (since that file would now be available in the backups).

derefr · on Aug 24, 2010

You're right—it's more of a "these files are in the way, and I'm sure I'm done with them... so it shouldn't hurt to just type those two little letters and reclaim the storage..."

Actually, that sounds like exactly the cognitive dissonance people had when they first started using Gmail. Perhaps filesystems need an "Archive" folder as well? Not even a Trash folder—because people want to empty a Trash folder—but rather just an enforced (and shell-supported) directory where things go when you don't have any reason to keep them, and therefore have no place to put them?

MHordecki · on Aug 20, 2010

Btrfs to the rescue!

sid0 · on Aug 20, 2010

TxNTFS!

thaumaturgy · on Aug 20, 2010

My backups don't run on a minute-to-minute basis (dunno about yours), so it's totally plausible that I can spend all day working on a particular file and then mistakenly nuke it somehow, and it won't be retrievable by the most backup standards.

mike-cardwell · on Aug 20, 2010

I bought a TimeCapsule and pointed TimeMachine at it on my Mac, so I have hourly backups. Losing an hours worth of work is annoying, but considerably less annoying than losing a days worth of work.

csummers · on Aug 20, 2010

Been there, done that. I rm -rf'd a bunch of important files once, and at the time grep was giving me "memory exhausted" errors. I was able to use strings to grab all of the text of the disk, and then wade through the results with vim.

I guess this is a pretty common problem. The blog post I wrote about it in 2005 continues to be the most searched-for entry point on my site: http://csummers.com/2005/12/20/undelete-text-files-on-linux-...

tbrownaw · on Aug 19, 2010

    cat /dev/mem | strings | grep -i llama

ramidarigaz · on Aug 19, 2010

Hmm... I'm getting an error on that one.

    cat: /dev/mem: Operation not permitted

Edit: even as root

someone_here · on Aug 19, 2010

If I recall correctly, that's a bug that is preset on a particular kernel from 6-9 months ago.

miles · on Aug 19, 2010

It's not a bug:

x86: introduce /dev/mem restrictions with a config option http://lwn.net/Articles/267427/ "This patch introduces a restriction on /dev/mem: Only non-memory can be read or written unless the newly introduced config option is set."

Command-line access to /dev/mem in Ubuntu http://superuser.com/questions/39583/command-line-access-to-...

someone_here · on Aug 20, 2010

Oh cool. Thanks.

ramidarigaz · on Aug 20, 2010

Sad :(

I was looking forward to catting for llamas.

ramidarigaz · on Aug 19, 2010

Huh. Good to know.

someone_here · on Aug 19, 2010

My memory is full of llamas!

auxbuss · on Aug 19, 2010

Where the author says conservative, he means liberal.

(From afar, I understand my Colonial cousins' struggle with these two words.)

thaumaturgy · on Aug 20, 2010

Well, I appreciated your joke, anyway.

omrisiri · on Aug 20, 2010

I've been using this method since i first learned about raw disk access (dev files) and grep.

I think it should be mentioned that this will work properly only if the file was not fragmented - Which will usually be the case in EXT3 unless you are using almost all of the space in the drive, but may happen frequently if you are using a FAT file system (which is used a lot in USB disks).

Also, If you just deleted a binary file this method will be problematic as well, and in that case you can use a tool like photorec to scan the disk and even limit it only to the free space on the drive - which reduce the time it takes to go over a disk and can detect all kinds of binary file types (uses the magic number of the file to detect the type).

Like other people mentioned here before, you should recover all the data to a different partition/disk than the one you are trying to recover a file from.

With that said - recovering data is a tedious and error prone process, so if the data is worth enough(and for some silly reason you don't have a backup) you should:

A. turn off the computer immediately after you've discovered the loss of data (to reduce the chances of overwriting anything important)

B.Give the computer/disk to a professional to recover (because you obviously aren't one since you don't keep backups)

moell · on Aug 21, 2010

Fortunately point A on Linux can be substituted with mount -o remount,ro /

naturalized · on Aug 19, 2010

Or, if you want to really delete a file, use #shred filename command

#man shred SHRED(1) User Commands SHRED(1)

NAME shred - overwrite a file to hide its contents, and optionally delete it

I especially like the -n option!

moobot · on Aug 19, 2010

Except that shred is not guaranteed to work on many (most?) modern filesystems. From `man shred`:

       CAUTION: Note that shred relies on a very  important  assumption:  that
       the  file system overwrites data in place.  This is the traditional way
       to do things, but many modern file system designs do not  satisfy  this
       assumption.   The following are examples of file systems on which shred
       is not effective, or is not guaranteed to be effective in all file sys‐
       tem modes:

       * log-structured or journaled file systems, such as those supplied with
       AIX and Solaris (and JFS, ReiserFS, XFS, Ext3, etc.)

       * file systems that write redundant data and  carry  on  even  if  some
       writes fail, such as RAID-based file systems

       *  file  systems  that  make snapshots, such as Network Appliance's NFS
       server

       * file systems that cache in temporary locations, such as NFS version 3
       clients

16s · on Aug 19, 2010

It works fine on default EXT3. The only thing journaled is meta-data. You snipped that part out. More from man shred

In the case of ext3 file systems, the above disclaimer applies (and shred is thus of limited effectiveness) only in data=journal mode, which journals file data in addition to just metadata.

In both the data=ordered (default) and data=writeback modes, shred works as usual.

albertzeyer · on Aug 19, 2010

Via `reiserfsck --rebuild-tree`, you can also do that for ReiserFS partitions. Have worked very reliable for me. Only problem is that it doesn't always recover the filename and/or the directory structure (depending on how long it is ago that you have deleted it).

kentnl · on Aug 20, 2010

Just don't do this if you've at some stage backed up another reiserfs filesystem inside your reiserfs filesystem with 'dd'.

the rebuild tree trick mistakenly sees entries in the dd'd copy as being files in the parent file system, and then sprays them all over your drive.

jrockway · on Aug 20, 2010

Better than burying them off in the woods somewhere.

datums · on Aug 20, 2010

One of my most mememorable cluster fucks was recovering a database using strings on the disk. The customer ran repair table and ended up with a very small table :) . It was tedious but felt awesome actually getting a large part of the data back.

pixelbeat · on Aug 20, 2010

I've also used this technique. I even wrote a script with progress bar to do it, which is linked to at the end of:

http://www.pixelbeat.org/docs/disk_grep.html

kajecounterhack · on Aug 20, 2010

I used this method once...the file created gets pretty huge but you can even manually sift through it for lost code if you know roughly where it ended up!

retroafroman · on Aug 19, 2010

Excellent Linux hack. I hadn't ever heard this before.

albertzeyer · on Aug 19, 2010

It works on all systems where you have raw access to the disk. And it isn't really that fancy if you think about how it works and how file systems work.

pcora · on Aug 19, 2010

Yup, it's not very fancy, but in the end, serves its purpose and can really save your work.

the last part, about using an alias for rm is something that I've never thought about it and now I'm gonna use always on my servers.

koevet · on Aug 19, 2010

Actually, I think that the real great hack here is to alias the rm command to a trashbin script (as suggested at the end of the article)

telemachos · on Aug 19, 2010

The danger of aliasing the command itself (the bare 'rm') is that you come to count on the safety of the alias. Then you work one day on a friend's or coworker's machine and...BOOM.

What I do instead is make a nearby (and simple) alias. For example:

    rmi='rm -i'

abhiomkar · on Aug 19, 2010

This can be achieved using trash-cli

freerobby · on Aug 19, 2010

Clever stuff, thanks for sharing.

mkramlich · on Aug 20, 2010

frequent automatic backups and version control are your friend