Recently, through a series of dumb – dumber – dumbest moves, I had accidentally deleted a MySQLDump backup file of a database on one of our Ubuntu servers. The sorted details of how and why my brain did such an idiotic move at this point are moot, but the solution isn’t. In a panic, I started researching how to undelete files from an ext3 file system.
Short Answer
Unless conditions are near-perfect, you can’t .
Long Answer
Ext3 uses journaling – which is, for the most part, good. It makes sure that if your system crashes the possibility for stability is high – much higher than with ext2. The problem is when you unintentionally issue commands, it makes sure it does them. In the case of deletion, it zeros the inodes – completely and forever breaking the link between the file pointer and the blocks the data resides in.
Tools You May Need
A live CD with lots of utilities
This is where Trinity Rescue Kit comes in. It’s a great distro that with tons of boot options including boot from the CD, Boot To Ram drive, and boot with extra SCSI support. It has a relatively small footprint with all sorts of tools including an NTFS undelete tool and…
Data Recovery Software
If the files are small-ish in size, of a common type, and not a lot of new data has been written after the deletion, there is a chance you can recover the files in their entirety. I suggest using this handy little application called PhotoRec. It does a nice job of trying to identify files by looking at the raw drive contents. It can even help with bad or damaged partitions
You may also want to look at the sourceforge project foremost, although I did not get a chance to evaluate it.
Tools don’t always help
In my case, my file was relatively large (18 MB) and segmented. My disk was not really damaged, so photorec was finding literally thousands of “files” (Since it looks only at the direct data, it could not tell the difference between good files currently on the system, and unlinked files.)
Also, I didn’t actually delete (rm) the file, I had overwritten it with a much smaller (3k) version of the file. This made finding the beginning of the file much harder.
To compound things, the data was on a Raid 0 SCSI partition, which even Trinity Rescue Kit could not Mount with the enhanced SCSI boot (although it could “see” it)
The Manual Search For Data Begins
About the only thing I had going for me was that my file was text. This allowed me to manually search the raw data for key phrases using grep, finding there byte position, and then using dd to grab chunks of the hard drive and sift through them piecing as much of the file as a I could back together again.
Use ‘grep’ To Search the Partition
- Boot to the Live CD
- grep the raw device (e.g. /dev/sdb1) .
- Make sure to use the –binary-files=text options so that it will accurately give results.
- Make sure to use the -b to show the byte position where it is located. In my case, I also wanted to see the context of the grep result to see if this was a chunk worth pulling so I used the -A, and -B options to show lines before
- Send grep results to a file. My final grep command looked a little something like this:
grep --binary-files=text -A 20 -b 'INSERT INTO `attachments`' /dev/sdb1 > results.txt - My results.txt file was saved to another external USB drive. Never use or continue writing to the partition you are searching.
- If your partition is large, this search may take a while.
- Note the byte position of any results. I was searching for text at the beginning of the file, so the byte position tells us where to start pulling from.
- Be careful when viewing the results. You will most likely see a mix of text and binary data. Depending on how the binary data may be displayed, it might shoot your terminal window intto graphics mode, which makes everything (including the prompt and text you enter after into gibberish) If this happens, you can reset the mode, but it can be a pain as you might not always know what you are typing
Once you get the result of where the file starts, you can use dd to copy that chunk to another file on a different drive. If you remember files sizes, you can just guestimate how much data to pull. If you know the text at the end of the file, you can run a second grep to tell you the position at the end of the file.
Depending on the data you are looking for, you may also be able to use the strings command to find where your file is.
Use ‘dd’ To Extract the Information
First off, be CAREFUL when running dd. If you mix up your input and outputs you could end up overwriting or worse yet totally hoozling your partitions. dd parameters you will use are:
- if=FILE — the source file (or in this case device) to pull from
- of=FILE — the destination to write to
- ibs=BYTE — The byte size of one block from the source (I set it to 512)
- skip=BLOCK — The BLOCK number skipped to to start reading from. In this case this will be the byte offset provided by our grep results above divided by the ibs (512)
- count=BLOCK — The number of blocks to read (file size of file you want to retrieve divided by the ibs (512)
My dd command looked something like this:
dd if=/dev/sdb1 of=/mnt0/homres/recover.sql ibs=512 skip=337452145 count=360000
But wait!, I’m not done. Once I had the file, I had to sift through it using a text editor and cut out any pieces that were left from other files (my SQL Dump was segmented – so there was disbursed binary data form other files in there), find out what was missing, and do a series of additional greps to try and find those missing parts.
In addition to being lucky that the file I was looking for was text, I had a good idea of the structure of the data, and what record counts for the tables should look like after reincorporating the dump file.
Using this method, and a not so recent backup, I was able to manually piece together about 98% of the data I had inadvertently hoozled. – It was no easy task, and there was still some data lost the alternative — well I still shutter just thinking about it.
I would like to also thank my father for helping me out with this. His knowledge is always invaluable and for me – calmed my nerves a bit.
Inadvertent Knowledge
- grep searches using regular expressions – many might say “Dur… General Regular Expression Processor” But I thought it was a straight string search utility. COOL! Now if I could just find a dramatic use for it…
- Not zeroing out your drive before formatting a partition leaves data on the drive. – I had parts of a windows installation left by DELL before they shipped the computer to me “unformatted.” This was over four years ago. It also made it harder for me to sift through the raw data.
How to avoid this in the future
Many of these following point might seem obvious, but for my benefit, I had to write them down to get them into my thick, stubborn skull.
- Try not to code when your punchy-tired.
- Make backups, and backups of backups, before handling sensitive data.
- If you run the ext3 file system, you may want to install giis as a fail-safe. It will make backup of inodes with block positions of recently deleted files.
