$Id$ badblock-guess: Quickly recover most of the data from a damaged disk Purpose ------- badblock-guess will try to find all readable sectors of the disk in minimal time. It is similiar to: dd if= of= bs=512 conv=noerror,sync but dd(1) solution can last for many weeks/months on heavily corrupted disk media as each bad sector attempted to be read costs about 5secs (your disk bad-blocks performance may significantly vary). badblock-guess tries to first find and recover most of the data from healthy zones of the disk and later it will start trying to recover smaller bits of information spread between/around badblock zones - the principle of its operation is the fact that badblocks usually occur in chunks. If you don't user-terminate badblock-guess and leave it to finish completely itself, its execution time should be approx. the same as for dd(1). You may also use it to guess the remaining execution time as during its later execution phases it will be showing the remaining number of sectors ("TODO=x") while most of them are probably badblocks in this phase of execution. This program will not recover data of any sector with failing read command. If dd(1) command above finishes for you in a reasonable time, you don't need this program. No special read methods are used. No vendor-specific dependencies exist. No IDE, SCSI or any other specific device is required. License ------- GNU GENERAL PUBLIC LICENSE, Version 2, June 1991. See the file COPYING for its details. It is also available at: http://www.gnu.org/licenses/gpl.txt Usage ----- Syntax: badblock-guess [] Possible device cases (watch out for vs. differences!): No is specified: Just the is scanned for errors and the detected badblocks list is output. is a harddrive (/dev/hdc): Other harddrive of equal or higher capacity is recommended is a partition (/dev/hdc1): Partition on any other physical drive in the system with the exactly same partition size is recommended is a file: File must exist but it will be enlarged when needed. Commands: rm -f /tmp/hdc1.img; touch /tmp/hdc1.img are recommended to specify file "/tmp/hdc1.img" for All the numbers are always expressed as sector (that means 512 bytes) number/count! Never terminate badblock-guess by CTRL-C or kill(1) if you want to utilize its results - always use 'f' followed by ENTER ('f' for 'finish'). You may not see 'f' while typing it - type it blindly (without quotes - just the one letter!). During finishing the program will produce the badblocks list output to its stdout while it will be clearing the 'bad' or 'not-yet-done' zones of the target disk (if any specified). TODO: Resuming of the operation from the badblocks list to continue the scanning is not yet implemented. hdparm ------ Although not required it is strongly recommended to turn of drive readahead during severe disk failure recoveries. You can use /sbin/hdparm -A0a0 /dev/hdX (or /dev/sdX etc.) for your drive to do it. You may need to install some extra package with hdparm(8) for your Linux distribution. Be aware that the disk performance will be critically hit - you may expect read performance about 90KB/s (approx. 7.5GB/24hours if no badblocks read retrying is needed). YMMV. Linux kernel flaw ----------------- Linux kernels have internal blocksize 1KB but the partitions can be sized by 512 sectors and thus the last odd partition sector gets inaccessible. This isn't a problem for a the disk devices as AFAIK all the disks have always even number of sectors. This program is aware of this flaw and it will not report last such sector as BAD - just the appropriate warning is supplied (to stderr). This may be visible for example while recovering NTFS partition with odd cylinder number (=>odd sector count) as NTFS uses the last partition sector for its superblock backup. Fortunately CHKDSK will fix it back, of course. Compilation ----------- Type: make And you should have now the binary file "badblock-guess" compiled out. You may need to install the following packages of your Linux distribution: e2fsprogs, e2fsprogs-devel glib, glib-devel (this is NOT glibc!) other standard C compilation tools and libraries... Compiled out binary is fully statically linked, you can bring it with you on the floppy everywhere (running Linux is still required, of course!). Operation description --------------------- During its run it will update its progress line. All the numbers are always expressed as sector (that means 512 bytes) number/count. @342342/819223,TODO=8192,bad=0,largest=8192,hunks=1 ^A ^B ^C ^D ^E ^F A=currently reding sector 342342 ... B=... out of total sector 819223 of the disk (or partition) C=8192 secters weren't yet attempted to be read D=0 sectors were found with proof of read failure E=currently read hunk of 8192 sectors =also there is a maximum size of hunk 8192 sectors F=total remaining hunks to be processed Initially there is just one hunk (0-media_size) to be read. When no errors on the disk are found, this one hunk is finished and no output (bad sectors list) is generated. When we find bad a sector, we divide our todo-listed hunks by the schema: > successfuly read 10 ->[1 bad]|end We always read the biggest hunk "todo", "todo" hunks of the size are read in backwards order (to approach the found bad blocks from the other side). In this case the program would start with the status line (*): @27/46,TODO=35,bad=1,largest=19,hunks=5 ^A ^B ^C ^D ^E ^F A=we start reading the first sector of the largest hunk (10+1+1+2+4+9) B=total number of sectors on the disk (10+1+1+2+4+9+19) C=still have to read all hunks (1+2+4+9+19) D=we found just one confirmed bad sector (1) E=the largest hunk "todo" on the disk is (19) F=total number of the remaining "todo" hunks (1+1+1+1+1) If you terminate badblock-guess run (by 'f' key, see above!), you can be sure that you will loose the data of at most sectors after any bad sector found. But you still can loose sectors of data. (*) The status line is printed once per a second and thus in real you can't predict the exact states of the progress status line, of course. Output badblock list format --------------------------- Output badblock list has form first_sector-behind_last_sector ; "behind_last_sector" is last_bad_sector+1. Thus single bad sector would be: 5678-5679 The following three types of output lines can occur. The first one is just a shortcut for the second+third - such output is chosen when TODO blocks are consequently following BAD blocks. 1000-9192 ;BAD=100, TODO(@1100-@9192)=8092 1000-1100 ;BAD=100 1100-9192 ;TODO=8092