Click to See Complete Forum and Search --> : Thermonuclear Kernel Compile
Alright, I have just become victimized (albeit completely caught off guard) by a kernel compile gone very bad. The following are some details regarding my (former) system
/dev/hdc - Experimental hd formerly housing a LFS install
/dev/hde - A 10 GB hd formerly housing a working Redhat linux install.
Running redhat linux on hde, I compiled a new kernel for the LFS install on /dev/hdc. Why the curious use of the word former? Well here is what gcc make clean, make dep, and make modules managed to accomplish:
1/ Master boot record on /dev/hdc GONE!
2/ Master boot record on /dev/hde formerly booting red hat linux, now booting the incredibly useful LIL- operating system
3/ Swap partiton on /dev/hde GONE!
thats right all gone. Now, I dont mind starting all over again (A task that will take about 2 days), but can anyone tell me how the F*** a kernel compile managed to strategically nuke MY ENTIRE PC!!!!!!
Needless to say I am both angry, and amazed at the sheer uncontrolled and reckless power that the kernel compilation process has. Just about any any documentation online describes the kernel compilation process as an everyday task that any linux enthusiast will "eventually need to do"
Considering what it did to my system, I would consider compiling the kernel one of the most dangerous things anyone could do.
Malakin
08-14-2001, 10:42 PM
It's impossible that compiling a kernel would effect anything you're suggesting. Compiling the kernel is just like compiling any other program.
When you boot into lilo, you get your usual choices correct? If you choose the option that used to boot you into linux what happens now?
Are you certain you didn't use the command 'make bzlilo'??? Thats the only command I know of that after the kernel compilation will install the new kernel as the default under LILO and run LILO for you. That command would definitely have wiped your MBR if you had the lilo.conf file setup to do so; as for a 'missing' swap partition, you might have forgotten to enable ata66/100 in the kernel, which /dev/hde would have been counting on...no ata66 controller, no /dev/hde. Get the picture?
Take a deep breath. DO_NOT_REINSTALL_ANYTHING. Reboot the system, and if you have a disk like TOMSRTBT or a couple of the Debian rescue/root or Slackware boot/root floppies handy, get back into the system with them and fix your lilo.conf file, copy the old kernel image you were using to be the default image, etc. It's really not that bad.
Thanks bdl,
Apparently disaster recovery skills are just as important as any other skill in the Linux world.
I have actually taken the opportunity to start over, there were many aspects of the installation on dev/hde that bothered me (hard drive arrangement, kernel installation options etc.). With a fresh installation and the remnents of my LFS disk, I'm pumped, I'm ready to do right!
As for the make command, no, I followed the NHF on kernel compiling to the letter. Issued the make bzImage. I tried to make a very small monolithic kernel for the LFS, and I still plan to do so on this attempt.
I wonder, I am aware that kernel compiling is (or at least should be) incapeable of modifying the MBR, however LILO installs a first / second stage boot loader with absolute file references (HDD address locations like 0x00C4 as opposed to /boot/bzImage). If compiling the kernel moved anything around in /boot, then the existing LILO configuration would have been invalid since the files it expects to see on the HDD are no longer in the same absolute location (ie. at 0x0C43 instead of 0x0c63). Upon reboot using the hde MBR, lilo gave the following output
LIL-
Indicating difficulties with the second stage boot loader, apparently, it no longer knew where anything was on the hard drive (I had several stanzas in my lilo.conf the default kernel should have been completely unaffected as I downloaded and compiled a new kernel).
One thing is for certain, after the next kernel compile, I will run /sbin/lilo on both disks just to be sure nothing has been moved around, besides, you can not run /sbin/lilo too many times. In addition I will have the boot disks you mentioned handy!
Thanks for the support, It appears that I learn more from my mistakes than I do from my success.
:D
In addition, the LIL- error code indicates that the /boot/map file has been moved, or corrupted. I never touched the map file on hde, the kernel compile MUST have altered it (the map file) somehow, or shuffled around the inodes in the /boot directory, to cause the map file to go missing.
In either case, it is looking as though if re-running sbin/lilo on all partitions is an excellent practice after compiling a kernel, just in case /boot files have been moved or altered.
manual_overide
08-15-2001, 03:15 PM
when you compile, you must tell lilo where the new kernel is and copy the new System.map into /boot Then you must rerun lilo so it installs the new configuration. These are probably the most important steps. If you forget or do them wrong, your system won't boot.
Upon comiling the kernel (on hde), I moved system.map, and the new kernel to dev/hdc. I ran /sbin/lilo on hdc (albeit improperly, hence loosing the MBR on hdc). However I did not modify any of the contents on /dev/hde/boot (did not copy the kernel there, did not copy the system.map file there, left it alone completely. The compilation must have altered something there.
Doesnt matter though, never hurts to re run sbin/lilo on all drives in the future.