dchidelf
03-17-2004, 11:08 PM
Well... I'm sure I'm causing it, but I'm not sure that it is because of my code.
I am working on a program that does a lot of memory allocation and mucking around with pointers to pointers to structures of pointers....etc.
You get the point.
I have been running the code through it's paces trying to find potential problems.
I can cause the code to generate a segmentation fault, but I'm not sure if the segmentation fault is due to a problem with the code itself or with the means by which I trigger the fault.
The system I am testing my program on is a Red Hat 7.3 (2.4.18-3) with 128MB ram.
I launched 5 instances of my program it fed them each enough input to cause them to use well over 256MB ram each. At the same time I had a perl script running that I wrote to also eat as much memory as it could. This whole thing resulted in constant thrashing as apparent by the unending harddrive access.
It was after an hour or more of this activity that one or more of my processes will terminate with a Seg Fault. I had logging in place to track if memory allocation had failed within my program and it hadn't. Several times while testing this way, my perl script also suffered a Seg Fault. Yet another time the system halted when it was thrown into kernel panic.
The only thing I have been able to tell from the debugging I have been doing is that it seems my program dies shortly after function calls. Near the beginning of the called function (not in the same spot in the code) a seg fault will be triggered.
Every time I run the program it is working on the exact same input data and there is no randomness to how it processes the data, yet the Seg Faults occur at different times.
After adding an additional 256MB of ram to my test system I haven’t been able to cause the Seg Faults. I even tried setting ulimit -m (and -v) to 4096. This causes my memory allocations to start failing, but the code handles it nicely.
My question:
Should I just chalk it up to the constant page faults eventually leading to some amount of memory corruption that ends up causing a seg fault in whichever process encounters it first?
I don't want to sound like one of those programmers that thinks everything is someone else's bug, but debugging my code seems to be getting me nowhere especially when it the errors are not even isolated within my program.
Thanks
I am working on a program that does a lot of memory allocation and mucking around with pointers to pointers to structures of pointers....etc.
You get the point.
I have been running the code through it's paces trying to find potential problems.
I can cause the code to generate a segmentation fault, but I'm not sure if the segmentation fault is due to a problem with the code itself or with the means by which I trigger the fault.
The system I am testing my program on is a Red Hat 7.3 (2.4.18-3) with 128MB ram.
I launched 5 instances of my program it fed them each enough input to cause them to use well over 256MB ram each. At the same time I had a perl script running that I wrote to also eat as much memory as it could. This whole thing resulted in constant thrashing as apparent by the unending harddrive access.
It was after an hour or more of this activity that one or more of my processes will terminate with a Seg Fault. I had logging in place to track if memory allocation had failed within my program and it hadn't. Several times while testing this way, my perl script also suffered a Seg Fault. Yet another time the system halted when it was thrown into kernel panic.
The only thing I have been able to tell from the debugging I have been doing is that it seems my program dies shortly after function calls. Near the beginning of the called function (not in the same spot in the code) a seg fault will be triggered.
Every time I run the program it is working on the exact same input data and there is no randomness to how it processes the data, yet the Seg Faults occur at different times.
After adding an additional 256MB of ram to my test system I haven’t been able to cause the Seg Faults. I even tried setting ulimit -m (and -v) to 4096. This causes my memory allocations to start failing, but the code handles it nicely.
My question:
Should I just chalk it up to the constant page faults eventually leading to some amount of memory corruption that ends up causing a seg fault in whichever process encounters it first?
I don't want to sound like one of those programmers that thinks everything is someone else's bug, but debugging my code seems to be getting me nowhere especially when it the errors are not even isolated within my program.
Thanks