On Fri, Apr 11, 2008 at 09:07:51PM -0700, Andrew Morton wrote:
> > Kernel panic - not syncing: Panic by panic_module.
> > __tunable_atomic_notifier_call_chain enter
> > msg_handler:panic_event was called.
> > ipmi_wdog:wdog_panic_handler was called.
> > notifier_test: notifier_test_panic() is called.
> > notifier_test: notifier_test_panic2() is called.
> OK. But I don't see anywhere in here the most important piece of
> information: why do we need this feature in Linux?
> What are the use-cases? What is the value? etc.
> Often I can guess (but I like the originator to remove the guesswork). In
> this case I'm stumped - I can't see any reason why anyone would want this.
To begin with, he wants kdb, kgdb etc to co-exist with kdump. He wants
to put all the RAS tools (who are interested in panic event) on a list
and export it to user space and let user decide in what order do the tool get
executed at panic time (based on priority).
This brings in little bit reliability concerns for kdump due to notifier
code being run after panic.
I think people want to use this infrastrutucure beyond RAS tools. I
remember somebody wanting to send a message to remote node after a
panic (before kdump kicks in) so that remote node can initiate failover
Ideally, doing any operation after panic is not safe and one should avoid
such things and any action required should be done in next kernel (like
sending messages to remote nodes etc). Having said that, it makes the
job harder as one needs to pass all the required data to second kernel.
So it will not left to user whether he should execute the code after
panic in first kernel or create required bits to execute code in second
kernel. Things should be more reliable in second kernel.
I am not very sure how paranoid one should be about this additional bit of
notifier code being executed after panic. Probably we can take this in
to make user's life easier.