> Any advice or insight into what we're doing wrong would be very much
> appreciated. My apologies in advance for the somewhat off-topic
By killing abruptly the primary server while doing IO, you're probably
pushing the envelope... You may have a somewhat better luck with a
cluster fs, OCFS2 works very well for me usually (GFS is a complete
PITA to setup).
The better option would be to disallow completely write
caching on the client side (because this is probably where it's going
wrong) however I don't know how. You can get it to flush extremely
often by playing with /proc/sys/vm/dirty_expire_centiseconds
and /proc/sys/vm/dirty_writeback_centisecs, though. Safer settings
generally imply terrible performance, though, you've been warned.
Ah another thing may be some cache option in the iSCSI target. what
target are you using?