首页 > *nix技术, X86芯片, 内核技术, 硬件设备, 跟踪调试 > Triple Fault的捕获与处理

Triple Fault的捕获与处理

2012年4月11日 发表评论 阅读评论 3,611 次浏览

前面有文章说Triple Fault无法捕获,经过几天资料搜寻,对于这个问题还是有办法的。如果可以,对于捕获Triple Fault,最简便的方法就是使用虚拟机来进行操作,这样在发生Triple Fault时影响的只是虚拟机,在host机器里我们仍然可以做进一步处理。当前虚拟机比较多,比如最常用的Vmware、Qemu、Bochs等,我就试了一下Vmware,发生Triple Fault时Vmware会弹个框进行提示CPU进入shutdown模式,按确定(虚拟机)重启,按取消(虚拟机)关机;这个对于我们调试Triple Fault帮助不大;另外几个虚拟机,网上搜索了一下,据称Qemu会有oops信息:
http://readlist.com/lists/netbsd.org/current-users/5/28250.html
Bochs也会有oops信息:
http://www.brokenthorn.com/Resources/OSDev15.html
http://www.brokenthorn.com/Resources/OSDev9.html

如果无法使用虚拟机怎么办?在实体机器上的Triple Fault捕获仍然有办法,但这需要更底层的操作,在Intel官方手册Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 3A System Programming Guide.pdf的266页:
If another exception occurs while attempting to call the double-fault handler, the processor enters shutdown mode. This mode is similar to the state following execution of an HLT instruction. In this mode, the processor stops executing instructions until an NMI interrupt, SMI interrupt, hardware reset, or INIT# is received. The processor generates a special bus cycle to indicate that it has entered shutdown mode. Software designers may need to be aware of the response of hardware when it goes into shutdown mode. For example, hardware may turn on an indicator light on the front panel, generate an NMI interrupt to record diagnostic information, invoke reset initialization, generate an INIT initialization, or generate an SMI. If any events are pending during shutdown, they will be handled after an wake event from shutdown is processed (for example, A20M# interrupts).
已经明确说明在硬件上可以做一些动作,不直接重启CPU,改而发送SMI或NMI这样的信号,让CPU仍有机会做一些事情,比如record diagnostic information。

IBM至少有一款架构的芯片组支持对triple fault的捕获并log一条日志:
http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5072340
1. Triple faults occur as follows:
a)A CPU enters an exception routine because of a fault.
b)While in this exception routine, if the CPU encounters another fault, the CPU will enter another exception routine (Double Fault).
c)If a third fault occurs, the CPU cannot handle it and issues a Shutdown Special Cycle (Triple Fault).
2. When a triple fault occurs on servers without IBM X3 Architecture chipsets, the Shutdown Special Cycle causes the southbridge to reset with no logged messages.
3. When a triple fault occurs on servers with IBM X3 Architecture (Hurricane) chipsets, the System Management Interrupt handle the Shutdown Special Cycle, logs a triple fault event through BIOS, and then resets the system. The triple fault message is an information only message because most triple faults are contained at the kernel or device driver level and is not necessarily a hardware error.
4. Identifying and determining the cause of the triple fault is very difficult and can require software and hardware development debug.

IBM提供的更详细的Triple Fault的捕获与处理方法:
Method of detecting and reporting triple faults in software
http://ip.com/IPCOM/000168583

本地下载:IPCOM000168583D

转载请保留地址:http://www.lenky.info/archives/2012/04/1507http://lenky.info/?p=1507


备注:如无特殊说明,文章内容均出自Lenky个人的真实理解而并非存心妄自揣测来故意愚人耳目。由于个人水平有限,虽力求内容正确无误,但仍然难免出错,请勿见怪,如果可以则请留言告之,并欢迎来讨论。另外值得说明的是,Lenky的部分文章以及部分内容参考借鉴了网络上各位网友的热心分享,特别是一些带有完全参考的文章,其后附带的链接内容也许更直接、更丰富,而我只是做了一下归纳&转述,在此也一并表示感谢。关于本站的所有技术文章,欢迎转载,但请遵从CC创作共享协议,而一些私人性质较强的心情随笔,建议不要转载。

法律:根据最新颁布的《信息网络传播权保护条例》,如果您认为本文章的任何内容侵犯了您的权利,请以Email或书面等方式告知,本站将及时删除相关内容或链接。

  1. 本文目前尚无任何评论.
  1. 本文目前尚无任何 trackbacks 和 pingbacks.