0x00 故障描述

最近电脑蓝屏死机的现象频频发生,蓝屏代码为DRIVER_IRQL_NOT_LESS_OR_EQUAL,考虑到刚更换了显卡,一直从显卡方面入手,更换了几次显卡驱动,从game ready版本更新到studio版本,问题依旧。

0x01 排查过程

某日下午,熟悉的蓝屏再次出现,死机重启之后,我决定彻底排查该问题

1、打开计算机管理-系统工具-事件查看器-Windows日志-系统

2、筛选出该时间段的系统错误日志,查到系统针对该异常关机事件生成了转储文件dmp。

3、使用windbg.exe打开dmp文件

4、加载完成后运行!analyze -v命令进行分析

ExtensionGallery settings after reading 'SOFTWARE\Microsoft\Debug Engine' registry:
ExtensionGallery ExtensionRepository: Implicit

************* Preparing the environment for Debugger Extensions Gallery repositories **************
   ExtensionRepository : Implicit
   UseExperimentalFeatureForNugetShare : true
   AllowNugetExeUpdate : true
   NonInteractiveNuget : true
   AllowNugetMSCredentialProviderInstall : true
   AllowParallelInitializationOfLocalRepositories : true
   EnableRedirectToChakraJsProvider : false

   -- Configuring repositories
      ----> Repository : LocalInstalled, Enabled: true
      ----> Repository : UserExtensions, Enabled: true

>>>>>>>>>>>>> Preparing the environment for Debugger Extensions Gallery repositories completed, duration 0.000 seconds

************* Waiting for Debugger Extensions Gallery to Initialize **************

>>>>>>>>>>>>> Waiting for Debugger Extensions Gallery to Initialize completed, duration 0.031 seconds
   ----> Repository : UserExtensions, Enabled: true, Packages count: 0
   ----> Repository : LocalInstalled, Enabled: true, Packages count: 46

Microsoft (R) Windows Debugger Version 10.0.29547.1002 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [C:\Windows\Minidump\051326-25703-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: srv*
Executable search path is: 
Windows 10 Kernel Version 19041 MP (16 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Kernel base = 0xfffff803`78800000 PsLoadedModuleList = 0xfffff803`7942a3f0
Debug session time: Wed May 13 16:35:39.228 2026 (UTC + 8:00)
System Uptime: 15 days 18:49:25.546
Loading Kernel Symbols
...............................................................
................................................................
................................................................
................................................................
..................
Loading User Symbols
PEB is paged out (Peb.Ldr = 000000ac`f0959018).  Type ".hh dbgerr001" for details
Loading unloaded module list
................
For analysis of this file, run !analyze -v
nt!KeBugCheckEx:
fffff803`78bfd510 48894c2408      mov     qword ptr [rsp+8],rcx ss:0018:ffff8603`91443b30=000000000000000a
15: kd> !analyze -v
Loading Kernel Symbols
...............................................................
................................................................
................................................................
................................................................
..................
Loading User Symbols
PEB is paged out (Peb.Ldr = 000000ac`f0959018).  Type ".hh dbgerr001" for details
Loading unloaded module list
................
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 0000000000000000, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000008, value 0 = read operation, 1 = write operation
Arg4: 0000000000000000, address which referenced memory

Debugging Details:
------------------


KEY_VALUES_STRING: 1

    Key  : Analysis.CPU.mSec
    Value: 1281

    Key  : Analysis.Elapsed.mSec
    Value: 6773

    Key  : Analysis.IO.Other.Mb
    Value: 0

    Key  : Analysis.IO.Read.Mb
    Value: 1

    Key  : Analysis.IO.Write.Mb
    Value: 1

    Key  : Analysis.Init.CPU.mSec
    Value: 593

    Key  : Analysis.Init.Elapsed.mSec
    Value: 15776

    Key  : Analysis.Memory.CommitPeak.Mb
    Value: 79

    Key  : Analysis.Version.DbgEng
    Value: 10.0.29547.1002

    Key  : Analysis.Version.Description
    Value: 10.2602.27.2 amd64fre

    Key  : Analysis.Version.Ext
    Value: 1.2602.27.2

    Key  : Bugcheck.Code.LegacyAPI
    Value: 0xd1

    Key  : Bugcheck.Code.TargetModel
    Value: 0xd1

    Key  : Dump.Attributes.AsUlong
    Value: 0x8

    Key  : Dump.Attributes.KernelGeneratedTriageDump
    Value: 1

    Key  : Failure.Bucket
    Value: AV_nt!KiPageFault

    Key  : Failure.Hash
    Value: {ec3e2762-48ae-ffe9-5b16-fbcb853e8320}

    Key  : Faulting.IP.Type
    Value: Null

    Key  : Hypervisor.Enlightenments.Value
    Value: 77057948

    Key  : Hypervisor.Enlightenments.ValueHex
    Value: 0x497cf9c

    Key  : Hypervisor.Flags.AnyHypervisorPresent
    Value: 1

    Key  : Hypervisor.Flags.ApicEnlightened
    Value: 1

    Key  : Hypervisor.Flags.ApicVirtualizationAvailable
    Value: 0

    Key  : Hypervisor.Flags.AsyncMemoryHint
    Value: 0

    Key  : Hypervisor.Flags.CoreSchedulerRequested
    Value: 0

    Key  : Hypervisor.Flags.CpuManager
    Value: 1

    Key  : Hypervisor.Flags.DeprecateAutoEoi
    Value: 0

    Key  : Hypervisor.Flags.DynamicCpuDisabled
    Value: 1

    Key  : Hypervisor.Flags.Epf
    Value: 0

    Key  : Hypervisor.Flags.ExtendedProcessorMasks
    Value: 1

    Key  : Hypervisor.Flags.HardwareMbecAvailable
    Value: 1

    Key  : Hypervisor.Flags.MaxBankNumber
    Value: 0

    Key  : Hypervisor.Flags.MemoryZeroingControl
    Value: 0

    Key  : Hypervisor.Flags.NoExtendedRangeFlush
    Value: 0

    Key  : Hypervisor.Flags.NoNonArchCoreSharing
    Value: 1

    Key  : Hypervisor.Flags.Phase0InitDone
    Value: 1

    Key  : Hypervisor.Flags.PowerSchedulerQos
    Value: 0

    Key  : Hypervisor.Flags.RootScheduler
    Value: 0

    Key  : Hypervisor.Flags.SynicAvailable
    Value: 1

    Key  : Hypervisor.Flags.UseQpcBias
    Value: 0

    Key  : Hypervisor.Flags.Value
    Value: 4853999

    Key  : Hypervisor.Flags.ValueHex
    Value: 0x4a10ef

    Key  : Hypervisor.Flags.VpAssistPage
    Value: 1

    Key  : Hypervisor.Flags.VsmAvailable
    Value: 1

    Key  : Hypervisor.RootFlags.AccessStats
    Value: 1

    Key  : Hypervisor.RootFlags.CrashdumpEnlightened
    Value: 1

    Key  : Hypervisor.RootFlags.CreateVirtualProcessor
    Value: 1

    Key  : Hypervisor.RootFlags.DisableHyperthreading
    Value: 0

    Key  : Hypervisor.RootFlags.HostTimelineSync
    Value: 1

    Key  : Hypervisor.RootFlags.HypervisorDebuggingEnabled
    Value: 0

    Key  : Hypervisor.RootFlags.IsHyperV
    Value: 1

    Key  : Hypervisor.RootFlags.LivedumpEnlightened
    Value: 1

    Key  : Hypervisor.RootFlags.MapDeviceInterrupt
    Value: 1

    Key  : Hypervisor.RootFlags.MceEnlightened
    Value: 1

    Key  : Hypervisor.RootFlags.Nested
    Value: 0

    Key  : Hypervisor.RootFlags.StartLogicalProcessor
    Value: 1

    Key  : Hypervisor.RootFlags.Value
    Value: 1015

    Key  : Hypervisor.RootFlags.ValueHex
    Value: 0x3f7


BUGCHECK_CODE:  d1

BUGCHECK_P1: 0

BUGCHECK_P2: 2

BUGCHECK_P3: 8

BUGCHECK_P4: 0

FILE_IN_CAB:  051326-25703-01.dmp

DUMP_FILE_ATTRIBUTES: 0x8
  Kernel Generated Triage Dump

FAULTING_THREAD:  ffff8b0b051d4080

READ_ADDRESS: fffff803794fb390: Unable to get MiVisibleState
 0000000000000000 

PROCESS_NAME:  aTrustXtunnel.

CUSTOMER_CRASH_COUNT:  1

STACK_TEXT:  
ffff8603`91443b28 fffff803`78c11da9     : 00000000`0000000a 00000000`00000000 00000000`00000002 00000000`00000008 : nt!KeBugCheckEx
ffff8603`91443b30 fffff803`78c0d778     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiBugCheckDispatch+0x69
ffff8603`91443c70 00000000`00000000     : fffff803`7e413865 ffff8b0a`f1c5ddf0 ffff8603`91443e90 00000000`00000000 : nt!KiPageFault+0x478


SYMBOL_NAME:  nt!KiPageFault+478

MODULE_NAME: nt

IMAGE_NAME:  ntkrnlmp.exe

IMAGE_VERSION:  10.0.19041.6216

STACK_COMMAND: .process /r /p 0xffff8b0b059c5080; .thread /r /p 0xffff8b0b051d4080 ; kb

BUCKET_ID_FUNC_OFFSET:  478

FAILURE_BUCKET_ID:  AV_nt!KiPageFault

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

FAILURE_ID_HASH:  {ec3e2762-48ae-ffe9-5b16-fbcb853e8320}

Followup:     MachineOwner
---------


0x02 问题分析

错误类型: DRIVER_IRQL_NOT_LESS_OR_EQUAL (0xD1)

  • 这个错误意味着某个内核模式的驱动程序试图在不被允许的过高权限级别(IRQL)下,访问一段可以被换出到硬盘的内存(pageable memory)。简单来说,就是驱动程序“越权”访问了它不该在此时访问的内存地址。

触发进程: PROCESS_NAME: aTrustXtunnel.

  • 这是定位问题的关键线索。崩溃发生时,正在运行的程序是 aTrustXtunnel.exe。这个程序通常是深信服(Sangfor)零信任安全客户端的一部分,它会安装自己的内核驱动来管理网络连接和安全策略。

根本原因:

  • 综合以上两点,可以得出结论:深信服 aTrust 客户端的某个驱动程序存在缺陷或与当前系统环境不兼容。当 aTrustXtunnel.exe 调用该驱动执行操作时,驱动程序内部的错误逻辑导致了这次非法内存访问,从而引发系统蓝屏保护性重启。虽然堆栈显示故障发生在 nt!KiPageFault(Windows内核的页面错误处理程序),但这只是系统捕获到错误后的最终行为,真正的源头是导致这次错误访问的驱动。

0x03 疑惑

我并没有开启aTrust客户端的开机自启动功能,为什么还有后台进程在运行,从任务管理器中也可以证实该现象:

继续排查,我打开服务管理窗口,看到aTrust确实安装了一项服务,点进服务详情:

该项服务描述为:深信服零信任守护服务,禁止后,将导致功能使用异常

该描述更是莫名奇妙,我甚至没有启动主程序,您在后台守护啥呢?

0x03 解决方案

由于接下来的工作已经不再需要再用到深信服的aTrust客户端,所以直接将其删除,后续并没有再出现蓝屏的情况。