之前寫過一篇文章,關于SQLSERVER能識別多少個邏輯CPU的,前些天在論壇里有人問Windows處理器編組是如何劃分的??
SQLSERVER到底能識別多少個邏輯CPU?
在帖子給出了兩篇文章,我們現在來看一下
http://social.technet.microsoft.com/Forums/zh-CN/17f34500-08d5-4302-a484-3ce487899a83/windows-2008-r2sql-server-2005cpu?forum=sqlserverzhchs
Uneven Windows PRocessor Groups
SQL Server 2005 and 2008 versions may not detect all available processors on a machine with more than 64 logical processors
Uneven Windows Processor Groups(不均勻的處理器編組)
這篇文章主要討論64個邏輯cpu的硬件。
我們討論Windows 2008R2 他支持64個邏輯處理器。當前可用的硬件是8個核的物理處理器/socket接口。
盡管加上超線程,那么意味著是16個邏輯cpu。每一個socket接口形成一個或兩個NUMA節點。4個或8個邏輯cpu形成一個處理器編組。
處理器編組的分配是在操作系統啟動的時候分配好的。因為這個原因,Windows2008R2 和之后的Windows操作系統會檢查物理硬件架構為了
分配跟NUMA節點相對應的處理器編組,并且檢查內存延時,為了決定分配哪一個邏輯cpu到哪一個處理器編組。一旦分配完成,就不能再動態更改!
這樣的分配工作只會發生在超過64個邏輯cpu的硬件架構。在典型的8-socket服務器,資源和內存的分布通常是不均勻的,在不同的處理器編組之間
(除了一些在2009年和2010年的時候一些市場上出現的96個邏輯cpu的奇怪的硬件)
已經開發好的軟件面對處理器編組這個概念會發生什么?在大于64個邏輯cpu的時候,軟件會怎樣選擇不同的邏輯處理器
實際上,Windows會在應用程序啟動的時候分配其中一個處理器編組給它。應用程序會檢查64邏輯cpu窗口是否在運行。
然而應用程序會檢查完整的內存資源。典型的應用程序會被調度到其中一個處理器編組。
只要處理器編組有均勻的分布和軟件不需要依賴某些NUMA節點的可用性,一切都很好。
然而,這個平衡受到英特爾發布的最新版本的Intel Xeon E7處理器核心家族的( 10和20邏輯處理器)的影響
顯然,核心的數量和邏輯處理器的數量加起來不太好對于64核cpus。在我的博客里,我已經列出了我討論的
處理器影響到SQLSERVER服務器關聯掩碼的設置。
到目前位置我們并沒有討論到Windows2008R2是如何分配4-socket服務器上的80個邏輯處理器或一個8-socket服務器上的160個邏輯處理器的情況。
Windows2008R2的原來算法實現就是創建盡可能少的處理器編組并且保持每個處理器編組里的處理器數量盡可能足夠大。
因此我們使用這些新的10-core處理器最終會造成處理器編組的不均勻,讓我們看看發生了什么事。
檢測當前的處理器編組信息
為了檢測Windows2008R2上面的確切的處理器編組的信息,硬件通常需要編出超過64個邏輯CPU的線程。執行檢查的工具的名字叫“coreinfo ”
下載地址:http://technet.microsoft.com/en-us/sysinternals/cc835722.aspx
下載地址:http://files.cnblogs.com/lyhabc/Coreinfo.zip
請下載coreinfo .exe然后在cmd窗口里運行它。
最好使用下面語句將coreinfo的信息輸出到文本文件以便分析
coreinfo > structure.txt
structure.txt文件內容
Intel(R) Pentium(R) CPU G630 @ 2.70GHzIntel64 Family 6 Model 42 Stepping 7, GenuineIntelHTT * Hyperthreading enabledHYPERVISOR - Hypervisor is presentVMX * Supports Intel hardware-assisted virtualizationSVM - Supports AMD hardware-assisted virtualizationEM64T * Supports 64-bit modeSMX - Supports Intel trusted executionSKINIT - Supports AMD SKINITNX * Supports no-execute page protectionSMEP - Supports Supervisor Mode Execution PreventionSMAP - Supports Supervisor Mode access PreventionPAGE1GB - Supports 1 GB large pagesPAE * Supports > 32-bit physical addressesPAT * Supports Page Attribute TablePSE * Supports 4 MB pagesPSE36 * Supports > 32-bit address 4 MB pagesPGE * Supports global bit in page tablesSS * Supports bus snooping for cache OperationsVME * Supports Virtual-8086 modeRDWRFSGSBASE - Supports direct GS/FS base accessFPU * Implements i387 floating point instructionsMMX * Supports MMX instruction setMMXEXT - Implements AMD MMX extensions3DNOW - Supports 3DNow! instructions3DNOWEXT - Supports 3DNow! extension instructionsSSE * Supports Streaming SIMD ExtensionsSSE2 * Supports Streaming SIMD Extensions 2SSE3 * Supports Streaming SIMD Extensions 3SSSE3 * Supports Supplemental SIMD Extensions 3SSE4a - Supports Sreaming SIMDR Extensions 4aSSE4.1 * Supports Streaming SIMD Extensions 4.1SSE4.2 * Supports Streaming SIMD Extensions 4.2AES - Supports AES extensionsAVX - Supports AVX intruction extensionsFMA - Supports FMA extensions using YMM stateMSR * Implements RDMSR/WRMSR instructionsMTRR * Supports Memory Type Range RegistersXSAVE * Supports XSAVE/XRSTOR instructionsOSXSAVE * Supports XSETBV/XGETBV instructionsRDRAND - Supports RDRAND instructionRDSEED - Supports RDSEED instructionCMOV * Supports CMOVcc instructionCLFSH * Supports CLFLUSH instructionCX8 * Supports compare and exchange 8-byte instructionsCX16 * Supports CMPXCHG16B instructionBMI1 - Supports bit manipulation extensions 1BMI2 - Supports bit manipulation extensions 2ADX - Supports ADCX/ADOX instructionsDCA - Supports prefetch from memory-mapped deviceF16C - Supports half-precision instructionFXSR * Supports FXSAVE/FXSTOR instructionsFFXSR - Supports optimized FXSAVE/FSRSTOR instructionMONITOR * Supports MONITOR and MWAIT instructionsMOVBE - Supports MOVBE instructionERMSB - Supports Enhanced REP MOVSB/STOSBPCLULDQ * Supports PCLMULDQ instructionPOPCNT * Supports POPCNT instructionLZCNT - Supports LZCNT instructionSEP * Supports fast system call instructionsLAHF-SAHF * Supports LAHF/SAHF instructions in 64-bit modeHLE - Supports Hardware Lock Elision instructionsRTM - Supports Restricted Transactional Memory instructionsDE * Supports I/O breakpoints including CR4.DEDTES64 * Can write history of 64-bit branch addressesDS * Implements memory-resident debug bufferDS-CPL * Supports Debug Store feature with CPLPCID * Supports PCIDs and settable CR4.PCIDEINVPCID - Supports INVPCID instructionPDCM * Supports Performance Capabilities MSRRDTSCP * Supports RDTSCP instructionTSC * Supports RDTSC instructionTSC-DEADLINE * Local APIC supports one-shot deadline timerTSC-INVARIANT * TSC runs at constant ratexTPR * Supports disabling task priority messagesEIST * Supports Enhanced Intel SpeedstepACPI * Implements MSR for power managementTM * Implements thermal monitor circuitryTM2 * Implements Thermal Monitor 2 controlAPIC * Implements software-accessible local APICx2APIC - Supports x2APICCNXT-ID - L1 data cache mode adaptive or BIOSMCE * Supports Machine Check, INT18 and CR4.MCEMCA * Implements Machine Check ArchitecturePBE * Supports use of FERR#/PBE# pinPSN - Implements 96-bit processor serial numberPREFETCHW * Supports PREFETCHW instructionMaximum implemented CPUID leaves: 0000000D (Basic), 80000008 (Extended).Logical to Physical Processor Map:*- Physical Processor 0-* Physical Processor 1Logical Processor to Socket Map:** Socket 0Logical Processor to NUMA Node Map:** NUMA Node 0Logical Processor to Cache Map:*- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64*- Instruction Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64*- Unified Cache 0, Level 2, 256 KB, Assoc 8, LineSize 64-* Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64-* Instruction Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64-* Unified Cache 1, Level 2, 256 KB, Assoc 8, LineSize 64** Unified Cache 2, Level 3, 3 MB, Assoc 12, LineSize 64Logical Processor to Group Map:** Group 0View Code
打開文件,你會看到類似于下面的section ,通常結果是在最后一個section 。這個section 叫‘Logical Processor to Group Map’.
有80 邏輯cpu的機器的結果通常是這樣:
The result of a server with 80 LOGICAL PROCESSOR threads might look
新聞熱點
疑難解答