Comparison of CPU microarchitectures

The following is a comparison of CPU microarchitectures.

Microarchitecture	Pipeline stages	Misc
AMD K5		Out-of-order execution, register renaming, speculative execution
AMD K6		Superscalar, branch prediction
AMD K6-III		Branch prediction, speculative execution, out-of-order execution^[1]
AMD K7		Out-of-order execution, branch prediction, Harvard architecture
AMD K8		64-bit, integrated memory controller, 16 byte instruction prefetching
AMD K10		Superscalar, out-of-order execution, 32-way set associative L3 victim cache, 32-byte instruction prefetching
ARM7TDMI (-S)	3
ARM7EJ-S	5
ARM810	5
ARM9TDMI	5
ARM1020E	6
XScale PXA210/PXA250	7
ARM1136J(F)-S	8
ARM1156T2(F)-S	9
ARM Cortex-A5	8	Single issue, in-order
ARM Cortex-A7 MPCore	8	Partial dual-issue, in-order
ARM Cortex-A8	13	Dual-issue
ARM Cortex-A9 MPCore	8–11	Out-of-order, speculative issue, superscalar
ARM Cortex-A15 MPCore	15	Multi-core (up to 16), out-of-order, speculative issue, 3-way superscalar
ARM Cortex-A53		Partial dual-issue, in-order
ARM Cortex-A57		Deeply out-of-order, wide multi-issue, 3-way superscalar
AVR32 AP7	7
AVR32 UC3	3	Harvard architecture
Bobcat		Out-of-order execution
Bulldozer		Shared multithreaded L2 cache, multithreading, multi-core, around 20 stage long pipeline, integrated memory controller, out-of-order, superscalar, up to 16 cores per chip, up to 16 MB L3 cache, Virtualization, Turbo Core, FlexFPU which use simultaneous multithreading^[2]
Piledriver		Shared multithreaded L2 cache, multithreading, multi-core, around 20 stage long pipeline, integrated memory controller, out-of-order, superscalar, up to 16 MB L2 cache, up to 16 MB L3 cache, Virtualization, FlexFPU which use simultaneous multithreading,^[2] up to 16 cores per chip, up to 5 GHz clock speed, up to 220 W TDP, Turbo Core
Excavator	4
Zen	6	Simultaneous multithreading
Crusoe		In-order execution, 128-bit VLIW, integrated memory controller
Efficeon		In-order execution, 256-bit VLIW, fully integrated memory controller
Cyrix Cx5x86	6^[3]	Branch prediction
Cyrix 6x86		Superscalar, superpipelined, register renaming, speculative execution, out-of-order execution
DLX	5
eSi-3200	5	In-order, speculative issue
eSi-3250	5	In-order, speculative issue
EV4 (Alpha 21064)		Superscalar
EV7 (Alpha 21364)		Superscalar design with out-of-order execution, branch prediction, 4-way simultaneous multithreading, integrated memory controller
EV8 (Alpha 21464)		Superscalar design with out-of-order execution
65k	30+	Ultra low power consumption, register renaming, out of order execution, branch prediction, multi-core, module, capable of reach higher clock
P5 (Pentium)	5	Superscalar
P6 (Pentium Pro)	14	Speculative execution, register renaming, superscalar design with out-of-order execution
P6 (Pentium II)	14^[4]	Branch prediction
P6 (Pentium III)	14^[4]
Intel Itanium	11^[5]	Speculative execution, branch prediction, register renaming, 30 execution units, multithreading, multi-core, coarse-grained mutithreading, 2-way simultaneous multithreading, Dual-domain multithreading, Turbo Boost, Virtualization, VLIW, RAS with Advanced Machine Check Architecture, Instruction Replay technology, Cache Safe technology, Enhanced SpeedStep technology
Intel NetBurst (Willamette)	20	2-way simultaneous multithreading (Hyper-threading), Rapid Execution Engine, Execution Trace Cache, quad-pumped Front-Side Bus, Hyper-pipelined Technology, superscalar, out-of order
NetBurst (Northwood)	20	2-way simultaneous multithreading
NetBurst (Prescott)	31	2-way simultaneous multithreading
NetBurst (Cedar Mill)	31	2-way simultaneous multithreading
Intel Core	12	Multi-core, out-of-order, 4-way superscalar
Intel Atom	16	2-way simultaneous multithreading, in-order, no instruction reordering, speculative execution, or register renaming
Intel Atom Oak Trail		2-way simultaneous multithreading, in-order, burst mode, 512 KB L2 cache
Intel Atom Silvermont		Out-of-order execution
Nehalem	14	2-way simultaneous multithreading, out-of-order, 6-way superscalar, integrated memory controller, L1/L2/L3 cache, Turbo Boost
Sandy Bridge	14	2-way simultaneous multithreading, multi-core, integrated memory controller, L1/L2/L3 cache, 2 threads per core, Turbo Boost
Intel Haswell	14	Multi-core, multithreading, 2-way simultaneous multithreading, hardware-based transactional memory (in selected models), L4 cache (in GT3 models), Turbo Boost, out-of-order execution, superscalar, up to 8 MB L3 cache (mainstream), up to 20 MB L3 cache (Extreme)
Broadwell		Multi-core, multithreading
Skylake		Multi-core, L4 cache
Intel Xeon Phi 7120x	7-stage integer, 6-stage vector	Multi-core, multithreading, 4 hardware-based simultaneous threads per core which can't be disabled unlike regular HyperThreading, Time-multiplexed multithreading, 61 cores per chip, 244 threads per chip, 30.5 MB L2 cache, 300 W TDP, Turbo Boost, in-order dual-issue pipelines, coprocessor, Floating-point accelerator, 512-bit wide Vector-FPU
LatticeMico32	6	Harvard architecture
POWER1		Superscalar, out-of-order execution
POWER3		Superscalar, out-of-order execution
POWER4		Superscalar, speculative execution, out-of-order execution
POWER5		2-way simultaneous multithreading, out-of-order execution, integrated memory controller
IBM POWER6		2-way simultaneous multithreading, in-order execution, up to 5 GHz
IBM POWER7+		Multi-core, multithreading, out-of-order, superscalar, 4 intelligent simultaneous threads per core, 12 execution units per core, 8 cores per chip, 80 MB L3 cache, true hardware entropy generator, hardware-assisted cryptographic acceleration, fixed-point unit, decimal fixed-point unit, Turbo Core, decimal floating-point unit
IBM Cell		Multi-core, multithreading, 2-way simultaneous multithreading (PPE), Power Processor Element, Synergistic Processing Elements, Element Interconnect Bus, in-order execution
IBM Cyclops64		Multi-core, multithreading, 2 threads per core, in-order
IBM zEnterprise zEC12	15/16/17	Multi-core, 6 cores per chip, up to 5.5 GHz, superscalar, out-of-order, 48 MB L3 cache, 384 MB shared L4 cache
PowerPC 401	3
PowerPC 405	5
PowerPC 440	7
PowerPC 470	9	Symmetric multiprocessing (SMP)
PowerPC A2	15
PowerPC e300	4	Superscalar, branch prediction
PowerPC e500	Dual 7 stage	Multi-core
PowerPC e600	3-issue 7 stage	Superscalar out-of-order execution, branch prediction
PowerPC e5500	4-issue 7 stage	Out-of-order, multi-core
PowerPC e6500		Multi-core
PowerPC 603	4	5 execution units, branch prediction, no SMP
PowerPC 603q	5	In-order
PowerPC 604	6	Superscalar, out-of-order execution, 6 execution units, SMP support
PowerPC 620	5	Out-of-order execution, SMP support
PWRficient		Superscalar, out-of-order execution, 6 execution units
R4000	8	Scalar
StrongARM SA-110	5	Scalar, in-order
SuperH SH2	5
SuperH SH2A	5	Superscalar, Harvard architecture
SPARC		Superscalar
hyperSPARC		Superscalar
SuperSPARC		Superscalar, in-order
SPARC64 VI/VII/VII+		Superscalar, out-of-order^[6]
UltraSPARC	9
UltraSPARC T1	6	Open source, multithreading, multi-core, 4 threads per core, integrated memory controller
UltraSPARC T2	8	Open source, multithreading, multi-core, 8 threads per core
SPARC T3	8	Multithreading, multi-core, 8 threads per core, SMP, 16 cores per chip, 2 MB L3 cache, in-order, hardware random number generator
Oracle SPARC T4	16	Multithreading, multi-core, 8 fine-grained threads per core of which 2 can be executed simultaneously, 2-way simultaneous multithreading, SMP, 8 cores per chip, out-of-order, 4 MB L3 cache, out-of order, Hardware random number generator
Oracle Corporation SPARC T5	16	Multithreading, multi-core, 8 fine-grained threads per core of which 2 can be executed simultaneously, 2-way simultaneous multithreading, 16 cores per chip, out-of-order, 16-way associative shared 8 MB L3 cache, hardware-assisted cryptographic acceleration, stream-processing unit, out-of order execution, RAS features, 16 cryptography units per chip, hardware random number generator
Oracle SPARC M5	16	Multithreading, multi-core, 8 fine-grained threads per core of which 2 can be executed simultaneously, 2-way simultaneous multithreading, 6 cores per chip, out-of-order, 48 MB L3 cache, out-of order execution, RAS features, stream-processing unit, hardware-assisted cryptographic acceleration, 6 cryptography units per chip, Hardware random number generator
Fujitsu SPARC64 X		Multithreading, multi-core, 2-way simultaneous multithreading, 16 cores per chip, out-of order, 24 MB L2 cache, out-of order, RAS features
Imagination Technologies MIPS Warrior
VIA C7		In-order execution
VIA Nano (Isaiah)		Superscalar out-of-order execution, branch prediction, 7 execution units
WinChip	4	In-order execution

References

↑ "Products We Design". amd.com. Retrieved 19 January 2014.
1 2 "wp-content/uploads/2013/07/AMD-Steamroller-vs-Bulldozer". cdn3.wccftech.com. Retrieved 19 January 2014.
↑ "Cyrix 5x86 ("M1sc")". pcguide.com. Retrieved 19 January 2014.
1 2 "Computer Science 246: Computer Architecture" (PDF). Harvard University. Retrieved 23 December 2013. P6 pipeline
↑ Intel Itanium 2 Processor Hardware Developer's Manual. p. 14. http://www.intel.com/design/itanium2/manuals/25110901.pdf (2002) Retrieved 28 November 2011
↑ "Multi Core Processor SPARC64™ Series : Fujitsu Global". fujitsu.com. Retrieved 19 January 2014.

CPU technologies

Architecture	Von Neumann Harvard (Modified) Dataflow TTA

Instruction set	ASIP CISC RISC EDGE (TRIPS) VLIW (EPIC) MISC OISC NISC ZISC Comparison

Word size	1-bit 4-bit 8-bit 9-bit 10-bit 12-bit 15-bit 16-bit 18-bit 22-bit 24-bit 25-bit 26-bit 27-bit 31-bit 32-bit 33-bit 34-bit 36-bit 39-bit 40-bit 48-bit 50-bit 60-bit 64-bit 128-bit 256-bit 512-bit Variable

Execution	Instruction pipelining Bubble Operand forwarding Out-of-order execution Register renaming Speculative execution Branch predictor Memory dependence prediction Hazards

Parallel level	Bit Bit-serial Word Instruction Scalar Superscalar Task Thread Process Data Vector Memory

Multithreading	Temporal Simultaneous Preemptive Cooperative

Flynn's taxonomy	SISD SIMD MISD MIMD SPMD Addressing mode

Core count	Single-core processor Multi-core processor Manycore processor

Types	Digital signal processor (DSP) GPGPU Microcontroller Physics processing unit System on a chip (SoC) Cellular

Components	Address generation unit (AGU) Arithmetic logic unit (ALU) Barrel shifter Floating-point unit (FPU) Back-side bus Multiplexer Demultiplexer Registers Memory management unit (MMU) Translation lookaside buffer (TLB) Cache Register file Microcode Control unit Clock rate

Power management	APM ACPI Dynamic frequency scaling Dynamic voltage scaling Clock gating

Hardware security	Non-executable memory (NX bit) Bounds checking (Intel MPX) Hardware restriction (firmware) Software Guard Extensions (Intel SGX) Trusted Execution Technology Secure cryptoprocessor Hardware security module Hengzhi chip

This article is issued from Wikipedia - version of the 4/28/2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.

Comparison of CPU microarchitectures

See also

References