MOVNTPS--Move Aligned Four Packed Single-FP Non Temporal

Opcode

Instruction

Description

0F 2B /r

MOVNTPS m128, xmm

Move packed single-precision floating-point values from xmm to m128, minimizing pollution in the cache hierarchy.

Description

Moves the double quadword in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to minimize cache pollution during the write to memory. The source operand is an XMM register, which is assumed to contain four packed single-precision floating-point values. The destination operand is a 128-bit memory location.

The non-temporal hint is implemented by using a write combining (WC) memory type protocol when writing the data to memory. Using this protocol, the processor does not write the data into the cache hierarchy, nor does it fetch the corresponding cache line from memory into the cache hierarchy. The memory type of the region being written to can override the non-temporal hint, if the memory address specified for the non-temporal store is in an uncacheable (UC) or write protected (WP) memory region.

Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation such as SFENCE should be used in conjunction with MOVNTDQ instructions if multiple processors might use different memory types to read/write the memory location.

Operation

DEST SRC;

Intel(R) C++ Compiler Intrinsic Equivalent

MOVNTDQ void_mm_stream_ps(float * p, __m128 a)

SIMD Floating-Point Exceptions

None.

Protected Mode Exceptions

#GP(0) - For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. If memory operand is not aligned on a 16-byte boundary, regardless of segment.

#SS(0) - For an illegal address in the SS segment.

#PF(fault-code) - For a page fault.

#NM - If TS in CR0 is set.

#UD - If EM in CR0 is set. If OSFXSR in CR4 is 0. If CPUID feature flag SSE is 0.

Real-Address Mode Exceptions

#GP(0) - If memory operand is not aligned on a 16-byte boundary, regardless of segment.

Interrupt 13 - If any part of the operand lies outside the effective address space from 0 to 0FFFFH.

#NM - If TS in CR0 is set.

#UD - If EM in CR0 is set. If OSFXSR in CR4 is 0. If CPUID feature flag SSE2 is 0.

Virtual-8086 Mode Exceptions

Same exceptions as in Real Address Mode

#PF(fault-code) - For a page fault.