Contact US

Creating a Corporation

May

 

SITE MAP

Class Training For Pc Repair

A Plus Guide

A plus Os

LAPTOPS

 

X86 Tech

January

February

March

Osi

auto parts

Winserver8

windows7

 

Virtualization

Cloud Computing

Security

Cash fast

B & I SNOWDEN-Find a Wealth of Products and Services

 

 

Executing MMX™

Instructions

A programmer must approach the use of MMX instructions

differently, based on whether the code being developed is at

the system level or at the application level. The details of these

differences are discussed in “Programming Considerations” on

page 9.

Before using the MMX instructions, the programmer must use

the CPUID instruction to determine if the processor supports

multimedia technology. See the AMD Processor Recognition

Application Note, order# 20734, for more information.

Function 1 (EAX=1) of the AMD-K6 processor CPUID

instruction returns the processor feature bits in the EDX

register. Software can then test bit 23 of the feature bits to

determine if the processor supports the multimedia technology.

If bit 23 is set to 1, MMX instructions are supported. All

AMD-K6 processors have bit 23 set. Once it is determined that

multimedia technology is supported, subsequent code can use

the MMX instructions. Alternatively, the AMD 8000_0001h

extended CPUID function can be used to test whether the

processor supports multimedia technology.

After a module of MMX code has executed, the programmer

must empty the MMX state by executing the EMMS command.

Because the MMX registers share the floating-point registers,

an instruction is needed to prevent MMX code from interfering

with floating-point. The EMMS command clears the multimedia

state and resets all the floating-point tag bits. Emptying the

MMX state sets the floating-point tag bits to empty (all ones),

which marks the MMX/FP registers as invalid and available.

Register Set

The AMD-K6 processor implements eight new 64-bit MMX

registers. These registers are mapped on the floating-point

registers. As shown in Figure 1 on page 5, the new MMX

instructions refer to these registers as mmreg0 to mmreg7.

Mapping the new MMX registers on the floating-point stack

enables backwards compatibility for the register saving that

must occur as a result of task switching.

AMD-K6™ Processor Multimedia Technology 5

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

Figure 1. MMX™ Registers

Aliasing the MMX registers onto the floating-point stack

registers provides a safe way to introduce this new technology.

Instead of needing to modify operating systems, new MMX

applications can be supported through device drivers, MMX

libraries, or DLL files. See the Programming Considerations

section of this document for more information.

Current operating systems have support for floating-point

operations. Using the floating-point registers for MMX code is

an ingenious way of implementing automatic support for MMX

instructions. Every time the processor executes an MMX

instruction, all the floating-point register tag bits are set to zero

(00b=valid). Setting the tag bits after every MMX instruction

prevents the processor from having to perform extra tasks.

These extra tasks are normally executed on floating-point

registers when the Tag field is something other than 00b.

If a task switch occurs during an MMX or floating-point

instruction, the Control Register (CR0) Task Switch (TS) bit is

set to 1. The processor then generates an interrupt 7 (int 7

Device Not Available) when it encounters the next

floating-point or MMX instruction, allowing the operating

system to save the state of the MMX/FP registers.

TAG BITS 63 0

mmreg0

mmreg7

mmreg1

mmreg6

mmreg5

mmreg2

mmreg3

mmreg4

xx

xx

xx

xx

xx

xx

xx

xx

6 AMD-K6™ Processor Multimedia Technology

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

If there is a task switch when MMX applications are running

with older applications that do not include MMX instructions,

the MMX/FP register state is still saved automatically through

the int 7 handler.

Data Types

The AMD-K6 processor multimedia technology uses a packed

data format. The data is packed in a single, 64-bit MMX register

or memory operand as eight bytes, four words, or two double

words. Each byte, word, doubleword, or quadword is an integer

data type.

The form of an instruction determines the data type. For

example, the MOV instruction comes in two different forms—

MOVD moves 32 bits of data and MOVQ moves 64 bits of data.

The four new data types are defined as follows:

Packed byte Eight 8-bit bytes packed into 64 bits

Signed integer range(–27 to 27–1)

Unsigned integer range(0 to 28–1)

Packed word Four 16-bit words packed into 64-bits

Signed integer range(–215to 215–1)

Unsigned integer range(0 to 216–1)

Packed Two 32-bit doublewords packed into 64 bits

doubleword Signed integer range(–231 to 231–1)

Unsigned integer range(0 to 232–1)

Quadword One 64-bit quadword

Signed integer range(–263 to 263–1)

Unsigned integer range(0 to 264–1)

Figure 2 on page 7 shows the four new data types.

AMD-K6™ Processor Multimedia Technology 7

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

Figure 2. MMX™ Data Types

Instructions

The AMD-K6 processor multimedia technology includes 57 new

MMX instructions. These new instructions are organized into

the following groups:

n Arithmetic

n Empty MMX registers

n Compare

n Convert (pack/unpack)

n Logical

n Move

n Shift

The following mnemonics are used in the instructions:

n P—Packed data

n B—Byte

n W—Word

n D—Doubleword

n Q—Quadword

n S—Signed

63 56 55 47

63

39 31 23 15 7

47

63

63

31 15

48 40 32 24 16

0

0 32

48 32 16 0

0 8

31

(8 bits x 8) Packed bytes

(16 bits x 4) Packed words

(32 bits x 2) Packed double words

(64 bits x 1) Quadword

B2 B1 B4 B3 B5 B0 B6 B7

W0 W1 W2 W3

D0 D1

Q0

8 AMD-K6™ Processor Multimedia Technology

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

n U—Unsigned

n SS—Signed Saturation

n US—Unsigned Saturation

For example, the mnemonic for the PACK instruction that packs

four words into eight unsigned bytes is PACKUSWB. In this

mnemonic, the US designates an unsigned result with

saturation, and the WB means that the source is packed words

and the result is packed bytes.

The term saturation is commonly used in multimedia

applications. Saturation allows mathematical limits to be

placed on the data elements. If a result exceeds the boundary of

that data type, the result is set to the defined limit for that

instruction. A common use of saturation is to prevent color

wraparound.

Instruction Formats

All MMX instructions, except the EMMS instruction that uses

no operands, are formatted as follows:

INSTRUCTION mmreg1, mmreg2/mem64

The source operand (mmreg2/mem64) can be either an MMX

register or a memory location. The destination operand

(mmreg1) can only be an MMX register.

The MOVD and MOVQ instructions also have the following

acceptable formats:

MOVD mmreg1, mreg32/mem32

MOVD mreg32/mem32, mmreg1

MOVQ mem64, mmreg1

In the first example, the source operand (mreg32/mem32) can

be either an integer register or a 32-bit memory address. The

destination operand (mmreg1) can only be an MMX register.

The second example has the source operand as an MMX

register. The destination operand (mreg32/mem32) can be

either an integer register or a 32-bit memory address. The third

example has the source operand as an MMX register and the

destination operand as a 64-bit memory location

The SHIFT instructions can also utilize an immediate source

operand. It is designated as imm8.

PSRLW mmreg1, imm8

9

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

2

Programming

Considerations

This chapter describes considerations for programmers writing

operating systems, compilers, and applications that utilize

MMX instructions as implemented in the AMD-K6 MMX

enhanced processor.

Feature Detection

To use the AMD-K6 processor multimedia technology, the

programmer must determine if the processor supports them.

The CPUID instruction gives programmers the ability to

determine the presence of multimedia technology on the

processor. Software must first test to see if the CPUID

instruction is supported. For a detailed description of the

CPUID instruction, see the AMD Processor Recognition

Application Note, order# 20734.

The presence of the CPUID instruction is indicated by the ID

bit (21) in the EFLAGS register. If this bit is writable, the

CPUID instruction is supported. The following code sample

shows how to test for the presence of the CPUID instruction.

10 Programming Considerations

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

pushfd ; save EFLAGS

pop eax ; store EFLAGS in EAX

mov ebx, eax ; save in EBX for later testing

xor eax, 00200000h ; toggle bit 21

push eax ; put to stack

popfd ; save changed EAX to EFLAGS

pushfd ; push EFLAGS to TOS

pop eax ; store EFLAGS in EAX

cmp eax, ebx ; see if bit 21 has changed

jz NO_CPUID ; if no change, no CPUID

If the processor supports the CPUID instruction, the

programmer must execute the standard function, EAX=0. The

CPUID function returns a 12-character string that identifies the

processor’s vendor. For AMD processors, standard function 0

returns a vendor string of “Authentic AMD”. This string

requires the software to follow the AMD definitions for

subsequent CPUID functions and the values returned for those

functions.

The next step is for the programmer to determine if MMX

instructions are supported. Function 1 of the CPUID

instruction provides this information. Function 1 (EAX=1) of

the AMD CPUID instruction returns the feature bits in the EDX

register. If bit 23 in the EDX register is set to 1, MMX

instructions are supported. The following code sample shows

how to test for MMX instruction support.

mov eax,1 ; setup function 1

CPUID ; call the function

test edx, 800000 ; test 23rd bit

jnz YES_MM ; multimedia technology supported

Alternatively, the extended function 1 (EAX=8000_0001h) can

be used to determine if MMX instructions are supported.

mov eax,8000_0001h ; setup extended function 1

CPUID ; call the function

test edx, 800000 ; test 23rd bit

jnz YES_MM ; multimedia technology supported

Programming Considerations 11

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

Task Switching

A task switch is an event that occurs within operating systems

that allows multiple programs to be executed in parallel. Most

modern operating systems utilizing task switching, are called

multitasking operating systems.

There are two types of multitasking operating systems—

cooperative and preemptive.

Cooperative

Multitasking

In cooperative multitasking operating systems, applications do

not care about other tasks that may be running. Each task

assumes that it owns the machine state (processor, registers, I/O,

memory, etc.). In addition, these tasks must take care of saving

their own information (i.e., registers, stacks, states) in their own

memory areas. The cooperative multitasking operating system

does not save operating state information for the applications.

There are different types of cooperative multitasking operating

systems. Some of these operating systems perform some level of

state saves, but this state saving is not always reliable. All

software engineers programming for a cooperative multitasking

environment must save the MMX or floating-point states before

relinquishing control to another task or to the operating

system. The FSAVE and FRSTOR commands are used to

perform this task. Figure 4 illustrates this task switching

process.

Note: Some cooperative operating systems may have API calls to

perform these tasks for the application.

Figure 3. Cooperative Task Switching

PROGRAM MUST

RESTORE STATES

FRSTOR

code executing

code module

finished

PROGRAM MUST

SAVE STATES

FSAVE

goto TASK 1

executing

MMX™/FP code PROGRAM MUST RESTORE

STATES

FRSTOR

executing code

TASK 1 TASK 2 TASK 1

Task Switch

to TASK 2

PROGRAM MUST

SAVE STATES

FSAVE

Preemptive

Multitasking

In preemptive multitasking operating systems like OS/2,

Windows NT™, and UNIX, the operating system handles all

state and register saves. The application programmer does not

need to save states when programming within a preemptive

multitasking environment. The preemptive multitasking

operating system sets aside a save area for each task.

In a preemptive multitasking operating system, if a task switch

occurs, the operating system sets the Control Register 0 (CR0)

Task Switch (TS) bit to 1. If the new task encounters a

floating-point or MMX instruction, an interrupt 7 (int 7, Device

Not Available) is generated. The int7 handler saves the state of

the first task and restores the state of the second task. The int7

handler sets the CR0.TS to 0 and returns to the original

floating-point or MMX instruction in the second task. Figure 4

illustrates this task switching process.

Figure 4. Preemptive Task Switching

executing

MMX™/FP code

executing code Save Task 1 State

Restore Task 2

Set CR0.TS=0

Return to Task 2

MMX/FP code

TASK 1 TASK 2 INT 7 handler

Task Switch

to TASK 2

Set CR0.TS=1

Encounter

MMX/FP code

Because TS=1

goto INT 7

handler

Programming Considerations 13

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

Exceptions

Table 1 contains a list of exceptions that MMX instructions can

generate.

The rules for exceptions have not changed in the

implementation of MMX instructions. None of the exception

handlers need to be modified.

Note:

1. An invalid opcode exception interrupt 6 occurs if an MMX

instruction is executed on a processor that does not

support MMX instructions.

2. If a floating-point exception is pending and the processor

encounters an MMX instruction, FERR# is asserted and, if

CR0.NE = 1, an interrupt 16 is generated.

Table 1. MMX™ Instruction Exceptions

Exception Real

Virtual

8086 Protected Description

Invalid opcode (6) X X X The emulate MMX instruction bit (EM) of the control

register (CR0) is set to 1.

Device not available (7) X X X Save the floating-point or MMX state if the task switch

bit (TS) of the control register (CR0) is set to 1.

Stack exception (12) X X X During instruction execution, the stack segment limit

was exceeded.

General protection (13) X During instruction execution, the effective address of

one of the segment registers used for the operand

points to an illegal memory location.

Segment overrun (13) X X One of the instruction data operands falls outside the

address range 00000h to 0FFFFh.

Page fault (14) X X A page fault resulted from the execution of the

instruction.

Floating-point exception

pending (16)

X X X An exception is pending due to the floating-point

execution unit.

Alignment check (17) X X An unaligned memory reference resulted from the

instruction execution, and the alignment mask bit

(AM) of the control register (CR0) is set to 1. (In

Protected Mode, CPL = 3.)

14 Programming Considerations

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

Mixing MMX™ and Floating-Point Instructions

The programmer must take care when writing code that

contains both MMX and floating-point instructions. The MMX

code modules should be separated from the floating-point code

modules. All code of one type (MMX or floating-point code)

should be grouped together as often as possible. To obtain the

highest performance, routines should not contain any

conditional branches at the end of loops that jump to code of a

different type than the code that is currently being executed.

In certain multimedia environments, floating-point and MMX

instructions may be mixed. For example, if a programmer wants

to change the viewing perspective of a three-dimensional scene,

the perspective can be changed through transformation

matrices using floating-point registers. The picture/pixel

information is integer-based and requires MMX instructions to

manipulate this information. Both MMX and floating-point

instructions are required to perform this task.

The software must clean up after itself at the end of an MMX

code module. The EMMS instruction must be used at the end of

an MMX code module to mark all floating-point registers as

empty (11=empty/invalid). In cooperative multitasking

operating systems, the EMMS instruction must be used when

switching between tasks.

Note: In some situations, experienced programmers can utilize the

MMX registers to pass information between tasks. In these

situations, the EMMS instruction is not required.

The tag bits are affected by every MMX and floating-point

instruction. After every MMX instruction except EMMS, all the

tag bits in the floating-point tag word are set to 0. When the

EMMS instruction is executed, all the tag bits in the tag word

are set to 1.

Prefixes

All instructions in the x86 architecture translate to a binary

value or opcode. This 1 or 2 byte opcode value is different for

each instruction. If an instruction is two bytes long, the second

byte is called the Mod R/M byte. The Mod R/M byte is used to

further describe the type of instruction that is used.

Programming Considerations 15

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

The x86 opcode and the Mod R/M byte can also be followed by

an SIB byte. This byte is used to describe the Scale, Index and

Base forms of 32-bit addressing.

The format of the x86 instruction allows for certain prefixes to

be placed before each instruction. These prefixes indicate

different types of command overrides.

The MMX instructions follow these rules just like all the

current existing instructions. This allows for an easy

implementation into the x86 architecture. All of the rules that

apply to the x86 architecture apply to MMX instructions,

including accessing registers, memory, and I/O.

Most opcode prefixes can be utilized while using MMX

instructions. The following prefixes can be used with MMX

instructions:

n The Segment Override prefixes (2Eh/CS, 36h/SS, 3Eh/DS,

26h/ES, 64h/FS, and 65h/GS) affect MMX instructions that

contain a memory operand.

n The LOCK prefix (F0h) triggers an invalid opcode exception

(interrupt 6).

n The Address Size Override prefix (67h) affects MMX

instructions that contain a memory operand.

16 Programming Considerations

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

17

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

3

MMX™ Instruction Set

The following MMX instruction definitions are in alphabetical

order according to the instruction mnemonics.

18 MMX™ Instruction Set

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

EMMS

mnemonic opcode description

EMMS 0F 77h Clear the MMX state

Privilege: none

Registers Affected: MMX

Flags Affected: none

Exceptions Generated:

The EMMS instruction is used to clear the MMX state following the execution of a

block of code using MMX instructions. Because the MMX registers and tag words are

shared with the floating-point unit, it is necessary to clear the state before executing

code that includes floating-point instructions.

Exception Real

Virtual

8086 Protected Description

Invalid opcode (6) X X X The emulate MMX instruction bit (EM) of the control register (CR0) is set to 1.

Device not available (7) X X X Save the floating-point or MMX state if the task switch bit (TS) of the control

register (CR0) is set to 1.

Floating-point exception

pending (16)

X X X An exception is pending due to the floating-point execution unit.

MMX™ Instruction Set 19

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

MOVD

mnemonic opcode description

MOVD mmreg1, reg32/mem32 0F 6Eh Copy a 32-bit value from the general purpose register or

memory location into the MMX register

MOVD reg32/mem32, mmreg1 0F 7Eh Copy a 32-bit value from the MMX register into the general

purpose register or memory location

Privilege: none

Registers Affected: MMX

Flags Affected: none

Exceptions Generated:

The MOVD instruction moves a 32-bit data value from an MMX register to a general

purpose register or memory, or it moves the 32-bit data from a general purpose

register or memory into an MMX register. If the 32-bit data to be moved is provided by

an MMX register, the instruction moves bits 31–0 of the MMX register into the

specified register or memory location. If the 32-bit data is being moved into an MMX

register, the instruction moves the 32-bits of data into bits 31–0 of the MMX register

and fills bits 63–32 with zeros.

Related Instructions See the MOVQ instruction.

Exception Real

Virtual

8086 Protected Description

Invalid opcode (6) X X X The emulate MMX instruction bit (EM) of the control register (CR0) is set to 1.

Device not available (7) X X X Save the floating-point or MMX state if the task switch bit (TS) of the control

register (CR0) is set to 1.

Stack exception (12) X During instruction execution, the stack segment limit was exceeded.

General protection (13) X During instruction execution, the effective address of one of the segment

registers used for the operand points to an illegal memory location.

Segment overrun (13) X X One of the instruction data operands falls outside the address range 00000h

to 0FFFFh.

Page fault (14) X X A page fault resulted from the execution of the instruction.

Floating-point exception

pending (16)

X X X An exception is pending due to the floating-point execution unit.

Alignment check (17) X X An unaligned memory reference resulted from the instruction execution,

and the alignment mask bit (AM) of the control register (CR0) is set to 1.

(In Protected Mode, CPL = 3.)

20 MMX™ Instruction Set

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

MOVQ

mnemonic opcode description

MOVQ mmreg1, mmreg2/mem64 0F 6Fh Copy a 64-bit value from an MMX register or memory location

into an MMX register

MOVQ mmreg2/mem64, mmreg1 0F 7Fh Copy a 64-bit value from an MMX register into an MMX register

or memory location

Privilege: none

Registers Affected: MMX

Flags Affected: none

Exceptions Generated:

The MOVQ instruction moves a 64-bit data value from one MMX register to another

MMX register or memory, or it moves the 64-bit data from one MMX register or

memory to another MMX register. Copying data from one memory location to another

memory location cannot be accomplished with the MOVQ instruction.

Related Instructions See the MOVD instruction.

Exception Real

Virtual

8086 Protected Description

Invalid opcode (6) X X X The emulate MMX instruction bit (EM) of the control register (CR0) is set to 1.

Device not available (7) X X X Save the floating-point or MMX state if the task switch bit (TS) of the control

register (CR0) is set to 1.

Stack exception (12) X During instruction execution, the stack segment limit was exceeded.

General protection (13) X During instruction execution, the effective address of one of the segment

registers used for the operand points to an illegal memory location.

Segment overrun (13) X X One of the instruction data operands falls outside the address range 00000h

to 0FFFFh.

Page fault (14) X X A page fault resulted from the execution of the instruction.

Floating-point exception

pending (16)

X X X An exception is pending due to the floating-point execution unit.

Alignment check (17) X X An unaligned memory reference resulted from the instruction execution,

and the alignment mask bit (AM) of the control register (CR0) is set to 1.

(In Protected Mode, CPL = 3.)

MMX™ Instruction Set 21

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

PACKSSDW

mnemonic opcode description

PACKSSDW mmreg1, mmreg2/mem64 0F 6Bh Pack with saturation signed 32-bit operands into signed

16-bit results

Privilege: none

Registers Affected: MMX

Flags Affected: none

Exceptions Generated:

The PACKSSDW instruction performs a pack and saturate operation on two signed

32-bit values in the first operand and two signed 32-bit values in the second operand.

The four signed 16-bit results are placed in the specified MMX register.

The pack operation is a data conversion. The PACKSSDW instruction converts or

packs the four signed 32-bit values into four signed 16-bit values, applying saturating

arithmetic. If the signed 32-bit value is less than –32768 (8000h), it saturates to –32768

(8000h). If the signed 32-bit value is greater than 32767 (7FFFh), it saturates to 32767

(7FFFh). All values between –32768 and 32767 are represented with their signed

16-bit value.

The first operand must be an MMX register. In addition to providing the first operand,

this MMX register is the location where the result of the pack and saturate operation

is stored. The second operand can be an MMX register or a 64-bit memory location.

Exception Real

Virtual

8086 Protected Description

Invalid opcode (6) X X X The emulate MMX instruction bit (EM) of the control register (CR0) is set to 1.

Device not available (7) X X X Save the floating-point or MMX state if the task switch bit (TS) of the control

register (CR0) is set to 1.

Stack exception (12) X During instruction execution, the stack segment limit was exceeded.

General protection (13) X During instruction execution, the effective address of one of the segment

registers used for the operand points to an illegal memory location.

Segment overrun (13) X X One of the instruction data operands falls outside the address range 00000h

to 0FFFFh.

Page fault (14) X X A page fault resulted from the execution of the instruction.

Floating-point exception

pending (16)

X X X An exception is pending due to the floating-point execution unit.

Alignment check (17) X X An unaligned memory reference resulted from the instruction execution,

and the alignment mask bit (AM) of the control register (CR0) is set to 1.

(In Protected Mode, CPL = 3.)

22 MMX™ Instruction Set

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

Functional Illustration of the PACKSSDW Instruction

The following list explains the functional illustration of the PACKSSDW instruction:

n Bits 63–32 of the source operand (mmreg2/mem64) are packed into bits 63–48 of

the destination operand (mmreg1). The result is saturated to the largest possible

16-bit negative number because the 32-bit negative source operand (8000_0002h)

exceeds the capacity of the signed 16-bit destination operand.

n Bits 31–0 of the source operand are packed into bits 47–32 of the destination

operand. The result is saturated to the largest possible 16-bit positive number

because the 32-bit positive source operand (0000_8000h) exceeds the capacity of

the 16-bit destination operand.

n Bits 63–32 of the destination operand are packed into bits 31–16 of the destination

operand. The results are not saturated because the 32-bit negative source operand

(FFFF_8002h) does not exceed the capacity of the 16-bit destination operand.

n Bits 31–0 of the destination operand are packed into bits 15–0 of the destination

operand. The results are not saturated because the 32-bit positive source operand

(0000_01FCh) does not exceed the capacity of the 16-bit destination operand.

Related Instructions See the PACKSSWB instruction.

See the PACKUSWB instruction.

See the PUNPCKHWD instruction.

See the PUNPCKLWD instruction.

0000 8000

8000h 7FFFh 8002h 01FCh

mmreg1

mmreg2/mem64 mmreg1

0

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

PACKSSDW

mnemonic opcode description

PACKSSDW mmreg1, mmreg2/mem64 0F 6Bh Pack with saturation signed 32-bit operands into signed

16-bit results

Privilege: none

Registers Affected: MMX

Flags Affected: none

Exceptions Generated:

The PACKSSDW instruction performs a pack and saturate operation on two signed

32-bit values in the first operand and two signed 32-bit values in the second operand.

The four signed 16-bit results are placed in the specified MMX register.

The pack operation is a data conversion. The PACKSSDW instruction converts or

packs the four signed 32-bit values into four signed 16-bit values, applying saturating

arithmetic. If the signed 32-bit value is less than –32768 (8000h), it saturates to –32768

(8000h). If the signed 32-bit value is greater than 32767 (7FFFh), it saturates to 32767

(7FFFh). All values between –32768 and 32767 are represented with their signed

16-bit value.

The first operand must be an MMX register. In addition to providing the first operand,

this MMX register is the location where the result of the pack and saturate operation

is stored. The second operand can be an MMX register or a 64-bit memory location.

Exception Real

Virtual

8086 Protected Description

Invalid opcode (6) X X X The emulate MMX instruction bit (EM) of the control register (CR0) is set to 1.

Device not available (7) X X X Save the floating-point or MMX state if the task switch bit (TS) of the control

register (CR0) is set to 1.

Stack exception (12) X During instruction execution, the stack segment limit was exceeded.

General protection (13) X During instruction execution, the effective address of one of the segment

registers used for the operand points to an illegal memory location.

Segment overrun (13) X X One of the instruction data operands falls outside the address range 00000h

to 0FFFFh.

Page fault (14) X X A page fault resulted from the execution of the instruction.

Floating-point exception

pending (16)

X X X An exception is pending due to the floating-point execution unit.

Alignment check (17) X X An unaligned memory reference resulted from the instruction execution,

and the alignment mask bit (AM) of the control register (CR0) is set to 1.

(In Protected Mode, CPL = 3.)

22 MMX™ Instruction Set

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

Functional Illustration of the PACKSSDW Instruction

The following list explains the functional illustration of the PACKSSDW instruction:

n Bits 63–32 of the source operand (mmreg2/mem64) are packed into bits 63–48 of

the destination operand (mmreg1). The result is saturated to the largest possible

16-bit negative number because the 32-bit negative source operand (8000_0002h)

exceeds the capacity of the signed 16-bit destination operand.

n Bits 31–0 of the source operand are packed into bits 47–32 of the destination

operand. The result is saturated to the largest possible 16-bit positive number

because the 32-bit positive source operand (0000_8000h) exceeds the capacity of

the 16-bit destination operand.

n Bits 63–32 of the destination operand are packed into bits 31–16 of the destination

operand. The results are not saturated because the 32-bit negative source operand

(FFFF_8002h) does not exceed the capacity of the 16-bit destination operand.

n Bits 31–0 of the destination operand are packed into bits 15–0 of the destination

operand. The results are not saturated because the 32-bit positive source operand

(0000_01FCh) does not exceed the capacity of the 16-bit destination operand.

Related Instructions See the PACKSSWB instruction.

See the PACKUSWB instruction.

See the PUNPCKHWD instruction.

See the PUNPCKLWD instruction.

0000 8000

8000h 7FFFh 8002h 01FCh

mmreg1

mmreg2/mem64 mmreg1

0 0

0 63

63 63

0002h 8000h

31 32 31 32

31 32 47 48 15 16

0000 FFFF 8002h 01FCh

Indicates a saturated value

MMX™ Instruction Set 23

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

PACKSSWB

mnemonic opcode description

PACKSSWB mmreg1, mmreg2/mem64 0F 63h Pack with saturation signed 16-bit operands into signed 8-bit

results

Privilege: none

Registers Affected: MMX

Flags Affected: none

Exceptions Generated:

The PACKSSWB instruction performs a pack and saturate operation on four signed

16-bit values in the first operand and four signed 16-bit values in the second operand.

The eight signed 8-bit results are placed in the specified MMX register.

The pack operation is a data conversion. The PACKSSWB instruction converts or

packs the eight signed 16-bit values into eight signed 8-bit values, applying saturating

arithmetic. If the signed 16-bit value is less than –128 (80h), it saturates to –128 (80h).

If the signed 16-bit value is greater than 127 (7Fh), it saturates to 127 (7Fh). All values

between –128 and 127 are represented by their signed 8-bit value.

The first operand must be an MMX register. In addition to providing the first operand,

this MMX register is the location where the result of the pack and saturate operation

is stored. The second operand can be an MMX register or a 64-bit memory location.

Exception Real

Virtual

8086 Protected Description

Invalid opcode (6) X X X The emulate MMX instruction bit (EM) of the control register (CR0) is set to 1.

Device not available (7) X X X Save the floating-point or MMX state if the task switch bit (TS) of the control

register (CR0) is set to 1.

Stack exception (12) X During instruction execution, the stack segment limit was exceeded.

General protection (13) X During instruction execution, the effective address of one of the segment

registers used for the operand points to an illegal memory location.

Segment overrun (13) X X One of the instruction data operands falls outside the address range 00000h

to 0FFFFh.

Page fault (14) X X A page fault resulted from the execution of the instruction.

Floating-point exception

pending (16)

X X X An exception is pending due to the floating-point execution unit.

Alignment check (17) X X An unaligned memory reference resulted from the instruction execution,

and the alignment mask bit (AM) of the control register (CR0) is set to 1.

(In Protected Mode, CPL = 3.)

24 MMX™ Instruction Set

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

Functional Illustration of the PACKSSWB Instruction

The following list explains the functional illustration of the PACKSSWB instruction:

n Bits 63–48 of the source operand (mmreg2/mem64) are packed into bits 63–56 of

the destination operand (mmreg1). The result is not saturated because the 16-bit

positive source operand (007Eh) does not exceed the capacity of a signed 8-bit

destination operand.

n Bits 47–32 of the source operand are packed into bits 55–48 of the destination

operand. The result is saturated to the largest possible 8-bit positive number

because the 16-bit positive source operand (7F00h) exceeds the capacity of a

signed 8-bit destination operand.

n Bits 31–16 of the source operand are packed into bits 47–40 of the destination

operand. The result is saturated to the largest possible 8-bit negative number

because the 16-bit negative source operand (EF9Dh) exceeds the capacity of a

signed 8-bit destination operand.

n Bits 15–0 of the source operand are packed into bits 39–32 of the destination

operand. The result is not saturated because the 16-bit negative source operand

(FF88h) does not exceed the capacity of the 8-bit destination operand.

n Bits 63–48 of the destination operand are packed into bits 31–24 of the destination

operand. The result is saturated to the largest possible 8-bit negative number

because the 16-bit negative source operand (FF02h) exceeds the capacity of a

signed 8-bit destination operand.

00

mmreg1

mmreg2/mem64 mmreg1

0 0

0 63

63 63

7Eh

31 32 31 32

31 32 47 48 15 16

47 48 15 16 47 48 15 16

7 8 23 24 39 40 55 56

7F 00h EF 9Dh FF 88h FF 02h 00 85h 00 7Eh 81 CFh

7Eh 80h 80h 7Eh 7Fh 88h 7Fh 80h

Indicates a saturated value

MMX™ Instruction Set 25

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

n Bits 47–32 of the destination operand are packed into bits 23–16 of the destination

operand. The result is saturated to the largest possible 8-bit positive number

because the 16-bit positive source operand (0085h) exceeds the capacity of a

signed 8-bit destination operand.

n Bits 31–16 of the destination operand are packed into bits 15–8 of the destination

operand. The result is not saturated because the 16-bit positive source operand

(007Eh) does not exceed the capacity of a signed 8-bit destination operand.

n Bits 15–0 of the destination operand are packed into bits 7–0 of the destination

operand. The result is saturated to the largest possible 8-bit negative number

because the 16-bit negative source operand (81CFh) exceeds the capacity of a

signed 8-bit destination operand.

Related Instructions See the PACKSSDW instruction.

See the PACKUSWB instruction.

See the PUNPCKHBW instruction.

See the PUNPCKLBW instruction.

26 MMX™ Instruction Set

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

PACKUSWB

mnemonic opcode description

PACKUSWB mmreg1, mmreg2/mem64 0F 67h Pack with saturation signed16-bit operands into unsigned

8-bit results

Privilege: none

Registers Affected: MMX

Flags Affected: none

Exceptions Generated:

The PACKUSWB instruction performs a pack and saturate operation on four signed

16-bit values in the first operand and four signed 16-bit values in the second operand.

The eight unsigned 8-bit results are placed in the specified MMX register.

The pack operation is a data conversion. The PACKUSWB instruction converts or

packs the eight signed 16-bit values into eight unsigned 8-bit values, applying

saturating arithmetic. If the signed 16-bit value is a negative number, it saturates to 0

(00h). If the signed 16-bit value is greater than 255 (FFh), it saturates to 255 (FFh). All

values between 0 and 255 are represented with their unsigned 8-bit value.

The first operand must be an MMX register. In addition to providing the first operand,

this MMX register is the location where the result of the pack and saturate operation

is stored. The second operand can be an MMX register or a 64-bit memory location.

Exception Real

Virtual

8086 Protected Description

Invalid opcode (6) X X X The emulate MMX instruction bit (EM) of the control register (CR0) is set to 1.

Device not available (7) X X X Save the floating-point or MMX state if the task switch bit (TS) of the control

register (CR0) is set to 1.

Stack exception (12) X During instruction execution, the stack segment limit was exceeded.

General protection (13) X During instruction execution, the effective address of one of the segment

registers used for the operand points to an illegal memory location.

Segment overrun (13) X X One of the instruction data operands falls outside the address range 00000h

to 0FFFFh.

Page fault (14) X X A page fault resulted from the execution of the instruction.

Floating-point exception

pending (16)

X X X An exception is pending due to the floating-point execution unit.

Alignment check (17) X X An unaligned memory reference resulted from the instruction execution,

and the alignment mask bit (AM) of the control register (CR0) is set to 1.

(In Protected Mode, CPL = 3.)

AMD Tech  OR VIA Processor

need other links go to SITE MAP

home B & I SNOWDEN-Find a Wealth of Products and Services

   Bisnowden,3330 Adeline st. Berkeley,Ca94703 or send to bisnowden@yahoo.com Tele 510-595-1332
send mail to bisnowden@yahoo.com with questions or comments
  about this web site.

Last modified: July 07, 2011