Download Intel Processors: L1 Cache, Fetch/Decode Unit, and FPU Architecture and more Study notes Operating Systems in PDF only on Docsity!
Intel Architecture
Software Developer’s
Manual
Volume 1:
Basic Architecture
NOTE : The Intel Architecture Software Developer’s Manual consists of
three volumes: Basic Architecture , Order Number 243190; Instruction Set
Reference, Order Number 243191; and the System Programming Guide,
Order Number 243192.
Please refer to all three volumes when evaluating your design needs.
1999
Information in this document is provided in connection with Intel products. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel’s Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sustaining applications.
Intel may make changes to specifications and product descriptions at any time, without notice.
Designers must not rely on the absence or characteristics of any features or instructions marked “reserved” or “undefined.” Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them.
Intel’s Intel Architecture processors (e.g., Pentium®, Pentium® II, Pentium® III, and Pentium® Pro processors) may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an ordering number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intel's literature center at http://www.intel.com.
COPYRIGHT © INTEL CORPORATION 1999 *THIRD-PARTY BRANDS AND NAMES ARE THE PROPERTY OF THEIR RESPECTIVE OWNERS.
TABLE OF CONTENTS
v
vii
TABLE OF CONTENTS
PROGRAMMING WITH THE INTEL
TABLE OF CONTENTS
x
CHAPTER 10
INPUT/OUTPUT
10.1. I/O PORT ADDRESSING............................................ 10-
10.2. I/O PORT HARDWARE............................................. 10-
10.3. I/O ADDRESS SPACE.............................................. 10-
10.3.1. Memory-Mapped I/O............................................. .10- 10.4. I/O INSTRUCTIONS................................................ 10- 10.5. PROTECTED-MODE I/O............................................ 10- 10.5.1. I/O Privilege Level............................................... .10- 10.5.2. I/O Permission Bit Map........................................... .10- 10.6. ORDERING I/O................................................... 10-
CHAPTER 11
PROCESSOR IDENTIFICATION AND FEATURE DETERMINATION
11.1. PROCESSOR IDENTIFICATION...................................... 11-
11.2. IDENTIFICATION OF EARLIER INTEL ARCHITECTURE PROCESSORS..... 11-
11.3. CPUID INSTRUCTION EXTENSIONS................................. 11-
11.3.1. Version Information.............................................. .11- 11.3.2. Control Register Extensions....................................... .11-
APPENDIX A
EFLAGS CROSS-REFERENCE
APPENDIX B
EFLAGS CONDITION CODES
APPENDIX C
FLOATING-POINT EXCEPTIONS SUMMARY
APPENDIX D
SIMD FLOATING-POINT EXCEPTIONS SUMMARY
APPENDIX E
GUIDELINES FOR WRITING FPU
EXCEPTIONS HANDLERS
E.1. ORIGIN OF THE MS-DOS* COMPATIBILITY MODE FOR HANDLING
FPU EXCEPTIONS................................................. E-
E.2. IMPLEMENTATION OF THE MS-DOS* COMPATIBILITY MODE IN THE INTEL486™,
PENTIUM®, AND P6 FAMILY PROCESSORS E-
E.2.1. MS-DOS* Compatibility Mode in the Intel486™ and Pentium® Processors.... E- E.2.1.1. Basic Rules: When FERR# Is Generated............................ E- E.2.1.2. Recommended External Hardware to Support the MS-DOS* Compatibility Mode.................................... E- E.2.1.3. No-Wait FPU Instructions Can Get FPU Interrupt in Window............. E- E.2.2. MS-DOS* Compatibility Mode in the P6 Family Processors................ E- E.3. RECOMMENDED PROTOCOL FOR MS-DOS* COMPATIBILITY HANDLERS.. E- E.3.1. Floating-Point Exceptions and Their Defaults.......................... E- E.3.2. Two Options for Handling Numeric Exceptions......................... E- E.3.2.1. Automatic Exception Handling: Using Masked Exceptions............. E- E.3.2.2. Software Exception Handling.................................... E- E.3.3. Synchronization Required for Use of FPU Exception Handlers............ E-
xi
TABLE OF CONTENTS
E.3.3.1. Exception Synchronization: What, Why and When.................... E- E.3.3.2. Exception Synchronization Examples.............................. E- E.3.3.3. Proper Exception Synchronization in General....................... E- E.3.3.4. FPU Exception Handling Examples............................... E- E.3.4. Need for Storing State of IGNNE# Circuit If Using FPU and SMM.......... E- E.3.5. Considerations When FPU Shared Between Tasks..................... E- E.3.5.1. Speculatively Deferring FPU Saves, General Overview................ E- E.3.5.2. Tracking FPU Ownership....................................... E- E.3.5.3. Interaction of FPU State Saves and Floating-point Exception Association.. E- E.3.5.4. Interrupt Routing From the Kernel................................ E- E.3.5.5. Special Considerations for Operating Systems that Support Streaming SIMD Extensions..................................... E- E.4. DIFFERENCES FOR HANDLERS USING NATIVE MODE.................. E- E.4.1. Origin with the Intel 286 and Intel 287, and Intel386™ and Intel 387 Processors............................................. E- E.4.2. Changes with Intel486™, Pentium®, and P6 Family Processors with CR0.NE=1................................................. E- E.4.3. Considerations When FPU Shared Between Tasks Using Native Mode...... E-
APPENDIX F
GUIDELINES FOR WRITING SIMD FLOATING-POINT EXCEPTION HANDLERS
F.1. TWO OPTIONS FOR HANDLING NUMERIC EXCEPTIONS................. F-
F.2. SOFTWARE EXCEPTION HANDLING.................................. F-
F.3. EXCEPTION SYNCHRONIZATION..................................... F-
F.4. SIMD FLOATING-POINT EXCEPTIONS AND THE IEEE-754 STANDARD FOR
BINARY FLOATING-POINT COMPUTATIONS F-
F.4.1. Floating-Point Emulation........................................... F- F.4.2. Streaming SIMD Extensions Response To Floating-Point Exceptions........ F- F.4.2.1. Numeric Exceptions............................................ F- F.4.2.2. Results of Operations with NaN Operands or a NaN Result for Streaming SIMD Extensions Numeric Instructions..................... F- F.4.2.3. Condition Codes, Exception Flags, and Response for Masked and Unmasked Numeric Exceptions.................................. F- F.4.3. SIMD Floating-Point Emulation Implementation Example................. F-
xiii
TABLE OF FIGURES
TABLE OF TABLES
- CHAPTER TABLE OF CONTENTS
- VOLUME 1: BASIC ARCHITECTURE 1- 1.1. OVERVIEW OF THE INTEL ARCHITECTURE SOFTWARE DEVELOPER’S MANUAL,
- VOLUME 2: INSTRUCTION SET REFERENCE 1- 1.2. OVERVIEW OF THE INTEL ARCHITECTURE SOFTWARE DEVELOPER’S MANUAL,
- VOLUME 3: SYSTEM PROGRAMMING GUIDE 1- 1.3. OVERVIEW OF THE INTEL ARCHITECTURE SOFTWARE DEVELOPER’S MANUAL,
- 1.4. NOTATIONAL CONVENTIONS 1-
- 1.4.1. Bit and Byte Order .1-
- 1.4.2. Reserved Bits and Software Compatibility .1-
- 1.4.3. Instruction Operands .1-
- 1.4.4. Hexadecimal and Binary Numbers .1-
- 1.4.5. Segmented Addressing .1-
- 1.4.6. Exceptions .1-
- 1.5. RELATED LITERATURE 1-
- CHAPTER
- 2.1. BRIEF HISTORY OF THE INTEL ARCHITECTURE 2- INTRODUCTION TO THE INTEL ARCHITECTURE
- 2.2. INCREASING INTEL ARCHITECTURE PERFORMANCE AND MOORE’S LAW 2-
- 2.3. BRIEF HISTORY OF THE INTEL ARCHITECTURE FLOATING-POINT UNIT. 2-
- ADVANCED MICROARCHITECTURE 2- 2.4. INTRODUCTION TO THE P6 FAMILY PROCESSOR’S
- MICROARCHITECTURE 2- 2.5. DETAILED DESCRIPTION OF THE P6 FAMILY PROCESSOR
- 2.5.1. Memory Subsystem. .2-
- 2.5.2. Fetch/Decode Unit. .2-
- 2.5.3. Instruction Pool (Reorder Buffer). .2-
- 2.5.4. Dispatch/Execute Unit .2-
- 2.5.5. Retirement Unit .2-
- CHAPTER
- 3.1. MODES OF OPERATION 3- BASIC EXECUTION ENVIRONMENT
- 3.2. OVERVIEW OF THE BASIC EXECUTION ENVIRONMENT 3-
- 3.3. MEMORY ORGANIZATION. 3-
- 3.4. MODES OF OPERATION 3-
- 3.5. 32-BIT VS. 16-BIT ADDRESS AND OPERAND SIZES. 3-
- 3.6. REGISTERS. 3-
- 3.6.1. General-Purpose Data Registers .3-
- 3.6.2. Segment Registers .3-
- 3.6.3. EFLAGS Register .3-
- 3.6.3.1. Status Flags .3-
- 3.6.3.2. DF Flag .3-
- 3.6.4. System Flags and IOPL Field .3-
- 3.7. INSTRUCTION POINTER 3-
- 3.8. OPERAND-SIZE AND ADDRESS-SIZE ATTRIBUTES. 3-
- CHAPTER iv
- 4.1. PROCEDURE CALL TYPES 4- PROCEDURE CALLS, INTERRUPTS, AND EXCEPTIONS
- 4.2. STACK 4-
- 4.2.1. Setting Up a Stack. .4-
- 4.2.2. Stack Alignment. .4-
- 4.2.3. Address-Size Attributes for Stack Accesses .4-
- 4.2.4. Procedure Linking Information. .4-
- 4.2.4.1. Stack-Frame Base Pointer .4-
- 4.2.4.2. Return Instruction Pointer .4-
- 4.3. CALLING PROCEDURES USING CALL AND RET 4-
- 4.3.1. Near CALL and RET Operation. .4-
- 4.3.2. Far CALL and RET Operation .4-
- 4.3.3. Parameter Passing .4-
- 4.3.3.1. Passing Parameters Through the General-Purpose Registers .4-
- 4.3.3.2. Passing Parameters on the Stack .4-
- 4.3.3.3. Passing Parameters in an Argument List .4-
- 4.3.4. Saving Procedure State Information .4-
- 4.3.5. Calls to Other Privilege Levels .4-
- 4.3.6. CALL and RET Operation Between Privilege Levels .4-
- 4.4. INTERRUPTS AND EXCEPTIONS 4-
- 4.4.1. Call and Return Operation for Interrupt or Exception Handling Procedures .4-
- 4.4.2. Calls to Interrupt or Exception Handler Tasks .4-
- 4.4.3. Interrupt and Exception Handling in Real-Address Mode .4-
- 4.4.4. INT n, INTO, INT 3, and BOUND Instructions .4-
- 4.5. PROCEDURE CALLS FOR BLOCK-STRUCTURED LANGUAGES. 4-
- 4.5.1. ENTER Instruction. .4-
- 4.5.2. LEAVE Instruction .4-
- CHAPTER
- 5.1. FUNDAMENTAL DATA TYPES 5- DATA TYPES AND ADDRESSING MODES
- 5.1.1. Alignment of Words, Doublewords, and Quadwords. .5-
- 5.2. NUMERIC, POINTER, BIT FIELD, AND STRING DATA TYPES 5-
- 5.2.1. Integers .5-
- 5.2.2. Unsigned Integers .5-
- 5.2.3. BCD Integers .5-
- 5.2.4. Pointers .5-
- 5.2.5. Bit Fields .5-
- 5.2.6. Strings .5-
- 5.2.7. Floating-Point Data Types .5-
- 5.2.8. MMX™ Technology Data Types .5-
- 5.2.9. Streaming SIMD Extensions Data Types .5-
- 5.3. OPERAND ADDRESSING. 5-
- 5.3.1. Immediate Operands .5-
- 5.3.2. Register Operands .5-
- 5.3.3. Memory Operands. .5-
- 5.3.3.1. Specifying a Segment Selector. .5-
- 5.3.3.2. Specifying an Offset .5-
- 5.3.3.3. Assembler and Compiler Addressing Modes .5-
- 5.3.4. I/O Port Addressing .5-
- CHAPTER TABLE OF CONTENTS
- 6.1. NEW INTEL ARCHITECTURE INSTRUCTIONS 6- INSTRUCTION SET SUMMARY
- 6.1.1. New Instructions Introduced with the Streaming SIMD Extensions 6-
- 6.1.2. New Instructions Introduced with the MMX™ Technology 6-
- 6.1.3. New Instructions in the Pentium® Pro Processor 6-
- 6.1.4. New Instructions in the Pentium® Processor 6-
- 6.1.5. New Instructions in the Intel486™ Processor 6-
- 6.2. INSTRUCTION SET LIST 6-
- 6.2.1. Integer Instructions 6-
- 6.2.1.1. Data Transfer Instructions. 6-
- 6.2.1.2. Binary Arithmetic Instructions 6-
- 6.2.1.3. Decimal Arithmetic 6-
- 6.2.1.4. Logic Instructions 6-
- 6.2.1.5. Shift and Rotate Instructions. 6-
- 6.2.1.6. Bit and Byte Instructions 6-
- 6.2.1.7. Control Transfer Instructions. 6-
- 6.2.1.8. String Instructions 6-
- 6.2.1.9. Flag Control Instructions 6-
- 6.2.1.10. Segment Register Instructions 6-
- 6.2.1.11. Miscellaneous Instructions 6-
- 6.2.2. MMX™ Technology Instructions 6-
- 6.2.2.1. MMX™ Data Transfer Instructions 6-
- 6.2.2.2. MMX™ Conversion Instructions 6-
- 6.2.2.3. MMX™ Packed Arithmetic Instructions. 6-
- 6.2.2.4. MMX™ Comparison Instructions 6-
- 6.2.2.5. MMX™ Logic Instructions 6-
- 6.2.2.6. MMX™ Shift and Rotate Instructions 6-
- 6.2.2.7. MMX™ State Management. 6-
- 6.2.3. Floating-Point Instructions 6-
- 6.2.3.1. Data Transfer 6-
- 6.2.3.2. Basic Arithmetic 6-
- 6.2.3.3. Comparison. 6-
- 6.2.3.4. Transcendental 6-
- 6.2.3.5. Load Constants. 6-
- 6.2.3.6. FPU Control 6-
- 6.2.4. System Instructions 6-
- 6.2.5. Streaming SIMD Extensions 6-
- 6.2.5.1. Streaming SIMD Extensions Data Transfer Instructions. 6-
- 6.2.5.2. Streaming SIMD Extensions Conversion Instructions. 6-
- 6.2.5.3. Streaming SIMD Extensions Packed Arithmetic Instructions 6-
- 6.2.5.4. Streaming SIMD Extensions Comparison Instructions 6-
- 6.2.5.5. Streaming SIMD Extensions Logical Instructions 6-
- 6.2.5.6. Streaming SIMD Extensions Data Shuffle Instructions 6-
- 6.2.5.7. Streaming SIMD Extensions Additional SIMD-Integer Instructions. 6-
- 6.2.5.8. Streaming SIMD Extensions Cacheability Control Instructions. 6-
- 6.2.5.9. Streaming SIMD Extensions State Management Instructions 6-
- 6.3. DATA MOVEMENT INSTRUCTIONS 6-
- 6.3.1. General-Purpose Data Movement Instructions 6-
- 6.3.1.1. Move Instruction 6-
- 6.3.1.2. Conditional Move Instructions. 6-
- 6.3.1.3. Exchange Instructions. 6-
- 6.3.2. Stack Manipulation Instructions. .6- vi
- 6.3.2.1. Type Conversion Instructions .6-
- 6.3.2.2. Simple Conversion .6-
- 6.3.2.3. Move and Convert .6-
- 6.4. BINARY ARITHMETIC INSTRUCTIONS 6-
- 6.4.1. Addition and Subtraction Instructions .6-
- 6.4.2. Increment and Decrement Instructions .6-
- 6.4.3. Comparison and Sign Change Instruction. .6-
- 6.4.4. Multiplication and Divide Instructions .6-
- 6.5. DECIMAL ARITHMETIC INSTRUCTIONS 6-
- 6.5.1. Packed BCD Adjustment Instructions .6-
- 6.5.2. Unpacked BCD Adjustment Instructions .6-
- 6.6. LOGICAL INSTRUCTIONS 6-
- 6.7. SHIFT AND ROTATE INSTRUCTIONS. 6-
- 6.7.1. Shift Instructions .6-
- 6.7.2. Double-Shift Instructions .6-
- 6.7.3. Rotate Instructions. .6-
- 6.8. BIT AND BYTE INSTRUCTIONS. 6-
- 6.8.1. Bit Test and Modify Instructions .6-
- 6.8.2. Bit Scan Instructions .6-
- 6.8.3. Byte Set on Condition Instructions .6-
- 6.8.4. Test Instruction .6-
- 6.9. CONTROL TRANSFER INSTRUCTIONS 6-
- 6.9.1. Unconditional Transfer Instructions .6-
- 6.9.1.1. Jump Instruction .6-
- 6.9.1.2. Call and Return Instructions .6-
- 6.9.1.3. Return From Interrupt Instruction .6-
- 6.9.2. Conditional Transfer Instructions. .6-
- 6.9.2.1. Conditional Jump Instructions. .6-
- 6.9.2.2. Loop Instructions .6-
- 6.9.2.3. Jump If Zero Instructions .6-
- 6.9.3. Software Interrupts .6-
- 6.10. STRING OPERATIONS 6-
- 6.10.1. Repeating String Operations .6-
- 6.11. I/O INSTRUCTIONS. 6-
- 6.12. ENTER AND LEAVE INSTRUCTIONS 6-
- 6.13. EFLAGS INSTRUCTIONS 6-
- 6.13.1. Carry and Direction Flag Instructions .6-
- 6.13.2. Interrupt Flag Instructions .6-
- 6.13.3. EFLAGS Transfer Instructions. .6-
- 6.13.4. Interrupt Flag Instructions .6-
- 6.14. SEGMENT REGISTER INSTRUCTIONS 6-
- 6.14.1. Segment-Register Load and Store Instructions. .6-
- 6.14.2. Far Control Transfer Instructions. .6-
- 6.14.3. Software Interrupt Instructions. .6-
- 6.14.4. Load Far Pointer Instructions .6-
- 6.15. MISCELLANEOUS INSTRUCTIONS. 6-
- 6.15.1. Address Computation Instruction .6-
- 6.15.2. Table Lookup Instructions .6-
- 6.15.3. Processor Identification Instruction .6-
- 6.15.4. No-Operation and Undefined Instructions .6-
- CHAPTER TABLE OF CONTENTS
- 7.1. COMPATIBILITY AND EASE OF USE OF THE INTEL ARCHITECTURE FPU 7- FLOATING-POINT UNIT
- 7.2. REAL NUMBERS AND FLOATING-POINT FORMATS 7-
- 7.2.1. Real Number System. 7-
- 7.2.2. Floating-Point Format 7-
- 7.2.2.1. Normalized Numbers 7-
- 7.2.2.2. Biased Exponent. 7-
- 7.2.3. Real Number and Non-number Encodings 7-
- 7.2.3.1. Signed Zeros. 7-
- 7.2.3.2. Normalized and Denormalized Finite Numbers 7-
- 7.2.3.3. Signed Infinities. 7-
- 7.2.3.4. NaNs 7-
- 7.2.4. Indefinite 7-
- 7.3. FPU ARCHITECTURE 7-
- 7.3.1. FPU Data Registers. 7-
- 7.3.1.1. Parameter Passing with the FPU Register Stack 7-
- 7.3.2. FPU Status Register 7-
- 7.3.2.1. Top of Stack (TOP) Pointer. 7-
- 7.3.2.2. Condition Code Flags 7-
- 7.3.2.3. Exception Flags 7-
- 7.3.2.4. Stack Fault Flag 7-
- 7.3.3. Branching and Conditional Moves on FPU Condition Codes 7-
- 7.3.4. FPU Control Word 7-
- 7.3.4.1. Exception-Flag Masks. 7-
- 7.3.4.2. Precision Control Field 7-
- 7.3.4.3. Rounding Control Field 7-
- 7.3.5. Infinity Control Flag 7-
- 7.3.6. FPU Tag Word. 7-
- 7.3.7. FPU Instruction and Operand (Data) Pointers 7-
- 7.3.8. Last Instruction Opcode. 7-
- 7.3.9. Saving the FPU’s State 7-
- 7.4. FLOATING-POINT DATA TYPES AND FORMATS 7-
- 7.4.1. Real Numbers 7-
- 7.4.2. Binary Integers. 7-
- 7.4.3. Decimal Integers 7-
- 7.4.4. Unsupported Extended-Real Encodings 7-
- 7.5. FPU INSTRUCTION SET 7-
- 7.5.1. Escape (ESC) Instructions. 7-
- 7.5.2. FPU Instruction Operands 7-
- 7.5.3. Data Transfer Instructions 7-
- 7.5.4. Load Constant Instructions 7-
- 7.5.5. Basic Arithmetic Instructions 7-
- 7.5.6. Comparison and Classification Instructions 7-
- 7.5.6.1. Branching on the FPU Condition Codes 7-
- 7.5.7. Trigonometric Instructions 7-
- 7.5.8. Pi 7-
- 7.5.9. Logarithmic, Exponential, and Scale 7-
- 7.5.10. Transcendental Instruction Accuracy. 7-
- 7.5.11. FPU Control Instructions 7-
- 7.5.12. Waiting Vs. Non-waiting Instructions 7-
- 7.5.13. Unsupported FPU Instructions. 7-
- 7.6. OPERATING ON NANS. 7- viii
- 7.6.1. Operating on NaNs with Streaming SIMD Extensions .7-
- 7.6.2. Uses for Signaling NANs .7-
- 7.6.3. Uses for Quiet NANs .7-
- 7.7. FLOATING-POINT EXCEPTION HANDLING 7-
- 7.7.1. Arithmetic vs. Non-arithmetic Instructions .7-
- 7.7.2. Automatic Exception Handling. .7-
- 7.7.3. Software Exception Handling .7-
- 7.7.3.1. Native Mode .7-
- 7.7.3.2. MS-DOS* Compatibility Mode .7-
- 7.7.3.3. Typical Floating-Point Exception Handler Actions .7-
- 7.8. FLOATING-POINT EXCEPTION CONDITIONS 7-
- 7.8.1. Invalid Operation Exception. .7-
- 7.8.1.1. Stack Overflow or Underflow Exception (#IS). .7-
- 7.8.1.2. Invalid Arithmetic Operand Exception (#IA) .7-
- 7.8.2. Divide-By-Zero Exception (#Z) .7-
- 7.8.3. Denormal Operand Exception (#D) .7-
- 7.8.4. Numeric Overflow Exception (#O) .7-
- 7.8.5. Numeric Underflow Exception (#U) .7-
- 7.8.6. Inexact Result (Precision) Exception (#P) .7-
- 7.8.7. Exception Priority. .7-
- 7.9. FLOATING-POINT EXCEPTION SYNCHRONIZATION 7-
- CHAPTER
- 8.1. OVERVIEW OF THE MMX™ TECHNOLOGY PROGRAMMING ENVIRONMENT 8- MMX™ TECHNOLOGY
- 8.1.1. MMX™ Registers .8-
- 8.1.2. MMX™ Data Types .8-
- 8.1.3. Single Instruction, Multiple Data (SIMD) Execution Model .8-
- 8.1.4. Memory Data Formats. .8-
- 8.1.5. Data Formats for MMX™ Registers .8-
- 8.2. MMX™ INSTRUCTION SET 8-
- 8.2.1. Saturation Arithmetic and Wraparound Mode .8-
- 8.2.2. Instruction Operands .8-
- 8.3. OVERVIEW OF THE MMX™ INSTRUCTION SET 8-
- 8.3.1. Data Transfer Instructions .8-
- 8.3.2. Arithmetic Instructions .8-
- 8.3.2.1. Packed Addition and Subtraction .8-
- 8.3.2.2. Packed Multiplication .8-
- 8.3.2.3. Packed Multiply Add .8-
- 8.3.3. Comparison Instructions .8-
- 8.3.4. Conversion Instructions .8-
- 8.3.5. Logical Instructions .8-
- 8.3.6. Shift Instructions .8-
- 8.3.7. EMMS (Empty MMX™ State) Instruction .8-
- 8.4. COMPATIBILITY WITH FPU ARCHITECTURE 8-
- 8.4.1. MMX™ Instructions and the Floating-Point Tag Word .8-
- 8.4.2. Effect of Instruction Prefixes on MMX™ Instructions .8-
- 8.5. WRITING APPLICATIONS WITH MMX™ CODE 8-
- 8.5.1. Detecting Support for MMX™ Technology Using the CPUID Instruction .8-
- 8.5.2. Using the EMMS Instruction .8-
- 8.5.3. Interfacing with MMX™ Code 8- TABLE OF CONTENTS
- 8.5.4. Writing Code with MMX™ and Floating-Point Instructions 8-
- 8.5.4.1. RECOMMENDATIONS AND GUIDELINES 8-
- 8.5.5. Using MMX™ Code in a Multitasking Operating System Environment 8-
- 8.5.5.1. COOPERATIVE MULTITASKING OPERATING SYSTEM. 8-
- 8.5.5.2. PREEMPTIVE MULTITASKING OPERATING SYSTEM 8-
- 8.5.6. Exception Handling in MMX™ Code 8-
- 8.5.7. Register Mapping. 8-
- CHAPTER
- 9.1. OVERVIEW OF THE STREAMING SIMD EXTENSIONS 9- PROGRAMMING WITH THE STREAMING SIMD EXTENSIONS
- 9.1.1. SIMD Floating-Point Registers 9-
- 9.1.2. SIMD Floating-Point Data Types 9-
- 9.1.3. Single Instruction, Multiple Data (SIMD) Execution Model 9-
- 9.1.4. Pentium® III Processor Single Precision Floating-Point Format 9-
- 9.1.5. Memory Data Formats 9-
- 9.1.6. SIMD Floating-Point Register Data Formats 9-
- 9.1.7. SIMD Floating-Point Control/Status Register. 9-
- 9.1.8. Rounding Control Field 9-
- 9.1.9. Flush-To-Zero 9-
- 9.2. STREAMING SIMD EXTENSIONS SET 9-
- 9.2.1. Instruction Operands 9-
- 9.3. OVERVIEW OF THE STREAMING SIMD EXTENSIONS SET 9-
- 9.3.1. Data Movement Instructions 9-
- 9.3.2. Arithmetic Instructions 9-
- 9.3.2.1. Packed/Scalar Addition and Subtraction. 9-
- 9.3.2.2. Packed/Scalar Multiplication and Division 9-
- 9.3.2.3. Packed/Scalar Square Root 9-
- 9.3.2.4. Packed Maximum/Minimum 9-
- 9.3.3. Comparison Instructions 9-
- 9.3.4. Conversion Instructions 9-
- 9.3.5. Logical Instructions 9-
- 9.3.6. Additional SIMD Integer Instructions 9-
- 9.3.7. Shuffle Instructions 9-
- 9.3.8. State Management Instructions 9-
- 9.3.9. Cacheability Control Instructions 9-
- 9.4. COMPATIBILITY WITH FPU ARCHITECTURE. 9-
- 9.4.1. Effect of Instruction Prefixes on Streaming SIMD Extensions 9-
- 9.5. WRITING APPLICATIONS WITH STREAMING SIMD EXTENSIONS CODE. 9-
- CPUID Instruction 9- 9.5.1. Detecting Support for Streaming SIMD Extensions Using the
- 9.5.2. Interfacing with Streaming SIMD Extensions Procedures and Functions 9-
- 9.5.3. Writing Code with MMX™, Floating-Point, and Streaming SIMD Extensions 9-
- 9.5.3.1. Cacheability Hint Instructions 9-
- 9.5.3.2. Recommendations and Guidelines 9-
- System Environment 9- 9.5.4. Using Streaming SIMD Extensions Code in a Multitasking Operating
- 9.5.4.1. Cooperative Multitasking Operating System. 9-
- 9.5.4.2. Preemptive Multitasking Operating System 9-
- 9.5.5. Exception Handling in Streaming SIMD Extensions 9-
- Figure 1-1. Bit and Byte Order .1- TABLE OF FIGURES
- and Their Interface with the Memory Subsystem .2- Figure 2-1. The Processing Units in the P6 Family Processor Microarchitecture
- Figure 2-2. Functional Block Diagram of the P6 Family Processor Microarchitecture .2-
- Figure 3-1. P6 Family Processor Basic Execution Environment. .3-
- Figure 3-2. Three Memory Management Models .3-
- Figure 3-3. Application Programming Registers .3-
- Figure 3-4. Alternate General-Purpose Register Names .3-
- Figure 3-5. Use of Segment Registers for Flat Memory Model. .3-
- Figure 3-6. Use of Segment Registers in Segmented Memory Model .3-
- Figure 3-7. EFLAGS Register .3-
- Figure 4-1. Stack Structure .4-
- Figure 4-2. Stack on Near and Far Calls. .4-
- Figure 4-3. Protection Rings .4-
- Figure 4-4. Stack Switch on a Call to a Different Privilege Level .4-
- Figure 4-5. Stack Usage on Transfers to Interrupt and Exception Handling Routines .4-
- Figure 4-6. Nested Procedures .4-
- Figure 4-7. Stack Frame after Entering the MAIN Procedure .4-
- Figure 4-8. Stack Frame after Entering Procedure A .4-
- Figure 4-9. Stack Frame after Entering Procedure B .4-
- Figure 4-10. Stack Frame after Entering Procedure C .4-
- Figure 5-1. Fundamental Data Types .5-
- Figure 5-2. SIMD Floating-Point Data Type .5-
- Figure 5-3. Bytes, Words, Doublewords and Quadwords in Memory .5-
- Figure 5-4. Numeric, Pointer, and Bit Field Data Types .5-
- Figure 5-5. Memory Operand Address .5-
- Figure 5-6. Offset (or Effective Address) Computation .5-
- Figure 6-1. Operation of the PUSH Instruction .6-
- Figure 6-2. Operation of the PUSHA Instruction .6-
- Figure 6-3. Operation of the POP Instruction .6-
- Figure 6-4. Operation of the POPA Instruction .6-
- Figure 6-5. Sign Extension .6-
- Figure 6-6. SHL/SAL Instruction Operation. .6-
- Figure 6-7. SHR Instruction Operation .6-
- Figure 6-8. SAR Instruction Operation .6-
- Figure 6-9. SHLD and SHRD Instruction Operations .6-
- Figure 6-10. ROL, ROR, RCL, and RCR Instruction Operations .6-
- Figure 6-11. Flags Affected by the PUSHF, POPF, PUSHFD, and POPFD instructions .6-
- Figure 7-1. Binary Real Number System .7-
- Figure 7-2. Binary Floating-Point Format .7-
- Figure 7-3. Real Numbers and NaNs .7-
- Figure 7-4. Relationship Between the Integer Unit and the FPU .7-
- Figure 7-5. FPU Execution Environment. .7-
- Figure 7-6. FPU Data Register Stack .7-
- Figure 7-7. Example FPU Dot Product Computation .7-
- Figure 7-8. FPU Status Word .7-
- Figure 7-9. Moving the FPU Condition Codes to the EFLAGS Register. .7-
- Figure 7-10. FPU Control Word .7-
- Figure 7-11. FPU Tag Word .7-
- Figure 7-12. Contents of FPU Opcode Registers .7- xiv
- Figure 7-13. Protected Mode FPU State Image in Memory, 32-Bit Format .7-
- Figure 7-14. Real Mode FPU State Image in Memory, 32-Bit Format .7-
- Figure 7-15. Protected Mode FPU State Image in Memory, 16-Bit Format .7-
- Figure 7-16. Real Mode FPU State Image in Memory, 16-Bit Format .7-
- Figure 7-17. Floating-Point Unit Data Type Formats .7-
- Figure 8-1. MMX™ Register Set .8-
- Figure 8-2. MMX™ Data Types .8-
- Figure 8-3. Eight Packed Bytes in Memory (at address 1000H) .8-
- Figure 9-1. SIMD Floating-Point Registers .9-
- Figure 9-2. Packed Single-FP .9-
- Figure 9-3. Four Packed FP Data in Memory (at address 1000H) .9-
- Figure 9-4. SIMD Floating-Point Control/Status Register Format .9-
- Figure 9-5. Packed Operations .9-
- Figure 9-6. Scalar Operations .9-
- Figure 9-7. Packed Shuffle Operation. .9-
- Figure 9-8. Unpack High Operation .9-
- Figure 9-9. Unpack Low Operation .9-
- Figure 10-1. Memory-Mapped I/O. .10-
- Figure 10-2. I/O Permission Bit Map .10-
- Figure 11-1. EAX Return Values. .11-
- Figure 11-2. CPUID Feature Field Information Bits .11-
- Figure 11-3. CR4 Register Extensions .11-
- Exception Handling. E- Figure E-1. Recommended Circuit for MS-DOS* Compatibility FPU
- Figure E-2. Behavior of Signals During FPU Exception Handling E-
- Figure E-3. Timing of Receipt of External Interrupt E-
- Figure E-4. Arithmetic Example Using Infinity E-
- Figure E-5. General Program Flow for DNA Exception Handler E-
- Figure E-6. Program Flow for a Numeric Exception Dispatch Routine E-
- Figure F-1. Control Flow for Handling Unmasked Floating-Point Exceptions .F-
- Key Features .2- Table 2-1. Processor Performance Over Time and Other Intel Architecture
- Table 3-1. Effective Operand- and Address-Size Attributes .3-
- Table 4-1. Exceptions and Interrupts .4-
- Table 5-1. Default Segment Selection Rules .5-
- Table 6-1. Move Instruction Operations. .6-
- Table 6-2. Conditional Move Instructions. .6-
- Table 6-3. Bit Test and Modify Instructions .6-
- Table 6-4. Conditional Jump Instructions. .6-
- Table 6-5. Information Provided by the CPUID Instruction .6-
- Table 7-1. Real Number Notation .7-
- Table 7-2. Denormalization Process .7-
- Table 7-3. FPU Condition Code Interpretation. .7-
- Table 7-4. Precision Control Field (PC) .7-
- Table 7-5. Rounding Control Field (RC) .7-
- Table 7-6. Rounding of Positive Numbers with Masked Overflow .7-
- Table 7-7. Rounding of Negative Numbers with Masked Overflow .7-
- Table 7-8. Length, Precision, and Range of FPU Data Types. .7-
- Table 7-9. Real Number and NaN Encodings .7-
- Table 7-10. Binary Integer Encodings .7-
- Table 7-11. Packed Decimal Integer Encodings .7-
- Table 7-12. Unsupported Extended-Real Encodings. .7-
- Table 7-13. Data Transfer Instructions .7-
- Table 7-14. Floating-Point Conditional Move Instructions .7-
- Table 7-15. Setting of FPU Condition Code Flags for Real Number Comparisons .7-
- Table 7-16. Setting of EFLAGS Status Flags for Real Number Comparisons. .7-
- Table 7-17. TEST Instruction Constants for Conditional Branching .7-
- Table 7-18. Rules for Generating QNaNs .7-
- Table 7-19. Results of Operations with NaN Operands. .7-
- Table 7-20. Arithmetic and Non-arithmetic Instructions .7-
- Table 7-21. Invalid Arithmetic Operations and the Masked Responses to Them .7-
- Table 7-22. Divide-By-Zero Conditions and the Masked Responses to Them .7-
- Table 7-23. Masked Responses to Numeric Overflow. .7-
- Table 8-1. Data Range Limits for Saturation .8-
- Table 8-2. MMX™ Instruction Set Summary .8-
- Table 8-3. Effect of Prefixes on MMX™ Instructions .8-
- Table 9-1. Precision and Range of SIMD Floating-point Datatype .9-
- Table 9-2. Real Number and NaN Encodings .9-
- Table 9-3. Rounding Control Field (RC) .9-
- Table 9-4. Streaming SIMD Extensions Behavior with Prefixes .9-
- Table 9-5. SIMD Integer Instructions Behavior with Prefixes. .9-
- Table 9-6. Cacheability Control Instruction Behavior with Prefixes .9-
- Table 9-7. Cache Hints .9-
- Table 10-1. I/O Instruction Serialization. .10-
- Table 11-1. EAX Input Value and CPUID Return Values .11-
- CPUID in EDX .11- Table 11-2. New P6-Family Processor Feature Information Returned by
- Table A-1. EFLAGS Cross-Reference A-
- Table B-1. EFLAGS Condition Codes B-
- Table C-1. Floating-Point Exceptions Summary. C- xvi
- Table D-1. Streaming SIMD Extensions Instruction Set Summary D-
- Table F-1. ADDPS, ADDSS, SUBPS, SUBSS, MULPS, MULSS, DIVPS, DIVSS .F-
- Table F-2. CMPPS.EQ, CMPSS.EQ, CMPPS.ORD, CMPSS.ORD .F-
- Table F-3. CMPPS.NEQ, CMPSS.NEQ, CMPPS.UNORD, CMPSS.UNORD. .F-
- Table F-4. CMPPS.LT, CMPSS.LT, CMPPS.LE, CMPSS.LE .F-
- Table F-5. CMPPS.NLT, CMPSS.NLT, CMPSS.NLT, CMPSS.NLE .F-
- Table F-6. COMISS .F-
- Table F-7. UCOMISS .F-
- Table F-8. CVTPS2PI, CVTSS2SI, CVTTPS2PI, CVTTSS2SI .F-
- Table F-9. MAXPS, MAXSS, MINPS, MINSS .F-
- Table F-10. SQRTPS, SQRTSS .F-
- Table F-11. #I - Invalid Operations. .F-
- Table F-12. #Z - Divide-by-Zero. .F-
- Table F-13. #D - Denormal Operand .F-
- Table F-14. #0 - Numeric Overflow .F-
- Table F-15. #U - Numeric Underflow .F-
- Table F-16. #P - Inexact Result (Precision) .F-
About This Manual
1-
CHAPTER 1
ABOUT THIS MANUAL
The Intel Architecture Software Developer’s Manual, Volume 1: Basic Architecture (Order
Number 243190) is part of a three-volume set that describes the architecture and programming
environment of all Intel Architecture (IA) processors. The other two volumes in this set are:
- The Intel Architecture Software Developer’s Manual, Volume 2: Instruction Set Reference
(Order Number 243191).
- The Intel Architecture Software Developer’s Manual, Volume 3: System Programming
Guide (Order Number 243192).
The Intel Architecture Software Developer’s Manual, Volume 1, describes the basic architecture
and programming environment of an IA processor; the Intel Architecture Software Developer’s
Manual, Volume 2, describes the instruction set of the processor and the opcode structure. These
two volumes are aimed at application programmers who are writing programs to run under
existing operating systems or executives. The Intel Architecture Software Developer’s Manual,
Volume 3 describes the operating-system support environment of an IA processor, including
memory management, protection, task management, interrupt and exception handling, and
system management mode. It also provides IA processor compatibility information. This
volume is aimed at operating-system and BIOS designers and programmers.
1.1. OVERVIEW OF THEINTEL ARCHITECTURE SOFTWARE
DEVELOPER’S MANUAL, VOLUME 1: BASIC ARCHITECTURE
The contents of this manual are as follows:
Chapter 1 — About This Manual. Gives an overview of all three volumes of the Intel Archi-
tecture Software Developer’s Manual. It also describes the notational conventions in these
manuals and lists related Intel manuals and documentation of interest to programmers and hard-
ware designers.
Chapter 2 — Introduction to the Intel Architecture. Introduces the IA and the families of
Intel processors that are based on this architecture. It also gives an overview of the common
features found in these processors and brief history of the IA.
Chapter 3 — Basic Execution Environment. Introduces the models of memory organization
and describes the register set used by applications.
Chapter 4 — Procedure Calls, Interrupts, and Exceptions. Describes the procedure stack
and the mechanisms provided for making procedure calls and for servicing interrupts and excep-
tions.
1-
ABOUT THIS MANUAL
Chapter 5 — Data Types and Addressing Modes. Describes the data types and addressing
modes recognized by the processor.
Chapter 6 — Instruction Set Summary. Gives an overview of all the IA instructions except
those executed by the processor’s floating-point unit. The instructions are presented in function-
ally related groups.
Chapter 7 — Floating-Point Unit. Describes the IA floating-point unit, including the floating-
point registers and data types; gives an overview of the floating-point instruction set; and
describes the processor’s floating-point exception conditions.
Chapter 8 — Programming with Intel MMX™ Technology. Describes the Intel MMX™
technology, including MMX™ registers and data types, and gives an overview of the MMX™
instruction set.
Chapter 9 — Programming with the Streaming SIMD Extensions. Describes the Intel
Streaming SIMD Extensions, including the registers and data types.
Chapter 10 — Input/Output. Describes the processor’s I/O architecture, including I/O port
addressing, the I/O instructions, and the I/O protection mechanism.
Chapter 11 — Processor Identification and Feature Determination. Describes how to deter-
mine the CPU type and the features that are available in the processor.
Appendix A — EFLAGS Cross-Reference. Summarizes how the IA instructions affect the
flags in the EFLAGS register.
Appendix B — EFLAGS Condition Codes. Summarizes how the conditional jump, move, and
byte set on condition code instructions use the condition code flags (OF, CF, ZF, SF, and PF) in
the EFLAGS register.
Appendix C — Floating-Point Exceptions Summary. Summarizes the exceptions that can be
raised by floating-point instructions.
Appendix D — SIMD Floating-Point Exceptions Summary. Provides the Streaming SIMD
Extensions mnemonics, and the exceptions that each instruction can cause.
Appendix E — Guidelines for Writing FPU Exception Handlers. Describes how to design
and write MS-DOS* compatible exception handling facilities for FPU and SIMD floating-point
exceptions, including both software and hardware requirements and assembly-language code
examples. This appendix also describes general techniques for writing robust FPU exception
handlers.
Appendix F — Guidelines for Writing SIMD-FP Exception Handlers. Provides guidelines
for the Streaming SIMD Extensions instructions that can generate numeric (floating-point)
exceptions, and gives an overview of the necessary support for handling such exceptions.