Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Intel Processors: L1 Cache, Fetch/Decode Unit, and FPU Architecture, Study notes of Operating Systems

University of Nebraska–Lincoln (UNL)Operating Systems

Intel architecture processors, their design defects, and the addition of l1 cache to increase instruction execution rate. It also covers the fetch/decode unit, fpu architecture, and exceptions handling in detail. How the intel p6 family processor microarchitecture incorporates two cache levels and uses speculative execution for dynamic execution.

Typology: Study notes

Pre 2010

Uploaded on 08/30/2009

koofers-user-4tm-1 🇺🇸

10 documents

1 / 369

This page cannot be seen from the preview

Don't miss anything!

Intel Architecture

Software Developer’s

Manual

Volume 1:

Basic Architecture

NOTE: The Intel Architecture Software Developer’s Manual consists of

three volumes: Basic Architecture, Order Number 243190; Instruction Set

Reference, Order Number 243191; and the System Programming Guide,

Order Number 243192.

Please refer to all three volumes when evaluating your design needs.

1999

Discover Study notes of Operating Systems University of Nebraska–Lincoln (UNL)

Partial preview of the text

Download Intel Processors: L1 Cache, Fetch/Decode Unit, and FPU Architecture and more Study notes Operating Systems in PDF only on Docsity!

Intel Architecture

Software Developer’s

Manual

Volume 1:

Basic Architecture

NOTE : The Intel Architecture Software Developer’s Manual consists of

three volumes: Basic Architecture , Order Number 243190; Instruction Set

Reference, Order Number 243191; and the System Programming Guide,

Order Number 243192.

Please refer to all three volumes when evaluating your design needs.

1999

Information in this document is provided in connection with Intel products. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel’s Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sustaining applications.

Intel may make changes to specifications and product descriptions at any time, without notice.

Designers must not rely on the absence or characteristics of any features or instructions marked “reserved” or “undefined.” Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them.

Intel’s Intel Architecture processors (e.g., Pentium®, Pentium® II, Pentium® III, and Pentium® Pro processors) may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.

Copies of documents which have an ordering number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intel's literature center at http://www.intel.com.

vii

PROGRAMMING WITH THE INTEL

CHAPTER 10 INPUT/OUTPUT

10.1. I/O PORT ADDRESSING............................................ 10-

10.2. I/O PORT HARDWARE............................................. 10-

10.3. I/O ADDRESS SPACE.............................................. 10-

10.3.1. Memory-Mapped I/O............................................. .10- 10.4. I/O INSTRUCTIONS................................................ 10- 10.5. PROTECTED-MODE I/O............................................ 10- 10.5.1. I/O Privilege Level............................................... .10- 10.5.2. I/O Permission Bit Map........................................... .10- 10.6. ORDERING I/O................................................... 10-

CHAPTER 11 PROCESSOR IDENTIFICATION AND FEATURE DETERMINATION

11.1. PROCESSOR IDENTIFICATION...................................... 11-

11.2. IDENTIFICATION OF EARLIER INTEL ARCHITECTURE PROCESSORS..... 11-

11.3. CPUID INSTRUCTION EXTENSIONS................................. 11-

11.3.1. Version Information.............................................. .11- 11.3.2. Control Register Extensions....................................... .11-

APPENDIX A

EFLAGS CROSS-REFERENCE

APPENDIX B

EFLAGS CONDITION CODES

APPENDIX C

FLOATING-POINT EXCEPTIONS SUMMARY

APPENDIX D

SIMD FLOATING-POINT EXCEPTIONS SUMMARY

APPENDIX E

GUIDELINES FOR WRITING FPU

EXCEPTIONS HANDLERS

E.1. ORIGIN OF THE MS-DOS* COMPATIBILITY MODE FOR HANDLING

FPU EXCEPTIONS................................................. E-

E.2. IMPLEMENTATION OF THE MS-DOS* COMPATIBILITY MODE IN THE INTEL486™,

PENTIUM®, AND P6 FAMILY PROCESSORS E-

E.2.1. MS-DOS* Compatibility Mode in the Intel486™ and Pentium® Processors.... E- E.2.1.1. Basic Rules: When FERR# Is Generated............................ E- E.2.1.2. Recommended External Hardware to Support the MS-DOS* Compatibility Mode.................................... E- E.2.1.3. No-Wait FPU Instructions Can Get FPU Interrupt in Window............. E- E.2.2. MS-DOS* Compatibility Mode in the P6 Family Processors................ E- E.3. RECOMMENDED PROTOCOL FOR MS-DOS* COMPATIBILITY HANDLERS.. E- E.3.1. Floating-Point Exceptions and Their Defaults.......................... E- E.3.2. Two Options for Handling Numeric Exceptions......................... E- E.3.2.1. Automatic Exception Handling: Using Masked Exceptions............. E- E.3.2.2. Software Exception Handling.................................... E- E.3.3. Synchronization Required for Use of FPU Exception Handlers............ E-

E.3.3.1. Exception Synchronization: What, Why and When.................... E- E.3.3.2. Exception Synchronization Examples.............................. E- E.3.3.3. Proper Exception Synchronization in General....................... E- E.3.3.4. FPU Exception Handling Examples............................... E- E.3.4. Need for Storing State of IGNNE# Circuit If Using FPU and SMM.......... E- E.3.5. Considerations When FPU Shared Between Tasks..................... E- E.3.5.1. Speculatively Deferring FPU Saves, General Overview................ E- E.3.5.2. Tracking FPU Ownership....................................... E- E.3.5.3. Interaction of FPU State Saves and Floating-point Exception Association.. E- E.3.5.4. Interrupt Routing From the Kernel................................ E- E.3.5.5. Special Considerations for Operating Systems that Support Streaming SIMD Extensions..................................... E- E.4. DIFFERENCES FOR HANDLERS USING NATIVE MODE.................. E- E.4.1. Origin with the Intel 286 and Intel 287, and Intel386™ and Intel 387 Processors............................................. E- E.4.2. Changes with Intel486™, Pentium®, and P6 Family Processors with CR0.NE=1................................................. E- E.4.3. Considerations When FPU Shared Between Tasks Using Native Mode...... E-

APPENDIX F

GUIDELINES FOR WRITING SIMD FLOATING-POINT EXCEPTION HANDLERS

F.1. TWO OPTIONS FOR HANDLING NUMERIC EXCEPTIONS................. F-

F.2. SOFTWARE EXCEPTION HANDLING.................................. F-

F.3. EXCEPTION SYNCHRONIZATION..................................... F-

F.4. SIMD FLOATING-POINT EXCEPTIONS AND THE IEEE-754 STANDARD FOR

BINARY FLOATING-POINT COMPUTATIONS F-

F.4.1. Floating-Point Emulation........................................... F- F.4.2. Streaming SIMD Extensions Response To Floating-Point Exceptions........ F- F.4.2.1. Numeric Exceptions............................................ F- F.4.2.2. Results of Operations with NaN Operands or a NaN Result for Streaming SIMD Extensions Numeric Instructions..................... F- F.4.2.3. Condition Codes, Exception Flags, and Response for Masked and Unmasked Numeric Exceptions.................................. F- F.4.3. SIMD Floating-Point Emulation Implementation Example................. F-

xiii

TABLE OF FIGURES

TABLE OF TABLES

CHAPTER TABLE OF CONTENTS
- VOLUME 1: BASIC ARCHITECTURE 1- 1.1. OVERVIEW OF THE INTEL ARCHITECTURE SOFTWARE DEVELOPER’S MANUAL,
- VOLUME 2: INSTRUCTION SET REFERENCE 1- 1.2. OVERVIEW OF THE INTEL ARCHITECTURE SOFTWARE DEVELOPER’S MANUAL,
- VOLUME 3: SYSTEM PROGRAMMING GUIDE 1- 1.3. OVERVIEW OF THE INTEL ARCHITECTURE SOFTWARE DEVELOPER’S MANUAL,
1.4. NOTATIONAL CONVENTIONS 1-
1.4.1. Bit and Byte Order .1-
1.4.2. Reserved Bits and Software Compatibility .1-
1.4.3. Instruction Operands .1-
1.4.4. Hexadecimal and Binary Numbers .1-
1.4.5. Segmented Addressing .1-
1.4.6. Exceptions .1-
1.5. RELATED LITERATURE 1-
CHAPTER
2.1. BRIEF HISTORY OF THE INTEL ARCHITECTURE 2- INTRODUCTION TO THE INTEL ARCHITECTURE
2.2. INCREASING INTEL ARCHITECTURE PERFORMANCE AND MOORE’S LAW 2-
2.3. BRIEF HISTORY OF THE INTEL ARCHITECTURE FLOATING-POINT UNIT. 2-
- ADVANCED MICROARCHITECTURE 2- 2.4. INTRODUCTION TO THE P6 FAMILY PROCESSOR’S
- MICROARCHITECTURE 2- 2.5. DETAILED DESCRIPTION OF THE P6 FAMILY PROCESSOR
2.5.1. Memory Subsystem. .2-
2.5.2. Fetch/Decode Unit. .2-
2.5.3. Instruction Pool (Reorder Buffer). .2-
2.5.4. Dispatch/Execute Unit .2-
2.5.5. Retirement Unit .2-
CHAPTER
3.1. MODES OF OPERATION 3- BASIC EXECUTION ENVIRONMENT
3.2. OVERVIEW OF THE BASIC EXECUTION ENVIRONMENT 3-
3.3. MEMORY ORGANIZATION. 3-
3.4. MODES OF OPERATION 3-
3.5. 32-BIT VS. 16-BIT ADDRESS AND OPERAND SIZES. 3-
3.6. REGISTERS. 3-
3.6.1. General-Purpose Data Registers .3-
3.6.2. Segment Registers .3-
3.6.3. EFLAGS Register .3-
3.6.3.1. Status Flags .3-
3.6.3.2. DF Flag .3-
3.6.4. System Flags and IOPL Field .3-
3.7. INSTRUCTION POINTER 3-
3.8. OPERAND-SIZE AND ADDRESS-SIZE ATTRIBUTES. 3-
CHAPTER iv
4.1. PROCEDURE CALL TYPES 4- PROCEDURE CALLS, INTERRUPTS, AND EXCEPTIONS
4.2. STACK 4-
4.2.1. Setting Up a Stack. .4-
4.2.2. Stack Alignment. .4-
4.2.3. Address-Size Attributes for Stack Accesses .4-
4.2.4. Procedure Linking Information. .4-
4.2.4.1. Stack-Frame Base Pointer .4-
4.2.4.2. Return Instruction Pointer .4-
4.3. CALLING PROCEDURES USING CALL AND RET 4-
4.3.1. Near CALL and RET Operation. .4-
4.3.2. Far CALL and RET Operation .4-
4.3.3. Parameter Passing .4-
4.3.3.1. Passing Parameters Through the General-Purpose Registers .4-
4.3.3.2. Passing Parameters on the Stack .4-
4.3.3.3. Passing Parameters in an Argument List .4-
4.3.4. Saving Procedure State Information .4-
4.3.5. Calls to Other Privilege Levels .4-
4.3.6. CALL and RET Operation Between Privilege Levels .4-
4.4. INTERRUPTS AND EXCEPTIONS 4-
4.4.1. Call and Return Operation for Interrupt or Exception Handling Procedures .4-
4.4.2. Calls to Interrupt or Exception Handler Tasks .4-
4.4.3. Interrupt and Exception Handling in Real-Address Mode .4-
4.4.4. INT n, INTO, INT 3, and BOUND Instructions .4-
4.5. PROCEDURE CALLS FOR BLOCK-STRUCTURED LANGUAGES. 4-
4.5.1. ENTER Instruction. .4-
4.5.2. LEAVE Instruction .4-
CHAPTER
5.1. FUNDAMENTAL DATA TYPES 5- DATA TYPES AND ADDRESSING MODES
5.1.1. Alignment of Words, Doublewords, and Quadwords. .5-
5.2. NUMERIC, POINTER, BIT FIELD, AND STRING DATA TYPES 5-
5.2.1. Integers .5-
5.2.2. Unsigned Integers .5-
5.2.3. BCD Integers .5-
5.2.4. Pointers .5-
5.2.5. Bit Fields .5-
5.2.6. Strings .5-
5.2.7. Floating-Point Data Types .5-
5.2.8. MMX™ Technology Data Types .5-
5.2.9. Streaming SIMD Extensions Data Types .5-
5.3. OPERAND ADDRESSING. 5-
5.3.1. Immediate Operands .5-
5.3.2. Register Operands .5-
5.3.3. Memory Operands. .5-
5.3.3.1. Specifying a Segment Selector. .5-
5.3.3.2. Specifying an Offset .5-
5.3.3.3. Assembler and Compiler Addressing Modes .5-
5.3.4. I/O Port Addressing .5-
CHAPTER TABLE OF CONTENTS
6.1. NEW INTEL ARCHITECTURE INSTRUCTIONS 6- INSTRUCTION SET SUMMARY
6.1.1. New Instructions Introduced with the Streaming SIMD Extensions 6-
6.1.2. New Instructions Introduced with the MMX™ Technology 6-
6.1.3. New Instructions in the Pentium® Pro Processor 6-
6.1.4. New Instructions in the Pentium® Processor 6-
6.1.5. New Instructions in the Intel486™ Processor 6-
6.2. INSTRUCTION SET LIST 6-
6.2.1. Integer Instructions 6-
6.2.1.1. Data Transfer Instructions. 6-
6.2.1.2. Binary Arithmetic Instructions 6-
6.2.1.3. Decimal Arithmetic 6-
6.2.1.4. Logic Instructions 6-
6.2.1.5. Shift and Rotate Instructions. 6-
6.2.1.6. Bit and Byte Instructions 6-
6.2.1.7. Control Transfer Instructions. 6-
6.2.1.8. String Instructions 6-
6.2.1.9. Flag Control Instructions 6-
6.2.1.10. Segment Register Instructions 6-
6.2.1.11. Miscellaneous Instructions 6-
6.2.2. MMX™ Technology Instructions 6-
6.2.2.1. MMX™ Data Transfer Instructions 6-
6.2.2.2. MMX™ Conversion Instructions 6-
6.2.2.3. MMX™ Packed Arithmetic Instructions. 6-
6.2.2.4. MMX™ Comparison Instructions 6-
6.2.2.5. MMX™ Logic Instructions 6-
6.2.2.6. MMX™ Shift and Rotate Instructions 6-
6.2.2.7. MMX™ State Management. 6-
6.2.3. Floating-Point Instructions 6-
6.2.3.1. Data Transfer 6-
6.2.3.2. Basic Arithmetic 6-
6.2.3.3. Comparison. 6-
6.2.3.4. Transcendental 6-
6.2.3.5. Load Constants. 6-
6.2.3.6. FPU Control 6-
6.2.4. System Instructions 6-
6.2.5. Streaming SIMD Extensions 6-
6.2.5.1. Streaming SIMD Extensions Data Transfer Instructions. 6-
6.2.5.2. Streaming SIMD Extensions Conversion Instructions. 6-
6.2.5.3. Streaming SIMD Extensions Packed Arithmetic Instructions 6-
6.2.5.4. Streaming SIMD Extensions Comparison Instructions 6-
6.2.5.5. Streaming SIMD Extensions Logical Instructions 6-
6.2.5.6. Streaming SIMD Extensions Data Shuffle Instructions 6-
6.2.5.7. Streaming SIMD Extensions Additional SIMD-Integer Instructions. 6-
6.2.5.8. Streaming SIMD Extensions Cacheability Control Instructions. 6-
6.2.5.9. Streaming SIMD Extensions State Management Instructions 6-
6.3. DATA MOVEMENT INSTRUCTIONS 6-
6.3.1. General-Purpose Data Movement Instructions 6-
6.3.1.1. Move Instruction 6-
6.3.1.2. Conditional Move Instructions. 6-
6.3.1.3. Exchange Instructions. 6-
6.3.2. Stack Manipulation Instructions. .6- vi
6.3.2.1. Type Conversion Instructions .6-
6.3.2.2. Simple Conversion .6-
6.3.2.3. Move and Convert .6-
6.4. BINARY ARITHMETIC INSTRUCTIONS 6-
6.4.1. Addition and Subtraction Instructions .6-
6.4.2. Increment and Decrement Instructions .6-
6.4.3. Comparison and Sign Change Instruction. .6-
6.4.4. Multiplication and Divide Instructions .6-
6.5. DECIMAL ARITHMETIC INSTRUCTIONS 6-
6.5.1. Packed BCD Adjustment Instructions .6-
6.5.2. Unpacked BCD Adjustment Instructions .6-
6.6. LOGICAL INSTRUCTIONS 6-
6.7. SHIFT AND ROTATE INSTRUCTIONS. 6-
6.7.1. Shift Instructions .6-
6.7.2. Double-Shift Instructions .6-
6.7.3. Rotate Instructions. .6-
6.8. BIT AND BYTE INSTRUCTIONS. 6-
6.8.1. Bit Test and Modify Instructions .6-
6.8.2. Bit Scan Instructions .6-
6.8.3. Byte Set on Condition Instructions .6-
6.8.4. Test Instruction .6-
6.9. CONTROL TRANSFER INSTRUCTIONS 6-
6.9.1. Unconditional Transfer Instructions .6-
6.9.1.1. Jump Instruction .6-
6.9.1.2. Call and Return Instructions .6-
6.9.1.3. Return From Interrupt Instruction .6-
6.9.2. Conditional Transfer Instructions. .6-
6.9.2.1. Conditional Jump Instructions. .6-
6.9.2.2. Loop Instructions .6-
6.9.2.3. Jump If Zero Instructions .6-
6.9.3. Software Interrupts .6-
6.10. STRING OPERATIONS 6-
6.10.1. Repeating String Operations .6-
6.11. I/O INSTRUCTIONS. 6-
6.12. ENTER AND LEAVE INSTRUCTIONS 6-
6.13. EFLAGS INSTRUCTIONS 6-
6.13.1. Carry and Direction Flag Instructions .6-
6.13.2. Interrupt Flag Instructions .6-
6.13.3. EFLAGS Transfer Instructions. .6-
6.13.4. Interrupt Flag Instructions .6-
6.14. SEGMENT REGISTER INSTRUCTIONS 6-
6.14.1. Segment-Register Load and Store Instructions. .6-
6.14.2. Far Control Transfer Instructions. .6-
6.14.3. Software Interrupt Instructions. .6-
6.14.4. Load Far Pointer Instructions .6-
6.15. MISCELLANEOUS INSTRUCTIONS. 6-
6.15.1. Address Computation Instruction .6-
6.15.2. Table Lookup Instructions .6-
6.15.3. Processor Identification Instruction .6-
6.15.4. No-Operation and Undefined Instructions .6-
CHAPTER TABLE OF CONTENTS
7.1. COMPATIBILITY AND EASE OF USE OF THE INTEL ARCHITECTURE FPU 7- FLOATING-POINT UNIT
7.2. REAL NUMBERS AND FLOATING-POINT FORMATS 7-
7.2.1. Real Number System. 7-
7.2.2. Floating-Point Format 7-
7.2.2.1. Normalized Numbers 7-
7.2.2.2. Biased Exponent. 7-
7.2.3. Real Number and Non-number Encodings 7-
7.2.3.1. Signed Zeros. 7-
7.2.3.2. Normalized and Denormalized Finite Numbers 7-
7.2.3.3. Signed Infinities. 7-
7.2.3.4. NaNs 7-
7.2.4. Indefinite 7-
7.3. FPU ARCHITECTURE 7-
7.3.1. FPU Data Registers. 7-
7.3.1.1. Parameter Passing with the FPU Register Stack 7-
7.3.2. FPU Status Register 7-
7.3.2.1. Top of Stack (TOP) Pointer. 7-
7.3.2.2. Condition Code Flags 7-
7.3.2.3. Exception Flags 7-
7.3.2.4. Stack Fault Flag 7-
7.3.3. Branching and Conditional Moves on FPU Condition Codes 7-
7.3.4. FPU Control Word 7-
7.3.4.1. Exception-Flag Masks. 7-
7.3.4.2. Precision Control Field 7-
7.3.4.3. Rounding Control Field 7-
7.3.5. Infinity Control Flag 7-
7.3.6. FPU Tag Word. 7-
7.3.7. FPU Instruction and Operand (Data) Pointers 7-
7.3.8. Last Instruction Opcode. 7-
7.3.9. Saving the FPU’s State 7-
7.4. FLOATING-POINT DATA TYPES AND FORMATS 7-
7.4.1. Real Numbers 7-
7.4.2. Binary Integers. 7-
7.4.3. Decimal Integers 7-
7.4.4. Unsupported Extended-Real Encodings 7-
7.5. FPU INSTRUCTION SET 7-
7.5.1. Escape (ESC) Instructions. 7-
7.5.2. FPU Instruction Operands 7-
7.5.3. Data Transfer Instructions 7-
7.5.4. Load Constant Instructions 7-
7.5.5. Basic Arithmetic Instructions 7-
7.5.6. Comparison and Classification Instructions 7-
7.5.6.1. Branching on the FPU Condition Codes 7-
7.5.7. Trigonometric Instructions 7-
7.5.8. Pi 7-
7.5.9. Logarithmic, Exponential, and Scale 7-
7.5.10. Transcendental Instruction Accuracy. 7-
7.5.11. FPU Control Instructions 7-
7.5.12. Waiting Vs. Non-waiting Instructions 7-
7.5.13. Unsupported FPU Instructions. 7-
7.6. OPERATING ON NANS. 7- viii
7.6.1. Operating on NaNs with Streaming SIMD Extensions .7-
7.6.2. Uses for Signaling NANs .7-
7.6.3. Uses for Quiet NANs .7-
7.7. FLOATING-POINT EXCEPTION HANDLING 7-
7.7.1. Arithmetic vs. Non-arithmetic Instructions .7-
7.7.2. Automatic Exception Handling. .7-
7.7.3. Software Exception Handling .7-
7.7.3.1. Native Mode .7-
7.7.3.2. MS-DOS* Compatibility Mode .7-
7.7.3.3. Typical Floating-Point Exception Handler Actions .7-
7.8. FLOATING-POINT EXCEPTION CONDITIONS 7-
7.8.1. Invalid Operation Exception. .7-
7.8.1.1. Stack Overflow or Underflow Exception (#IS). .7-
7.8.1.2. Invalid Arithmetic Operand Exception (#IA) .7-
7.8.2. Divide-By-Zero Exception (#Z) .7-
7.8.3. Denormal Operand Exception (#D) .7-
7.8.4. Numeric Overflow Exception (#O) .7-
7.8.5. Numeric Underflow Exception (#U) .7-
7.8.6. Inexact Result (Precision) Exception (#P) .7-
7.8.7. Exception Priority. .7-
7.9. FLOATING-POINT EXCEPTION SYNCHRONIZATION 7-
CHAPTER
8.1. OVERVIEW OF THE MMX™ TECHNOLOGY PROGRAMMING ENVIRONMENT 8- MMX™ TECHNOLOGY
8.1.1. MMX™ Registers .8-
8.1.2. MMX™ Data Types .8-
8.1.3. Single Instruction, Multiple Data (SIMD) Execution Model .8-
8.1.4. Memory Data Formats. .8-
8.1.5. Data Formats for MMX™ Registers .8-
8.2. MMX™ INSTRUCTION SET 8-
8.2.1. Saturation Arithmetic and Wraparound Mode .8-
8.2.2. Instruction Operands .8-
8.3. OVERVIEW OF THE MMX™ INSTRUCTION SET 8-
8.3.1. Data Transfer Instructions .8-
8.3.2. Arithmetic Instructions .8-
8.3.2.1. Packed Addition and Subtraction .8-
8.3.2.2. Packed Multiplication .8-
8.3.2.3. Packed Multiply Add .8-
8.3.3. Comparison Instructions .8-
8.3.4. Conversion Instructions .8-
8.3.5. Logical Instructions .8-
8.3.6. Shift Instructions .8-
8.3.7. EMMS (Empty MMX™ State) Instruction .8-
8.4. COMPATIBILITY WITH FPU ARCHITECTURE 8-
8.4.1. MMX™ Instructions and the Floating-Point Tag Word .8-
8.4.2. Effect of Instruction Prefixes on MMX™ Instructions .8-
8.5. WRITING APPLICATIONS WITH MMX™ CODE 8-
8.5.1. Detecting Support for MMX™ Technology Using the CPUID Instruction .8-
8.5.2. Using the EMMS Instruction .8-
8.5.3. Interfacing with MMX™ Code 8- TABLE OF CONTENTS
8.5.4. Writing Code with MMX™ and Floating-Point Instructions 8-
8.5.4.1. RECOMMENDATIONS AND GUIDELINES 8-
8.5.5. Using MMX™ Code in a Multitasking Operating System Environment 8-
8.5.5.1. COOPERATIVE MULTITASKING OPERATING SYSTEM. 8-
8.5.5.2. PREEMPTIVE MULTITASKING OPERATING SYSTEM 8-
8.5.6. Exception Handling in MMX™ Code 8-
8.5.7. Register Mapping. 8-
CHAPTER
9.1. OVERVIEW OF THE STREAMING SIMD EXTENSIONS 9- PROGRAMMING WITH THE STREAMING SIMD EXTENSIONS
9.1.1. SIMD Floating-Point Registers 9-
9.1.2. SIMD Floating-Point Data Types 9-
9.1.3. Single Instruction, Multiple Data (SIMD) Execution Model 9-
9.1.4. Pentium® III Processor Single Precision Floating-Point Format 9-
9.1.5. Memory Data Formats 9-
9.1.6. SIMD Floating-Point Register Data Formats 9-
9.1.7. SIMD Floating-Point Control/Status Register. 9-
9.1.8. Rounding Control Field 9-
9.1.9. Flush-To-Zero 9-
9.2. STREAMING SIMD EXTENSIONS SET 9-
9.2.1. Instruction Operands 9-
9.3. OVERVIEW OF THE STREAMING SIMD EXTENSIONS SET 9-
9.3.1. Data Movement Instructions 9-
9.3.2. Arithmetic Instructions 9-
9.3.2.1. Packed/Scalar Addition and Subtraction. 9-
9.3.2.2. Packed/Scalar Multiplication and Division 9-
9.3.2.3. Packed/Scalar Square Root 9-
9.3.2.4. Packed Maximum/Minimum 9-
9.3.3. Comparison Instructions 9-
9.3.4. Conversion Instructions 9-
9.3.5. Logical Instructions 9-
9.3.6. Additional SIMD Integer Instructions 9-
9.3.7. Shuffle Instructions 9-
9.3.8. State Management Instructions 9-
9.3.9. Cacheability Control Instructions 9-
9.4. COMPATIBILITY WITH FPU ARCHITECTURE. 9-
9.4.1. Effect of Instruction Prefixes on Streaming SIMD Extensions 9-
9.5. WRITING APPLICATIONS WITH STREAMING SIMD EXTENSIONS CODE. 9-
- CPUID Instruction 9- 9.5.1. Detecting Support for Streaming SIMD Extensions Using the
9.5.2. Interfacing with Streaming SIMD Extensions Procedures and Functions 9-
9.5.3. Writing Code with MMX™, Floating-Point, and Streaming SIMD Extensions 9-
9.5.3.1. Cacheability Hint Instructions 9-
9.5.3.2. Recommendations and Guidelines 9-
- System Environment 9- 9.5.4. Using Streaming SIMD Extensions Code in a Multitasking Operating
9.5.4.1. Cooperative Multitasking Operating System. 9-
9.5.4.2. Preemptive Multitasking Operating System 9-
9.5.5. Exception Handling in Streaming SIMD Extensions 9-
Figure 1-1. Bit and Byte Order .1- TABLE OF FIGURES
- and Their Interface with the Memory Subsystem .2- Figure 2-1. The Processing Units in the P6 Family Processor Microarchitecture
Figure 2-2. Functional Block Diagram of the P6 Family Processor Microarchitecture .2-
Figure 3-1. P6 Family Processor Basic Execution Environment. .3-
Figure 3-2. Three Memory Management Models .3-
Figure 3-3. Application Programming Registers .3-
Figure 3-4. Alternate General-Purpose Register Names .3-
Figure 3-5. Use of Segment Registers for Flat Memory Model. .3-
Figure 3-6. Use of Segment Registers in Segmented Memory Model .3-
Figure 3-7. EFLAGS Register .3-
Figure 4-1. Stack Structure .4-
Figure 4-2. Stack on Near and Far Calls. .4-
Figure 4-3. Protection Rings .4-
Figure 4-4. Stack Switch on a Call to a Different Privilege Level .4-
Figure 4-5. Stack Usage on Transfers to Interrupt and Exception Handling Routines .4-
Figure 4-6. Nested Procedures .4-
Figure 4-7. Stack Frame after Entering the MAIN Procedure .4-
Figure 4-8. Stack Frame after Entering Procedure A .4-
Figure 4-9. Stack Frame after Entering Procedure B .4-
Figure 4-10. Stack Frame after Entering Procedure C .4-
Figure 5-1. Fundamental Data Types .5-
Figure 5-2. SIMD Floating-Point Data Type .5-
Figure 5-3. Bytes, Words, Doublewords and Quadwords in Memory .5-
Figure 5-4. Numeric, Pointer, and Bit Field Data Types .5-
Figure 5-5. Memory Operand Address .5-
Figure 5-6. Offset (or Effective Address) Computation .5-
Figure 6-1. Operation of the PUSH Instruction .6-
Figure 6-2. Operation of the PUSHA Instruction .6-
Figure 6-3. Operation of the POP Instruction .6-
Figure 6-4. Operation of the POPA Instruction .6-
Figure 6-5. Sign Extension .6-
Figure 6-6. SHL/SAL Instruction Operation. .6-
Figure 6-7. SHR Instruction Operation .6-
Figure 6-8. SAR Instruction Operation .6-
Figure 6-9. SHLD and SHRD Instruction Operations .6-
Figure 6-10. ROL, ROR, RCL, and RCR Instruction Operations .6-
Figure 6-11. Flags Affected by the PUSHF, POPF, PUSHFD, and POPFD instructions .6-
Figure 7-1. Binary Real Number System .7-
Figure 7-2. Binary Floating-Point Format .7-
Figure 7-3. Real Numbers and NaNs .7-
Figure 7-4. Relationship Between the Integer Unit and the FPU .7-
Figure 7-5. FPU Execution Environment. .7-
Figure 7-6. FPU Data Register Stack .7-
Figure 7-7. Example FPU Dot Product Computation .7-
Figure 7-8. FPU Status Word .7-
Figure 7-9. Moving the FPU Condition Codes to the EFLAGS Register. .7-
Figure 7-10. FPU Control Word .7-
Figure 7-11. FPU Tag Word .7-
Figure 7-12. Contents of FPU Opcode Registers .7- xiv
Figure 7-13. Protected Mode FPU State Image in Memory, 32-Bit Format .7-
Figure 7-14. Real Mode FPU State Image in Memory, 32-Bit Format .7-
Figure 7-15. Protected Mode FPU State Image in Memory, 16-Bit Format .7-
Figure 7-16. Real Mode FPU State Image in Memory, 16-Bit Format .7-
Figure 7-17. Floating-Point Unit Data Type Formats .7-
Figure 8-1. MMX™ Register Set .8-
Figure 8-2. MMX™ Data Types .8-
Figure 8-3. Eight Packed Bytes in Memory (at address 1000H) .8-
Figure 9-1. SIMD Floating-Point Registers .9-
Figure 9-2. Packed Single-FP .9-
Figure 9-3. Four Packed FP Data in Memory (at address 1000H) .9-
Figure 9-4. SIMD Floating-Point Control/Status Register Format .9-
Figure 9-5. Packed Operations .9-
Figure 9-6. Scalar Operations .9-
Figure 9-7. Packed Shuffle Operation. .9-
Figure 9-8. Unpack High Operation .9-
Figure 9-9. Unpack Low Operation .9-
Figure 10-1. Memory-Mapped I/O. .10-
Figure 10-2. I/O Permission Bit Map .10-
Figure 11-1. EAX Return Values. .11-
Figure 11-2. CPUID Feature Field Information Bits .11-
Figure 11-3. CR4 Register Extensions .11-
- Exception Handling. E- Figure E-1. Recommended Circuit for MS-DOS* Compatibility FPU
Figure E-2. Behavior of Signals During FPU Exception Handling E-
Figure E-3. Timing of Receipt of External Interrupt E-
Figure E-4. Arithmetic Example Using Infinity E-
Figure E-5. General Program Flow for DNA Exception Handler E-
Figure E-6. Program Flow for a Numeric Exception Dispatch Routine E-
Figure F-1. Control Flow for Handling Unmasked Floating-Point Exceptions .F-
- Key Features .2- Table 2-1. Processor Performance Over Time and Other Intel Architecture
Table 3-1. Effective Operand- and Address-Size Attributes .3-
Table 4-1. Exceptions and Interrupts .4-
Table 5-1. Default Segment Selection Rules .5-
Table 6-1. Move Instruction Operations. .6-
Table 6-2. Conditional Move Instructions. .6-
Table 6-3. Bit Test and Modify Instructions .6-
Table 6-4. Conditional Jump Instructions. .6-
Table 6-5. Information Provided by the CPUID Instruction .6-
Table 7-1. Real Number Notation .7-
Table 7-2. Denormalization Process .7-
Table 7-3. FPU Condition Code Interpretation. .7-
Table 7-4. Precision Control Field (PC) .7-
Table 7-5. Rounding Control Field (RC) .7-
Table 7-6. Rounding of Positive Numbers with Masked Overflow .7-
Table 7-7. Rounding of Negative Numbers with Masked Overflow .7-
Table 7-8. Length, Precision, and Range of FPU Data Types. .7-
Table 7-9. Real Number and NaN Encodings .7-
Table 7-10. Binary Integer Encodings .7-
Table 7-11. Packed Decimal Integer Encodings .7-
Table 7-12. Unsupported Extended-Real Encodings. .7-
Table 7-13. Data Transfer Instructions .7-
Table 7-14. Floating-Point Conditional Move Instructions .7-
Table 7-15. Setting of FPU Condition Code Flags for Real Number Comparisons .7-
Table 7-16. Setting of EFLAGS Status Flags for Real Number Comparisons. .7-
Table 7-17. TEST Instruction Constants for Conditional Branching .7-
Table 7-18. Rules for Generating QNaNs .7-
Table 7-19. Results of Operations with NaN Operands. .7-
Table 7-20. Arithmetic and Non-arithmetic Instructions .7-
Table 7-21. Invalid Arithmetic Operations and the Masked Responses to Them .7-
Table 7-22. Divide-By-Zero Conditions and the Masked Responses to Them .7-
Table 7-23. Masked Responses to Numeric Overflow. .7-
Table 8-1. Data Range Limits for Saturation .8-
Table 8-2. MMX™ Instruction Set Summary .8-
Table 8-3. Effect of Prefixes on MMX™ Instructions .8-
Table 9-1. Precision and Range of SIMD Floating-point Datatype .9-
Table 9-2. Real Number and NaN Encodings .9-
Table 9-3. Rounding Control Field (RC) .9-
Table 9-4. Streaming SIMD Extensions Behavior with Prefixes .9-
Table 9-5. SIMD Integer Instructions Behavior with Prefixes. .9-
Table 9-6. Cacheability Control Instruction Behavior with Prefixes .9-
Table 9-7. Cache Hints .9-
Table 10-1. I/O Instruction Serialization. .10-
Table 11-1. EAX Input Value and CPUID Return Values .11-
- CPUID in EDX .11- Table 11-2. New P6-Family Processor Feature Information Returned by
Table A-1. EFLAGS Cross-Reference A-
Table B-1. EFLAGS Condition Codes B-
Table C-1. Floating-Point Exceptions Summary. C- xvi
Table D-1. Streaming SIMD Extensions Instruction Set Summary D-
Table F-1. ADDPS, ADDSS, SUBPS, SUBSS, MULPS, MULSS, DIVPS, DIVSS .F-
Table F-2. CMPPS.EQ, CMPSS.EQ, CMPPS.ORD, CMPSS.ORD .F-
Table F-3. CMPPS.NEQ, CMPSS.NEQ, CMPPS.UNORD, CMPSS.UNORD. .F-
Table F-4. CMPPS.LT, CMPSS.LT, CMPPS.LE, CMPSS.LE .F-
Table F-5. CMPPS.NLT, CMPSS.NLT, CMPSS.NLT, CMPSS.NLE .F-
Table F-6. COMISS .F-
Table F-7. UCOMISS .F-
Table F-8. CVTPS2PI, CVTSS2SI, CVTTPS2PI, CVTTSS2SI .F-
Table F-9. MAXPS, MAXSS, MINPS, MINSS .F-
Table F-10. SQRTPS, SQRTSS .F-
Table F-11. #I - Invalid Operations. .F-
Table F-12. #Z - Divide-by-Zero. .F-
Table F-13. #D - Denormal Operand .F-
Table F-14. #0 - Numeric Overflow .F-
Table F-15. #U - Numeric Underflow .F-
Table F-16. #P - Inexact Result (Precision) .F-

About This Manual

CHAPTER 1

ABOUT THIS MANUAL

The Intel Architecture Software Developer’s Manual, Volume 1: Basic Architecture (Order

Number 243190) is part of a three-volume set that describes the architecture and programming

environment of all Intel Architecture (IA) processors. The other two volumes in this set are:

The Intel Architecture Software Developer’s Manual, Volume 2: Instruction Set Reference

(Order Number 243191).

The Intel Architecture Software Developer’s Manual, Volume 3: System Programming

Guide (Order Number 243192).

The Intel Architecture Software Developer’s Manual, Volume 1, describes the basic architecture

and programming environment of an IA processor; the Intel Architecture Software Developer’s

Manual, Volume 2, describes the instruction set of the processor and the opcode structure. These

two volumes are aimed at application programmers who are writing programs to run under

existing operating systems or executives. The Intel Architecture Software Developer’s Manual,

Volume 3 describes the operating-system support environment of an IA processor, including

memory management, protection, task management, interrupt and exception handling, and

system management mode. It also provides IA processor compatibility information. This

volume is aimed at operating-system and BIOS designers and programmers.

1.1. OVERVIEW OF THEINTEL ARCHITECTURE SOFTWARE

DEVELOPER’S MANUAL, VOLUME 1: BASIC ARCHITECTURE

Intel Processors: L1 Cache, Fetch/Decode Unit, and FPU Architecture, Study notes of Operating Systems

Related documents

Partial preview of the text

Download Intel Processors: L1 Cache, Fetch/Decode Unit, and FPU Architecture and more Study notes Operating Systems in PDF only on Docsity!

Intel Architecture

Software Developer’s

Manual

Volume 1:

Basic Architecture

NOTE : The Intel Architecture Software Developer’s Manual consists of

three volumes: Basic Architecture , Order Number 243190; Instruction Set

Reference, Order Number 243191; and the System Programming Guide,

Order Number 243192.

Please refer to all three volumes when evaluating your design needs.

TABLE OF CONTENTS

TABLE OF CONTENTS

PROGRAMMING WITH THE INTEL

TABLE OF CONTENTS

CHAPTER 10

INPUT/OUTPUT

10.1. I/O PORT ADDRESSING............................................ 10-

10.2. I/O PORT HARDWARE............................................. 10-

10.3. I/O ADDRESS SPACE.............................................. 10-

CHAPTER 11

PROCESSOR IDENTIFICATION AND FEATURE DETERMINATION

11.1. PROCESSOR IDENTIFICATION...................................... 11-

11.2. IDENTIFICATION OF EARLIER INTEL ARCHITECTURE PROCESSORS..... 11-

11.3. CPUID INSTRUCTION EXTENSIONS................................. 11-

APPENDIX A

EFLAGS CROSS-REFERENCE

APPENDIX B

EFLAGS CONDITION CODES

APPENDIX C

FLOATING-POINT EXCEPTIONS SUMMARY

APPENDIX D

SIMD FLOATING-POINT EXCEPTIONS SUMMARY

APPENDIX E

GUIDELINES FOR WRITING FPU

EXCEPTIONS HANDLERS

E.1. ORIGIN OF THE MS-DOS* COMPATIBILITY MODE FOR HANDLING

FPU EXCEPTIONS................................................. E-

E.2. IMPLEMENTATION OF THE MS-DOS* COMPATIBILITY MODE IN THE INTEL486™,

PENTIUM®, AND P6 FAMILY PROCESSORS E-

TABLE OF CONTENTS

APPENDIX F

GUIDELINES FOR WRITING SIMD FLOATING-POINT EXCEPTION HANDLERS

F.1. TWO OPTIONS FOR HANDLING NUMERIC EXCEPTIONS................. F-

F.2. SOFTWARE EXCEPTION HANDLING.................................. F-

F.3. EXCEPTION SYNCHRONIZATION..................................... F-

F.4. SIMD FLOATING-POINT EXCEPTIONS AND THE IEEE-754 STANDARD FOR

BINARY FLOATING-POINT COMPUTATIONS F-

xiii

TABLE OF FIGURES

TABLE OF TABLES

About This Manual

The Intel Architecture Software Developer’s Manual, Volume 1: Basic Architecture (Order

Number 243190) is part of a three-volume set that describes the architecture and programming

environment of all Intel Architecture (IA) processors. The other two volumes in this set are:

(Order Number 243191).

Guide (Order Number 243192).

The Intel Architecture Software Developer’s Manual, Volume 1, describes the basic architecture

and programming environment of an IA processor; the Intel Architecture Software Developer’s

Manual, Volume 2, describes the instruction set of the processor and the opcode structure. These

two volumes are aimed at application programmers who are writing programs to run under

existing operating systems or executives. The Intel Architecture Software Developer’s Manual,

Volume 3 describes the operating-system support environment of an IA processor, including

memory management, protection, task management, interrupt and exception handling, and

system management mode. It also provides IA processor compatibility information. This

volume is aimed at operating-system and BIOS designers and programmers.

The contents of this manual are as follows:

Chapter 1 — About This Manual. Gives an overview of all three volumes of the Intel Archi-

tecture Software Developer’s Manual. It also describes the notational conventions in these

manuals and lists related Intel manuals and documentation of interest to programmers and hard-

ware designers.

Chapter 2 — Introduction to the Intel Architecture. Introduces the IA and the families of

Intel processors that are based on this architecture. It also gives an overview of the common

features found in these processors and brief history of the IA.

Chapter 3 — Basic Execution Environment. Introduces the models of memory organization

and describes the register set used by applications.

Chapter 4 — Procedure Calls, Interrupts, and Exceptions. Describes the procedure stack