Inner loops
A sourcebook for fast 32-bit software development
Résumé
Inner Loops: A Sourcebook for Fast 32-bit Software Development gives the green light to optimal PC performance with practical advice and a strategic sampling of important algorithms. Focused directly on the 32-bit future of PC computing, Inner Loops explores the new rules and opportunities of a wide-open memory space, parallel instruction execution, and clock speeds in the hundreds of megahertz. You'll be taken through:
- a thorough review of 32-bit code optimization for the 486, Pentium, and Pentium Pro
- making the transition from 16-bit to 32-bit assembly language
- principles of C and assembly language optimization
- tips for fast 32-bit software design
- real-world examples of top-speed inner loops for several important PC algorithms
- what MMX, the Intel multimedia extensions, mean for speed
Author Rick Booth backs up his theory of speed with
practical examples and source code, including such topics
as:
- Fast memory moves
- Random numbers
- Hashing
- Huffman compression
- Sorting
- Matrix math
- JPEG's inner loop
Consultant and developer Rick Booth is a 17-year veteran of the video game and digital video industries.
Table of contents :
Acknowledgments ix About the CD-ROM xi Preface xiii Introduction 1 Optimized Alogrithms for the 32-Bit PC 1 Who This Book Is For 2 Overview of the Contents 2 Playing Devil's Advocate 5 Inner Loops 6 Part 1 Theory 9 Chapter 1 The Reference World of the 11 Pentium-100 The Pentium-100 Processor 12 Memory 13 The PCI and ISA Buses 15 Disks 15 The Modem 18 The Network Connection 18 Still Images 19 Video 19 Audio 20 Text 20 Compiled Code 21 Operating Systems 21 Conclusion 22 Chapter 2 32-Bit Assembly Language 23 What to Forget About 16-Bit Assembly 24 Language Basic 32-Bit Assembly Language 26 The Instruction Set 34 Cycle-Counting Conventions 45 Conclusion 46 Chapter 3 Structured Assembly Language 49 Conventional Assembly Language 50 C and Assembly Language Interleaved 52 Calling and Returning 55 Data Structures 57 Structured Assembly Language 58 Debugging 65 Conclusion 65 Chapter 4 Optimizing the 486 67 The CH486.EXE Instruction Timing Program 68 The Instruction Set 69 The Cache 80 The 486 Output Queue 84 Data and Code Alignment 85 Cycle Thieves and Cycle Savers 87 Conclusion 89 References 89 Chapter 5 Optimizing the Pentium 91 The CHPENT.BAT Instruction Timing Program 92 Pentium Features Overview 92 How Instruction Pairing Works 93 The Instruction Set 98 Cache 109 The Pentium Output Queue 115 Data and Code Alignment 116 Cycle Thieves and Cycle Savers 118 Instruction Stream Timing Oddities 119 Conclusion 121 References 121 Chapter 6 Optimizing the Pentium Pro 123 The CHPPRO.EXE Instruction Timing Program 124 Dynamic Execution 124 Micro-ops 127 New Instructions 129 Instruction Set Timing 130 The Floating-Point Unit 134 The Cache 135 Data Code and Alignment 136 The Partial Register Stall 136 Branch Prediction 137 Optimization Summary 138 Conclusion 139 References 140 Chapter 7 Achieving Speed 141 When and Where to Speed Code Up 142 Rethinking Alogrithms 143 The Special Advantages of Assembly Language 146 Data and Stack Alignment 147 Running on Seven registers 148 Table-Driven Code 150 Unrolling Loops 151 Loop Counter Efficiency 154 Minimizing Pointer Increments 154 Jump Tables 156 Coiled Loops 157 Tandem Loops 158 Prefetching into Cache 160 Setting Flags 161 Fast Multiplication 162 Rethinking in Assembly Language 163 Conclusion 169 Chapter 8 C Performance 171 How C Statements Translate into Assembly 172 Language Compiler Optimization Switch Behaviors 180 Predicting C Performance 182 Conclusion 183 Chapter 9 MMX 185 General Processor Enhancements 186 The MMX Registers 189 The MMX Instructions 190 MMX Instruction Timing and Pairing 195 Software Emulation of MMX 196 Conclusion 202 References 203 Part 2 Practice 205 Chapter 10 Moving Memory 207 The CHMEM.EXE Program 208 Alignment, The First Rule 208 A Pentium Read-Before-Write Speedup 212 Conclusion 217 Chapter 11 Random Numbers and Primes 219 The Algorithms 220 On the CD-ROM 221 Random Number Theory 222 Random Numbers in Practice 227 Conclusion 250 References 251 Chapter 12 Linked Lists and Trees 253 Nodes in a Nutshell 253 Speed Issues 257 Inner Loops 258 Linked List Searching 259 Binary Tree Searching 261 Traversing a Binary Tree 264 Conclusion 266 Chapter 13 Hashing 267 When to Hash 267 How to Hash 268 Fast Hashing with IL_HashFindSymbol() 268 IL_HashAddSymbol() 278 The Demo Program: CHHASH.EXE 279 Conclusion 279 Chapter 14 Huffman Compression 281 A Huffman Example 281 What Huffman Is Good For 283 Performance 285 Huffman, Step by Step 285 Huffman Compression 286 Huffman Decompression 297 Conclusion 299 References 300 Chapter 15 A Fast Sort 301 Order N Sorting 301 The CHSORT.EXE Program 303 The First Inner Loop of the Distribution 303 Sort The Second Inner Loop of the Distribution 307 Sort Performance 308 The Loop Not Taken 312 Conclusion 313 Chapter 16 Matrix Multiplication 315 Matrix Review 315 Fast Multiplication with a Three-by-Three 317 Matrix The CHMATRX.EXE Demonstration Program 320 Large Matrices 320 Conclusion 322 Chapter 17 JPEG 323 How JPEG Works 324 Taking on the Inverse DCT 329 Conclusion 349 References 349 Epilogue 351 Index 353
Caractéristiques techniques
PAPIER | |
Éditeur(s) | Addison Wesley |
Auteur(s) | Rick Booth |
Parution | 31/01/1997 |
Nb. de pages | 364 |
EAN13 | 9780201479607 |
Avantages Eyrolles.com
Consultez aussi
- Les meilleures ventes en Graphisme & Photo
- Les meilleures ventes en Informatique
- Les meilleures ventes en Construction
- Les meilleures ventes en Entreprise & Droit
- Les meilleures ventes en Sciences
- Les meilleures ventes en Littérature
- Les meilleures ventes en Arts & Loisirs
- Les meilleures ventes en Vie pratique
- Les meilleures ventes en Voyage et Tourisme
- Les meilleures ventes en BD et Jeunesse