Inner loops
A sourcebook for fast 32-bit software development
Résumé
Inner Loops: A Sourcebook for Fast 32-bit Software Development gives the green light to optimal PC performance with practical advice and a strategic sampling of important algorithms. Focused directly on the 32-bit future of PC computing, Inner Loops explores the new rules and opportunities of a wide-open memory space, parallel instruction execution, and clock speeds in the hundreds of megahertz. You'll be taken through:
- a thorough review of 32-bit code optimization for the 486, Pentium, and Pentium Pro
- making the transition from 16-bit to 32-bit assembly language
- principles of C and assembly language optimization
- tips for fast 32-bit software design
- real-world examples of top-speed inner loops for several important PC algorithms
- what MMX, the Intel multimedia extensions, mean for speed
Author Rick Booth backs up his theory of speed with
practical examples and source code, including such topics
as:
- Fast memory moves
- Random numbers
- Hashing
- Huffman compression
- Sorting
- Matrix math
- JPEG's inner loop
Consultant and developer Rick Booth is a 17-year veteran of the video game and digital video industries.
Table of contents :
Acknowledgments ix
About the CD-ROM xi
Preface xiii
Introduction 1
Optimized Alogrithms for the 32-Bit PC 1
Who This Book Is For 2
Overview of the Contents 2
Playing Devil's Advocate 5
Inner Loops 6
Part 1 Theory 9
Chapter 1 The Reference World of the 11
Pentium-100
The Pentium-100 Processor 12
Memory 13
The PCI and ISA Buses 15
Disks 15
The Modem 18
The Network Connection 18
Still Images 19
Video 19
Audio 20
Text 20
Compiled Code 21
Operating Systems 21
Conclusion 22
Chapter 2 32-Bit Assembly Language 23
What to Forget About 16-Bit Assembly 24
Language
Basic 32-Bit Assembly Language 26
The Instruction Set 34
Cycle-Counting Conventions 45
Conclusion 46
Chapter 3 Structured Assembly Language 49
Conventional Assembly Language 50
C and Assembly Language Interleaved 52
Calling and Returning 55
Data Structures 57
Structured Assembly Language 58
Debugging 65
Conclusion 65
Chapter 4 Optimizing the 486 67
The CH486.EXE Instruction Timing Program 68
The Instruction Set 69
The Cache 80
The 486 Output Queue 84
Data and Code Alignment 85
Cycle Thieves and Cycle Savers 87
Conclusion 89
References 89
Chapter 5 Optimizing the Pentium 91
The CHPENT.BAT Instruction Timing Program 92
Pentium Features Overview 92
How Instruction Pairing Works 93
The Instruction Set 98
Cache 109
The Pentium Output Queue 115
Data and Code Alignment 116
Cycle Thieves and Cycle Savers 118
Instruction Stream Timing Oddities 119
Conclusion 121
References 121
Chapter 6 Optimizing the Pentium Pro 123
The CHPPRO.EXE Instruction Timing Program 124
Dynamic Execution 124
Micro-ops 127
New Instructions 129
Instruction Set Timing 130
The Floating-Point Unit 134
The Cache 135
Data Code and Alignment 136
The Partial Register Stall 136
Branch Prediction 137
Optimization Summary 138
Conclusion 139
References 140
Chapter 7 Achieving Speed 141
When and Where to Speed Code Up 142
Rethinking Alogrithms 143
The Special Advantages of Assembly Language 146
Data and Stack Alignment 147
Running on Seven registers 148
Table-Driven Code 150
Unrolling Loops 151
Loop Counter Efficiency 154
Minimizing Pointer Increments 154
Jump Tables 156
Coiled Loops 157
Tandem Loops 158
Prefetching into Cache 160
Setting Flags 161
Fast Multiplication 162
Rethinking in Assembly Language 163
Conclusion 169
Chapter 8 C Performance 171
How C Statements Translate into Assembly 172
Language
Compiler Optimization Switch Behaviors 180
Predicting C Performance 182
Conclusion 183
Chapter 9 MMX 185
General Processor Enhancements 186
The MMX Registers 189
The MMX Instructions 190
MMX Instruction Timing and Pairing 195
Software Emulation of MMX 196
Conclusion 202
References 203
Part 2 Practice 205
Chapter 10 Moving Memory 207
The CHMEM.EXE Program 208
Alignment, The First Rule 208
A Pentium Read-Before-Write Speedup 212
Conclusion 217
Chapter 11 Random Numbers and Primes 219
The Algorithms 220
On the CD-ROM 221
Random Number Theory 222
Random Numbers in Practice 227
Conclusion 250
References 251
Chapter 12 Linked Lists and Trees 253
Nodes in a Nutshell 253
Speed Issues 257
Inner Loops 258
Linked List Searching 259
Binary Tree Searching 261
Traversing a Binary Tree 264
Conclusion 266
Chapter 13 Hashing 267
When to Hash 267
How to Hash 268
Fast Hashing with IL_HashFindSymbol() 268
IL_HashAddSymbol() 278
The Demo Program: CHHASH.EXE 279
Conclusion 279
Chapter 14 Huffman Compression 281
A Huffman Example 281
What Huffman Is Good For 283
Performance 285
Huffman, Step by Step 285
Huffman Compression 286
Huffman Decompression 297
Conclusion 299
References 300
Chapter 15 A Fast Sort 301
Order N Sorting 301
The CHSORT.EXE Program 303
The First Inner Loop of the Distribution 303
Sort
The Second Inner Loop of the Distribution 307
Sort
Performance 308
The Loop Not Taken 312
Conclusion 313
Chapter 16 Matrix Multiplication 315
Matrix Review 315
Fast Multiplication with a Three-by-Three 317
Matrix
The CHMATRX.EXE Demonstration Program 320
Large Matrices 320
Conclusion 322
Chapter 17 JPEG 323
How JPEG Works 324
Taking on the Inverse DCT 329
Conclusion 349
References 349
Epilogue 351
Index 353
Caractéristiques techniques
| PAPIER | |
| Éditeur(s) | Addison Wesley |
| Auteur(s) | Rick Booth |
| Parution | 31/01/1997 |
| Nb. de pages | 364 |
| EAN13 | 9780201479607 |
Avantages Eyrolles.com
Consultez aussi
- Les meilleures ventes en Graphisme & Photo
- Les meilleures ventes en Informatique
- Les meilleures ventes en Construction
- Les meilleures ventes en Entreprise & Droit
- Les meilleures ventes en Sciences
- Les meilleures ventes en Littérature
- Les meilleures ventes en Arts & Loisirs
- Les meilleures ventes en Vie pratique
- Les meilleures ventes en Voyage et Tourisme
- Les meilleures ventes en BD et Jeunesse