Inner loops - A sourcebook for fast 32-bit software development -... - Librairie Eyrolles
Tous nos rayons

Déjà client ? Identifiez-vous

Mot de passe oublié ?

Nouveau client ?

CRÉER VOTRE COMPTE
Inner loops
Ajouter à une liste

Librairie Eyrolles - Paris 5e
Indisponible

Inner loops

Inner loops

A sourcebook for fast 32-bit software development

Rick Booth

364 pages, parution le 31/01/1997

Résumé

No speed limits have been posted on the PC performance track, yet much software runs in the slow lane, functioning at 10 to 50 percent of its potential speed. The cause of these slowdowns? Bottlenecking on time-critical inner loops.

Inner Loops: A Sourcebook for Fast 32-bit Software Development gives the green light to optimal PC performance with practical advice and a strategic sampling of important algorithms. Focused directly on the 32-bit future of PC computing, Inner Loops explores the new rules and opportunities of a wide-open memory space, parallel instruction execution, and clock speeds in the hundreds of megahertz. You'll be taken through:

  • a thorough review of 32-bit code optimization for the 486, Pentium, and Pentium Pro
  • making the transition from 16-bit to 32-bit assembly language
  • principles of C and assembly language optimization
  • tips for fast 32-bit software design
  • real-world examples of top-speed inner loops for several important PC algorithms
  • what MMX, the Intel multimedia extensions, mean for speed

Author Rick Booth backs up his theory of speed with practical examples and source code, including such topics as:

  • Fast memory moves
  • Random numbers
  • Hashing
  • Huffman compression
  • Sorting
  • Matrix math
  • JPEG's inner loop
Many chapters contain high-performance demos, which are also found on the CD. These include one of the fastest sort engines possible, a top-speed Huffman compression system, and JPEG's decompression inner loop tuned for top performance.

Consultant and developer Rick Booth is a 17-year veteran of the video game and digital video industries.

Table of contents :

Acknowledgments                                    ix
About the CD-ROM                                   xi
Preface                                            xiii
Introduction                                       1
Optimized Alogrithms for the 32-Bit PC             1
Who This Book Is For                               2
Overview of the Contents                           2
Playing Devil's Advocate                           5
Inner Loops                                        6
Part 1  Theory                                     9
  Chapter 1  The Reference World of the            11
  Pentium-100
    The Pentium-100 Processor                      12
    Memory                                         13
    The PCI and ISA Buses                          15
    Disks                                          15
    The Modem                                      18
    The Network Connection                         18
    Still Images                                   19
    Video                                          19
    Audio                                          20
    Text                                           20
    Compiled Code                                  21
    Operating Systems                              21
    Conclusion                                     22
  Chapter 2  32-Bit Assembly Language              23
    What to Forget About 16-Bit Assembly           24
    Language
    Basic 32-Bit Assembly Language                 26
    The Instruction Set                            34
    Cycle-Counting Conventions                     45
    Conclusion                                     46
  Chapter 3  Structured Assembly Language          49
    Conventional Assembly Language                 50
    C and Assembly Language Interleaved            52
    Calling and Returning                          55
    Data Structures                                57
    Structured Assembly Language                   58
    Debugging                                      65
    Conclusion                                     65
  Chapter 4  Optimizing the 486                    67
    The CH486.EXE Instruction Timing Program       68
    The Instruction Set                            69
    The Cache                                      80
    The 486 Output Queue                           84
    Data and Code Alignment                        85
    Cycle Thieves and Cycle Savers                 87
    Conclusion                                     89
    References                                     89
  Chapter 5  Optimizing the Pentium                91
    The CHPENT.BAT Instruction Timing Program      92
    Pentium Features Overview                      92
    How Instruction Pairing Works                  93
    The Instruction Set                            98
    Cache                                          109
    The Pentium Output Queue                       115
    Data and Code Alignment                        116
    Cycle Thieves and Cycle Savers                 118
    Instruction Stream Timing Oddities             119
    Conclusion                                     121
    References                                     121
  Chapter 6  Optimizing the Pentium Pro            123
    The CHPPRO.EXE Instruction Timing Program      124
    Dynamic Execution                              124
    Micro-ops                                      127
    New Instructions                               129
    Instruction Set Timing                         130
    The Floating-Point Unit                        134
    The Cache                                      135
    Data Code and Alignment                        136
    The Partial Register Stall                     136
    Branch Prediction                              137
    Optimization Summary                           138
    Conclusion                                     139
    References                                     140
  Chapter 7  Achieving Speed                       141
    When and Where to Speed Code Up                142
    Rethinking Alogrithms                          143
    The Special Advantages of Assembly Language    146
    Data and Stack Alignment                       147
    Running on Seven registers                     148
    Table-Driven Code                              150
    Unrolling Loops                                151
    Loop Counter Efficiency                        154
    Minimizing Pointer Increments                  154
    Jump Tables                                    156
    Coiled Loops                                   157
    Tandem Loops                                   158
    Prefetching into Cache                         160
    Setting Flags                                  161
    Fast Multiplication                            162
    Rethinking in Assembly Language                163
    Conclusion                                     169
  Chapter 8  C Performance                         171
    How C Statements Translate into Assembly       172
    Language
    Compiler Optimization Switch Behaviors         180
    Predicting C Performance                       182
    Conclusion                                     183
  Chapter 9  MMX                                   185
    General Processor Enhancements                 186
    The MMX Registers                              189
    The MMX Instructions                           190
    MMX Instruction Timing and Pairing             195
    Software Emulation of MMX                      196
    Conclusion                                     202
    References                                     203
Part 2  Practice                                   205
  Chapter 10  Moving Memory                        207
    The CHMEM.EXE Program                          208
    Alignment, The First Rule                      208
    A Pentium Read-Before-Write Speedup            212
    Conclusion                                     217
  Chapter 11  Random Numbers and Primes            219
    The Algorithms                                 220
    On the CD-ROM                                  221
    Random Number Theory                           222
    Random Numbers in Practice                     227
    Conclusion                                     250
    References                                     251
  Chapter 12  Linked Lists and Trees               253
    Nodes in a Nutshell                            253
    Speed Issues                                   257
    Inner Loops                                    258
    Linked List Searching                          259
    Binary Tree Searching                          261
    Traversing a Binary Tree                       264
    Conclusion                                     266
  Chapter 13  Hashing                              267
    When to Hash                                   267
    How to Hash                                    268
    Fast Hashing with IL_HashFindSymbol()          268
    IL_HashAddSymbol()                             278
    The Demo Program: CHHASH.EXE                   279
    Conclusion                                     279
  Chapter 14  Huffman Compression                  281
    A Huffman Example                              281
    What Huffman Is Good For                       283
    Performance                                    285
    Huffman, Step by Step                          285
    Huffman Compression                            286
    Huffman Decompression                          297
    Conclusion                                     299
    References                                     300
  Chapter 15  A Fast Sort                          301
    Order N Sorting                                301
    The CHSORT.EXE Program                         303
    The First Inner Loop of the Distribution       303
    Sort
    The Second Inner Loop of the Distribution      307
    Sort
    Performance                                    308
    The Loop Not Taken                             312
    Conclusion                                     313
  Chapter 16  Matrix Multiplication                315
    Matrix Review                                  315
    Fast Multiplication with a Three-by-Three      317
    Matrix
    The CHMATRX.EXE Demonstration Program          320
    Large Matrices                                 320
    Conclusion                                     322
  Chapter 17  JPEG                                 323
    How JPEG Works                                 324
    Taking on the Inverse DCT                      329
    Conclusion                                     349
    References                                     349
Epilogue                                           351
Index                                              353

Caractéristiques techniques

  PAPIER
Éditeur(s) Addison Wesley
Auteur(s) Rick Booth
Parution 31/01/1997
Nb. de pages 364
EAN13 9780201479607

Avantages Eyrolles.com

Livraison à partir de 0,01 en France métropolitaine
Paiement en ligne SÉCURISÉ
Livraison dans le monde
Retour sous 15 jours
+ d'un million et demi de livres disponibles
satisfait ou remboursé
Satisfait ou remboursé
Paiement sécurisé
modes de paiement
Paiement à l'expédition
partout dans le monde
Livraison partout dans le monde
Service clients sav@commande.eyrolles.com
librairie française
Librairie française depuis 1925
Recevez nos newsletters
Vous serez régulièrement informé(e) de toutes nos actualités.
Inscription