Déjà client ? Identifiez-vous

Mot de passe oublié ?

Nouveau client ?

CRÉER VOTRE COMPTE
Perl & LWP
Ajouter à une liste

Librairie Eyrolles - Paris 5e
Indisponible

Perl & LWP

Perl & LWP

Sean M. Burke

242 pages, parution le 12/07/2002

Résumé

Perl soared to popularity as a language for creating and managing web content, but with LWP (Library for WWW in Perl), Perl is equally adept at consuming information on the Web. LWP is a suite of modules for fetching and processing web pages.

The Web is a vast data source that contains everything from stock prices to movie credits, and with LWP all that data is just a few lines of code away. Anything you do on the Web, whether it's buying or selling, reading or writing, uploading or downloading, news to e-commerce, can be controlled with Perl and LWP. You can automate Web-based purchase orders as easily as you can set up a program to download MP3 files from a web site.

Perl & LWP covers:
  • Understanding LWP and its design
  • Fetching and analyzing URLs
  • Extracting information from HTML using regular expressions and tokens
  • Working with the structure of HTML documents using trees
  • Setting and inspecting HTTP headers and response codes
  • Managing cookies
  • Accessing information that requires authentication
  • Extracting links
  • Cooperating with proxy caches
  • Writing web spiders (also known as robots) in a safe fashion

Perl & LWP includes many step-by-step examples that show how to apply the various techniques. Programs to extract information from the web sites of BBC News, Altavista, ABEBooks.com, and the Weather Underground, to name just a few, are explained in detail, so that you understand how and why they work.

Perl programmers who want to automate and mine the web can pick up this book and be immediately productive. Written by a contributor to LWP, and with a foreword by one of LWP's creators, Perl & LWP is the authoritative guide to this powerful and popular toolkit.

Contents

Foreword

Preface

1. Introduction to Web Automation
     The Web as Data Source
     History of LWP
     Installing LWP
     Words of Caution
     LWP in Action

2. Web Basics
     URLs
     An HTTP Transaction
     LWP::Simple
     Fetching Documents Without LWP::Simple
     Example: AltaVista
     HTTP POST
     Example: Babelfish

3. The LWP Class Model
     The Basic Classes
     Programming with LWP Classes
     Inside the do_GET and do_POST Functions
     User Agents
     HTTP::Response Objects
     LWP Classes: Behind the Scenes

4. URLs
     Parsing URLs
     Relative URLs
     Converting Absolute URLs to Relative
     Converting Relative URLs to Absolute

5. Forms
     Elements of an HTML Form
     LWP and GET Requests
     Automating Form Analysis
     Idiosyncrasies of HTML Forms
     POST Example: License Plates
     POST Example: ABEBooks.com
     File Uploads
     Limits on Forms

6. Simple HTML Processing with Regular Expressions
     Automating Data Extraction
     Regular Expression Techniques
     Troubleshooting
     When Regular Expressions Aren't Enough
     Example: Extracting Links from a Bookmark File
     Example: Extracting Links from Arbitrary HTML
     Example: Extracting Temperatures from Weather Underground

7. HTML Processing with Tokens
     HTML as Tokens
     Basic HTML::TokeParser Use
     Individual Tokens
     Token Sequences
     More HTML::TokeParser Methods
     Using Extracted Text

8. Tokenizing Walkthrough
     The Problem
     Getting the Data
     Inspecting the HTML
     First Code
     Narrowing In
     Rewrite for Features
     Alternatives

9. HTML Processing with Trees
     Introduction to Trees
     HTML::TreeBuilder
     Processing
     Example: BBC News
     Example: Fresh Air

10. Modifying HTML with Trees
     Changing Attributes
     Deleting Images
     Detaching and Reattaching
     Attaching in Another Tree
     Creating New Elements

11. Cookies, Authentication, and Advanced Requests
     Cookies
     Adding Extra Request Header Lines
     Authentication
     An HTTP Authentication Example: The Unicode Mailing Archive

12. Spiders
     Types of Web-Querying Programs
     A User Agent for Robots
     Example: A Link-Checking Spider
     Ideas for Further Expansion

A. LWP Modules

B. HTTP Status Codes

C. Common MIME Types

D. Language Tags

E. Common Content Encodings

F. ASCII Table

G. User's View of Object-Oriented Modules

Index

L'auteur - Sean M. Burke

Sean M. Burke is an active member in the Perl community and one of CPAN's most prolific module authors. He has been a columnist for The Perl Journal since 1998, and is an authority on markup languages. Trained as a linguist, he also develops tools for software internationalization and native language preservation. Sean is also the author of O'Reilly's Perl & LWP.

Caractéristiques techniques

  PAPIER
Éditeur(s) O'Reilly
Auteur(s) Sean M. Burke
Parution 12/07/2002
Nb. de pages 242
Format 18 x 23,5
Couverture Broché
Poids 416g
Intérieur Noir et Blanc
EAN13 9780596001780
ISBN13 978-0-596-00178-0

Avantages Eyrolles.com

Livraison à partir de 0,01 en France métropolitaine
Paiement en ligne SÉCURISÉ
Livraison dans le monde
Retour sous 15 jours
+ d'un million et demi de livres disponibles
satisfait ou remboursé
Satisfait ou remboursé
Paiement sécurisé
modes de paiement
Paiement à l'expédition
partout dans le monde
Livraison partout dans le monde
Service clients sav@commande.eyrolles.com
librairie française
Librairie française depuis 1925
Recevez nos newsletters
Vous serez régulièrement informé(e) de toutes nos actualités.
Inscription