Programming Spiders, Bots, and Aggregators in Java
You will quickly build on your basic knowledge of Java to quickly master the techniques that are essential to this specialized world of programming, including parsing HTML, interpreting data, working with cookies, reading and writing XML, and managing high-volume workloads. You'll also learn about the ethical issues associated with bot use--and the limitations imposed by some websites.
This book offers two levels of instruction, both of which are focused on the library of routines provided on the companion CD. If your main concern is adding ready-made functionality to an application, you'll achieve your goals quickly thanks to step-by-step instructions and sample programs that illustrate effective implementations. If you're interested in the technologies underlying these routines, you'll find in-depth explanations of how they work and the techniques required for customization.
Chapter 1: Java Socket Programming.
Chapter 2: Examining the Hypertext Transfer Protocol.
Chapter 3: Accessing Secure Sites with HTTPS.
Chapter 4: HTML Parsing.
Chapter 5: Posting Forms.
Chapter 6: Interpreting Data.
Chapter 7: Exploring Cookies.
Chapter 8: Building a Spider.
Chapter 9: Building a High-Volume Spider.
Chapter 10: Building a Bot.
Chapter 11: Building an Aggregator.
Chapter 12: Using Bots Conscientiously.
Chapter 13: The Future of Bots.
Appendix A: The Bot Package.
Appendix B: Various HTTP Related Charts.
Appendix C: Troubleshooting.
Appendix D: Installing Tomcat.
Appendix E: How to Compile Examples Under Windows.
Appendix F: How to Compile Examples Under UNIX.
Appendix G: Recompiling the Bot Package.