Print this page Share

Web Content Mining With Java: Techniques for Exploiting the World Wide Web

ISBN: 978-0-470-84311-6
328 pages
April 2002
Web Content Mining With Java: Techniques for Exploiting the World Wide Web (047084311X) cover image


Unlock the potential of the world's biggest database.

This practical book shows you how to build portals, construct search engines and other knowledge-based applications to mine the information you need from the Web.

* Written by a developer for developers
* A practical, hands-on approach
* Illustrates how Java associated tools (XML, HTML) can be combined with database technology to display and manipulate Web-derived information more effectively.
* Demonstrates how to build a structure browser, portal, meta-search engine and how to make 'Talking Pages'
See More

Table of Contents

Preface xi

About the Author xix

Acknowledgements xxi

1 Surveying the Scene 1

2 Language of the Web 13

3 HTML and XML Parsing 33

4 Data Filters and Structured Queries 67

5 Building a Portal with Java 109

6 Building a Search Engine with Java 131

7 Mail Mining With Java 153

8 Introduction to Text Mining 177

9 Introduction to Data Mining 207

10 Loose Ends and Looking Ahead 231

Appendix A: Software Installation and Configuration 243

Appendix B: Javadoc Extracts 251

Appendix C: Earlier Versions of JAXP 271

Appendix D: License and Copyright Statements 275

Appendix E: Census 1891 Data XML 279

Appendix F: Share Price Cluster Data 287

Appendix G: Glossary of Acronyms 291

References 295

Further Reading 297

Index 299

See More


"When I got this book, I couldn't put it down. A lot of computer books sit on the shelf or send me to sleep, but not this one. Not only is it both topical and useful, but it hits a just-about-ideal balance between code and food for thought. The author has a real knack for useful solutions to complex problems." (www. Java Ranch 17 May 2002)
See More

Related Titles

Back to Top