Class Summary
CrawlState Class containing state information about a crawling process.
HTMLLinkInfo Contains information about all links that were found in an HTML page.
HTTPFile Class representing a file that can be retrieved using the HTTP.
URLNormalizer Class that can be used to normalize/canonicalize URLs so that the can be compared.
WebCrawlReader Reader implementation that provides an XML representation of the (filtered) contents of (a) website(s) providing one or seed URLs.