Recent results suggest that the World Wide Web traffic continues to increase at exponential rates [9]. One way to reduce web traffic and speed up web accesses is through the use of caching. Caching documents close to clients that need them, reduces the number of server requests and the traffic associated with them. Unfortunately, recent results suggest that the maximum cache hit rate that can be achieved by any caching algorithm is usually no more than 40% to 50% - that is, regardless of the caching scheme in use, one out of two documents can not be found in the cache [1]. The reason is simple: Most people browse and explore the web, trying to find new information. Thus, caching old documents is of limited use in an environment where users want to explore new information.
One way to further increase the caching hit ratio is to anticipate future document requests and preload or prefetch these documents in a local cache. When the client requests these documents, the documents will be available in the local cache, and it won't be necessary to fetch them from the remote web server. Thus, successful prefetching reduces the web latency observed by clients, and lowers both server and network load. In addition, off-line prefetching activated after-hours when there is plenty of bandwidth at low rates, may reduce overall cost and improve performance. At the same time, prefetching facilitates off-line browsing, since new documents that the user will most probably be interested in, are automatically prefetched.
Unfortunately, it is difficult, if not impossible, to guess future user needs, since no program can predict the future. In this paper we propose Top-10, an approach to prefetching that addresses successfully the above concerns through the use of server knowledge to locate the most popular documents, proxies to aggregate requests to a server and amortize the costs of prefetching over a large number of clients, and adaptation to each client's evolving access patterns and needs.