We envision the operation of Top-10 prefetching in a client-proxy-server framework (see fig. 1). Prefetching occurs both at the client and the proxy level. User-level clients prefetch from first-level proxies to cater the needs of particular users. The benefits and costs of prefetching on user-level clients are discussed in section 3.2. First and second-level proxies play both the client and the server role. First-level proxies are clients to second-level proxies and prefetch and cache documents for user-level clients (ie. browsers). Second-level proxies are clients to various popular servers from which they prefetch and cache documents to be served to their own clients. The performance results of Top-10 prefetching at first and second level proxies are discussed in sections 3.3 and 3.4 respectively.
We picture first-level proxies at the department level of companies or institutions and second-level proxies at the level of organizations or universities. Eventhough this framework implies a structure, this structure is dynamic and may support dynamic proxy configuration schemes. In any case, Top-10 prefetching may be transparent to the user and cooperate with the caching mechanisms of the browser or the proxy.
Figure 2: Top-10 prefetching depends on the cooperation of the various http servers and a client-side prefetching agent
The implementation of Top-10 prefetching is based on the cooperation of server and client-side entities (see fig. 2). On the server-side, the Top-10 daemon processes the access logs of the server, and compiles the Top-10, the list of the most popular documents on that server. Then, it updates a web page presenting this information and the Top-10 is served as yet another document by the http server. The frequency of evaluating the Top-10 depends on how frequently the content on the server changes. In section 3.5 we investigate this issue further.
On the client side, the prefetching agent logs all http requests of the client and adapts its prefetching activity based on them. The prefetching agent co-operates with a proxy that filters all http requests initiated by the client. If an http request can be served from the local cache of prefetched documents, the proxy serves the document from the cache. Otherwise, it forwards the request to the web server or the next level proxy. Daily or weekly, depending on the configuration, the prefetching agent goes through the client access logs which contain all http requests made by the client and creates the prefetching profile of the client, that is, the list of servers from which prefetching should be activated. The number of documents requested from any of those servers during the previous time interval exceeds the ACCESS_THRESHOLD. Finally, based on the prefetching profile of the client, the prefetching agent requests the most popular documents from the servers which have been activated for prefetching. The number of documents prefetched from each server is equal to the number of requests to that server during the last time interval, or the TOP_10 whichever is less.
Although the details of prefetching Top-10 documents can be fine-tuned to suit each client, the underlying principle of prefetching only popular documents is powerful enough to lead in successful prefetching. An advanced prefetching agent may request and take into account additional parameters like document size, percentile document popularity, and client resources, to make a more informed decision on what should be prefetched.