Next: Conclusions
Up: A Top-10 Approach to
Previous: Previous Work
In this paper we present a systematic approach towards the reduction of the
web latency experienced by web clients, by
prefetching documents, before they are actually requested by the users.
Prefetching has not been employed in the Web so far mainly for several reasons:
(i) a prefetching robot can easily get out of control and
start prefetching everything that it is out there,
(ii) prefetching may be ineffective, since nobody knows
what a client will want to access,
(iii) proxies may delay clients, and
finally
(iv) prefetching over high-speed interconnection networks may result in
minor performance improvements.
We believe our prefetching approach addresses all previous concerns about
prefetching for the following reasons:
-
The Top-10 approach to prefetching uses several well defined
thresholds to make sure
that only a small number of useful documents is prefetched.
The prefetched documents do not increase the total
traffic by more than 10%-20%. Our prefetching approach sometimes
even reduces the total network traffic by aggregating several
clients requests. Thus, it can not get out of control and lead to traffic
chaos.
-
We prefetch only the proven popular documents that most clients
have accessed, and that future clients will probably want to access.
Thus, the risk of bringing useless documents is minimized.
Essentially, by prefetching only popular documents
the risk of guessing the future is significantly reduced.
-
Although sometimes a user ``feels'' that it is faster to retrieve a document
directly from a server instead going through a proxy,
this is because most of the other users
think that they should go through a proxy and avoid putting
unnecessary load to a server. If everybody starts accessing the
servers without intervening proxies, most servers will collapse, and
most interconnection lines close to the servers will saturate.
The only way to avoid the collapse is to use proxies somewhere
in the traffic route from a client to a server.
One could argue that aggregating several clients through a proxy
will slow down all clients and eventually
saturate the proxy. Although this may be true to some extent,
it can not easily happen. Proxies are usually powerful computers
capable of handling hundred of requests per second. If each interactive
user that browses the Web makes one request every few seconds, then
the proxy will be able to handle up to a few
thousand users before being unable to cope with more requests.
Realistically, most departments do not have that many users to
generate the load needed to saturate a proxy.
-
Although prefetching over a high-speed interconnection may result in
minor performance improvements, certainly it does not hurt
the clients, and it may benefit the servers by taking some of the
load off the server and putting it on the proxies.
In addition to addressing previous concerns,
we believe that prefetching in general,
and Top-10 in particular has several advantages:
-
Prefetching can be used during off-peak periods to downlod useful documents
at low transfer costs. By transferring documents during low-rate
periods, prefetching pays for itself, and may even result in
profit.
-
Prefetching reduces client latency by spreading the load from busy
servers to idle clients/proxies. Thus, it improves user turnaround
time and user productivity.
-
Prefetching can also be used to organize up-to-date digital repositories
of related documents. For example, a prefetch agent may download
all documents related to a specific research topic, update them regularly
and make them available to local users.
Although several people have concerns about prefetching,
we believe that the Top-10 approach has been specifically designed to
address these concerns, and bring out the
benefits of prefetching. Top-10 takes the risks out of prefetching
by doing it in a controlled way that results in significant
performance improvements with only a minor traffic increase.
Next: Conclusions
Up: A Top-10 Approach to
Previous: Previous Work
Evangelos Markatos
Fri Nov 1 16:38:26 EET 1996