head	1.4;
access;
symbols;
locks; strict;
comment	@# @;


1.4
date	2000.10.10.03.10.33;	author brianr;	state Exp;
branches;
next	1.3;

1.3
date	2000.10.10.03.09.37;	author brianr;	state Exp;
branches;
next	1.2;

1.2
date	2000.10.10.03.07.42;	author brianr;	state Exp;
branches;
next	1.1;

1.1
date	2000.10.10.02.55.00;	author brianr;	state Exp;
branches;
next	;


desc
@old version
.
@


1.4
log
@s/puts/leaves/
@
text
@I've been thinking about this for a while now.. gnutella search results
currently contain the IP of the person with a match for the search request.
But wouldn't it be great if there was a way to get the file back to the end
user without revealing the posessor's IP address?

If one or more hosts between the file posessor and the requester
supported a special extension whereby the search results were rewritten to
traverse a HTTP proxy chain created on the fly, privacy would be improved.
Furthermore, if those HTTP proxy chains supported caching, performance might
be improved too. 

Here's how it works:

Host X joins the network, connecting to host Y, which is connected to host
Z. Host Y supports the new anonymous downloading feature. Host Z does not
support the anonymous downloading feature.

Host A, which may or may not support anonymous downloading, connects to the
gnutella network and searches for a document. The search request is
broadcast to attached hosts B and C. Host C happens to be connected to host
Z, which is connected to Y, and thus Z. Host X sees that it has received a
search request for a document it has from host Y, and sends a routed message
back through Y to the gnutellanet network. Host Y rewrites the search result
to include its own IP address. It also makes an entry in a time-expired
table and agrees to proxy the request to host X for anyone that asks. If for
some reason Y can not agree to proxy the request (perhaps it is over its
bandwidth cap) it will pass the search result unmodified to Z. When a
request comes for that document, Y it will fetch it from X. Host Y hands off
the rewritten packet to Z, which goes to C, B, and A. From host A's
perspective, Y had the file, not Z. At Y's discression, Y will enter the
file it got from X in its cache and also answer search requests matching it
affirmatively.

Now the response is passed up the chain, eventually to host A. Host A
requests the document from host Y, which proxies it to host X, which has the
document. Who did the user get the document from? They think they got it
from Y, but did they? No. They got it from X. Even if host Y leaves host X's
IP in the response, how can we be sure host Y isn't just forwarding the
request for someone else? Even when responding to requests that can be
fulfilled locally, servers should insert a random delay. In fact, if such a
system is in use, there is no reliable way to prove who you got a document
from unless you can monitor the Internet connections between every site
involved in the transaction.

Further complicating the matter might be the use of encryption and
connection multiplexing between involved hosts. Hosts X and Y, for example,
might communicate all information including proxied requests over a single
encrypted channel. They might pass fodder on that channel when no
transactions were in progress to reduce the effectiveness of traffic
analysis.

One other great advantage is that caching could be employed to much improve
download rates for popular files. Host Y, for example, could agree to keep
around a few hundred megs of recently downloaded files. It then could
respond to search requests for those files. 

With additional client support, a system for finding other files with the
same checksum as a search result could be employed. A round-robin DNS of
hosts that agree to answer requests for a common namespace of files could be
established. If a given host in the DNS listing did not have a file locally,
it would try to get it. Since gnutella file transfers are based purely on
HTTP, such a DNS entry could be used in responses to improve speed for
gnutella clients fetching documents through a caching proxy server network
such as squid. The use of such common namespace for responses would be
negotiated by the gnutella client.
@


1.3
log
@removed silly capitalization
@
text
@d37 1
a37 1
from Y, but did they? No. They got it from X. Even if host Y puts host X's
@


1.2
log
@removed part about negotiation. In reality, any gnutella server between
X and A can rewrite search results and agree to proxy.
@
text
@d61 1
a61 1
it would try to get it. Since GNUtella file transfers are based purely on
d63 1
a63 1
GNUtella clients fetching documents through a caching proxy server network
d65 1
a65 1
negotiated by the GNUtella client.
@


1.1
log
@Initial revision
@
text
@d1 1
a1 1
I've been thinking about this for a while now.. GNUtella search results
d6 5
a10 5
Now if some users between the user and the destination supported a special
extension to the GNUtella protocol, this would be possible. Each server in
the path between a searcher and the hit knows the return path. This is why
search results are directed rather than broadcasted, while still traversing
the GNUtella network.
d15 2
a16 3
Z. Host Y supports the new anonymous downloading feature, and during the
initial connection to host X, the use of this feature is negotiated. Host Z
does not support the anonymous downloading feature.
d19 1
a19 1
GNUtella network and searches for a document. The search request is
d22 11
a32 8
search request for a document it has from host Y, which supports the
anonymous downloading extension. It returns a result without its IP address,
but with a special string instead. Host Y directs this response to Z, but
since Z does not support anonymous downloading, host Y replaces the special
string with Y's IP address. It also makes an entry in a time-expired table
and agrees to proxy the request to host X for anyone that asks. If for some
reason Y can not agree to proxy the request (perhaps it is over its
bandwidth cap) it will put host X's IP in and forward the response to Z. 
d42 2
a43 2
from unless you can monitor the Internet connections of every site involved
in the transaction.
@