Also posted to the mailing list. Important Background Information: A soundex hash is a 4 character representation of any English word consisting of a letter followed by three numbers. There is a high collision rate. When a collision occurs between two or more words, they all have similar phonetic structure. See also the perl Text::Soundex manual page and _The Art of Computer Programming Volume 3_ by Donald Knuth. My posting follows: I've figured out a way to create a search engine that can tell you what sites have hits for a search without the search engine actually knowing what each site has. This is particularly important to ease liability issues for search engine operators. It will probably become a part of my distributed anonymous napster clone. Here's how it works: The site makes up its list of stuff. Then it divides the list of files into words (a word being any alphanumeric sequence separated by newline, whitespace, or punctuation). Then those words are hashed with soundex. The list of soundex tokens is sorted, uniq'd, and then sent to the search engine. When the user does a search for a keyword, their search term will be hashed into a soundex token which is sent to the search engine. The search engine will respond with contact information for every site that has a file with a word that "sounds like" the keyword requested. The RIAA or other groups couldn't get search engine operators to ban certain search terms because the engine operators never see the search terms. They couldn't be made to ban certain filenames because they never see the filenames. They could refuse to ban certain soundex hashes because doing so would ban every word that sounds like a given search term. -- Brian Ristuccia brianr@osiris.978.org bristucc@nortelnetworks.com bristucc@cs.uml.edu