View Single Post
Old 04-10-2004, 01:58 AM   #3 (permalink)
manalone
Once upon a time...
 
Sorry kel, I disagreee.

1) The basic HITS algorithm is available. It's based on locating the adjacency matrix of a base set of pages.
If you google for HITS algorithm, there are several pages with detailed explanations of the code.

Obvioulsy, it's not exactly what Google does, but it is the same basic algorithm.

2) I don't think that the relational schema is an issue, since HITS and other algorithms are based on well defined data. The relational nature is actually an asssitance, since you can do a lot of the work in-situ.

tuckdiddy, you should try one of these books:
Modern Information Retrieval
Ricardo Baeza-Yates, Berthier Ribeiro-Neto, Addison Wesley, 1999.

Data Mining: Concepts and Techniques
Jiawei Han and Micheline Kamber, Morgan Kaufmann, 2000.

They have useful info on building search engines.
__________________
--
Man Alone
=======
Abstainer: a weak person who yields to the temptation of denying himself a pleasure.
Ambrose Bierce, The Devil's Dictionary.
manalone is offline  
 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43