相似網頁檢索 的英文怎麼說
中文拼音 [xiāngsìwǎngyèjiǎnsuǒ]
相似網頁檢索
英文
find similar- 相 : 相Ⅰ名詞1 (相貌; 外貌) looks; appearance 2 (坐、立等的姿態) bearing; posture 3 [物理學] (相位...
- 網 : Ⅰ名詞1 (捕魚捉鳥的器具) net 2 (像網的東西) thing which looks like a net 3 (像網一樣的組織或...
- 頁 : 名詞(張) page; leaf Ⅱ量詞(面) page
- 檢 : Ⅰ動詞1 (查) check up; inspect; examine 2 (約束; 檢點) restrain oneself; be careful in one s c...
- 索 : Ⅰ名詞1 (大繩子; 大鏈子) a large rope 2 (姓氏) a surname Ⅱ動詞1 (搜尋; 尋找) search 2 (要; ...
- 相似 : 1. (相像) resemble; be similar; be alike 2. (相像處; 類似物) similarity; similitude; analogue
- 網頁 : asp
- 檢索 : retrieval; retrieve; search; searching
-
Document similarity search is to find documents similar to a given query document and return a ranked list of similar documents to users, which is widely used in many text and web systems, such as digital library, search engine, etc. traditional retrieval models, including the okapi s bm25 model and the smart s vector space model with length normalization, could handle this problem to some extent by taking the query document as a long query
文檔相似搜索指從文檔集中檢索與給定查詢文檔相似的文檔。對于給定的查詢文檔,我們期望文檔相似搜索系統能夠返回一個按相似度排序的相似文檔列表。文檔相似搜索技術已經被廣泛應用到電子圖書館,搜索引擎等系統中,例如citeseer . ist科學文獻數字圖書館的相似文獻推薦功能, google的相似網頁查詢功能等。Then in - depth analysed some key technologies including the webpage content extracting, the website topological structure analysis, the website subject analysis & indexing and the website retrieval. especially, some novety solutions, such as the content extraction methology based on the space length of tags, the website tree construction methology based on the similarty of directory in url, and the website concept methology basd on the structure of website, are proposed. finally on the basis of the algorithms and theory, the intellectual websites retrieval system is built and the experiments proved that it can achieve better results
根據以上研究目的,本文首先分析了智能網站檢索技術的構造與實現,提出了基於主題標引的智能網站檢索系統的系統結構與實現策略,之後對系統中的主題分析、標引與檢索等關鍵技術進行了深入分析,針對主要技術難點重點討論了網頁正文抽取,網站拓撲結構分析,網站主題獲取等相關實現技術,提出了基於標簽間距的正文抽取演算法,基於url目錄相似度的網站結構分析演算法和基於網站結構的網站主題概念獲取演算法等解決方案。
分享友人