chinese corpus 中文意思是什麼

chinese corpus 解釋
中文語料庫
  • chinese : adj. 中國(人)的;中國(話)的。 the Chinese Wall 萬里長城。n. 〈sing. , pl. 〉 中國人;中國話,漢語。
  • corpus : n (pl pora )1 軀體,身體;〈主謔〉屍體。2 (法典等的)集成,全集。3 (事物的)主體;【法律】主...
  1. The institute of computational linguistics, peking university has completed the basic processing of a contemporary chinese corpus that has 27 million chinese characters

    摘要北京大學計算語言學研究所已經完成了一個有2700萬漢字的現代漢語語料庫的基本加工。
  2. The current travel the main contents of culture stanzas to include the building mountain district exquisite article the resources to display, the literature performance wait the activity. the building mountain that rites part is solemn is cultured, and perform the part of and happiness and crazy, outstanding suburb color, mold the clear suburb topic of to emerge with the consciousness with the the new atmosphere to travel the area of big area, culture. the current travels the culture stanza with the happy suburb, sahuan building mountain " for the topic, travel the culture stanza the opening ceremony and saint mountain the scenery to spend a holiday the area to start practice the celebration ceremony the rites, building mountain the economic trade to talk over the meeting with the tenth building mountain, continuously the fire of the human civilization the motherland is good, the capital city is beautiful " three corpus activities of activities of publicity educations are with the stanza of white grass cookhouse ground sahuan son ", ten cross river the light stanza, celebrate the 7 1 " hall the whole image for ascending folk king of country whip performing, cloud residing the the series the activity, the spending the hole, silver the fox the hole, fairy the the hole, cloud the water hole " fourth holes look foring the competing and go to the countrying is a farmering, cut in lining " experience personallying a day swiming sixth items prop upping the activity to main contents, molding the root in peking the, hole the kingdom " ; release the north line of building mountain to travel the hallway, and adjust the mountain area the environment of economic construction, ecosystem, and promote the farmer to increase to accept, quickly the building mountain travel a developments step, and go on a tour for whole municipal and periphery visitor, recreational, the amusement invest with chinese and foreign businessman to start a business the offering is more, more ideal choice. attraction the more people travel the building the mountain, investment into the building mountain, and further push that area to completely develop

    儀式部分莊重典雅,表演部分歡樂狂野,突出郊野色彩,塑造鮮明的郊野主題和親和意識,展現房山旅遊大區文化名區的新氣象。本屆旅遊文化節以「 happy郊野撒歡房山」為主題,以第十屆房山旅遊文化節開幕式暨聖蓮山風景度假區開業慶典儀式房山經貿洽談會續燃人類文明之火「祖國好京城美」宣傳教育活動三項主體活動和白草畔野營地「撒歡兒節」十渡河燈節慶「七一」堂上鄉村民俗霸王鞭表演雲居寺祈福迎祥系列活動「石花洞銀狐洞仙棲洞雲水洞」四洞尋寶比賽及下鄉務農「插隊」體驗一日游六項支撐活動為主要內容,塑造「北京根祖,溶洞王國」的整體形象推出房山北線旅遊走廊,調整山區經濟結構,優化生態環境,促進農民增收,加快房山旅遊黃金圈建設步伐,為全市及周邊遊客出遊休閑娛樂和中外客商投資創業提供更多更理想的選擇。吸引更多的人旅遊到房山投資進房山,進一步推動該區全面發展。
  3. The problem is critical since, in the classical chinese corpus developed by academic sinica in the past 14 years, there are more than 9, 600 chinese characters without appropriate codes. in this paper, we present a database of chinese graphemes through which the structure of any missing characters as well as their attributes can be represented

    目前,對于繼承漢文化的地區來說,缺字問題已是一個共同的夢魘,凡是遇到漢字的人名、地名、史料等等,都有相當嚴重的缺字問題;所以,缺字問題已是一個國際性大家都關心的問題。
  4. A framework for dialectal chinese speech recognition is proposed and studied, in which a relatively small dialectal chinese or in other words chinese influenced by the native dialect speech corpus and dialect - related knowledge are adopted to transform a standard chinese or putonghua, abbreviated as pth speech recognizer into a dialectal chinese speech recognizer

    但在實際中,多數人所說的普通話因受其方言背景的影響而不十分標準,這大大影響了語音識別的性能。一種解決方案是,對每種方言都收集足夠多的語音數據然後構造相應的識別器,但由於漢語方言種類多且差異大,時間和成本都是很高的。
  5. Two kinds of knowledge sources are explored : one is expert knowledge and the other is a small dialectal chinese corpus

    方言相關的知識源有兩種,一是專家,一是小規模的方言背景普通話數據庫。
  6. Firstly, for the errors of text ’ character and word, utilizing neighborship of character or word, check character and word errors by character string co - occurrence probability. secondly, for the errors of syntax of text, according to statistic and analysis of a large - scale contemporary chinese corpus, recognize the predicate focus word and the others sentence ingredient, check the syntax errors. thirdly, for the errors of text ’ semanteme, establishing semantic dependency relationship tree based on hownet knowledge, presents a method that based on semantic dependency relationship analysis to compute sentence similarity, check the semantic errors

    對于文本字詞錯誤的檢查,本文主要利用了字詞二元接續關系,根據同現概率檢查文本字詞錯誤;對于文本語法錯誤的檢查,本文利用教研室已有的一個大規模語料庫,通過對語料庫進行統計分析,獲得語法查錯所需要的語言規律和知識,利用謂語中心詞識別和其他句子成分識別的方法,檢查文本語法結構上的錯誤;對于文本語義錯誤的檢查,本文主要利用知網知識得到語義依存樹,通過對句子的有效搭配對的相似度計算檢查語義錯誤。
  7. In this paper, taking words of synonyms and antonyms in " wu yue chun qiu " as a research subject, the author, with a scientific method of synchronic description, diachronic comparison, combining statistics and analysis, reveals the linguistic characteristics of " wu yue chun qiu ", and makes clear that it has corpus linguistic value and occupies an important position in chinese language evolving history

    以《吳越春秋》中的同義詞與反義詞為研究對象,運用共時描寫與歷時比較、統計與分析相結合的科學方法,揭示了《吳越春秋》的語言特色,肯定了其重要的語料價值和在漢語史上的地位。
  8. Upon this foundation, a corpus - based algorithm was designed and implemented to acquire and filter binary semantic pattern rules automatically. in the algorithm, a data mining method for cross - level association rules is adopted, which is guided by metarule, to find the semantic laws of word combinations in chinese phrase corpus. then statistic results are used to filter the findings

    在此基礎上,本文設計並實現了基於語料庫的二元語義模式規則自動挖掘和優選演算法,該演算法先採用數據挖掘中元規則制導的交叉層關聯規則挖掘方法,自動發現漢語短語熟語料庫中詞語兩兩組合的語義規律,再根據統計結果自動優選后轉換生成候選二元語義模式規則集。
  9. The 25 presentations were organised into 7 sessions. they included papers on corpus - based and statistics - based studies on various synchronic and diachronic aspects of the chinese language through the perspectives of " gramma ", sociology, political science and cognition, chinese database design and encoding issues, and chinese computational linguistics. there were also papers involving japanese and thai

    會議分七節舉行,共宣讀文章二十五篇,內容涉及各種共時及歷時,基於語料庫的和基於統計的理論語言學、社會語言學、政治學、語言認知等學術領域,以及中文語料庫設計及漢字顯示、漢語計算語言學等。
  10. The effects of conjunctions on chinese efl learners ' writings of cet - 6 - from the college english learners ' corpus

    從語料庫看連接詞在中國學生六級作文中的作用
  11. Bilingual corpus based chinese - english dictionary con struction

    平行語料庫中雙語術語詞典的自動抽取
  12. In this paper, we try to explore morphological rules for chinese by the corpus - based learning method

    本論文從電腦處理語言的角度探討中文構詞律的表達方式及產生的方法。
  13. ( 2 ) the influence to classification result is highly effected by using different classifier, for example, the center - vector algorithm obtains better classification results than other two algorithms. with the character feature, the average recall is 80. 73 %, and the average precision is 82. 94 %, and with the chinese - word feature, the average recall is 83. 6 %, and the average precision is 85. 97 %. different corpuses influence the classification result. for example, the average recall is 89. 31 % and the average precision is 88. 33 %, by using the news web pages as corpus from the web site " www. sina. com. cn ", which adopt the center - vector algorithm to structure classifier and select chinese - word as feature

    對三種分類器分別以字、詞為特徵進行分類測試、分析發現:使用相同的分類演算法,用詞作為特徵項,比以字作為特徵的分類效果好;用不同的演算法構造分類器對分類效果的影響很大,如中心向量演算法在字、詞特徵下的分類效果優于其他兩演算法;在以字為特徵的情況下,該演算法的平均查全率80 . 73 ,平均查準率82 . 94 ;在以詞為特徵的情況下,該演算法的平均查全率83 . 6 ,平均查準率85 . 97 ;選用語料不同對分類效果也有影響,如用新浪網( www . sina . com . cn )網頁語料進行測試,使用中心向量法分類器和詞作為特徵的情況下,平均準確率為89 . 31 ,平均查全率為88 . 33 。
  14. Classified study on inconsistency of segment for chinese corpus

    中文語料庫分詞不一致的分類處理研究
  15. On construction of a chinese corpus bused on semantic dependency relations

    基於語義依存關系的漢語語料庫的構建
  16. It implies that the method of svm + tbl to the corpus of conll2000 and chinese corpus of our definition

    最後本文把svm + tbl的方法應用在conll2000英文語料和我們定義的中文語料上。
  17. Sun maosong and zuo zhengping have presented a word segmentation algorithm based on a large chinese corpus. the approach may be beneficial to understanding unrestricted chinese texts

    他們給出了一個基於大規模語料的歧義切分演算法,該方法有助於理解非受限中文文本。
  18. We present an extensive experimental evaluation of refined concept index on two english collections and one chinese corpus using state - of - the - art support vector machine classifier

    因此,在大規模文本分類應用中,特徵選擇演算法往往更受歡迎。不過,概念索引卻是一個例外。
  19. Now, we have written a lot of software for chinese corpus processing, and have gained great achievements. but the outcome of them cannot answer our needs very well, and needs further improvements

    當前對漢語語料的加工結果,雖已取得了一定的成績,但國家的評測結果表明,其離實際需要的差距還是很大的,還有待于進一步的提高。
  20. Part - of - speech tagging is a fundamental theme in natural language processing. it is significant to the tagging of chinese corpus - based, machine translation and information indexing of large scale text

    詞性標注是自然語言處理中的一項基礎性課題,詞性標注的正誤對漢語語料庫標注、機器翻譯和大規模文本的信息檢索等都有重要的意義。
分享友人