text corpus 中文意思是什麼

text corpus 解釋
語料庫
  • text : n 1 原文,本文,正文;(文藝學等所說的)文本。2 課文,課本,教科書。3 基督教聖經經文,經句〈常引...
  • corpus : n (pl pora )1 軀體,身體;〈主謔〉屍體。2 (法典等的)集成,全集。3 (事物的)主體;【法律】主...
  1. Concordance : an alphabetical index of all the words in a text or corpus of texts, showing every contextual occurrence of a word

    共現關系:字母索引依字母順序對文章正文或作品全集所作的索引,其中顯示每一個詞的上下文。
  2. This text with administration litigation method an esse for designating as foundation to generalizing administration compulsoryexecution power concept, thinking administration executing forcibly power with exercise and is a process, administrationcompulsory execution power corpus of two dollars is some powers to point in the applicant ' s people court compulsory execution, administration organization is all experienced with court of the people to make administration compulsory execution power inside primarily can. see from the whole, administration

    本文以行政訴訟法第65條、第66條的規定為基礎來概括行政強制執行權概念,認為行政強制執行權的存在和行使是一個過程,行政強制執行權主體的二元性主要是指在申請人民法院強制執行中,行政機關與人民法院都在行使行政強制執行權中的某些權能。從整體上看,行政強制執行權是行政權,但是,該權力的運行存在告誡、申請、審查、決定、實施等五個階段。
  3. In the design of corpus, we carefully analyze the syllable distribution of corpus th - coss, then classify the prosodic characters of this corpus and present out the distribution of every prosodic character. based on prosodic character vector, we construct an error function which is used to select original corpus for simulation system, and show the distribution of prosodic characters for the original corpus. greedy algorithm and corpus self - adaptive process are expatiated to set theoretical foundation for text material search

    在語料庫分析與設計方面,首先統計th - coss語料庫中音節分佈情況,給出th - coss語料庫韻律特徵分類,並對每一種韻律特徵進行統計,然後構造了一個基於韻律特徵向量的誤差函數,並採用該誤差函數提取語料組成模擬系統的初始語料庫,分析該庫的韻律特徵分佈,最後闡述了greedy演算法與語料自適應過程,為文本語料的搜索打下理論基礎。
  4. This text is divided into totally three part, the first part clarities mainly the connotation of credit, the meaning transformation and present application. emphasizing on the importance of the corpus credit diathesis in modem society based on the foundation of existing theoretical outcome. furthermore. the text delimitates the connotation of the credit diathesis, induces the feature of using the credit, namely the three greatest features that are made up of objectivity and subjectivity, explicit ages and plasticity ; the author analyzes closely the credit diathesis of students that is very beneficial for further modem social safety, healthy development, building of the fashion of credit moral, purifying educational and academic circumstance and so on

    本文共分三部分,第一部分主要在認真吸取現有理論成果的基礎之上,經過分析和調查研究闡明了信用的內涵及其含義的演變,以及在今天生活中的應用,突出主體信用素質在現代社會的重要性,在此基礎之上比較科學地界定了信用素質的內涵,歸納信用素質的特點,即客觀性和主觀性、鮮明的時代性和可塑性三大特點。第二部分比較深入的分析了培養大學生信用素質對于促進現代社會的健康發展、信用道德風尚的建設、凈化教育環境和學術環境等具有的巨大價值,以及對培養合格人才具有非常重要的意義。
  5. Firstly, for the errors of text ’ character and word, utilizing neighborship of character or word, check character and word errors by character string co - occurrence probability. secondly, for the errors of syntax of text, according to statistic and analysis of a large - scale contemporary chinese corpus, recognize the predicate focus word and the others sentence ingredient, check the syntax errors. thirdly, for the errors of text ’ semanteme, establishing semantic dependency relationship tree based on hownet knowledge, presents a method that based on semantic dependency relationship analysis to compute sentence similarity, check the semantic errors

    對于文本字詞錯誤的檢查,本文主要利用了字詞二元接續關系,根據同現概率檢查文本字詞錯誤;對于文本語法錯誤的檢查,本文利用教研室已有的一個大規模語料庫,通過對語料庫進行統計分析,獲得語法查錯所需要的語言規律和知識,利用謂語中心詞識別和其他句子成分識別的方法,檢查文本語法結構上的錯誤;對于文本語義錯誤的檢查,本文主要利用知網知識得到語義依存樹,通過對句子的有效搭配對的相似度計算檢查語義錯誤。
  6. So it is a new and promising way of improving search on the web. finally, we propose a text summarization method which is based on weight of sentences and genetic algorithm, and it is also a corpus - based approach

    最後介紹了一種我們提出的基於語句權重和遺傳演算法的文件摘要方法,該方法從本質上說也是一種基於文件集的摘要方法。
  7. Latent semantic analysis ( lsa ) is a completely automatic theory and method of the acquisition and representation of knowledge, which extracts the contextual - usage meaning of words by statistical computations applied to a large corpus of text

    潛在語義分析( latentsemanticanalysis , lsa )是一種用於自動地實現知識提取和表示的理論和方法,它通過對大量的文本集進行統計分析,從中提取出詞語的上下文使用含義。
  8. This text tries to investigate the law relation which create from administration contract behavioral by anglicizing the difference between the administration contract and the administration contract behavioral, then put forward own idea : administration contract is a agreement which using administration corpus as its administration purpose and the citizen legal person and other organization reach base on similar meaning through consultation, it is pleasing result for both parties, the administration contract has and only has one party concerned as its administration corpus ; but administration contract behavioral is a behavioral when administration corpus establishment, implement, change, terminate administration contract, it is single and concrete, the two sides have different innate character obviously

    本文試從通過行政合同與行政合同行為的差異分析入手,進而探索行政合同行為產生的法律關系,並由此提出自己的觀念:行政合同是行政主體為行政目的,與公民、法人和其他組織通過協商方式在意思表示一致的基礎上所達成的協議,它是雙方合意的結果,行政合同行政合同相關問題的法律分析中文提要有且僅有一方當事人為行政主體;而行政合同行為則是行政主體在行政合同訂立、履行、變更和終止過程中發生的行為,它是單方的、具體的;兩者本質明顯不同。
  9. The experiments show that the method is better in avoiding the impact of the style or manner of writing in corpus on choosing the stoplist, and more suitable for preprocessing the text categorization than traditional methods

    實驗結果表明,該方法更好地避免了語料的行文格式對停用詞選取的影響,比傳統方法更適用於文本分類的預處理。
  10. In further research, the following issues must be considered : 1 ) the standardize of corpus ; 2 ) improve the accuracy of chinese words divided syncopation system, handle the different meanings of one word and recognize the words that do not appear in the dictionary ; 3 ) process semantic analysis ; 4 ) dynamically update the training sets fed back by the user ; 5 ) quantitatively analyze the system performance influenced by different factors, use an appropriate model to compare and evaluate the web text classification system ; 6 ) natural language process ; 7 ) distinguish the disguise of sensitive words

    在以後的工作中考慮如下問題: 1 )數據集的標準化; 2 )分詞系統精度的提高,對歧義處理以及未登錄詞識別的能力的提高: 3 )進行合理的語義分析: 4 )利用用戶反饋信息動態更新訓練集; 5 )定t分析分類器不同要素對分類系統性能的影響,使用合適的模型來比較和評價分類系統; 6 )自然語言理解問題,如「引用」問題; 7 )對于敏感詞匯偽裝的識別問題。
  11. Statistical machine translation ( smt ) is the text translation by the statistical parameter models obtained from the training corpus, which has become the mainstream of machine translation research

    統計機器翻譯是利用基於語料庫訓練得到的統計參數模型,將源語言的文本翻譯成目標語言,它是機器翻譯的主流方向。
  12. Firstly, this article solves problems in the uighur text analysis, such as syllable separation, and summarize stress ? pause and tone rules of prosodic based on the features of the uighur language and phonetics. then propose the design of ? context vector ?, and uses greedy algorithm to optimize corpus. at last, this article introduces synthesis method using variable - length concatenating units

    首先根據維語的語音和語言特徵,解決了音節劃分等有關文本分析的問題,並總結了重音、停頓、語氣等韻律規則;然後採用「語境矢量」的設計,用greedy演算法優化語料庫;最後採用不定長單元的拼接合成方法,首先選擇較大單元合成,當拼接單元為音節時,用viterbi演算法,基於語境挑選出最優的單元合成語音。
  13. In previous classifiers this process is very time - consuming and costly, thus limiting its applicability. so our classifiers may meet the requirements of real - time and high accuracy. in this thesis, we give a survey of the state - of - the - art in chinese text categorization, from the building of the corpus, the divided syncopation system of chinese web document, the selection of index, and the design of weight to the structure of scusctc ( scu smart chinese text classifier ) and its implement in java

    在研究的過程中,我們系統考察了中文web文檔自動分類的各個環節以及具體的實現技術:從語料庫的建立,中文web文檔的分詞,索引的選擇,權重的設計方案及分詞系統smcw的建立,到特徵選擇方法的研究討論,各種分類方法的研究討論,最後到中文web文檔傾向性分類系統( scusctcscusmartchinesetextclassifier )的結構提出及用java語言開發實現該系統,並對最後的分類結果及中間分詞結果進行了細致的實驗和考察。
  14. Considering the main problems of traditional mandarin text - to - speech system, in - depth research was conducted on a series of key techniques such as text prosodic level marking, corpus analysis and design, unit selection strategy and etc. we firstly take a glance back at the history of mandarin speech synthesis technology whose defects is also indicated

    針對傳統的漢語文語轉換系統存在的主要問題,採用基於語料庫的語音合成方法,在文本韻律層級標注、語料庫分析與設計、合成單元挑選策略等關鍵技術上做了一系列研究。
  15. Part - of - speech tagging is a fundamental theme in natural language processing. it is significant to the tagging of chinese corpus - based, machine translation and information indexing of large scale text

    詞性標注是自然語言處理中的一項基礎性課題,詞性標注的正誤對漢語語料庫標注、機器翻譯和大規模文本的信息檢索等都有重要的意義。
  16. We need a system, which can answer the questions from text corpus and then synthesize and generalize the answers. at the same time the system should be capable of processing multimedia information

    我們需要建立一個系統,它可以根據給定的語料庫回答有關文本的問題,並且具備綜合和概括信息的能力。
  17. World memex : build a system that given a text corpus, can answer questions about the text and summarize the text as precisely and quickly as a human expert in that field

    智能資料存儲處理系統:建立一個系統,對於一個給定的文本語料庫,任何關于所存儲的文本的問題都可以回答,可以對文本庫任何一段文本做出摘要,回答和摘要的準確程度和速度與該領域的專家相當。
  18. On the other hand, a mass of text categorization ( tc ) techniques, such as bayes, knn, svm, etc., are applied to email categorization ( ec ), however, there is no open chinese email corpus available on the internet, and researchers have to collect emails themselves to do experiments

    在另一方面,大量文本分類技術應用於郵件分類中,但目前並未有公開的中文郵件分類語料庫,實驗者都是在自己收集的語料上做實驗得出結論。且分類演算法的分類性能在某種程度上與訓練語料庫相關,好的或高質量的訓練語料庫可能會導致分類器得到好的分類性能。
  19. Collins wordbank ' s online english corpus is composed of 56 million words of contemporary written and spoken text, this sampler allows you to draw information on linguist data for learning or research purposes

    口林在線詞匯庫英文集由五千六百萬個現代書寫和口語文本中的詞匯組成,這個抽樣器可以為你學習或者研究提供語言數據。
分享友人