1·Large corpora (masses of text) are a good place to start.
通过大型语料库(海量文本)来检查是个好方法。
2·Modeling the linguistic data found in corpora can help us to understand linguistic patterns, and can be used to make predictions about new language data.
建模语料库中的语言数据可以帮助我们理解语言模型,并且可以用于进行关于新语言数据的预测。
3·Supervised classifiers use labeled training corpora to build models that predict the label of an input based on specific features of that input.
监督式分类器使用标签训练语料库来构建模型,预测基于特定要素输入的所输入的标签。
4·Manipulating large corpora, exploring linguistic models, and testing empirical claims.
操作大型语料库,设计语言模型,测试经验假设。
5·As a basis of this study, the present methods are also discussed and summarized from the Angle of corpora in this dissertation.
作为本研究的基础,本文还主要从语料库的角度对现有处理方法进行了讨论和总结。
1·One fairly simple thing you are likely to do with linguistic corpora is analyze frequencies of various events within them, and make probability predictions based on these known frequencies.
对于语言全集,您可能要做的一件相当简单的事情是分析其中各种事件(events)的频率分布,并基于这些已知频率分布做出概率预测。
2·The relation between threshold, weight and matching degree is also discussed. In order to make the querying friendlier, the method to avoid returning null set and corpora is also presented.
研究了权重、阈值和匹配度之间的关系,提出了避免查询结果为空集或全集的方法,使得查询更加人性化。