首页| JavaScript| HTML/CSS| Matlab| PHP| Python| Java| C/C++/VC++| C#| ASP| 其他|
购买积分 购买会员 激活码充值

您现在的位置是:虫虫源码 > Python > 统计句子边界检测

统计句子边界检测

  • 资源大小:6.62 MB
  • 上传时间:2021-06-30
  • 下载次数:0次
  • 浏览次数:0次
  • 资源积分:1积分
  • 标      签: 检测 统计 边界 句子

资 源 简 介

Includes proper tokenization and models for very high accuracy sentence boundary detection (English only for now). The models are trained from Wall Street Journal news combined with the Brown Corpus which is intended to be widely representative of written English. Error rates on test news data are near 0.25%. This is the source code for the paper "Sentence Boundary Detection and the Problem with the U.S." appearing at NAACL 2009. Code written in Python. Dan Gillick

文 件 列 表

README
SETUP
model_nb
lower_words
non_abbrs
feats
model_svm
lower_words
non_abbrs
svm_model
feats
sample.txt
sbd.py
sbd_util.py
word_tokenize.py
VIP VIP
0.182045s