首页| JavaScript| HTML/CSS| Matlab| PHP| Python| Java| C/C++/VC++| C#| ASP| 其他|
购买积分 购买会员 激活码充值

您现在的位置是:虫虫源码 > Java > 一个java桌面应用程序文件自动分类

一个java桌面应用程序文件自动分类

资 源 简 介

A Java desktop application (using the J2SE 5 platform and the Swing API) for automatic classification of documents against a given training set. It has been developed, and is packaged, as a Netbeans project. It uses the stemmers created with Snowball (http://snowball.tartarus.org, released under the BSD license) for text pre-processing, TF-IDF or the Bhattacharrya distance to rank the documents of the training set to the query document, and the K-NN algorithm to classify it. As of now, it only supports the classification of news from the ANSA website (http://www.ansa.it - The Italian main news agency), but the program has a modular architecture, that allows it to be extended by writing plugins for scraping the content of other websites, or other types of documents (PDF, DOC, ODT, etc...).

文 件 列 表

javadocs
documentclassifier
Scrapers
class-use
JLex
ansascraper
resources
inherit.gif
index-files
index-1.html
allclasses-frame.html
allclasses-noframe.html
constant-values.html
deprecated-list.html
help-doc.html
index.html
overview-frame.html
overview-summary.html
overview-tree.html
package-list
serialized-form.html
stylesheet.css
documentclassifier
DocumentClassifierApp.html
DocumentClassifierView.html
MapDefaultPreferences.html
package-frame.html
package-summary.html
package-tree.html
package-use.html
PreferencesDialog.html
DocumentClassifierAboutBox.html
Scrapers
ANSAScraper.html
package-frame.html
package-summary.html
package-tree.html
package-use.html
Scraper.html
class-use
ANSAScraper.html
Scraper.html
BhattacharryaDistanceComparator.html
TFIDFComparator.html
Main.html
DocumentClassifierAboutBox.html
DocumentClassifierApp.html
DocumentClassifierView.html
MapDefaultPreferences.html
PreferencesDialog.html
parser.html
sym.html
Yylex.html
VIP VIP
0.178462s