首页| JavaScript| HTML/CSS| Matlab| PHP| Python| Java| C/C++/VC++| C#| ASP| 其他|
购买积分 购买会员 激活码充值

您现在的位置是:虫虫源码 > Java > 计算两个给定的文本之间的语法相似性(内容)

计算两个给定的文本之间的语法相似性(内容)

资 源 简 介

Compute syntactical similarity of the text. Java program that compares two files and return - in percentage - how similar they are. So for example: java -jar ss.jar c:/tmp/a.txt c:/tmp/b.txt Output would be: Similarity is 89.60159% Some texts are too similar to each other, like almost! duplicated news articles for example. The difference could be that in the middle of the text is different advertisement or just headline is slightly modified. This simple program tries to compute how much (in percentage) are two texts similar. Note: This is syntactical similarity, not lexical one. It means that only structure of words and phrases is taken into account not their meaning. This project is used as part of http://www.opfine.com/ online financial news text analyser to simplify and reduce resources load.
VIP VIP
0.184186s