首页| JavaScript| HTML/CSS| Matlab| PHP| Python| Java| C/C++/VC++| C#| ASP| 其他|
购买积分 购买会员 激活码充值

您现在的位置是:虫虫源码 > 其他 > 快速搜索所有对相似的Jaccard距离

快速搜索所有对相似的Jaccard距离

  • 资源大小:6.76 MB
  • 上传时间:2021-06-30
  • 下载次数:0次
  • 浏览次数:0次
  • 资源积分:1积分
  • 标      签: 搜索 所有 快速 相似 距离

资 源 简 介

Introduction SketchSortJ(1,2) is a software for all pairs similarity search. It takes as an input data points and outputs approximate neighbor pairs within a Jaccard distance (1.0 - Jaccard-similarity). First, the input data points are mapped to sketches by minwise independent permutations, also called minhash, and then neighbor pairs of sketches within a Hamming distance are enumerated by the multiple sorting method (3). Finally, the Jaccard distances for such neighbor pairs are calculated. If the Jaccard distance for a neighbor pair is no more than a user-specified threshold , the neighbor pair is outputted. One might worry about missed nearest neighbor pairs by our method. A theoretical bound of the expectation of missing edge ratio is derived. It enables us to set parameters so as to limit the empirical missing edge ratio as small as possible. Quick Start To compile SketchSort , please type the followings:

文 件 列 表

sketchsort-jaccard-0.0.5
dat
src
VIP VIP
0.255512s