首页| JavaScript| HTML/CSS| Matlab| PHP| Python| Java| C/C++/VC++| C#| ASP| 其他|
购买积分 购买会员 激活码充值

您现在的位置是:虫虫源码 > Java > ml-mapreduce

ml-mapreduce

  • 资源大小:219.80 kB
  • 上传时间:2021-06-30
  • 下载次数:0次
  • 浏览次数:0次
  • 资源积分:1积分
  • 标      签:

资 源 简 介

What is this project all about? This is a class project which aims to demonstrate a simple Proof of Concept about massive parallelization of Machine learning algorithms on a Distributed Fie System. For the purposes of proof of concept we use the Hadoop DFS and its JAVA API. The first fully functional algorithm to be implemented was the Logistic Regression using Iterative Newton raphson. What we want to do? Being an offline system Hadoop has the "job" philosophy. Meant for data ranging in TB and it is largely an offline process. So you may ask how this can help in Machine learnin, where it may rather help to have an online process... We feel this is not true. Many machine learning algorithms take a lot of time to learn. Example we may have to run an unsupervised machine algorithm on data time and again on data accumalated, say from the WWW. Another interesting algorithm is cross validation for feature selection which can potentially take a lot of time. The EM algor
VIP VIP
0.217316s