首页| JavaScript| HTML/CSS| Matlab| PHP| Python| Java| C/C++/VC++| C#| ASP| 其他|
购买积分 购买会员 激活码充值

您现在的位置是:虫虫源码 > Java > 基于XPath的文本提取

基于XPath的文本提取

资 源 简 介

Welcome to Shawty Shawty is an XPath-based text extractor written in Groovy. The idea is to extract a Map/Table of fields from any marked-up source, like X/HTML, XML, SGML, etc. Building You"ll need Groovy 1.6+, Java 1.5+, and Maven 3. Once you"ve downloaded and installed all that stuff, running: $ mvn test package ...will produce shawty-{version}.jar in the "target" sub-directory of the Shawty tree. Example Usage See $SHAWTY_DIR/src/test/groovy/com/google/shawty/XPathExtractorTests.groovy for details, but in a nutshell, XPathExtractor is what you"ll primarily use, say, to extract some text from a web page: ``` def xml = """ Sample Page
VIP VIP
0.172603s