以文本方式查看主题 - 中文XML论坛 - 专业的XML技术讨论区 (http://bbs.xml.org.cn/index.asp) -- 『 Web挖掘技术 』 (http://bbs.xml.org.cn/list.asp?boardid=69) ---- What is Web Mining ? (http://bbs.xml.org.cn/dispbbs.asp?boardid=69&rootid=&id=45637) |
-- 作者:DavidPotter -- 发布时间:4/18/2007 5:41:00 PM -- What is Web Mining ? quote from http://www.galeas.de/webmining.html Web Mining is the extraction of interesting and potentially useful patterns and implicit information from artifacts or activity related to the World­Wide Web. There are roughly three knowledge discovery domains that pertain to web mining: Web Content Mining, Web Structure Mining, and Web Usage Mining. Web content mining is the process of extracting knowledge from the content of documents or their descriptions. Web document text mining, resource discovery based on concepts indexing or agent­based technology may also fall in this category. Web structure mining is the process of inferring knowledge from the World­Wide Web organization and links between references and referents in the Web. Finally, web usage mining, also known as Web Log Mining, is the process of extracting interesting patterns in web access logs. Web Content Mining Web content mining is an automatic process that goes beyond keyword extraction. Since the content of a text document presents no machine­readable semantic, some approaches have suggested to restructure the document content in a representation that could be exploited by machines. The usual approach to exploit known structure in documents is to use wrappers to map documents to some data model. Techniques using lexicons for content interpretation are yet to come. There are two groups of web content mining strategies: Those that directly mine the content of documents and those that improve on the content search of other tools like search engines. |
-- 作者:DavidPotter -- 发布时间:4/18/2007 5:44:00 PM -- 开始挣经验,当斑竹... |
-- 作者:teng_t1986 -- 发布时间:5/12/2007 3:10:00 PM -- 支持David Potter! 我给你站内邮箱发邮件了:) |
-- 作者:teng_t1986 -- 发布时间:5/12/2007 4:33:00 PM -- 其实你去申请一下应该没问题,这个版现在还没斑竹确实有点不大正常…… |
W 3 C h i n a ( since 2003 ) 旗 下 站 点 苏ICP备05006046号《全国人大常委会关于维护互联网安全的决定》《计算机信息网络国际联网安全保护管理办法》 |
62.500ms |