Low Cost Journal,International Peer Reviewed and Refereed Journals,Fast Paper Publication approved journal IJEDR(ISSN 2321-9939)
apply for ugc care approved journal, UGC Approved Journal, ugc approved journal, ugc approved list of journal, ugc care journal, care journal, UGC-CARE list, New UGC-CARE Reference List, UGC CARE Journals, ugc care list of journal, ugc care list 2020, ugc care approved journal, ugc care list 2020, new ugc approved journal in 2020,
Low cost research journal, Online international research journal, Peer-reviewed, and Refereed Journals, scholarly journals, impact factor 7.37 (Calculate by google scholar and Semantic Scholar | AI-Powered Research Tool)
Supervised Web Forum Crawling
Priyanka S.Bandagale,   Dr. Lata Ragha
In this paper, we present a supervised internet Forum crawler. The goal of planned methodology is to crawl optimum forum content from the net with stripped-down overhead. Forum threads contain info content that's the target of forum crawlers. though forums have completely different varieties of designs or layouts and area unit powered by numerous forum code packages, they continuously have similar implicit navigation ways connected by such uniform resource locator varieties to guide users to string pages from entry pages. supported this observation, we tend to cut back the net forum crawl drawback to a uniform resource locator (URL) kind recognition drawback victimization our crawler by demonstrating its results and pertinence. Crawler with multi-threaded downloader is chargeable for beginning threads and getting the knowledge regarding the web site being fetched. Multiple processes area unit run in parallel to perform the higher than task, so transfer rate is maximized and downloading time is decreased . finally we tend to show that our planned Naïve mathematician Classifier is best than generic BFS with the assistance of Associate in Nursing application in variety of native computer program.
Keywords- forum sites , crawling, ITF regex, URL classification, page type, URL pattern learning, URL type, EIT path.
Unique Identification Number - IJEDR1601049Page Number(s) - 298-302Pubished in - Volume 4 | Issue 1 | February 2016DOI (Digital Object Identifier) -    Publisher - IJEDR (ISSN - 2321-9939)
Cite this Article
Priyanka S.Bandagale,   Dr. Lata Ragha,   "Supervised Web Forum Crawling"
, International Journal of Engineering Development and Research (IJEDR), ISSN:2321-9939, Volume.4, Issue 1, pp.298-302, February 2016, Available at :http://www.ijedr.org/papers/IJEDR1601049.pdf