Document Text Classification Using Support Vector Machine
Batoul Aljaddouh,  Nishith A. Kotak
Document categorization is in trend nowadays due to large amount of data available in the internet. There are many different classification algorithms such as (Naïve Bayes, SVM, K-means, KNN, etc.) depending on the type of classifications and other features. In this paper, we have observed the best accuracy and efficiency using Support Vector Machine approach, which is been explored in the paper. The dataset used is combine between the two datasets. First, one is BBC news articles. Second is 20 news group. We have used more than 3500 distinct articles to build classifier into eight classes (Atheism, Business, Car, Entertainment, Politics, Space, Sport and Technology). The training accuracy for classifier is 96.40% resulting in about 130-misclassified cases. The overall accuracy tested over 20 articles of each class resulting in 160 testing articles obtained is 89.70% which is comparable to many existing algorithms. The paper further describes the factors affecting the classification technique and its implementations.
Keywords- Document Classification, Natural Language Processing, Text Mining, Text Classification, Support Vector machine, Artificial intelligence, Machine learning
Cite this Article
Batoul Aljaddouh,  Nishith A. Kotak,   "Document Text Classification Using Support Vector Machine"
, International Journal of Engineering Development and Research (IJEDR), ISSN:2321-9939, Volume.8, Issue 1, pp.138-142, January 2020, Available at :http://www.ijedr.org/papers/IJEDR2001026.pdf