عنوان البحث(Papers / Research Title)
AN EFFICIENT WEB USAGE MINING ALGORITHM BASED ON LOG FILE DATA
الناشر \ المحرر \ الكاتب (Author / Editor / Publisher)
توفيق عبد الخالق عباس الاسدي
Citation Information
توفيق,عبد,الخالق,عباس,الاسدي ,AN EFFICIENT WEB USAGE MINING ALGORITHM BASED ON LOG FILE DATA , Time 14/12/2016 06:15:44 : كلية تكنولوجيا المعلومات
وصف الابستركت (Abstract)
AN EFFICIENT WEB USAGE MINING ALGORITHM BASED ON LOG FILE DATA
الوصف الكامل (Full Abstract)
ABSTRACT Information on Internet and specially on website environment is increasing rapidly day by day and become very huge, this information play an important role for discovering various knowledge in the Web. Web Usage Mining one of the Web Mining algorithm categories that concern with discover and analysis useful information regard to link prediction, users navigation, customers behavior, site reorganization, web personalization and frequent access patterns from large web data that logs by Web server side and stored in standard text log file format called log file or Web usage data, this data can also be collected from an organization s database such as NASA. Web Usage Mining is a process of applying Data mining techniques and application to analyze and discover interesting knowledge from the Web. There are several existing research works on log file mining, some concern with web site structure, traversal pattern mining, association rule mining, Web page classification, and general statistics such as amount of time spent on a page. In this paper we will focus on mining the different segments content of Web log data entries in order to discover the hidden information and interesting browsing contents from it, then applying clustering algorithm to find similar groups of Web sites that have common browsing contents. Keywords: Web Mining, Web Usage Mining, Log File Analysis, Clustering, K-means, System Monitoring. 1. INTRODUCTION The massive growth of the amount of data and information on World Wide Web (WWW) and enormous Web pages created make mining and analysis useful information a practical challenges. World Wide Web (WWW) consist of billions of interconnected Web pages which are published by millions of authors on the world. Web page is a document that suitable to view by World Wide Web (WWW) through Web browser. Web document contain a data of all types such as structured tables, unstructured text, semi-structured content and multimedia content such as (images, audios and videos) [1]. The process used to extract and mine useful information and discovering knowledge from Web document by use Data Mining (DM) techniques is called Web mining. Web Mining is a multidisciplinary field include Data Mining (DM), machine learning, neural networks, information retrieval, statistics, and databases [2]. Web mining include wide domain of application that intent to discovering and extracting hidden information in data that stored in the Web. The main task of the Web mining techniques is to discover and retrieve interested information from huge data set contain huge web data and its store in file called log file. Web data contain variant types of information and it s include web log data, web structure data and user profiles data. Web mining is divided into three categories web content mining, web structure mining and web usage mining [3]. Web Content Mining can be consider is the task of extracting useful and interested information from contents of web documents. Content data is the collection of web pages designed by use web language, there are many languages can be used in this manner like HTML, PHP, ASP…etc. While some Web documents can be designed by used Content Management System (CMS) like Joomla, Word press, Vivo…etc. Many techniques can employed here like text mining, Multimedia mining in order to discover or extraction similar web pages content. Web Structure Mining is the task of using graph theory to analysis and understanding the connection structure of web site. Web structure can be divided into two classes: Extracting hyperlink patterns in the web and mining document contents.
تحميل الملف المرفق Download Attached File
|
|