Using HMM to Learn User Browsing Patterns for Focused Web Crawling Evangelos Milios Abstract: A focused crawler is designed to traverse the Web to gather documents on a specific topic. We present a new approach for prediction of the important links to relevant pages based on a Hidden Markov model (HMM). The system consists of three stages: user modelling, pattern learning and focused crawling. We first collect the Web pages visited during a user browsing session. These pages are clustered, and the link structure among pages from different clusters is used to learn page sequences that are likely to lead to target pages. During crawling, the priority of links to follow is based on predictions of the HMM of how likely the page is to lead to a target page. We compare performance with Context-Focused crawling and Best-First crawling.