Today I am going to post the code of a function I wrote to check broken links in a web page.
You can learn two things by going through it –
First, how to check broken links (I know that’s obvious :D).
Second, how to handle multiple windows in selenium webdriver + python. There are three useful functions for this –
1. webdriver.window_handles => returns the handles of all windows within the current webdriver session 2. webdriver.current_window_handle => returns the handle of the current window (the window currently in focus) 3. webdriver.switch_to_window(window_name) => switches focus to the window having specified window_name or window_handle (we can pass the window_handle instead of window_name as a parameter to this function)
To check broken links, my code navigates to the link given, and checks whether user lands up in the same page as expected.
While checking links on a page, I also find some links that, when clicked, open a page in a separate window. Most common example of such links are the links to social media sites on a page.
Eg. In the below code, I am testing the home page of “http://www.carwale.com/“.
This page has 4 links – the icons at the bottom of the page for facebook, youtube, google plus, and twitter – that when clicked, open the respective pages in a new window. Here, the section that handles multiple windows will come into play.
from selenium import webdriver browser = webdriver.Firefox() home_page = "http://www.carwale.com/" def check_page_broken_links(self, url): # Sample usage: # check_page_broken_links(self,"http://www.carwale.com/") # will return empty list if all links in the page work fine # else it will return list of all the broken links # (either link text, or link href) # Will check for - i) "Page Not found" error # ii) Redirects try: failed =  self.implicitly_wait(5) self.get(url) number_of_links = len(self.find_elements_by_tag_name('a')) for i in range(number_of_links): # Save current browser window handle initial_window = self.current_window_handle ## print "initial_window_handle: ", initial_window link = self.find_elements_by_tag_name('a')[i] link_address = link.get_attribute("href") link_name = link.text print "link checked: ",i,": ",link_name,": ",link_address if ((link_address is not None) and ("google" not in link_address) and ("mailto" not in link_address) and is_link_element_displayed(self,element=link) is True): link.click() # link clicked open_windows = self.window_handles ## print "window_handles: ", open_windows # Navigate to the browser window where # latest page was opened self.switch_to_window(open_windows[-1]) ## print "current_window_handle:" ## print self.current_window_handle time.sleep(5) print "defined: ",link_address print "current: ", self.current_url if (link_address[-1] == "#" and self.current_url in [link_address, link_address[:-1], link_address[:-2],link_address+'/']): # A "#" at the end means user # will stay on the same page.(Valid scenario) pass elif (self.current_url not in [link_address, home_page + link_address[1:]]): # if user lands up in a page different # from that intended if link_name: failed.append(link_name) else: failed.append(link_address) if len(self.window_handles) > 1: # close newly opened window self.close() # Switch to main browser window self.switch_to_window(open_windows) self.get(url) except Exception, e: return ['Exception occurred while checking',e] return failed # call defined function to check broken links in carwale.com home page print check_page_broken_links(browser,"http://www.carwale.com/")
If there are any broken links in the URL you passed to the function, they will be printed out in a list at the end of program execution.