Wednesday, August 21, 2013

Ways of dealing with StaleElementReferenceException in Selenium Webdriver

Lately I've been spending a lot of time debugging a bunch of flaky tests.  A while back I wrote a bunch of tests that were executing fine locally, but since moving it to a 3rd party grid, and running it across a larger range of browsers, I've find my self dealing with many flakiness issues, including 
StaleElementReferenceException.

What causes them?

In short, a StaleElementException is calling a reference on an element that has been torn down.  There are many causes for them, here are some of the common ones I've seen.

  • Page Refresh - I've seen this commonly working with tabbed pages.  Sometimes a page will look like 1 page with multiple tabs, but really are completely separate pages.  It might look like the same tab bar, but in reality you'll have to re-find the element in order to work with it.
  • JavaScript MVC frameworks - Many JavaScript MVC frameworks can update contents of a list or a control by rapidly tearing down and recreating the same element.  There are times an element looks like only the text is updating, but in reality the entire element is torn down and rebuilt.
  • Elements in transition - An interesting trick web developers use to create interesting transitions is to clone an section or div (and everything inside), then manipulate the clone that's overlaid on top the actual elements.  For example, to make a simple slide in transition to appear as if the old contents were sliding out and the new contents are sliding in, you could create a clone or the contents, then animate that clone to slide out while the new contents are in an identical div that's sliding in.
  • Elements being moved or reordered - If an element disappears off the screen to be relocated, it'll cause the original reference to go stale.
  • Elements being re-rendered - A good example is an ajax request that dumps it's return into the page.


Strategies to dealing with them

Here are some strategies I've used for dealing with some of the causes of StaleElementExceptions.  For the purpose of this article, I'll use Python since that's the language I'm currently using.

Note: In many of the examples, you'll see a 'wait_until()' method.  This method is just a simple wrapper function I created that executes a given statement until it returns true. You can find the source code for it here, https://github.com/wiredrive/wtframework/blob/master/wtframework/wtf/utils/wait_utils.py

Storing locators to your elements instead of references

This does not always work, but it'll reduce StaleElementExceptions by always performing a new find operation before each call to the WebElement.  It reduces staleness, but does add a bit of CPU and bandwidth usage to the overall test.  It is however very easy to do and work with.
from selenium import webdriver
import time
import unittest

class Test(unittest.TestCase):

    def testName(self):

        driver = webdriver.Firefox()
        driver.get("http://www.github.com")
        search_input = driver.find_element_by_name('q')
        search_input.send_keys('hello world\n') # Page contents refresh after typing in search results.
        time.sleep(5)
        search_input.send_keys('hello frank\n') # StaleElementReferenceException
But by storing the locator function instead of the reference to the element itself, we can avoid this.
from selenium import webdriver
import time
import unittest


class Test(unittest.TestCase):

    def testName(self):
        driver = webdriver.Firefox()
        driver.get("http://www.github.com")
        search_input = lambda: driver.find_element_by_name('q')
        search_input().send_keys('hello world\n') 
        time.sleep(5)
        search_input().send_keys('hello frank\n') # no stale element exception

This is because calling 'search_input()' re-evaluates the lambda statement and does a find on the element again.  This prevents staleness by getting a fresh reference to the WebElement each time.  

This technique is good for elements that get refreshed after an action, but don't necessarily change.  Like a 'Save' button that gets redrawn when a new entry is loaded, but has the same recognition properties.

Leverage hooks in the JS libraries used

Various frameworks and libraries will have different hooks that let you poll the state.  For example Jquery, has an animation queue, http://api.jquery.com/queue/,  you can query to detect if any transition effects are still running on the WebElement.  Polling the animation queue is a good way to avoid staleness that's caused in transition effects, such as slide in/out, fades, moving panes, etc...

It's pretty simple to poll on the transition queue.  You first need to figure out the selector (in that particular library/framework) for the element, then execute a javascript polling loop on it.

        # Using Jquery queue to get animation queue length.
        animationQueueIs = """
        return $.queue( $("#%s")[0], "fx").length;
        """ % element_id
        wait_until(lambda: self.driver.execute_script(animationQueueIs)==0)


This technique is best for waiting on animations or transitions to complete.  Most common one I've used it for is to wait for a sliding div containing elements I want to verify to completely slide into view before continuing.

Proactively wait for the element to go stale

This strategy is good if you have a reference to the element you want to wait for already and know  you are performing an event that'll update that element.  You can use this for waiting for an Ajax request to respond.  For example, say you have an counter that counts entries, and a button that creates a new entry.  And you want to verify that the count is incremented when a new entry is created.  You can wait for the counter to go stale before re-finding the counter and verifying it.

def is_element_stale(webelement):
    """
    Checks if a webelement is stale.
    @param webelement: A selenium webdriver webelement
    """
    try:
        webelement.tag_name
    except StaleElementReferenceException:
        return True
    except:
        pass
    
    return False

...


old_link_reference = driver.find_element_by_xpath("//div[@id='linklist']")

self.next_button().click()

# Wait till the element goes stale, this means the list has updated
wait_until(lambda: is_element_stale(old_link_reference))

updated_link_reference = driver.find_element_by_xpath("//div[@id='linklist']/*[text()='mylink']")

What this does is it waits for the old reference link to go stale, meaning the list has been cleared out and updated, then attempts to select the new link.  This approach is good to use when searching for an entry in a paginated list.  This allows you to stop polling and paginate to the next page the moment the ajax request for the next page returns, instead of waiting for some fixed timeout period polling for the desired element.

Javascript Latching

Javascript latching is a good technique when you have access to the source code, and can build in some test hooks into the webpage.  How JavaScript latching works is you add a promise to a asyc call in your Page's javascript source.

 //----------------------------------------------------------
 // Notice JQuery ajax calls return Promise objects.
 //----------------------------------------------------------
 var request = $.get("http://somewhere.com", {params:'...'})
               .then(function() {
                   // code developer wrote to process response...
                   // ......
                }).always(function(){
                   //----------------------------------------
                   // Notice I just daisy chain my LATCH onto  
                   // this code when the action completes.
                   //----------------------------------------
                   console.log("LATCH SET");
                   window._AUTOMATION_LATCH = true;
                });

Then in your Selenium test code, you can poll on the latch to wait on it to be set before continuing.

self.driver.find_element_by_id("searchInput").send_keys("Tom Dale")
        
self.wait_until(lambda: self.driver.execute_script("return window._AUTOMATION_LATCH;") == True)
        
# Then verify only tom dale is seen.
results = self.driver.find_elements_by_tag_name("li")
self.assertTrue( "Tom Dale" in results[0].text )

This is good for waiting on items that are manipulated by a complex set of asynchonous JavaScript.  The main advantage to this technique is you can easily daisy change this to the promises your JavaScript developers are already using to perform their async actions.

Moving your actions into JavaScript injection

This technique is good for situations when elements are rapidly being refreshed on the page.  By pushing your actions into the JavaScript side, you can execute your find and your action in the same context as the page, thus making it a synchronous call.

self.driver.execute_script("$(\"li:contains('Einstein')\").click()")

In this technique, we combine both the selection and the action into one synchronous JavaScript call (using jquery).  The reason why this is more reliable than doing on the python end like this,

self.driver.find_element_by_xpath("//li[contains(.,'Albert Einstein')]").click()
Is that in reality, the selection and the click action will actually be sent as 2 separate calls when it goes over the wire on Selenium's JSON wire protocol.  Doing this all in JavaScript will ensure the call gets executed in the same frame.

Getting your tests to stable

If you want to learn more about these techniques or see them in action.  I created a sample project demonstrating some of the techniques.  Feel free to fork this project and contribute code samples of your own on how to deal with Stale Element problems.

4 comments:

chicklitt said...

Thanks for sharing your techniques. This has been a frustrating occurrence in our test application. I look forward to trying some of these ideas.

Unknown said...

Thank you so much, trick to call making lambda function did work. Thanks a ton.

Unknown said...

could you please give the Java version of

from selenium import webdriver
import time
import unittest


class Test(unittest.TestCase):

def testName(self):
driver = webdriver.Firefox()
driver.get("http://www.github.com")
search_input = lambda: driver.find_element_by_name('q')
search_input().send_keys('hello world\n')
time.sleep(5)
search_input().send_keys('hello frank\n') # no stale element exception

harijay said...

Thanks this was super helpful. I upgraded my selenium , python and Firefox after a computer hardware failure and then all my tests were failing with staleelement exceptions and un-exptected page refreshes which I had not seen before. Adding in the lambdas and some more sleeps as you recommend helped getting the code working again and also good to keep in mind as a best practice.