Easily explore the DOM with find
¶
In Selenium, retrieving web elements is usually done with the methods find_element
or find_elements
. Even if these methods are great for most basic tasks, complex use cases are not properly handled by Selenium. For example, waiting for an element to appear in the DOM requires additional non trivial code. Handling all these complex use cases without code factorization can lead to a not maintainable project, with poor readability.
The function find
in manen.finder
aims to help you handle these use cases. With a set of arguments, you can easily adapt the function to what you need: wait for an element, retrieve one or several elements, trying different selectors for the same element, etc.
This guide will show you all you can do with find
function.
First, we need to an instance of Selenium WebDriver used for Chrome automation (but it could be any other browser).
[1]:
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.webdriver import WebDriver
from selenium.webdriver.common.selenium_manager import SeleniumManager
selenium_manager = SeleniumManager()
paths = selenium_manager.binary_paths(["--browser", "chrome"])
service = Service(executable_path=paths["driver_path"])
driver = WebDriver(service=service)
We will use the search results for “selenium” in PyPI as playground.
[2]:
driver.get("https://pypi.org/search/?q=selenium")
Let’s import the function from manen.finder
.
[3]:
from manen.finder import find
The signature of the function is a good start to understand what you can do with it:
def find(
selector: str | list[str] | None = None,
*,
inside: DriverOrElement | list[DriverOrElement] | None = None,
many: bool = False,
default: Any = NotImplemented,
wait: int = 0,
):
...
(knowing that DriverOrElement
is a type alias for Union[WebDriver, WebElement]
)
Finding one or several elements¶
The first argument, and the most important one, is selector
. With that you can specify the selection method and selector to be used to locate an element. The format for each selector is {selection_method}:{selector}
, with selection_method
being one of the following: xpath
, css
, partial_link_text
, link_text
, name:
, tag:
. If the selection method is not specified, it will be xpath
if the selector starts with ./
or /
, css
otherwise.
The second argument, inside
, is used to specify the context in which the element should be found.
[4]:
# Find the element with the information about the number of results
element = find("xpath://*[@id='content']//form/div[1]/div[1]/p", inside=driver)
element
[4]:
<selenium.webdriver.remote.webelement.WebElement (session="cd040c404e27870134e4343b42d77fa8", element="f.99D5A7C708561CB84620C8AE35889029.d.75E4D8CB9D3BDC1C43852C3A3BD379B3.e.1618")>
Using only these 2 parameters is the equivalent of doing driver.find_element(By.{selection_method}, {selector})
, so it returns a Selenium WebElement.
[5]:
print(element.text)
2,578 projects for "selenium"
If you want to use find_elements
instead of find_element
, you can set the many
parameter to True
.
[6]:
results = find("ul[aria-label='Search results'] li", inside=driver, many=True)
len(results)
[6]:
20
By default, Manen will raise a ElementNotFound
exception if the specified selectors match no elements in the area to inspect.
[7]:
find("css:i-dont-exist", inside=driver)
---------------------------------------------------------------------------
ElementNotFound Traceback (most recent call last)
Cell In[7], line 1
----> 1 find("css:i-dont-exist", inside=driver)
File ~/Documents/Projects/kodaho/manen/manen/finder.py:288, in find(selector, inside, many, default, wait)
285 return default
287 driver = inside if isinstance(inside, WebDriver) else inside.parent
--> 288 raise ElementNotFound(selectors=selectors, driver=driver)
ElementNotFound: Unable to find an element matching the selectors:
> css:i-dont-exist
Context of the exception:
- Title page: Search results · PyPI
- URL: https://pypi.org/search/?q=selenium
To avoid raising an error, you can specify a default value to be returned if any element is found; this is done with the default
parameter.
[8]:
find("i-dont-exist", inside=driver, default=None)
Attempting to locate an element with different selectors¶
Another use case supported by the function is trying several different selectors to locate an element. It will try all the selectors by order and return an element as soon as a selector hits a result.
[9]:
# Find a link in the page. The first selector won't match but the second will
a_element = find(['fake-link-selector', 'a'], inside=driver)
print(a_element.get_property('href'))
https://pypi.org/search/?q=selenium#content
Changing the scope of the search¶
Same as in Selenium, instead of searching inside the whole page, you can restrict the scope to a specific element, by specifying an element in the inside
parameter.
[10]:
# Get the name of the package in the first search result
element_name = find("h3 span.package-snippet__name", inside=results[0])
print(element_name.text)
selenium
If the inside
keyword argument is a list instead of a single element, it will return one result for each element in the list.
[11]:
elements = find("h3 span.package-snippet__name", inside=results)
assert isinstance(elements, list)
print("First 3 package names from the results", [element.text for element in elements][:3])
First 3 package names from the results ['selenium', 'selenium2', 'percy-selenium']
Waiting for an element to appear in the DOM¶
By specifying the wait
keyword argument, you can specify the number of seconds to wait before raising an error if the error is not found. If you add a default value, it will be returned if the element is not found.
[12]:
%%time
# Try to find an element that should be within 3 seconds, and return None if not found
find('css:i-dont-exist', inside=driver, wait=3, default=None)
CPU times: user 13.8 ms, sys: 2.31 ms, total: 16.1 ms
Wall time: 3.09 s
Re-using the function¶
Some use cases might require to re-use the find
function, with the same arguments. By not specifying the selector
argument, you can create a new function that will use the same arguments as the original one, but with your values as default value.
For example, you can create an equivalent of the find
function, with a restriction on the scope of the search, and with a default value.
[13]:
a_div = find("div.left-layout__main", inside=driver)
# Definition of our partial function
lookup = find(inside=a_div, many=True, default=[])
# Call the partial function with different selectors
li_elements = lookup("li")
print(f"{len(li_elements)} <li> elements found")
span_elements = lookup("span")
print(f"{len(span_elements)} <span> elements found")
no_element = lookup("i-dont-exist")
assert len(no_element) == 0
20 <li> elements found
63 <span> elements found
That’s it for the find
function! Next we will check Manen browser, an enhanced version of Selenium WebDriver, with additional features.
[14]:
driver.quit()