XPATH selectors


To select HTML elements from your page, you can use XPath selectors, which are basically a set of expressions that will extract the nodes you require. The nodes are obtained by following a path in the HTML document, either downwards from a known node, or upwards (it searches for descendants or ancestors of a known element). To find elements using XPATH, find below what suits your search:

  • Select a node by its name: nodename. Example: if the node is an h3 element, the selector is: h3.
  • Select all the nodes that have a certain attribute: nodename[@attribute]. Example: Select all h3 elements that have a class: //h3[@class].
  • Select all the nodes for which a specified attribute has a specified value. Example: Select all h3 elements that have a class having the ‘aClass’ value: //h3[@class=’aClass’].
  • Select the nth child element of a node, having a certain element type: //node/child[n]. Example: Select the third ‘li’ child of a ‘ul’ element: //ul/li[3].
  • Selecting the parent of a node: .. . Example: parent of a div element that has the class ‘aClass’: //div[@clas=’aClass’]/..

Let’s exemplify some of these selections on a piece of HTML code:

<div class="firstClass">
<li>
<div class="secondClass andThirdClass" >
  <h3>
   <a href="aRandomLink”> randomText</a>
  </h3>
  <div class="fourthClass">
    <div>
      <div class="fifthClass">
        <a class="someSpecialClass" href="#" id="firstId">
        <span>Some text here</span>
        </a>
      </div>
      <span>Yet more text</span>
    </div>
   </div>
  </div>
</li>
<li>...same structure as the first li...</li><li>...same structure as the previous two lis</li>
</div>

 

To select:

  • the h3 element: //h3
  • the a element whose class attribute is someSpecialClass: //a[@class=’someSpecialClass’]
  • the span element that has ‘Some text here’ as label: //div[@class=’fifthClass’]
  • the span element that has ‘Yet more text’ as label: //div[@class=’fourthClass’]/div/span
  • the first li element of the first div: //div[@class=’firstClass’]/li
  • the second li element of the first div: //div[@class=’firstClass’]/li[2]
  • all div elements whose class attributes contain the string ‘Class’: //div[contains(@class, ‘Class’)]
  • all div elements that have a class attribute containing the ‘second’ string and also a class attribute containing the ‘Third’ string: //div[contains(@class, ‘second’) and contains(@class, ‘Third’)
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s