Skip to content

Element Selection Guide

Naibo Wang edited this page Feb 13, 2023 · 10 revisions

To get data from the web page, the first step we need to do is to "Select" element on the web page. After the element is selected, then we can define subsequent operations based on it.

There are some concepts regarding with element selection.

Waiting Element

The Waiting Element refers to the element prepared to be selected. Just move the mouse pointer on the web page, we will see one element is marked with light-purple background, which means the element is waiting to be selected currently:

image

Select Element(s)

When you want to select one element, just right-click your mouse or press F7 on the keyboard to select the element, which will be marked with bright-blue background:

image

Then you can select the 2nd/3rd/.../n-th elements again by keep right-click your mouse or press F7 on the keyboard:

image

Operation Toolbox

After the element is selected, you will see many options on the Operation Toolbox, which can be dragged freely:

image

Revoke Selection

If you select more than 2 elements, but you don't want the latest selected element, you can click the Revoke selection option on the toolbox to revoke the latest selected element.

For example, you selected 3 elements on the webpage, which in order is:

(1) The "Daily Deals" link on the top bar.

(2) The "Shop by category" option.

(3) The "ebay" image (logo).

And now you don't want to select the "ebay" image anymore (3), then you can click the "Revoke selection" option to deselect it, remaining the other 2 elements (1 and 2) be selected:

image

Deselect Element

If you want to cancel the current selection, click the Deselect button on the toolbox to deselect all selected elements:

image

Expand Path

Sometimes you may want to select an element, but what is actually selected is its child element. E.g., the real XPath of the element you want to select is: /html/body/div/a, but the XPath of actually selected element is: /html/body/div/a/span, and you cannot actually select the "a" tag because it don't actually have a width or height. Here, you need to click the Expand Path button on the toolbox to expand the XPath of current selected element to its parent element:

image

And the selection area will also be marked:

image

Auto Similar Element(s) Match

EasySpider support automatically detect similar elements, which is very useful when we want to get data from a list, e.g., to get all product titles/prices on ebay.

Take the above ebay collection task as example, firstly, we select the title of the first product in the list, and we can see all other product titles are marked with blue boarder, which means they are "waiting to be selected":

image

Then we can click the "Select All" option on the toolbox, or select the 2nd/3rd/.../n-th matched element by right-click our mouse to select all matched elements:

image

image

Artificially Adjust Similar Elements

Sometimes EasySpider may have many options of "Similar Elements", e.g., the following two types of elements can both be similar elements for the link on Google:

image

This is because that elements under (1) are all "links", and elements under (2) belong to a same parent.

Under this condition, EasySpider will hint us one type of similar elements, if the hinted elements (such as (2)) are not the actual similar elements we want, we can just select the second intended elements on the web page to let EasySpider change the similar element pattern:

image

Another case is sometimes EasySpider may detect less similar elements than we expect:

image

We can see only the products under "Score these trending kicks" sections are detected, but we want not only these products, but also products under "Feel-good fashion at the Brand Outlet" section. To select them, similarly, we just need to artificially select one product at the "Feel-good fashion at the Brand Outlet" section, then all similar products will be detected:

image

Then, click "Select All" option in the toolbox to select all elements.

Select Child Elements

An element on a web page may contain many child elements which are useful, such as the following block contains the title, price, discount, number of watchers, warranty, etc. of a product:

image

Artificially select them one by one is cumbersome, especially when we want to select many elements under a list. Therefore, EasySpider provides the "Select child elements" option for us to select all child elements by one-click. The steps are:

  1. Select one element by right-click or F7.

  2. Click "Select child elements" option on the toolbox.

Then all child elements are selected by EasySpider:

image

To remove useless child element(s), please configure at the Workflow Manager.