Have you ever built a list in Octoparse? Have you noticed a loop mode gets automatically selected as a loop is created? This may have become so sneaky that you won’t even notice it after a long time of using it. In this article, however, I will like to point out a few scraping scenarios in which you may want to manually switch from one mode to another.

Consider manually selecting or switching a loop mode if you want to:

Speed up an extract task by splitting it

Search with multiple keywords on any websites then extract the search results

Extract from multiple URL’s with similar page layout

Speeding up an extraction with a split-able list (using a Fixed List/URL List)

A variable list follows a single XPath and matches any elements (as many as there are) that meets the criteria defined by the XPath on a webpage. A variable list is not split-able. On the contrary, fixed list and URL list are both split-able; hence, consider manually changing a variable list into a fixed list or URL list if you need to split a task for faster extraction.

Set up a crawler to first capture URL’s of all the webpages sharing similar web structure, then build a second crawler to visit and extract from each individual URL on the list following the same set of configuration (learn how).

Search a website with different keywords and capture the search results (using a Text List)

For anyone that wants to search and extract, you will need to provide Octoparse with a list of keywords to search for. This is done by setting up a loop of text list. Once the extraction is set to run, Octoparse will automatically search the first keyword, capture the search results, search the second keyword, capture the corresponding search results, so on and so forth.

The detailed steps are:

Step 1: Click on the search box

Step 2: Select “Enter text value”

Step 3: Drag a Loop action to the workflow

Step 4: Select “Text List” for loop mode

Step 5: Copy and past the list of keywords into the text box, click “Save”

Step 6: Drag the Input Text action into the the loop

Step 7: Under Advanced Options, check for “Use the text in the loop item to fill in the text box”, click “Save”

Scrape from a list of URL’s following the same webpage structure (using a URL List)

Octoparse extracts data from any webpage by interacting with the website and scanning the webpage for specific web elements according to the task configuration. Hence, in order to grab data consistently and accurately from multiple pages, it is important that those pages share the same page structure, for example, product detail page on an Ecommerce webpage (example ), business detail page from a directory website (example) or even user page from a social media website (example). These pages that essentially “look” the same can be efficiently scrapped with a loop of URL List.

The detailed steps are:

Step 1: Drag a Loop action to the workflow

Step 2: Select “URL List” for Loop Mode

Step 3: Copy and paste the pre-aggregated list of URL’s into the text box, click “Save”