Enter the target URL in the built-in browser (you can search anything in Jalan.net or copy the example URL here)

Click "Go" icon to open webpage

Step 3. Creating a list of items

We can see that all the hotel information sections are in similar layout, which means we could make a list of all these sections.

Move your cursor to the first hotel information section

Click when the highlighted part covers the whole block

If you can not select the right part, just click anywhere in the first block and keep clicking “Expand Area button”until the red dotted line encircles the whole block.

Click "Create a list of items"

Click "Add current item to the list"

Now, the first section has been added to the list, we need to finish adding all the sections to the list.

Click "Continue to edit the list"

Click the second section with similar layout

Click "Add current item to the list" again

Now we get all the sections added to the list.

Click "Finish Creating List"

Click "Loop", which means Octoparse would go through the list to extract data

Step 4. Select the data to be extracted and rename data fields.

In this step, we will begin extracting data from the loop list of hotel information sections. By navigating to the "Extract Data" action and clicking it, you will notice that the first information section is outlined with green dotted line. That means we need to extract data just within this section by following the steps below. Note that the extraction action we will be setting up for this section is going to apply to the rest of the list. Say we want to capture the product name and price

Click the hotel name

Select "Extract text"

Follow the same steps to extract the other data

Rename any field if necessary

Click "Save"

Step 5. Set up pagination

Now we need to flip through multiple web pages to extract as many data as possible by setting up pagination action.

Click the “次へ”

Choose “Loop click the element”

As the Xpath for “次へ” is different in pages, we need to modify the Xpath to locate it precisely.

Go to the “Advanced Option”

Enter the correct Xpath in the “Single Element” box://*[text()='次へ']

Click “Save”

As the “次へ” exists even in the last page, the loop will not end itself. We need to end the loop after clicking “次へ” 5 times for there are totally 6 pages. You can set up the time according to how many pages you are going to extract.

Go to the “Advanced Option”

Open "End loop when"

Tick "Exection time reach"

Enter "5" in the box

Click "Save"

Step 6. Start running your task

Now we are done configuring the task and it's time to run the task to get the data we want.

Click "Next"

Click "Next"

Click "Local Extraction"

There is Local Extraction and Cloud Extraction (premium plan). With Local Extraction, the task will be run in your own machine; with Cloud Extraction, the task will be run on Octoparse Cloud Platform, which means you can basically set it up to run and turn off your desktop or laptop and data will be automatically extracted and saved to the cloud. Features such as scheduled extraction, IP rotation and API are also supported with the Cloud. Find out more about Octoparse Cloud here.

Step7. Check and export the data

After completing the data extraction process, we can choose to check the data extracted or click "Export" button to export the results to Excel file, databases or other formats and save the file to your computer.

Done!

To learn more about how to crawl data from a website, you can refer to these tutorials: