

Now, we have both the pagination and the loop item done in Octoparse. Select “ Select All ” to continue.įinally, we select “ Extract data in the loop ”. Īll the needed data are being selected by Octoparse and highlighted in red. Then, select “ Select all sub-elements ”. Select one of the “ blocks ” on the browser, Octoparse can detect all the data fields in the blog you selected.
#Image crawler octoparse movie#
In this case, we need to scrape the data from the movie list, which says, we can directly create a loop item to extract the data. As we can see in the below picture, the XPath that Octoparse generated is We ’ d better change it to //a.

If you want to make the Octoparse recognize the element you selected more precisely, you could simply revise the XPath. We can see the pagination has been built in the workflow. Simply click the “ next> ” element in the built-in browser and then click “ Loop click selected element ” on the Action Tips. Step 2: Click to build a task to scrape the movie information.Īfter having the RUL opened in the Octoparse built-in browser, we can continue to build a pagination and a loop item to get the data. Then, paste the URL to the box and click the “ Save URL ” button. Simply click “+task” under the Advanced Mode. Step 1: Open the target website in the Octoparse built-in browser. Besides, it’s highly recommended to learn the basic logic of using Octoparse. The goal of this web scraper is to find films that are listed on the Horror movie list, obtaining director information, the cast of actors, and some other important information.īefore getting started, please download Octoparse V7 on your computer to follow up.
#Image crawler octoparse how to#
In this case, I’ll show you how to scrape the 134,555 Horror movie information from IMDb, using the link: To help you fulfill data gathering, this article will introduce how to scrape the information from the IMDb Horror movie list, including director information, the cast of actors, and some other important information. There are still more things that we can do with the movie data according to the needs.
