It will be a long article so I added a Table of content 👇 Fancy, right? Table Of ContentsCrawl an entire website with RcrawlerThe INDEX variableHTML FilesSo how to extract metadata while crawling?Extract more data without having to recrawlCategorize URLs using...

If you want to crawl a couple of URLs for SEO purposes, there are many many ways to do it but one of the most reliable and versatile packages you can use is rvest Here is a simple demo from the package documentation using the IMDb website: # Package installation,...

R’ and RStudio are great but sometimes it’s better the just export your data to exploit them elsewhere or just show them to other people. Here is a review of possible techniques: Export your data into a CSV assuming your data is store inside df var, fairly...

Selenium is a very classic tool for QA and it can help perform automatic checks on a website. This is an intro of how to use it:The first step is, as always, to install and load the RSelenium package #install to run once install.packages("RSelenium")...

What the hell is keyword cannibalization? if you put a lot of articles out there, at some point, some article will compete with one another for the same keywords in Google result pages. it’s what SEO people call ‘keyword cannibalization’. Does it...

XML sitemap is a fantastic tool but you have to do it properly otherwise it can definitely backfire. I can’t count the number of times while doing SEO audits, I discovered completely abandoned XML sitemaps asking Googlebot to index empty or 404 pages. This...