Block a site from being crawled by Common Crawl Crawler.
https://commoncrawl.org/
Robots.txt · AI Bot
Get access to data on 4,796,720 websites that are Common Crawl Bot Disallow Customers. We know of 3,722,656 live websites using Common Crawl Bot Disallow and an additional 1,074,064 sites that used Common Crawl Bot Disallow historically and 2,568,345 websites in the United States.
Get a list of 4,796,720 websites using Common Crawl Bot Disallow which includes location information, hosting data, contact details, 3,722,656 currently live websites and an additional 1,819,486 domains that redirect to sites in this list. 1,074,064 sites that used this technology previouslyand 2,568,345 websites in the United States currently using Common Crawl Bot Disallow.
Countries
Financial
Group
Region