Domains Project Bot

Information about our web crawler used for domain discovery.

Domains Project uses crawler and DNS checks to get new domains.

DNS checks client is in early stages and is used by select few. It is called Freya and we're working on making it stable and good enough for general public.

HTTP crawler is being rewritten as well. It is called Idun

User Agent

Typical user agent for Domains Project bot looks like this:

Mozilla/5.0 (compatible; Domains Project/1.0.8; +https://domainsproject.org)

Some older versions have set to Github repo:

Mozilla/5.0 (compatible; Domains Project/1.0.4; +https://github.com/tb0hdan/domains)

Technology

All data in this dataset is gathered using Scrapy and Colly frameworks.

Starting with version 1.0.7 crawler has partial robots.txt support and rate limiting. Please open issue if you experience any problems. Don't forget to include your domain.

Disabling Domains Project bot access to your website

Add this to your robots.txt:

User-agent: domainsproject.org
Disallow:/

or this:

User-agent: Domains Project
Disallow:/

The bot checks for both.