LangParity crawler
User-agent: LocaleDriftBot/1.0 (+https://localedrift.com/bot)
What this bot does
Our crawler fetches publicly available pages of multilingual websites to compare language versions and detect content drift. It reads HTML only; it does not download images, CSS or JavaScript assets except when a page requires rendering. It never accesses authenticated or private areas.
How we crawl
- We respect robots.txt (cached up to 24 hours per domain).
- We limit ourselves to one request every two seconds per domain.
- We store extracted text and hashes; raw HTML is pruned after 30 days.
- Reports quote only short evidence snippets (15 words or fewer).
Opt out or request removal
To exclude your domain from crawling, or to have stored snapshots for your domain deleted, contact us and we will honor the request. You can also disallow the LocaleDriftBot user-agent in your robots.txt.
Draft crawler information pending owner review; contact details and the DSR flow are finalized before launch.