How to check that a robot belongs to Yandex

Some robots can disguise themselves as Yandex robots by indicating the relevant User Agent. You can check the authenticity of a robot using reverse DNS lookup.

Just follow these steps:

  1. Determine the IP address of the user agent in question using your server logs.
  2. Use a reverse DNS lookup of the IP address to determine the host domain name.
  3. Check whether the host belongs to Yandex. All Yandex robot names end in yandex.ru, yandex.net or yandex.com. If the host name has a different ending, the robot does not belong to Yandex.
  4. Make sure that the name is correct. Use a forward DNS lookup to get the IP address corresponding to the host name. It should match the IP address used in the reverse DNS lookup. If the IP addresses do not match it means that the host name is fake.
    1. Yandex robots in server logs
    2. FAQ

Yandex robots in server logs

A number of Yandex robots download web documents for purposes other than indexing. To avoid unintentional blocking by site owners, they may ignore the file's restrictive directives robots.txtdesigned for arbitrary robots (User-agent: *).

In addition, robots may ignore some robots.txt restrictions for certain sites if there is an agreement between «Yandex» and the owners of those sites.

Примечание. If such a robot downloads a document that the main Yandex robot can't access, this document will never be indexed and won't be found in search results.

To restrict access to such robots to the site, use directives specifically for them, for example:

User-agent: YandexCalendar
Disallow: /

User-agent: YandexMobileBot
Disallow: /private/*.txt$

The robots use offline network: AS13238, AS208722 and IP addresses that change frequently, so their list isn't disclosed.

When the robot accesses the page, your server logs may show the User-agent and version of the browser used for crawling the site. For example, Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.268.

Примечание. The browser version may change, so we recommend not specifying the version when searching for the User-agent in server logs.
The robot's full name, including the User agent Purpose of the robot Takes into account the General rules specified in robots.txt
Mozilla/5.0 (compatible; YandexAccessibilityBot/3.0; +http://yandex.com/bots)

YandexAccessibilityBot downloads pages to check their accessibility for users.

It sends up to 3 requests to the site per second. The robot ignores the setting in the Yandex.Webmaster interface.

No
Mozilla/5.0 (compatible; YandexAdNet/1.0; +http://yandex.com/bots) The Yandex advertising network robot. Yes
Mozilla/5.0 (compatible; YandexBlogs/0.99; robot; +http://yandex.com/bots) The blog search robot that indexes post comments. Yes
Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) The main indexing robot. Yes
Mozilla/5.0 (compatible; YandexBot/3.0; MirrorDetector; +http://yandex.com/bots) Detecting site mirrors. Yes
Mozilla/5.0 (compatible; YandexCalendar/1.0; +http://yandex.com/bots) The Yandex.Calendar robot. Downloads calendar files by users' requests. These files are often located in directories prohibited from indexing. No
Mozilla/5.0 (compatible; YandexDirect/3.0; +http://yandex.com/bots) Downloads information about the content of Yandex Advertising network partner sites to identify their topic categories to match relevant advertising. No
Mozilla/5.0 (compatible; YandexDirectDyn/1.0; +http://yandex.com/bots Generates dynamic banners. No
Mozilla/5.0 (compatible; YandexFavicons/1.0; +http://yandex.com/bots) Downloads the site's favicon file to display in search results. No
Mozilla/5.0 (compatible; YaDirectFetcher/1.0; Dyatel; +http://yandex.com/bots) Downloads target pages of ads to check their availability and topic. This is necessary for ad placement in the search results and on the partner sites. No. The robot doesn't use the robots.txt file and ignores the directives set for it.
Mozilla/5.0 (compatible; YandexForDomain/1.0; +http://yandex.com/bots) The Yandex.Mail for domain robot used to verify domain ownership rights. Yes
Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots) Indexes images to display them in Yandex.Images. Yes
Mozilla/5.0 (compatible; YandexImageResizer/2.0; +http://yandex.com/bots) Mobile devices robot. Yes
Mozilla/5.0 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B411 Safari/600.1.4 (compatible; YandexMobileBot/3.0; +http://yandex.com/bots) Defines pages with layout suitable for mobile devices. No
Mozilla/5.0 (compatible; YandexMarket/1.0; +http://yandex.com/bots) The Yandex.Market robot. Yes
Mozilla/5.0 (compatible; YandexMarket/2.0; +http://yandex.com/bots) No
Mozilla/5.0 (compatible; YandexMedia/3.0; +http://yandex.com/bots) Indexes multimedia data. Yes
Mozilla/5.0 (compatible; YandexMetrika/2.0; +http://yandex.com/bots yabs01) Downloads site pages to check their availability, including landing pages of the Yandex.Direct ads. No. The robot doesn't use the robots.txt file and ignores the directives set for it.
Mozilla/5.0 (compatible; YandexMetrika/2.0; +http://yandex.com/bots) The Yandex.Metrica robot. No
Mozilla/5.0 (compatible; YandexMetrika/3.0; +http://yandex.com/bots) No
Mozilla/5.0 (compatible; YandexMetrika/4.0; +http://yandex.com/bots) The Yandex.Metrica robot. Downloads and caches the CSS styles to render site pages in Webvisor. No. The robot doesn't use the robots.txt file and ignores the directives set for it.
Mozilla/5.0 (compatible; YandexMobileScreenShotBot/1.0; +http://yandex.com/bots) Takes a screenshot of the mobile page. No
Mozilla/5.0 (compatible; YandexNews/4.0; +http://yandex.com/bots) The Yandex.News robot. Yes
Mozilla/5.0 (compatible; YandexOntoDB/1.0; +http://yandex.com/bots) The object response robot. Yes
Mozilla/5.0 (compatible; YandexOntoDBAPI/1.0; +http://yandex.com/bots) The object response robot that downloads dynamic data. No
Mozilla/5.0 (compatible; YandexPagechecker/1.0; +http://yandex.com/bots) Accesses the page for validating micro-markup via the Structured data validator. Yes
Mozilla/5.0 (compatible; YandexPartner/3.0; +http://yandex.com/bots) Downloads information about the content of Yandex partner sites. No
Mozilla/5.0 (compatible; YandexRCA/1.0; +http://yandex.com/bots) Collects data for generating previews. For example, wizard preview. No
Mozilla/5.0 (compatible; YandexSearchShop/1.0; +http://yandex.com/bots) Downloads product catalogs in YML files by users' requests. These files are often placed in directories prohibited for indexing. No
Mozilla/5.0 (compatible; YandexSitelinks; Dyatel; +http://yandex.com/bots) Checks the availability of pages used as sitelinks. Yes
Mozilla/5.0 (compatible; YandexSpravBot/1.0; +http://yandex.com/bots) The Yandex.Business robot. Yes
Mozilla/5.0 (compatible; YandexTracker/1.0; +http://yandex.com/bots) The Yandex.Tracker robot. No
Mozilla/5.0 (compatible; YandexTurbo/1.0; +http://yandex.com/bots) Crawls the RSS feed created to generate Turbo pages. It sends up to 3 requests to the site per second. The robot ignores the settings in the Yandex.Webmaster interface and the Crawl-delay directive. Yes
Mozilla/5.0 (compatible; YandexVertis/3.0; +http://yandex.com/bots) Search verticals robot. Yes
Mozilla/5.0 (compatible; YandexVerticals/1.0; +http://yandex.com/bots) The Yandex.Verticals robot: Auto.ru, Yanex.Realty, Yandex.Rabota, Yandex.Reviews. Yes
Mozilla/5.0 (compatible; YandexVideo/3.0; +http://yandex.com/bots) Indexes video clips to display in Yandex.Video. Yes
Mozilla/5.0 (compatible; YandexVideoParser/1.0; +http://yandex.com/bots) Indexes video clips to display in Yandex.Video. No
Mozilla/5.0 (compatible; YandexWebmaster/2.0; +http://yandex.com/bots) The Yandex.Webmaster robot. Yes
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z* Safari/537.36 (compatible; YandexScreenshotBot/3.0; +http://yandex.com/bots) Takes a screenshot of the page. No

* The combination of W.X.Y.Z characters is a placeholder for the user agent version of the Chrome browser. For example: 41.0.2272.96.

FAQ

How do I protect myself from fake robots that pretend to be Yandex robots?

To protect yourself against fake robots, use the reverse DNS lookup filter, as described above. This method is preferable to managing access by IP addresses, as it is more resistant to changes in the Yandex internal networks.

There is too much traffic going back and forth between my web server and your robot. Does Yandex support downloading of compressed pages?

Yes, it does. Each time the Yandex robot requests a page it says: “"Accept-Encoding: gzip,deflate” . This means you can set up your web server to reduce the traffic between the server and our robot. However, note that sending compressed content increases CPU usage on your server. If it is overloaded, it can cause problems. For gzip and deflate download, the robot applies the rfc2616 standard, section 3.5.