Yandex search engine robots.txt Clean-param configuration

Little scum · Posted on 4/2/2022 8:12:51 PM

Today I received an email from Yandex.Webmaster with the following text:

Some pages with GET parameters in the URL on your sitehttps://down.itsvse.comduplicate the contents of other pages (without GET parameters). For example,https://example.com/tovary?from=mainpageduplicateshttps://example.com/tovary.Because both pages are crawled, it might take longer for the information about important pages to be added to the search database. This may affect the site's search status.

Here are examples of pages and their duplicate pages with insignificant GET parameters:

ReturnUrl:
https://down.itsvse.com/Account/Index
https://down.itsvse.com/Account/Index?ReturnUrl=%2FUser%2FCollect
ReturnUrl:
https://down.itsvse.com/Account/Index
https://down.itsvse.com/Account/ ... %2FUser%2FResources
ReturnUrl:
https://down.itsvse.com/Account/Index
https://down.itsvse.com/Account/ ... oadLoading%2Fkzkalr
ReturnUrl:
https://down.itsvse.com/Account/Index
https://down.itsvse.com/Account/ ... Fitem%2Fawljnq.html
ReturnUrl:
https://down.itsvse.com/Account/Index
https://down.itsvse.com/Account/ ... loadLoading%2F11820
If these pages are duplicates, we recommend using the Clean-param directive in robots.txt, so that the robot ignores insignificant GET parameters and combines signals from identical pages on the main page.

Translated, as follows:

Your sitehttps://down.itsvse.comSome pages with the GET parameter in the URL on the page copy the content of the other pages (without the GET parameter). For examplehttps://example.com/tovary?from=mainpageandhttps://example.com/tovaryRepeat. Since both pages have been crawled, it may take longer to add information about important pages to the search database. This can affect the search status of your website.

Here are examples of pages with GET parameters that don't matter and their duplicate pages:

ReturnUrl：
https ://down.itsvse.com/Account/Index
https://down.itsvse.com/Account/Index?ReturnUrl=%2FUser%2FCollect
ReturnUrl：
https ://down.itsvse.com/Account/Index
https://down.itsvse.com/Account/ ... %2FUser%2FResources
ReturnUrl：
https ://down.itsvse.com/Account/Index
https://down.itsvse.com/Account/ ... oadLoading%2Fkzkalr
ReturnUrl：
https://down.itsvse.com/Account/Index
https://down.itsvse.com/Account/ ... Fitem%2Fawljnq.html
ReturnUrl：
https ://down.itsvse.com/Account/Index
https://down.itsvse.com/Account/ ... loadLoading%2F11820
If these pages are duplicates, we recommend using the Clean-param directive in robots.txt so that the bot ignores irrelevant GET parameters and merges signals from the same page on the homepage.

The general idea is that there are many pages with parameter values pointing to A links and then discoveringNo matter how the parameter value changes, the A link page always displays the same contentFor example, when a page clicks and needs to jump to the login page, we want to be able to jump to the original page after the login is successful.

The Yandex search engine wants us to add the Clean-param directive to such URL links, documentation:The hyperlink login is visible.

The clean-param syntax is as follows:

Login is visible.

In the first field, list the parameters that the bot should ignore, separated by the & character. In the second field, indicate the path prefix of the page to which the rule should be applied.

Examples include:

Login is visible.

We adjust the robots.txt in the end according to the format and examples as follows:

Login is visible.

(End)

flying fish · Posted on 4/4/2022 11:36:58 AM

Learn to learn

Little scum · Posted on 3/15/2023 8:07:00 PM

Modified to:

Login is visible.

[SEO] Yandex search engine robots.txt Clean-param configuration

Related Posts

Sections viewed