{"id":10486,"date":"2025-09-13T07:13:59","date_gmt":"2025-09-13T07:13:59","guid":{"rendered":"https:\/\/affinite.io\/cs\/?p=10486"},"modified":"2025-09-13T07:19:31","modified_gmt":"2025-09-13T07:19:31","slug":"jak-spravne-pouzivat-soubor-robots-txt","status":"publish","type":"post","link":"https:\/\/affinite.io\/cs\/jak-spravne-pouzivat-soubor-robots-txt\/","title":{"rendered":"Jak spr\u00e1vn\u011b pou\u017e\u00edvat soubor robots.txt"},"content":{"rendered":"\n
Soubor C\u00edlem tohoto \u010dl\u00e1nku je nab\u00eddnout nejen p\u0159ehled z\u00e1kladn\u00edch pravidel, ale tak\u00e9 poskytnout kontext, pokro\u010dil\u00e9 p\u0159\u00edklady a specifick\u00e1 doporu\u010den\u00ed pro r\u016fzn\u00e9 platformy \u2013 zejm\u00e9na WordPress.<\/p>\n\n\n\n Soubor Prim\u00e1rn\u00ed c\u00edle pou\u017eit\u00ed Soubor se skl\u00e1d\u00e1 z blok\u016f, z nich\u017e ka\u017ed\u00fd za\u010d\u00edn\u00e1 direktivou Soubor se \u010dte blokov\u011b po jednotliv\u00fdch user-agentech<\/strong>. Nen\u00ed mo\u017en\u00e9 kombinovat pravidla pro v\u00edce Pro p\u0159esn\u011bj\u0161\u00ed \u0159\u00edzen\u00ed p\u0159\u00edstupu jsou k dispozici z\u00e1stupn\u00e9 znaky.<\/p>\n\n\n\n Znamen\u00e1 „libovoln\u00fd po\u010det znak\u016f“. P\u0159\u00edklad:<\/p>\n\n\n\n Zak\u00e1\u017ee v\u0161echny PDF soubory ve slo\u017ece Znamen\u00e1 „konec URL“. Pou\u017e\u00edv\u00e1 se pro zp\u0159esn\u011bn\u00ed:<\/p>\n\n\n\n T\u00edmto pravidlem zak\u00e1\u017eeme v\u0161echny kon\u010d\u00edc\u00ed PDF soubory<\/strong>, ale ne nap\u0159\u00edklad Kombinace Google a dal\u0161\u00ed roboti um\u00ed \u010d\u00edst direktivu T\u00fdk\u00e1 se technick\u00fdch adres\u00e1\u0159\u016f, str\u00e1nkov\u00e1n\u00ed, vyhled\u00e1v\u00e1n\u00ed a AJAX funkc\u00ed. N\u00ed\u017ee rozebereme podrobn\u011bji.<\/p>\n\n\n\n WordPress ukl\u00e1d\u00e1 ve\u0161ker\u00e9 administrativn\u00ed funkce do Intern\u00ed vyhled\u00e1vac\u00ed dotazy generuj\u00ed potenci\u00e1ln\u011b stovky a\u017e tis\u00edce variant URL s r\u016fzn\u00fdmi parametry. Tyto str\u00e1nky obvykle neobsahuj\u00ed jedine\u010dn\u00fd obsah, mohou zp\u016fsobit:<\/p>\n\n\n\n Vyhled\u00e1v\u00e1n\u00ed by m\u011blo b\u00fdt blokov\u00e1no Nap\u0159.:<\/p>\n\n\n\n Tyto str\u00e1nky je vhodn\u00e9 bu\u010f blokovat pomoc\u00ed Modern\u00ed WordPress \u0161ablony \u010dasto pou\u017e\u00edvaj\u00ed REST API. I kdy\u017e tyto endpointy b\u011b\u017en\u011b neobsahuj\u00ed HTML, n\u011bkter\u00e9 crawly je proch\u00e1z\u00ed. Pokud nem\u00e1te d\u016fvod je zp\u0159\u00edstupnit (nap\u0159. ve\u0159ejn\u00e9 API dokumentace), doporu\u010duje se blokace.<\/p>\n\n\n\n Adres\u00e1\u0159e a soubory technick\u00e9ho r\u00e1zu by nem\u011bly b\u00fdt p\u0159\u00edstupn\u00e9 pro roboty \u2013 mohou obsahovat citliv\u00e1 data nebo zpomalit robota zpracov\u00e1v\u00e1n\u00edm irelevantn\u00edho obsahu.<\/p>\n\n\n\n N\u011bkte\u0159\u00ed spr\u00e1vcov\u00e9 webu omylem blokuj\u00ed nap\u0159\u00edklad:<\/p>\n\n\n\n To je v\u0161ak z\u00e1sadn\u00ed chyba<\/strong>. Googlebot pot\u0159ebuje m\u00edt p\u0159\u00edstup ke v\u0161em CSS a JS, kter\u00e9 jsou nezbytn\u00e9 pro spr\u00e1vn\u00e9 zobrazen\u00ed str\u00e1nky. Pokud mu v tom zabr\u00e1n\u00edte, m\u016f\u017ee doj\u00edt k penalizaci (Google vyhodnot\u00ed str\u00e1nku jako \u0161patn\u011b optimalizovanou).<\/p>\n\n\n\n Spr\u00e1vn\u00e9 \u0159e\u0161en\u00ed: Nezakazujte WordPress je velmi popul\u00e1rn\u00ed CMS, ale pr\u00e1v\u011b proto trp\u00ed \u0159adou SEO probl\u00e9m\u016f, kter\u00e9 Doporu\u010den\u00e1 z\u00e1kladn\u00ed konfigurace:<\/p>\n\n\n\n P\u0159i \u00faprav\u00e1ch robots.txt<\/code> je jednoduch\u00fd textov\u00fd soubor um\u00edst\u011bn\u00fd ve ve\u0159ejn\u011b p\u0159\u00edstupn\u00e9m ko\u0159enov\u00e9m adres\u00e1\u0159i webu (nap\u0159.
https:\/\/www.example.com\/robots.txt<\/code>). A\u010dkoliv je jeho syntaxe trivi\u00e1ln\u00ed, jeho v\u00fdznam v r\u00e1mci technick\u00e9ho SEO je z\u00e1sadn\u00ed. \u0160patn\u00e1 konfigurace m\u016f\u017ee v\u00e9st k z\u00e1va\u017en\u00fdm d\u016fsledk\u016fm \u2013 od ignorace d\u016fle\u017eit\u00fdch \u010d\u00e1st\u00ed webu roboty a\u017e po \u00fapln\u00e9 vy\u0159azen\u00ed webu z v\u00fdsledk\u016f vyhled\u00e1v\u00e1n\u00ed.<\/p>\n\n\n\n
\n\n\n\nCo je
robots.txt<\/code> a pro\u010d ho pou\u017e\u00edvat<\/h2>\n\n\n\n
robots.txt<\/code> slou\u017e\u00ed k \u0159\u00edzen\u00ed p\u0159\u00edstupu robot\u016f (tzv. user-agents) vyhled\u00e1va\u010d\u016f, jako je Googlebot, Bingbot, Yandexbot apod., k jednotliv\u00fdm \u010d\u00e1stem va\u0161eho webu. Je sou\u010d\u00e1st\u00ed tzv. Robots Exclusion Protocolu (REP), kter\u00fd byl navr\u017een pro efektivn\u00ed spr\u00e1vu proch\u00e1zen\u00ed web\u016f.<\/p>\n\n\n\n
robots.txt<\/code> jsou:<\/p>\n\n\n\n
\n
\n\n\n\nStruktura souboru robots.txt<\/h2>\n\n\n\n
User-agent<\/code>, po n\u00ed\u017e n\u00e1sleduj\u00ed
Disallow<\/code>,
Allow<\/code> a p\u0159\u00edpadn\u011b
Sitemap<\/code>.<\/p>\n\n\n\n
User-agent: *\nDisallow: \/wp-admin\/\nAllow: \/wp-admin\/admin-ajax.php\nSitemap: https:\/\/www.example.com\/sitemap_index.xml\n<\/code><\/pre>\n\n\n\n
\n
*<\/code> znamen\u00e1 v\u0161echna za\u0159\u00edzen\u00ed.<\/li>\n\n\n\n
User-agent<\/code> sekc\u00ed. Ka\u017ed\u00fd blok je vyhodnocov\u00e1n samostatn\u011b.<\/p>\n\n\n\n
\n\n\n\nWildcards a speci\u00e1ln\u00ed znaky<\/h2>\n\n\n\n
Hv\u011bzdi\u010dka
*<\/code><\/h3>\n\n\n\n
Disallow: \/private\/*.pdf\n<\/code><\/pre>\n\n\n\n
\/private\/<\/code>.<\/p>\n\n\n\n
Symbol
$<\/code><\/h3>\n\n\n\n
Disallow: \/*.pdf$\n<\/code><\/pre>\n\n\n\n
\/download.php?file=document.pdf<\/code>.<\/p>\n\n\n\n
*<\/code> a
$<\/code> umo\u017e\u0148uje silnou kontrolu nad strukturou URL.<\/p>\n\n\n\n
\n\n\n\nCo by m\u011bl soubor robots.txt obsahovat<\/h2>\n\n\n\n
1. Definici sitemap<\/h3>\n\n\n\n
Sitemap<\/code> a n\u00e1sledn\u011b zpracovat uveden\u00fd soubor.<\/p>\n\n\n\n
Sitemap: https:\/\/www.example.com\/sitemap_index.xml\n<\/code><\/pre>\n\n\n\n
\n
\/sitemap-products.xml<\/code>,
\/sitemap-categories.xml<\/code> atd.).<\/li>\n<\/ul>\n\n\n\n
2. Blokace nepodstatn\u00fdch sekc\u00ed webu<\/h3>\n\n\n\n
\n\n\n\nCo by se nem\u011blo indexovat a pro\u010d<\/h2>\n\n\n\n
Administrativn\u00ed rozhran\u00ed<\/h3>\n\n\n\n
Disallow: \/wp-admin\/\nAllow: \/wp-admin\/admin-ajax.php\n<\/code><\/pre>\n\n\n\n
\/wp-admin\/<\/code>. Neexistuje d\u016fvod, pro\u010d by se m\u011bly proch\u00e1zet nebo indexovat. V\u00fdjimku tvo\u0159\u00ed
admin-ajax.php<\/code>, kter\u00fd pou\u017e\u00edvaj\u00ed n\u011bkter\u00e9 pluginy a motivy (nap\u0159. WooCommerce, AJAX filtrov\u00e1n\u00ed atd.).<\/p>\n\n\n\n
Intern\u00ed vyhled\u00e1v\u00e1n\u00ed<\/h3>\n\n\n\n
Disallow: \/?s=\nDisallow: \/search\n<\/code><\/pre>\n\n\n\n
\n
robots.txt<\/code> nebo ozna\u010deno
noindex<\/code>.<\/p>\n\n\n\n
Autor, archiv a tag str\u00e1nky<\/h3>\n\n\n\n
Disallow: \/author\/\nDisallow: \/tag\/\nDisallow: \/category\/uncategorized\/\n<\/code><\/pre>\n\n\n\n
\n
\/author\/<\/code> \u2013 \u010dasto neobsahuj\u00ed \u017e\u00e1dn\u00fd unik\u00e1tn\u00ed obsah, nav\u00edc pokud existuje pouze jeden autor, jde o zbyte\u010dnou duplicitu.<\/li>\n\n\n\n
\/tag\/<\/code> \u2013 pokud nem\u00e1te d\u016fsledn\u011b strukturovan\u00e9 tagy, jedn\u00e1 se o \u0161um.<\/li>\n\n\n\n
\/category\/uncategorized\/<\/code> \u2013 v\u00fdchoz\u00ed kategorie WordPressu, kter\u00e1 by se m\u011bla p\u0159ejmenovat nebo zablokovat.<\/li>\n<\/ul>\n\n\n\n
robots.txt<\/code>, nebo ozna\u010dit pomoc\u00ed
noindex, follow<\/code>.<\/p>\n\n\n\n
Parametrizovan\u00e9 a str\u00e1nkovan\u00e9 URL<\/h3>\n\n\n\n
Disallow: \/*?*\nDisallow: *\/page\/\n<\/code><\/pre>\n\n\n\n
\n
?orderby=<\/code>,
?filter_price=<\/code>,
?color=<\/code>,
?size=<\/code> generovat stovky variant jedn\u00e9 str\u00e1nky.<\/li>\n\n\n\n
\/page\/2\/<\/code>) lze \u0159e\u0161it r\u016fzn\u011b \u2013 bu\u010f se zachov\u00e1n\u00edm indexace hlavn\u00ed str\u00e1nky a nastaven\u00edm kanonick\u00fdch URL, nebo blokac\u00ed t\u011bchto str\u00e1nek.<\/li>\n<\/ul>\n\n\n\n
REST API a AJAX endpointy<\/h3>\n\n\n\n
Disallow: \/wp-json\/\nDisallow: \/graphql\/\n<\/code><\/pre>\n\n\n\n
Technick\u00e9 adres\u00e1\u0159e<\/h3>\n\n\n\n
Disallow: \/cgi-bin\/\nDisallow: \/temp\/\nDisallow: \/backup\/\nDisallow: \/*.sql$\nDisallow: \/*.zip$\n<\/code><\/pre>\n\n\n\n
\n\n\n\nCo by se naopak nem\u011blo blokovat<\/h2>\n\n\n\n
CSS a JavaScript<\/h3>\n\n\n\n
Disallow: \/wp-includes\/\nDisallow: \/wp-content\/\n<\/code><\/pre>\n\n\n\n
\/wp-content\/<\/code> a
\/wp-includes\/<\/code><\/strong>, pokud v n\u011bm nejsou explicitn\u011b necht\u011bn\u00e9 prvky. V\u011bt\u0161ina obsahu v
\/wp-content\/uploads\/<\/code> (obr\u00e1zky) by m\u011bla b\u00fdt indexov\u00e1na.<\/p>\n\n\n\n
\n\n\n\nSpecifika pro WordPress<\/h2>\n\n\n\n
robots.txt<\/code> m\u016f\u017ee pomoci \u0159e\u0161it. Mezi specifick\u00e9 problematick\u00e9 oblasti pat\u0159\u00ed:<\/p>\n\n\n\n
\n
\/wp-admin\/<\/code>,
\/wp-includes\/<\/code> \u2013 technick\u00e9 adres\u00e1\u0159e<\/li>\n\n\n\n
\/feed\/<\/code>,
\/trackback\/<\/code>,
\/comments\/feed\/<\/code> \u2013 \u010dasto necht\u011bn\u00fd obsah<\/li>\n\n\n\n
?replytocom=<\/code> \u2013 alternativn\u00ed zp\u016fsob zobrazen\u00ed koment\u00e1\u0159e pomoc\u00ed URL parametru, kter\u00fd m\u016f\u017ee v\u00e9st k duplicit\u00e1m<\/li>\n\n\n\n
\/?s=<\/code> \u2013 intern\u00ed vyhled\u00e1v\u00e1n\u00ed<\/li>\n<\/ul>\n\n\n\n
User-agent: *\nDisallow: \/wp-admin\/\nAllow: \/wp-admin\/admin-ajax.php\nDisallow: \/?s=\nDisallow: \/search\nDisallow: \/author\/\nDisallow: \/category\/uncategorized\/\nDisallow: \/tag\/\nDisallow: \/feed\/\nDisallow: \/trackback\/\nDisallow: \/comments\/feed\/\nSitemap: https:\/\/www.example.com\/sitemap_index.xml\n<\/code><\/pre>\n\n\n\n
\n\n\n\nKontrola a testov\u00e1n\u00ed<\/h2>\n\n\n\n
robots.txt<\/code> nezapome\u0148 ov\u011b\u0159it, \u017ee:<\/p>\n\n\n\n