Skip to content
Snippets Groups Projects
Closed (moved) Fix broker robots.txt to disallow crawling
  • View options
  • Fix broker robots.txt to disallow crawling

  • View options
  • Closed (moved) Issue created by David Fifield

    From comment:11:ticket:28848 and https://github.com/ahf/snowflake-notes/blob/fb4304a7df08c6ddeeb103f38fc9103721a20cd9/Broker.markdown#the-robotstxt-handler:

    • Was the question about crawling ever answered? I can't think of a very good reason not to allow it. Even if censors were crawling the web for Snowflake brokers, they could get this information much more easily just from the source code.

    I believe the intention behind the robots.txt handler is to prevent search engines from indexing any pages on the site, because there's no permanent information there, not for any security or anti-enumeration reason.

    ahf points out that the current robots.txt achieves the opposite: it allows crawling of all pages by anyone. Instead of

    User-agent: *
    Disallow:

    it should be

    User-agent: *
    Disallow: /

    Linked items ... 0

  • Activity

    • All activity
    • Comments only
    • History only
    • Newest first
    • Oldest first
    Loading Loading Loading Loading Loading Loading Loading Loading Loading Loading