diff mbox series

[FFmpeg-devel] ffmpeg-web/robots.txt: attempt to keep spiders out of dynamically generated git content

Message ID 20210714185121.24646-1-michael@niedermayer.cc
State New
Headers show
Series [FFmpeg-devel] ffmpeg-web/robots.txt: attempt to keep spiders out of dynamically generated git content
Related show

Checks

Context Check Description
andriy/configure warning Failed to apply patch
andriy/configure warning Failed to apply patch

Commit Message

Michael Niedermayer July 14, 2021, 6:51 p.m. UTC
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
---
 htdocs/robots.txt | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Comments

Michael Niedermayer July 14, 2021, 7:04 p.m. UTC | #1
On Wed, Jul 14, 2021 at 08:51:21PM +0200, Michael Niedermayer wrote:
> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> ---
>  htdocs/robots.txt | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/htdocs/robots.txt b/htdocs/robots.txt
> index eb05362..4bbc395 100644
> --- a/htdocs/robots.txt
> +++ b/htdocs/robots.txt
> @@ -1,2 +1,13 @@
>  User-agent: *
> -Disallow:
> +Crawl-delay: 10
> +Disallow: /gitweb/
> +Disallow: /*a=search*
> +Disallow: /*/search/*
> +Disallow: /*a=blobdiff*
> +Disallow: /*/blobdiff/*
> +Disallow: /*a=commitdiff*
> +Disallow: /*/commitdiff/*
> +Disallow: /*a=snapshot*
> +Disallow: /*/snapshot/*
> +Disallow: /*a=blame*
> +Disallow: /*/blame/*

This is based on https://serverfault.com/questions/506613/ideal-robots-txt-for-a-gitweb-installation
i will add this link to robots.txt


[...]
ffmpegandmahanstreamer@lolcow.email July 14, 2021, 8 p.m. UTC | #2
On 2021-07-14 14:51, Michael Niedermayer wrote:
> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> ---
>  htdocs/robots.txt | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/htdocs/robots.txt b/htdocs/robots.txt
> index eb05362..4bbc395 100644
> --- a/htdocs/robots.txt
> +++ b/htdocs/robots.txt
> @@ -1,2 +1,13 @@
>  User-agent: *
> -Disallow:
> +Crawl-delay: 10
> +Disallow: /gitweb/
> +Disallow: /*a=search*
> +Disallow: /*/search/*
> +Disallow: /*a=blobdiff*
> +Disallow: /*/blobdiff/*
> +Disallow: /*a=commitdiff*
> +Disallow: /*/commitdiff/*
> +Disallow: /*a=snapshot*
> +Disallow: /*/snapshot/*
> +Disallow: /*a=blame*
> +Disallow: /*/blame/*
LGTM based on my own personal experiences. But the robots.txt has to be 
applied for git.ffmpeg.org as well, and not just ffmpeg.org. Or else 
they will just do the same for git.ffmpeg since there are treated 
separately.
Michael Niedermayer July 14, 2021, 8:40 p.m. UTC | #3
On Wed, Jul 14, 2021 at 04:00:53PM -0400, ffmpegandmahanstreamer@lolcow.email wrote:
> On 2021-07-14 14:51, Michael Niedermayer wrote:
> > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > ---
> >  htdocs/robots.txt | 13 ++++++++++++-
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/htdocs/robots.txt b/htdocs/robots.txt
> > index eb05362..4bbc395 100644
> > --- a/htdocs/robots.txt
> > +++ b/htdocs/robots.txt
> > @@ -1,2 +1,13 @@
> >  User-agent: *
> > -Disallow:
> > +Crawl-delay: 10
> > +Disallow: /gitweb/
> > +Disallow: /*a=search*
> > +Disallow: /*/search/*
> > +Disallow: /*a=blobdiff*
> > +Disallow: /*/blobdiff/*
> > +Disallow: /*a=commitdiff*
> > +Disallow: /*/commitdiff/*
> > +Disallow: /*a=snapshot*
> > +Disallow: /*/snapshot/*
> > +Disallow: /*a=blame*
> > +Disallow: /*/blame/*
> LGTM based on my own personal experiences. But the robots.txt has to be

will apply


> applied for git.ffmpeg.org as well, and not just ffmpeg.org. Or else they
> will just do the same for git.ffmpeg since there are treated separately.

was expecting this a bit ...
i will look into that tomorrow or so unless someone else does before me

thx

[...]
Michael Niedermayer July 15, 2021, 2:11 p.m. UTC | #4
On Wed, Jul 14, 2021 at 10:40:53PM +0200, Michael Niedermayer wrote:
> On Wed, Jul 14, 2021 at 04:00:53PM -0400, ffmpegandmahanstreamer@lolcow.email wrote:
> > On 2021-07-14 14:51, Michael Niedermayer wrote:
> > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > > ---
> > >  htdocs/robots.txt | 13 ++++++++++++-
> > >  1 file changed, 12 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/htdocs/robots.txt b/htdocs/robots.txt
> > > index eb05362..4bbc395 100644
> > > --- a/htdocs/robots.txt
> > > +++ b/htdocs/robots.txt
> > > @@ -1,2 +1,13 @@
> > >  User-agent: *
> > > -Disallow:
> > > +Crawl-delay: 10
> > > +Disallow: /gitweb/
> > > +Disallow: /*a=search*
> > > +Disallow: /*/search/*
> > > +Disallow: /*a=blobdiff*
> > > +Disallow: /*/blobdiff/*
> > > +Disallow: /*a=commitdiff*
> > > +Disallow: /*/commitdiff/*
> > > +Disallow: /*a=snapshot*
> > > +Disallow: /*/snapshot/*
> > > +Disallow: /*a=blame*
> > > +Disallow: /*/blame/*
> > LGTM based on my own personal experiences. But the robots.txt has to be
> 
> will apply
> 
> 
> > applied for git.ffmpeg.org as well, and not just ffmpeg.org. Or else they
> > will just do the same for git.ffmpeg since there are treated separately.
> 
> was expecting this a bit ...
> i will look into that tomorrow or so unless someone else does before me

done

[...]
diff mbox series

Patch

diff --git a/htdocs/robots.txt b/htdocs/robots.txt
index eb05362..4bbc395 100644
--- a/htdocs/robots.txt
+++ b/htdocs/robots.txt
@@ -1,2 +1,13 @@ 
 User-agent: *
-Disallow:
+Crawl-delay: 10
+Disallow: /gitweb/
+Disallow: /*a=search*
+Disallow: /*/search/*
+Disallow: /*a=blobdiff*
+Disallow: /*/blobdiff/*
+Disallow: /*a=commitdiff*
+Disallow: /*/commitdiff/*
+Disallow: /*a=snapshot*
+Disallow: /*/snapshot/*
+Disallow: /*a=blame*
+Disallow: /*/blame/*