diff mbox series

[FFmpeg-devel] ipfsgateway: Remove default gateway

Message ID 20220810222708.186270-1-derek.buitenhuis@gmail.com
State Accepted
Commit 412922cc6fa790897ef6bb2be5d6f9a5f030754d
Headers show
Series [FFmpeg-devel] ipfsgateway: Remove default gateway | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Derek Buitenhuis Aug. 10, 2022, 10:27 p.m. UTC
A gateway can see everything, and we should not be shipping a hardcoded
default from a third party company; it's a security risk.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
---
 libavformat/ipfsgateway.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

Comments

Timo Rothenpieler Aug. 11, 2022, 12:08 p.m. UTC | #1
On 11/08/2022 00:27, Derek Buitenhuis wrote:
> A gateway can see everything, and we should not be shipping a hardcoded
> default from a third party company; it's a security risk.
> 
> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
> ---
>   libavformat/ipfsgateway.c | 11 ++++-------
>   1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c
> index 5a5178c563..907b61b017 100644
> --- a/libavformat/ipfsgateway.c
> +++ b/libavformat/ipfsgateway.c
> @@ -240,13 +240,8 @@ static int translate_ipfs_to_http(URLContext *h, const char *uri, int flags, AVD
>           ret = populate_ipfs_gateway(h);
>   
>           if (ret < 1) {
> -            // We fallback on dweb.link (managed by Protocol Labs).
> -            snprintf(c->gateway_buffer, sizeof(c->gateway_buffer), "https://dweb.link");
> -
> -            av_log(h, AV_LOG_WARNING,
> -                   "IPFS does not appear to be running. "
> -                   "You’re now using the public gateway at dweb.link.\n");
> -            av_log(h, AV_LOG_INFO,
> +            av_log(h, AV_LOG_ERROR,
> +                   "IPFS does not appear to be running.\n\n"
>                      "Installing IPFS locally is recommended to "
>                      "improve performance and reliability, "
>                      "and not share all your activity with a single IPFS gateway.\n"
> @@ -259,6 +254,8 @@ static int translate_ipfs_to_http(URLContext *h, const char *uri, int flags, AVD
>                      "3. Define an $IPFS_PATH environment variable "
>                      "and point it to the IPFS data path "
>                      "- this is typically ~/.ipfs\n");
> +            ret = AVERROR(EINVAL);
> +            goto err;
>           }
>       }
>   

ACK, hardcoding a public gateway is a huge NO from me.
Also in string favour of backporting this to 5.1, since it should have 
never made it in there in the first place.
Mark Gaiser Aug. 11, 2022, 4:26 p.m. UTC | #2
Hi all,

On the IPFS side we do have a solution for that with CAR files, you can
read more about that here [1].
Within the scope of this ipfs gateway protocol handler there isn't a
solution yet to use CAR files, it is on our radar but still in the
discussion phase.

On the cURL side we had this same discussion with 2 possible solutions [2].
For completeness, i'll list them here in full too:

1. An error message that gives no example but instead points the user to
documentation on how to get it working.
=== cURL example
$ curl ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
Error: local gateway not found and/or IPFS_GATEWAY is not set
Learn how to run one: https://docs.ipfs.tech/install/command-line/
===

2. An error message that makes the user aware of IPFS and provides a
solution to get it working immediately.
=== cURL example
$ curl ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
Error: local gateway not found and/or IPFS_GATEWAY is not set.
Try: IPFS_GATEWAY=https://ipfs.io
or run your own: https://docs.ipfs.tech/install/command-line/
===

Within the cURL implementation we're going for point 1.
The same idea can very well apply to ffmpeg too. Different texts that match
the different context, but in the same spirit.

Now ffmpeg is a bit different here. First and foremost because it predates
the curl.
But also because the default fallback gateway was an explicitly requested
feature from the ffmpeg side to give an "it always works" feeling.
ffmpeg therefore has a fourth option: Do nothing and keep it as-is.

I'm very much looking forward to which approach is right for the ffmpeg
folks. I will provide a patch implementing that approach.
I'm specifically looking forward to a reply from Michael Niedermayer in
this context.

Best regards,
Mark Gaiser

[1] https://docs.ipfs.tech/reference/http/gateway/#trusted-vs-trustless
[2] https://github.com/curl/curl/pull/8805#issuecomment-1199427911

On Thu, Aug 11, 2022 at 2:08 PM Timo Rothenpieler <timo@rothenpieler.org>
wrote:

> On 11/08/2022 00:27, Derek Buitenhuis wrote:
> > A gateway can see everything, and we should not be shipping a hardcoded
> > default from a third party company; it's a security risk.
> >
> > Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
> > ---
> >   libavformat/ipfsgateway.c | 11 ++++-------
> >   1 file changed, 4 insertions(+), 7 deletions(-)
> >
> > diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c
> > index 5a5178c563..907b61b017 100644
> > --- a/libavformat/ipfsgateway.c
> > +++ b/libavformat/ipfsgateway.c
> > @@ -240,13 +240,8 @@ static int translate_ipfs_to_http(URLContext *h,
> const char *uri, int flags, AVD
> >           ret = populate_ipfs_gateway(h);
> >
> >           if (ret < 1) {
> > -            // We fallback on dweb.link (managed by Protocol Labs).
> > -            snprintf(c->gateway_buffer, sizeof(c->gateway_buffer), "
> https://dweb.link");
> > -
> > -            av_log(h, AV_LOG_WARNING,
> > -                   "IPFS does not appear to be running. "
> > -                   "You’re now using the public gateway at
> dweb.link.\n");
> > -            av_log(h, AV_LOG_INFO,
> > +            av_log(h, AV_LOG_ERROR,
> > +                   "IPFS does not appear to be running.\n\n"
> >                      "Installing IPFS locally is recommended to "
> >                      "improve performance and reliability, "
> >                      "and not share all your activity with a single IPFS
> gateway.\n"
> > @@ -259,6 +254,8 @@ static int translate_ipfs_to_http(URLContext *h,
> const char *uri, int flags, AVD
> >                      "3. Define an $IPFS_PATH environment variable "
> >                      "and point it to the IPFS data path "
> >                      "- this is typically ~/.ipfs\n");
> > +            ret = AVERROR(EINVAL);
> > +            goto err;
> >           }
> >       }
> >
>
> ACK, hardcoding a public gateway is a huge NO from me.
> Also in string favour of backporting this to 5.1, since it should have
> never made it in there in the first place.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Timo Rothenpieler Aug. 11, 2022, 4:49 p.m. UTC | #3
On 11.08.2022 18:26, Mark Gaiser wrote:
> Hi all,
> 
> On the IPFS side we do have a solution for that with CAR files, you can
> read more about that here [1].
> Within the scope of this ipfs gateway protocol handler there isn't a
> solution yet to use CAR files, it is on our radar but still in the
> discussion phase.
> 
> On the cURL side we had this same discussion with 2 possible solutions [2].
> For completeness, i'll list them here in full too:
> 
> 1. An error message that gives no example but instead points the user to
> documentation on how to get it working.
> === cURL example
> $ curl ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
> Error: local gateway not found and/or IPFS_GATEWAY is not set
> Learn how to run one: https://docs.ipfs.tech/install/command-line/
> ===
> 
> 2. An error message that makes the user aware of IPFS and provides a
> solution to get it working immediately.
> === cURL example
> $ curl ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
> Error: local gateway not found and/or IPFS_GATEWAY is not set.
> Try: IPFS_GATEWAY=https://ipfs.io
> or run your own: https://docs.ipfs.tech/install/command-line/
> ===
> 
> Within the cURL implementation we're going for point 1.
> The same idea can very well apply to ffmpeg too. Different texts that match
> the different context, but in the same spirit.
> 
> Now ffmpeg is a bit different here. First and foremost because it predates
> the curl.
> But also because the default fallback gateway was an explicitly requested
> feature from the ffmpeg side to give an "it always works" feeling.
> ffmpeg therefore has a fourth option: Do nothing and keep it as-is.

I'm not sure who requested that, but I doubt "tunnel all user traffic 
through some random third parties server" was the idea there.

Releases with that hardcoded server in will be in distributions for 
years, potentially over a decade long.
Nobody can guarantee that it doesn't turn malicious in the future.
And nobody can fully guarantee what the owner does with all the data today.

This is simply unacceptable and it has to be fixed ASAP.
The approach taken by this patch seems the correct way to deal with it 
to me.
It prints a message informing the user on what to do, akin to what curl 
seems to do.
Mark Gaiser Aug. 11, 2022, 5:21 p.m. UTC | #4
On Thu, Aug 11, 2022 at 6:49 PM Timo Rothenpieler <timo@rothenpieler.org>
wrote:

> On 11.08.2022 18:26, Mark Gaiser wrote:
> > Hi all,
> >
> > On the IPFS side we do have a solution for that with CAR files, you can
> > read more about that here [1].
> > Within the scope of this ipfs gateway protocol handler there isn't a
> > solution yet to use CAR files, it is on our radar but still in the
> > discussion phase.
> >
> > On the cURL side we had this same discussion with 2 possible solutions
> [2].
> > For completeness, i'll list them here in full too:
> >
> > 1. An error message that gives no example but instead points the user to
> > documentation on how to get it working.
> > === cURL example
> > $ curl ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
> > Error: local gateway not found and/or IPFS_GATEWAY is not set
> > Learn how to run one: https://docs.ipfs.tech/install/command-line/
> > ===
> >
> > 2. An error message that makes the user aware of IPFS and provides a
> > solution to get it working immediately.
> > === cURL example
> > $ curl ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
> > Error: local gateway not found and/or IPFS_GATEWAY is not set.
> > Try: IPFS_GATEWAY=https://ipfs.io
> > or run your own: https://docs.ipfs.tech/install/command-line/
> > ===
> >
> > Within the cURL implementation we're going for point 1.
> > The same idea can very well apply to ffmpeg too. Different texts that
> match
> > the different context, but in the same spirit.
> >
> > Now ffmpeg is a bit different here. First and foremost because it
> predates
> > the curl.
> > But also because the default fallback gateway was an explicitly requested
> > feature from the ffmpeg side to give an "it always works" feeling.
> > ffmpeg therefore has a fourth option: Do nothing and keep it as-is.
>
> I'm not sure who requested that, but I doubt "tunnel all user traffic
> through some random third parties server" was the idea there.
>

Here's the conversation requesting this very feature:
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-March/293835.html

>
> Releases with that hardcoded server in will be in distributions for
> years, potentially over a decade long.
> Nobody can guarantee that it doesn't turn malicious in the future.
> And nobody can fully guarantee what the owner does with all the data today.
>
> This is simply unacceptable and it has to be fixed ASAP.
> The approach taken by this patch seems the correct way to deal with it
> to me.
> It prints a message informing the user on what to do, akin to what curl
> seems to do.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Timo Rothenpieler Aug. 11, 2022, 5:35 p.m. UTC | #5
On 11.08.2022 19:21, Mark Gaiser wrote:
> On Thu, Aug 11, 2022 at 6:49 PM Timo Rothenpieler <timo@rothenpieler.org>
> wrote:
> 
>> On 11.08.2022 18:26, Mark Gaiser wrote:
>>> Hi all,
>>>
>>> On the IPFS side we do have a solution for that with CAR files, you can
>>> read more about that here [1].
>>> Within the scope of this ipfs gateway protocol handler there isn't a
>>> solution yet to use CAR files, it is on our radar but still in the
>>> discussion phase.
>>>
>>> On the cURL side we had this same discussion with 2 possible solutions
>> [2].
>>> For completeness, i'll list them here in full too:
>>>
>>> 1. An error message that gives no example but instead points the user to
>>> documentation on how to get it working.
>>> === cURL example
>>> $ curl ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
>>> Error: local gateway not found and/or IPFS_GATEWAY is not set
>>> Learn how to run one: https://docs.ipfs.tech/install/command-line/
>>> ===
>>>
>>> 2. An error message that makes the user aware of IPFS and provides a
>>> solution to get it working immediately.
>>> === cURL example
>>> $ curl ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
>>> Error: local gateway not found and/or IPFS_GATEWAY is not set.
>>> Try: IPFS_GATEWAY=https://ipfs.io
>>> or run your own: https://docs.ipfs.tech/install/command-line/
>>> ===
>>>
>>> Within the cURL implementation we're going for point 1.
>>> The same idea can very well apply to ffmpeg too. Different texts that
>> match
>>> the different context, but in the same spirit.
>>>
>>> Now ffmpeg is a bit different here. First and foremost because it
>> predates
>>> the curl.
>>> But also because the default fallback gateway was an explicitly requested
>>> feature from the ffmpeg side to give an "it always works" feeling.
>>> ffmpeg therefore has a fourth option: Do nothing and keep it as-is.
>>
>> I'm not sure who requested that, but I doubt "tunnel all user traffic
>> through some random third parties server" was the idea there.
>>
> 
> Here's the conversation requesting this very feature:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2022-March/293835.html

I generally agree with the points brought up there.
But my conclusion very much is not "just put a somewhat random default 
into the code".
Even a list of defaults is not Okay.
We can't hardcode "magic servers".

If it's not possible to make the protocol work without them, it likely 
shouldn't have been merged in the first place.
Why can't it access the files directly, but only via some magic http 
gateway?
Why does it need special code in ffmpeg in the first place, if you can 
just access it via that http proxy-gateway anyway?
Mark Gaiser Aug. 11, 2022, 5:56 p.m. UTC | #6
On Thu, Aug 11, 2022 at 7:35 PM Timo Rothenpieler <timo@rothenpieler.org>
wrote:

> On 11.08.2022 19:21, Mark Gaiser wrote:
> > On Thu, Aug 11, 2022 at 6:49 PM Timo Rothenpieler <timo@rothenpieler.org
> >
> > wrote:
> >
> >> On 11.08.2022 18:26, Mark Gaiser wrote:
> >>> Hi all,
> >>>
> >>> On the IPFS side we do have a solution for that with CAR files, you can
> >>> read more about that here [1].
> >>> Within the scope of this ipfs gateway protocol handler there isn't a
> >>> solution yet to use CAR files, it is on our radar but still in the
> >>> discussion phase.
> >>>
> >>> On the cURL side we had this same discussion with 2 possible solutions
> >> [2].
> >>> For completeness, i'll list them here in full too:
> >>>
> >>> 1. An error message that gives no example but instead points the user
> to
> >>> documentation on how to get it working.
> >>> === cURL example
> >>> $ curl
> ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
> >>> Error: local gateway not found and/or IPFS_GATEWAY is not set
> >>> Learn how to run one: https://docs.ipfs.tech/install/command-line/
> >>> ===
> >>>
> >>> 2. An error message that makes the user aware of IPFS and provides a
> >>> solution to get it working immediately.
> >>> === cURL example
> >>> $ curl
> ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
> >>> Error: local gateway not found and/or IPFS_GATEWAY is not set.
> >>> Try: IPFS_GATEWAY=https://ipfs.io
> >>> or run your own: https://docs.ipfs.tech/install/command-line/
> >>> ===
> >>>
> >>> Within the cURL implementation we're going for point 1.
> >>> The same idea can very well apply to ffmpeg too. Different texts that
> >> match
> >>> the different context, but in the same spirit.
> >>>
> >>> Now ffmpeg is a bit different here. First and foremost because it
> >> predates
> >>> the curl.
> >>> But also because the default fallback gateway was an explicitly
> requested
> >>> feature from the ffmpeg side to give an "it always works" feeling.
> >>> ffmpeg therefore has a fourth option: Do nothing and keep it as-is.
> >>
> >> I'm not sure who requested that, but I doubt "tunnel all user traffic
> >> through some random third parties server" was the idea there.
> >>
> >
> > Here's the conversation requesting this very feature:
> > https://ffmpeg.org/pipermail/ffmpeg-devel/2022-March/293835.html
>
> I generally agree with the points brought up there.
> But my conclusion very much is not "just put a somewhat random default
> into the code".
> Even a list of defaults is not Okay.
> We can't hardcode "magic servers".
>

This is just your - valued! -  opinion, but still just 1. I insist on
waiting to hear from Michael to hear a decision on this, mainly because he
was quite persistent in asking for this feature to begin with.
The risks were clear and - somewhat - mentioned in the post I linked to
before yet the decision was still to proceed.

Since then and now nothing has changed. No exploit was found. The only
thing that happened was a blog post from the cURL maintainer that merely
highlighted this issue. Still no abuse by any means.
That doesn't mean it will never be hacked. As I highlighted in that same
post, as that gateway gets used more and more it simply becomes an
increasingly attractive target for hackers to target.
And let's not forget that ffmpeg still warns you right now when that
fallback gateway is used.


> If it's not possible to make the protocol work without them, it likely
> shouldn't have been merged in the first place.
> Why can't it access the files directly, but only via some magic http
> gateway?
> Why does it need special code in ffmpeg in the first place, if you can
> just access it via that http proxy-gateway anyway?
>

No, we're not going to have that discussion again.
I outlined this in detail in every single patch round (we had 13 rounds) so
i'd recommend you to re-read that
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-April/295097.html
If that's still unclear then you can read much more about it here too:
https://blog.ipfs.io/2022-08-01-ipfs-and-ffmpeg/


> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Derek Buitenhuis Aug. 11, 2022, 7:18 p.m. UTC | #7
On 8/11/2022 6:56 PM, Mark Gaiser wrote:
> This is just your - valued! -  opinion, but still just 1. I insist on
> waiting to hear from Michael to hear a decision on this, mainly because he
> was quite persistent in asking for this feature to begin with.
> The risks were clear and - somewhat - mentioned in the post I linked to
> before yet the decision was still to proceed.

This is a bigger issue than a single person's opinion - it's just plainly an
unacceptable risk. I totally agree with Timo here.

'Just works' isn't an acceptbale reason to ship a hard coded third party
sever as part of a widely used FOS library and tool.

I admit I was large absent from the original (many) rounds or review,
but I'm kind of surprised it was merged as-is, to be honest. Frankly either
it should've been merged as "batteries not included", or should have been
a whole IPFS implementation.

However, regardless of how this came to be (whether reasonable or not), this
should be removed.

-Derek
Michael Niedermayer Aug. 11, 2022, 8:18 p.m. UTC | #8
On Thu, Aug 11, 2022 at 07:56:04PM +0200, Mark Gaiser wrote:
> On Thu, Aug 11, 2022 at 7:35 PM Timo Rothenpieler <timo@rothenpieler.org>
> wrote:
> 
> > On 11.08.2022 19:21, Mark Gaiser wrote:
> > > On Thu, Aug 11, 2022 at 6:49 PM Timo Rothenpieler <timo@rothenpieler.org
> > >
> > > wrote:
> > >
> > >> On 11.08.2022 18:26, Mark Gaiser wrote:
> > >>> Hi all,
> > >>>
> > >>> On the IPFS side we do have a solution for that with CAR files, you can
> > >>> read more about that here [1].
> > >>> Within the scope of this ipfs gateway protocol handler there isn't a
> > >>> solution yet to use CAR files, it is on our radar but still in the
> > >>> discussion phase.
> > >>>
> > >>> On the cURL side we had this same discussion with 2 possible solutions
> > >> [2].
> > >>> For completeness, i'll list them here in full too:
> > >>>
> > >>> 1. An error message that gives no example but instead points the user
> > to
> > >>> documentation on how to get it working.
> > >>> === cURL example
> > >>> $ curl
> > ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
> > >>> Error: local gateway not found and/or IPFS_GATEWAY is not set
> > >>> Learn how to run one: https://docs.ipfs.tech/install/command-line/
> > >>> ===
> > >>>
> > >>> 2. An error message that makes the user aware of IPFS and provides a
> > >>> solution to get it working immediately.
> > >>> === cURL example
> > >>> $ curl
> > ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
> > >>> Error: local gateway not found and/or IPFS_GATEWAY is not set.
> > >>> Try: IPFS_GATEWAY=https://ipfs.io
> > >>> or run your own: https://docs.ipfs.tech/install/command-line/
> > >>> ===
> > >>>
> > >>> Within the cURL implementation we're going for point 1.
> > >>> The same idea can very well apply to ffmpeg too. Different texts that
> > >> match
> > >>> the different context, but in the same spirit.
> > >>>
> > >>> Now ffmpeg is a bit different here. First and foremost because it
> > >> predates
> > >>> the curl.
> > >>> But also because the default fallback gateway was an explicitly
> > requested
> > >>> feature from the ffmpeg side to give an "it always works" feeling.
> > >>> ffmpeg therefore has a fourth option: Do nothing and keep it as-is.
> > >>
> > >> I'm not sure who requested that, but I doubt "tunnel all user traffic
> > >> through some random third parties server" was the idea there.
> > >>
> > >
> > > Here's the conversation requesting this very feature:
> > > https://ffmpeg.org/pipermail/ffmpeg-devel/2022-March/293835.html
> >
> > I generally agree with the points brought up there.
> > But my conclusion very much is not "just put a somewhat random default
> > into the code".
> > Even a list of defaults is not Okay.
> > We can't hardcode "magic servers".

I think we really should be looking at first principles here, and not
say what to do and what not to do in isolation. 
Especially as some mails in this thread are a bit more emotional than
what ive seen normally.
That said the concern is very real and valid.

So lets see why things where done as they are first.
* we added IPFS support to (obviously supprt IPFS which is a increasingly
  relevant thing)
* a full self contained IPFS implementation was not available and may or
  may not be practical (this should be revisited and reconsidered with people
  knowing the protocol well)
* The first goal is if possible support it out of the box and on all platforms
* Asking the user to setup a IPFS gateway or even to point to one per ENV variable
  appeared not really possible on locked down platforms like phones (maybe there
  is a way that was missed ?)
* so That left the choice to either add a default or to drop IPFS support for
  some platforms.
* The patch was on the ML for a long time and noone objected to the simple
  default
  
Now what is the problem with a single hardcoded default ?
(please correct me if iam missing something)
1 It can log you
2 It can man in the middle you
3 It can stop working

If we tell the user to find their own gateway this does not actually protect
them from these, it rather makes it "their problem" not ours.
Also a user setting up a IPFS_GATEWAY pointer will not maintain its security
a year later, 5 years later that will still be there and that will be a big
security issue too if a random choice is a big security issue.
So as much as a hardcoded default is bad, this is also bad

a full IPFS implementation (if this is possible which iam not sure about)
may be a solution. Running a local IPFS node which receives security updates
should work too. Again i suspect the later may be hard on locked down devices
like phones. (again someone who knows this should comment here)

So which options are there now
* full IPFS implementation (gold standard but maybe impossible)
* user setup IPFS node (probable not possible on some platforms)
* "its the users problem" (manually maintaining a link to a secure
  gateway sounds insecure to me with average users)
* Maintain a list of believed to be secure gateways outside the source
  maybe on https: git.ffmpeg.org. (this was not discussed previously)
  the code could if no local node/gateway and no IPFS_GATEWAY environment
  fetch a random entry from that gateway list and print info to the
  user notifying of the use of the default

It is quite possible iam missing something but this last option seems
an improvment over a single default. Also it seems more secure to me
to the average user than setting a IPFS_GATEWAY and then forgeting
that it was set for years.

We could also limit such a external fetched (updatable) list to
platforms where all other options are impossible
I dont know if thats a good idea or not, iam just throwing that out here


> >
> 
> This is just your - valued! -  opinion, but still just 1. I insist on
> waiting to hear from Michael to hear a decision on this, mainly because he
> was quite persistent in asking for this feature to begin with.

Iam quite happy to leave this discussion to others, last time it was
just that noone seemed to care over a really long time to comment
now it seems everyone really cares. 
I think its very good that people are thinking about it now, it is a
rather annoying situation as each option is a tradeoff which sucks in
some form
Maybe the ultimate best would be a change at the IPFS protocol level
so that lean light clients could securely use the protocol easily

thx


[...]
Timo Rothenpieler Aug. 11, 2022, 10:03 p.m. UTC | #9
On 11.08.2022 22:18, Michael Niedermayer wrote:
> On Thu, Aug 11, 2022 at 07:56:04PM +0200, Mark Gaiser wrote:
>> On Thu, Aug 11, 2022 at 7:35 PM Timo Rothenpieler <timo@rothenpieler.org>
>> wrote:
>>
>>> On 11.08.2022 19:21, Mark Gaiser wrote:
>>>> On Thu, Aug 11, 2022 at 6:49 PM Timo Rothenpieler <timo@rothenpieler.org
>>>>
>>>> wrote:
>>>>
>>>>> On 11.08.2022 18:26, Mark Gaiser wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> On the IPFS side we do have a solution for that with CAR files, you can
>>>>>> read more about that here [1].
>>>>>> Within the scope of this ipfs gateway protocol handler there isn't a
>>>>>> solution yet to use CAR files, it is on our radar but still in the
>>>>>> discussion phase.
>>>>>>
>>>>>> On the cURL side we had this same discussion with 2 possible solutions
>>>>> [2].
>>>>>> For completeness, i'll list them here in full too:
>>>>>>
>>>>>> 1. An error message that gives no example but instead points the user
>>> to
>>>>>> documentation on how to get it working.
>>>>>> === cURL example
>>>>>> $ curl
>>> ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
>>>>>> Error: local gateway not found and/or IPFS_GATEWAY is not set
>>>>>> Learn how to run one: https://docs.ipfs.tech/install/command-line/
>>>>>> ===
>>>>>>
>>>>>> 2. An error message that makes the user aware of IPFS and provides a
>>>>>> solution to get it working immediately.
>>>>>> === cURL example
>>>>>> $ curl
>>> ipfs://bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am
>>>>>> Error: local gateway not found and/or IPFS_GATEWAY is not set.
>>>>>> Try: IPFS_GATEWAY=https://ipfs.io
>>>>>> or run your own: https://docs.ipfs.tech/install/command-line/
>>>>>> ===
>>>>>>
>>>>>> Within the cURL implementation we're going for point 1.
>>>>>> The same idea can very well apply to ffmpeg too. Different texts that
>>>>> match
>>>>>> the different context, but in the same spirit.
>>>>>>
>>>>>> Now ffmpeg is a bit different here. First and foremost because it
>>>>> predates
>>>>>> the curl.
>>>>>> But also because the default fallback gateway was an explicitly
>>> requested
>>>>>> feature from the ffmpeg side to give an "it always works" feeling.
>>>>>> ffmpeg therefore has a fourth option: Do nothing and keep it as-is.
>>>>>
>>>>> I'm not sure who requested that, but I doubt "tunnel all user traffic
>>>>> through some random third parties server" was the idea there.
>>>>>
>>>>
>>>> Here's the conversation requesting this very feature:
>>>> https://ffmpeg.org/pipermail/ffmpeg-devel/2022-March/293835.html
>>>
>>> I generally agree with the points brought up there.
>>> But my conclusion very much is not "just put a somewhat random default
>>> into the code".
>>> Even a list of defaults is not Okay.
>>> We can't hardcode "magic servers".
> 
> I think we really should be looking at first principles here, and not
> say what to do and what not to do in isolation.
> Especially as some mails in this thread are a bit more emotional than
> what ive seen normally.
> That said the concern is very real and valid.
> 
> So lets see why things where done as they are first.
> * we added IPFS support to (obviously supprt IPFS which is a increasingly
>    relevant thing)
> * a full self contained IPFS implementation was not available and may or
>    may not be practical (this should be revisited and reconsidered with people
>    knowing the protocol well)
> * The first goal is if possible support it out of the box and on all platforms
> * Asking the user to setup a IPFS gateway or even to point to one per ENV variable
>    appeared not really possible on locked down platforms like phones (maybe there
>    is a way that was missed ?)
> * so That left the choice to either add a default or to drop IPFS support for
>    some platforms.
> * The patch was on the ML for a long time and noone objected to the simple
>    default
>    
> Now what is the problem with a single hardcoded default ?
> (please correct me if iam missing something)
> 1 It can log you
> 2 It can man in the middle you
> 3 It can stop working
> 
> If we tell the user to find their own gateway this does not actually protect
> them from these, it rather makes it "their problem" not ours.
> Also a user setting up a IPFS_GATEWAY pointer will not maintain its security
> a year later, 5 years later that will still be there and that will be a big
> security issue too if a random choice is a big security issue.
> So as much as a hardcoded default is bad, this is also bad
> 
> a full IPFS implementation (if this is possible which iam not sure about)
> may be a solution. Running a local IPFS node which receives security updates
> should work too. Again i suspect the later may be hard on locked down devices
> like phones. (again someone who knows this should comment here)
> 
> So which options are there now
> * full IPFS implementation (gold standard but maybe impossible)
> * user setup IPFS node (probable not possible on some platforms)
> * "its the users problem" (manually maintaining a link to a secure
>    gateway sounds insecure to me with average users)
> * Maintain a list of believed to be secure gateways outside the source
>    maybe on https: git.ffmpeg.org. (this was not discussed previously)
>    the code could if no local node/gateway and no IPFS_GATEWAY environment
>    fetch a random entry from that gateway list and print info to the
>    user notifying of the use of the default
> 
> It is quite possible iam missing something but this last option seems
> an improvment over a single default. Also it seems more secure to me
> to the average user than setting a IPFS_GATEWAY and then forgeting
> that it was set for years.
> 
> We could also limit such a external fetched (updatable) list to
> platforms where all other options are impossible
> I dont know if thats a good idea or not, iam just throwing that out here
> 

I'm aware that it's harsh, but with such limitations to run the protocol 
securely, it probably shouldn't have been merged in the first place.

Any kind of built in hardcoded server is not acceptable imo.
Even with it pointing to our own infrastructure, we can't really 
guarantee its availability, specially should the protocol gain traction 
and heavy use.

I'm not sure what the correct way forward is.
But the proposed patch here still seems like the best option to me.

>>
>> This is just your - valued! -  opinion, but still just 1. I insist on
>> waiting to hear from Michael to hear a decision on this, mainly because he
>> was quite persistent in asking for this feature to begin with.
> 
> Iam quite happy to leave this discussion to others, last time it was
> just that noone seemed to care over a really long time to comment
> now it seems everyone really cares.
> I think its very good that people are thinking about it now, it is a
> rather annoying situation as each option is a tradeoff which sucks in
> some form
> Maybe the ultimate best would be a change at the IPFS protocol level
> so that lean light clients could securely use the protocol easily


The patch wasn't on my radar at all. I had assumed it was actually 
implementing IPFS in some fashion.
Not via an entire external http gateway. I'm a bit confused that it's 
its whole own protocol.
Derek Buitenhuis Aug. 11, 2022, 10:51 p.m. UTC | #10
On 8/11/2022 11:03 PM, Timo Rothenpieler wrote:
> Any kind of built in hardcoded server is not acceptable imo.
> Even with it pointing to our own infrastructure, we can't really 
> guarantee its availability, specially should the protocol gain traction 
> and heavy use.

I agree... we should never send a users data through *any* service they
haven't explicitly asked for. Ever. Regardless of who runs it and who
is deemed "trustworthy".

> The patch wasn't on my radar at all. I had assumed it was actually 
> implementing IPFS in some fashion.

Yes, I had assumed the same too, and thus wasn't following the sets
at all.

As it exists right now though, I don't really see why lavf needs what
amounts to a URL builder for a service as a "protocol" - this totally
the wrong layer to do that at...

- Derek
Mark Gaiser Aug. 12, 2022, 1:43 p.m. UTC | #11
On Fri, Aug 12, 2022 at 12:51 AM Derek Buitenhuis <
derek.buitenhuis@gmail.com> wrote:

> On 8/11/2022 11:03 PM, Timo Rothenpieler wrote:
> > Any kind of built in hardcoded server is not acceptable imo.
> > Even with it pointing to our own infrastructure, we can't really
> > guarantee its availability, specially should the protocol gain traction
> > and heavy use.
>
> I agree... we should never send a users data through *any* service they
> haven't explicitly asked for. Ever. Regardless of who runs it and who
> is deemed "trustworthy".
>
> > The patch wasn't on my radar at all. I had assumed it was actually
> > implementing IPFS in some fashion.
>
> Yes, I had assumed the same too, and thus wasn't following the sets
> at all.
>
> As it exists right now though, I don't really see why lavf needs what
> amounts to a URL builder for a service as a "protocol" - this totally
> the wrong layer to do that at...
>
> - Derek
>

(not a specific reply to you, Derek, just a reply to the latest message in
this discussion.)

First I'd like to highlight the idea of security here. As that seems to
play a big part in this thread.
The points Michael makes with regards to letting a user find a gateway are
in my opinion valid. If a user googles for a gateway, you hope they would
find/use dweb.link, as it is a safe option. Now I'm not saying other
gateways are "less safe", not at all! But the dweb.link gateway is run by
protocol labs themselves and is used a lot. It has a team behind it
monitoring it who have a quite high responsibility of keeping the gateway
functioning properly. Not many gateways have the resources behind it that
dweb.link has. Also, with regards to video playback, that is guaranteed to
work on dweb.link whereas less resource heavy gateways could potentially
throttle such traffic. I won't call out names but i do know of at least 1
very popular one that does this.

I don't know how far I can answer organizational details about dweb.link,
but I do know that it's of high importance for Protocol Labs to make sure
the gateway functions normally.
If there are questions about security audits or guarantees about "user
data" [1] policies then I can forward those to the persons who know or have
the ability to get answers.

That all being said. In the case where the user has no local gateway (still
a likely fact) it would be far more secure to have a fallback one
(dweb.link) then to let the user google one.
Just pushing a sensationalized view of "it has to go because of security"
might actually have the adverse effect so be very careful and constructive
when you say that.

Now to play devel's advocate for a moment. Say dweb.link remains as
default, how likely is it to be hacked and truly causing harm for ffmpeg
users? A lot of steps have to happen before that harm really can be done,
here's a list of things i can come up with (but it's likely much longer).
- the gateway would have to be hacked and that would not be detected
(unlikely)
- the malicious party would have to change gateway code and still
remain undetected (again unlikely)
- ffmpeg users would actually get malicious data (unlikely)
- that malicious data would have to be able to cause actual harm (yet again
unlikely, ffmpeg is more likely to crash with malicious data [2])

So in realistic terms, how likely is it that dweb.link causes an actual
security risk for the user? The more I think about it, the less likely I
think that's going to be.
On the other hand, removing the default gateway very definitely has a much
higher potential for a security risk.

I hope we can proceed this discussion in a more mature way with actual
constructive arguments.
Passionate and emotional calls to remove this "just because" won't help the
discussion forward by any means.
I also ask to refrain from arguments like "it should've never been merged
and therefore removed", that's like a very demotivating response to even
consider responding to.

So keep it nice and constructive and we'll figure out a way that works for
everyone :)

[1] there are no users.. At most you have webserver logging winch, with
other data, could be used for tracking purposes perhaps.
[2] for malicious data to be successfully abused there would also have to
be at least 1 severe bug in ffmpeg itself. Probably in the http stack but
if that passes in the codec itself.
Vittorio Giovara Aug. 12, 2022, 2:22 p.m. UTC | #12
On Fri, Aug 12, 2022 at 12:51 AM Derek Buitenhuis <
derek.buitenhuis@gmail.com> wrote:

> As it exists right now though, I don't really see why lavf needs what
> amounts to a URL builder for a service as a "protocol" - this totally
> the wrong layer to do that at...
>

Agreed, this protocol should be dropped IMO.
Kieran Kunhya Aug. 12, 2022, 2:30 p.m. UTC | #13
On Fri, 12 Aug 2022 at 15:22, Vittorio Giovara <vittorio.giovara@gmail.com>
wrote:

> On Fri, Aug 12, 2022 at 12:51 AM Derek Buitenhuis <
> derek.buitenhuis@gmail.com> wrote:
>
> > As it exists right now though, I don't really see why lavf needs what
> > amounts to a URL builder for a service as a "protocol" - this totally
> > the wrong layer to do that at...
> >
>
> Agreed, this protocol should be dropped IMO.
>

Agreed. In a similar vein, we don't suddenly decide to pick DNS servers for
the user if they don't have one configured.

Kieran
Mark Gaiser Aug. 12, 2022, 2:34 p.m. UTC | #14
On Fri, Aug 12, 2022 at 4:30 PM Kieran Kunhya <kierank@obe.tv> wrote:

> On Fri, 12 Aug 2022 at 15:22, Vittorio Giovara <vittorio.giovara@gmail.com
> >
> wrote:
>
> > On Fri, Aug 12, 2022 at 12:51 AM Derek Buitenhuis <
> > derek.buitenhuis@gmail.com> wrote:
> >
> > > As it exists right now though, I don't really see why lavf needs what
> > > amounts to a URL builder for a service as a "protocol" - this totally
> > > the wrong layer to do that at...
> > >
> >
> > Agreed, this protocol should be dropped IMO.
> >
>
> Agreed. In a similar vein, we don't suddenly decide to pick DNS servers for
> the user if they don't have one configured.


Great opinion you 2, 0 constructiveness.
That doesn't help at all.

Please try again in a constructive manner that convinces me of your
undoubtedly great arguments!
Kieran Kunhya Aug. 12, 2022, 2:45 p.m. UTC | #15
>
> Great opinion you 2, 0 constructiveness.
> That doesn't help at all.
>
> Please try again in a constructive manner that convinces me of your
> undoubtedly great arguments!
>

Please let me know what IPFS has to do with Interplanetary Communications.

(hint: none).

Kieran
Derek Buitenhuis Aug. 12, 2022, 2:48 p.m. UTC | #16
On 8/12/2022 3:34 PM, Mark Gaiser wrote:
> Great opinion you 2, 0 constructiveness.
> That doesn't help at all.

Your tone has been pretty rude in this whole thread.

> Please try again in a constructive manner that convinces me of your
> undoubtedly great arguments!

You seem to be defining anything that involves removing it as "not
consructive", which is rather convenient.

I would also ask you stop sending unwanted private mails to people on this
thread about the subject.

- Derek
Kieran Kunhya Aug. 12, 2022, 2:50 p.m. UTC | #17
On Fri, 12 Aug 2022 at 15:48, Derek Buitenhuis <derek.buitenhuis@gmail.com>
wrote:

> On 8/12/2022 3:34 PM, Mark Gaiser wrote:
> > Great opinion you 2, 0 constructiveness.
> > That doesn't help at all.
>
> Your tone has been pretty rude in this whole thread.
>
> > Please try again in a constructive manner that convinces me of your
> > undoubtedly great arguments!
>
> You seem to be defining anything that involves removing it as "not
> consructive", which is rather convenient.
>
> I would also ask you stop sending unwanted private mails to people on this
> thread about the subject.
>
> - Derek
>

I would like to refer this issue to the technical committee for a final
decision.
TC is copied into this email.

Kieran
Nicolas George Aug. 12, 2022, 2:55 p.m. UTC | #18
Derek Buitenhuis (12022-08-11):
> I agree... we should never send a users data through *any* service they
> haven't explicitly asked for. Ever. Regardless of who runs it and who
> is deemed "trustworthy".

Absolutely. And Kieran's simile with DNS is very good. It is not just a
question of whether the gateway might turn evil, there are also concerns
of privacy.

> > The patch wasn't on my radar at all. I had assumed it was actually 
> > implementing IPFS in some fashion.
> Yes, I had assumed the same too, and thus wasn't following the sets
> at all.
> 
> As it exists right now though, I don't really see why lavf needs what
> amounts to a URL builder for a service as a "protocol" - this totally
> the wrong layer to do that at...

I also assumed it was a native implementation. If it is just a matter of
translating “ipfs://whatever” into “https://gateway/wHaTeVeR”, a perl
script in tools/ would be a reasonable expedient.

Native implementations are a huge part of what made FFmpeg great: you
build from source, without a shit-ton of extra libraries that might not
be packaged or recent enough on long-term distributions, and you get
support for most codecs, formats and protocols in the world.

Unfortunately, work on implementing native versions of codecs and
protocols seems to have gotten out of fashion.

For protocols, I can blame the lack of framework: our pedestrian
read/write blocking API is not adapted to modern protocols that require
asynchronous operation.

I had the project of building a new framework for networking and
protocols; in fact I have a large part of how I want to make it work in
the back of my mind already. But considering the shortsightedness of the
leadership of the project these days about framework that is not an
obvious incremental enhancement directly related to existing code, I do
not expect to invest more time into it any time soon. The same goes for
most API enhancement I had promised over the years. Too bad.

Regards,
Michael Niedermayer Aug. 12, 2022, 3:05 p.m. UTC | #19
On Fri, Aug 12, 2022 at 12:03:17AM +0200, Timo Rothenpieler wrote:
> On 11.08.2022 22:18, Michael Niedermayer wrote:
> > On Thu, Aug 11, 2022 at 07:56:04PM +0200, Mark Gaiser wrote:
[...]
> > > 
> > > This is just your - valued! -  opinion, but still just 1. I insist on
> > > waiting to hear from Michael to hear a decision on this, mainly because he
> > > was quite persistent in asking for this feature to begin with.
> > 
> > Iam quite happy to leave this discussion to others, last time it was
> > just that noone seemed to care over a really long time to comment
> > now it seems everyone really cares.
> > I think its very good that people are thinking about it now, it is a
> > rather annoying situation as each option is a tradeoff which sucks in
> > some form
> > Maybe the ultimate best would be a change at the IPFS protocol level
> > so that lean light clients could securely use the protocol easily
> 
> 
> The patch wasn't on my radar at all. I had assumed it was actually
> implementing IPFS in some fashion.
> Not via an entire external http gateway. I'm a bit confused that it's its
> whole own protocol.

Maybe thinking about http is the wrong mindset. Maybe DNS is a better analog

to grab data from DNS you can implement a full DNS server which recursivly
resolves the request starting from the root name servers (which it needs to have
hardcoded in some form) But this is something no application does because of
latency and wide support of easier name resolution on platforms

So what one does is to connect to local of ISP DNS server which caches results
and does resolve from the root servers if needed (either directly or though platform APIs)
Problem with IPFS is your ISP doesnt have a IPFS server nor do you have one
locally normally

Below is how i understand IPFS, please someone correct me if iam wrong, iam 
listing this here as i think it makes sense for the dicussion to better understand
what IPFS is before arguing about it

IPFS seems closer to DNS in how it works than to how http works
if you want to grab something from IPFS it cant just do it, it needs to connect
to peers and find out which has the data. 
If you start from zero (and some hardcoded peer list) that will take more time
than if there is a running node with active connections
So for better performance we want to use a IPFS node which persists before and
after the process with libavformat. This is the same as with a DNS server.

I suspect IPFS provides little security against loging,
If you run a IPFS node, others can likely find out what your node cached because
thats the whole point, of caching data, so that others can get it.
If you are concerned the http-ipfs gateway logs you, running your own node might
be worse. IIUC thats like a public caching DNS server

the other threat of the http-ipfs gateway modifying data can possible be prevented
with some effort.
IPFS urls IIUC contain the hash from a root of a merkle tree of the data so one 
can take a subset of the data with some more hashes and verify that the data
matcheswhat the URL refers to. This also makes data immutable. There is
mutable data in IPFS called IPNS.
IPNS uses a hash of a public key allowing the private key owner only to modify
the data.
again it can in principle be checked that this is all unmodifed by any intermediate
that makes IPFS different fron DNS and HTTP(S) which cannot be checked from the
URL alone

Also i hope this whole thread can stay technical because this all is a technical
problem and a technical mailing list and it should have a technical solution.

thx

[...]
Nicolas George Aug. 12, 2022, 5:01 p.m. UTC | #20
Michael Niedermayer (12022-08-12):
> Maybe thinking about http is the wrong mindset. Maybe DNS is a better analog
> 
> to grab data from DNS you can implement a full DNS server which recursivly
> resolves the request starting from the root name servers (which it needs to have
> hardcoded in some form) But this is something no application does because of
> latency and wide support of easier name resolution on platforms
> 
> So what one does is to connect to local of ISP DNS server which caches results
> and does resolve from the root servers if needed (either directly or though platform APIs)
> Problem with IPFS is your ISP doesnt have a IPFS server nor do you have one
> locally normally
> 
> Below is how i understand IPFS, please someone correct me if iam wrong, iam 
> listing this here as i think it makes sense for the dicussion to better understand
> what IPFS is before arguing about it
> 
> IPFS seems closer to DNS in how it works than to how http works
> if you want to grab something from IPFS it cant just do it, it needs to connect
> to peers and find out which has the data. 
> If you start from zero (and some hardcoded peer list) that will take more time
> than if there is a running node with active connections
> So for better performance we want to use a IPFS node which persists before and
> after the process with libavformat. This is the same as with a DNS server.
> 
> I suspect IPFS provides little security against loging,
> If you run a IPFS node, others can likely find out what your node cached because
> thats the whole point, of caching data, so that others can get it.
> If you are concerned the http-ipfs gateway logs you, running your own node might
> be worse. IIUC thats like a public caching DNS server
> 
> the other threat of the http-ipfs gateway modifying data can possible be prevented
> with some effort.
> IPFS urls IIUC contain the hash from a root of a merkle tree of the data so one 
> can take a subset of the data with some more hashes and verify that the data
> matcheswhat the URL refers to. This also makes data immutable. There is
> mutable data in IPFS called IPNS.
> IPNS uses a hash of a public key allowing the private key owner only to modify
> the data.
> again it can in principle be checked that this is all unmodifed by any intermediate
> that makes IPFS different fron DNS and HTTP(S) which cannot be checked from the
> URL alone

All this looks a lot like “magnet:” URLs for torrents, and we do not
consider FFmpeg should support torrents. But the practice can make the
difference: if leeching without seeding at all is supported, then it can
make sense.

The goal that everything works out of the box is limited by the need for
safety for the user, and it is a concern for both a peer-to-peer
protocol and for an external gateway. And it is not limited to technical
security risks, it involves also legal liability: the information that
somebody accessed a resource that is considered illegal in their country
is more likely to leak. Also to consider: if FFmpeg hardcodes a default
gateway, secondary distributors might change that default into a less
trustworthy one.

The simile with DNS has a significant limitation: DNS has been here
since forever, and we can assume it is properly configured everywhere.
In fact, FFmpeg does not use DNS, it uses the libc's resolver, which
could be configured not to use DNS at all. This protocol is a newfangled
thing, so the expectation that it just works is lower.

It brings me to another point: how common is this thing? FFmpeg aims to
support all protocols used in the world, but it is not meant to be a
showcase for somebody's vanity project or some company's new commercial
product. For this issue, I think the criterion the IETF uses to consider
something a standard is a good touchstone: are there several independent
and compatible implementations already out there?

Regards,
Michael Niedermayer Aug. 12, 2022, 5:18 p.m. UTC | #21
On Fri, Aug 12, 2022 at 07:01:49PM +0200, Nicolas George wrote:
> Michael Niedermayer (12022-08-12):
> > Maybe thinking about http is the wrong mindset. Maybe DNS is a better analog
> > 
> > to grab data from DNS you can implement a full DNS server which recursivly
> > resolves the request starting from the root name servers (which it needs to have
> > hardcoded in some form) But this is something no application does because of
> > latency and wide support of easier name resolution on platforms
> > 
> > So what one does is to connect to local of ISP DNS server which caches results
> > and does resolve from the root servers if needed (either directly or though platform APIs)
> > Problem with IPFS is your ISP doesnt have a IPFS server nor do you have one
> > locally normally
> > 
> > Below is how i understand IPFS, please someone correct me if iam wrong, iam 
> > listing this here as i think it makes sense for the dicussion to better understand
> > what IPFS is before arguing about it
> > 
> > IPFS seems closer to DNS in how it works than to how http works
> > if you want to grab something from IPFS it cant just do it, it needs to connect
> > to peers and find out which has the data. 
> > If you start from zero (and some hardcoded peer list) that will take more time
> > than if there is a running node with active connections
> > So for better performance we want to use a IPFS node which persists before and
> > after the process with libavformat. This is the same as with a DNS server.
> > 
> > I suspect IPFS provides little security against loging,
> > If you run a IPFS node, others can likely find out what your node cached because
> > thats the whole point, of caching data, so that others can get it.
> > If you are concerned the http-ipfs gateway logs you, running your own node might
> > be worse. IIUC thats like a public caching DNS server
> > 
> > the other threat of the http-ipfs gateway modifying data can possible be prevented
> > with some effort.
> > IPFS urls IIUC contain the hash from a root of a merkle tree of the data so one 
> > can take a subset of the data with some more hashes and verify that the data
> > matcheswhat the URL refers to. This also makes data immutable. There is
> > mutable data in IPFS called IPNS.
> > IPNS uses a hash of a public key allowing the private key owner only to modify
> > the data.
> > again it can in principle be checked that this is all unmodifed by any intermediate
> > that makes IPFS different fron DNS and HTTP(S) which cannot be checked from the
> > URL alone
> 
> All this looks a lot like “magnet:” URLs for torrents, and we do not
> consider FFmpeg should support torrents. But the practice can make the
> difference: if leeching without seeding at all is supported, then it can
> make sense.
> 
> The goal that everything works out of the box is limited by the need for
> safety for the user, and it is a concern for both a peer-to-peer
> protocol and for an external gateway. And it is not limited to technical
> security risks, it involves also legal liability: the information that
> somebody accessed a resource that is considered illegal in their country
> is more likely to leak. Also to consider: if FFmpeg hardcodes a default
> gateway, secondary distributors might change that default into a less
> trustworthy one.
> 
> The simile with DNS has a significant limitation: DNS has been here
> since forever, and we can assume it is properly configured everywhere.
> In fact, FFmpeg does not use DNS, it uses the libc's resolver, which
> could be configured not to use DNS at all. This protocol is a newfangled
> thing, so the expectation that it just works is lower.
> 

> It brings me to another point: how common is this thing? FFmpeg aims to

This is easy to awnser, you can look at: google trends since 2015 when IPFS
first release was

worldwide
https://trends.google.com/trends/explore?date=2015-01-01%202022-08-12&q=ipfs

at that timescale its popularity is going up alot over time


> support all protocols used in the world, but it is not meant to be a
> showcase for somebody's vanity project or some company's new commercial
> product. For this issue, I think the criterion the IETF uses to consider
> something a standard is a good touchstone: are there several independent
> and compatible implementations already out there?

gthub search for ipfs has "10,204 repository results"

first hit is a "IPFS implementation in Go"
3rd is a "IPFS implementation in JavaScript"
looking further i see
"Python implementation of IPFS, the InterPlanetary File System. Not even remotely done yet."
further down
"The Interplanetary File System (IPFS), implemented in Rust"

so id say you can have an implementation of some form in every modern language

And i dont think removing IPFS support entirely from FFmpeg is a smart choice.

thx

[...]
Timo Rothenpieler Aug. 12, 2022, 5:21 p.m. UTC | #22
On 12.08.2022 19:18, Michael Niedermayer wrote:
> And i dont think removing IPFS support entirely from FFmpeg is a smart choice.

I wouldn't at all be upset about having proper IPFS support in FFmpeg, 
there's no argument there.

The issue is that this has very little to do with actual/native IPFS 
support, but it's just a url rewriter, which on top of that comes with a 
hardcoded in default gateway. Which is run by a to me unknown company, 
with unknown interests.
Michael Niedermayer Aug. 13, 2022, 4:29 p.m. UTC | #23
On Fri, Aug 12, 2022 at 07:21:02PM +0200, Timo Rothenpieler wrote:
> On 12.08.2022 19:18, Michael Niedermayer wrote:
> > And i dont think removing IPFS support entirely from FFmpeg is a smart choice.
> 
> I wouldn't at all be upset about having proper IPFS support in FFmpeg,
> there's no argument there.
> 
> The issue is that this has very little to do with actual/native IPFS
> support, but it's just a url rewriter, which on top of that comes with a
> hardcoded in default gateway. Which is run by a to me unknown company, with
> unknown interests.

I fully support better IPFS support 
what iam a bit "upset" about is that running a IPFS node is presented as
if that was more private than using a gateway.

If you use a gateway there are 2 options
A. the gateway is honest then you have decent privacy
B. the gateway logs you, in which case you have no privacy

OTOH if you run a node
You have no privacy either way

Consider this:
If i want to know who downloads assetXYZ i can simple create 1000 nodes each
sharing assetXYZ. (this can in reality be 1 node pretending to be 1000)
If you now request assetXYZ from IPFS then the node you use will likely
download it straight from one of my 1000 nodes, i get your IP, yes we
have a encrypted connection but that goes straight to my attack nodes
you notice nothing of this, i log your IP and time.

If you used some public gateway, i would just log the time and IP of that
public gateway

If you want really private IPFS with you need TOR or something
equivalent.
If someone posts a patch to add native TOR support i surely wont be unhappy
I also would very welcome more native IPFS support but that alone does not
fix the privacy / logging issue

Also i would be VERY happy if iam wrong and running a IPFS node can be made
100% secure and private

independant of this, i would very much welcome the current gateway code to
be extended to verify the content so the gateway cannot modify it!
And this should be enabled for non local gateways by default i think

thx

[...]
Timo Rothenpieler Aug. 13, 2022, 7:06 p.m. UTC | #24
On 13.08.2022 18:29, Michael Niedermayer wrote:
> I fully support better IPFS support
> what iam a bit "upset" about is that running a IPFS node is presented as
> if that was more private than using a gateway.

That's not what people are suggesting.
The primary upset is about FFmpeg having hardcoded in a public gateway 
run by some company.
That is unprecedented for FFmpeg.
You have to keep in mind that that code will make it into a ton of 
distros, installed applications and who knows what else, for a very long 
time to come.

What if in 5 years that company goes under, and the domain is sold?
Or it just decides to "become evil"? What if it already is? I don't know 
that company, or how they earn their money with running a public service 
like that.
There are so many issues with hardcoding a domain like that into FFmpeg, 
that I'm surprised really anyone is defending it.


> If you use a gateway there are 2 options
> A. the gateway is honest then you have decent privacy
> B. the gateway logs you, in which case you have no privacy
> 
> OTOH if you run a node
> You have no privacy either way

If you run a node, you have put enough effort in, that you at least 
understand what is happening.
People understand torrents, which have the same issue, and manage to use 
them.

> Consider this:
> If i want to know who downloads assetXYZ i can simple create 1000 nodes each
> sharing assetXYZ. (this can in reality be 1 node pretending to be 1000)
> If you now request assetXYZ from IPFS then the node you use will likely
> download it straight from one of my 1000 nodes, i get your IP, yes we
> have a encrypted connection but that goes straight to my attack nodes
> you notice nothing of this, i log your IP and time.
> 
> If you used some public gateway, i would just log the time and IP of that
> public gateway
> 
> If you want really private IPFS with you need TOR or something
> equivalent.
> If someone posts a patch to add native TOR support i surely wont be unhappy
> I also would very welcome more native IPFS support but that alone does not
> fix the privacy / logging issue
> 
> Also i would be VERY happy if iam wrong and running a IPFS node can be made
> 100% secure and private

I don't really understand how that is at all relevant to the issue at hand:
We have hardcoded a companies server into our main codebase. Thus we 
endorse that company and basically say that we trust it.
Which I for one do not. I don't know it at all.
If it turns out that company is acting badly, it will also reflect badly 
on the project. We, as a project, simply cannot do that.

It's easy to say that "a user will just pick the first gateway found on 
google anyway", but we cannot safe users from their own responsibility 
there.
It's our responsibility to be trustworthy. Hardcoding servers like this 
does not instill trust.

Specially if the IPFS project then publishes a big blog post about 
ffmpeg having gained "native" support, which makes the whole effort 
appear even more dubious, since the support that was added is very much 
not native.

> independant of this, i would very much welcome the current gateway code to
> be extended to verify the content so the gateway cannot modify it!
> And this should be enabled for non local gateways by default i think

Seems like a good idea in any case. No idea how ipfs works, but does the 
url not work as hash for the contents it points to?
Michael Niedermayer Aug. 14, 2022, 6 p.m. UTC | #25
On Sat, Aug 13, 2022 at 09:06:50PM +0200, Timo Rothenpieler wrote:
> On 13.08.2022 18:29, Michael Niedermayer wrote:
> > I fully support better IPFS support
> > what iam a bit "upset" about is that running a IPFS node is presented as
> > if that was more private than using a gateway.
> 
> That's not what people are suggesting.
> The primary upset is about FFmpeg having hardcoded in a public gateway run
> by some company.
> That is unprecedented for FFmpeg.
> You have to keep in mind that that code will make it into a ton of distros,
> installed applications and who knows what else, for a very long time to
> come.
> 
> What if in 5 years that company goes under, and the domain is sold?
> Or it just decides to "become evil"? What if it already is? I don't know
> that company, or how they earn their money with running a public service
> like that.
> There are so many issues with hardcoding a domain like that into FFmpeg,
> that I'm surprised really anyone is defending it.

I think i misunderstood your concern. 


[...]
> > Consider this:
> > If i want to know who downloads assetXYZ i can simple create 1000 nodes each
> > sharing assetXYZ. (this can in reality be 1 node pretending to be 1000)
> > If you now request assetXYZ from IPFS then the node you use will likely
> > download it straight from one of my 1000 nodes, i get your IP, yes we
> > have a encrypted connection but that goes straight to my attack nodes
> > you notice nothing of this, i log your IP and time.
> > 
> > If you used some public gateway, i would just log the time and IP of that
> > public gateway
> > 
> > If you want really private IPFS with you need TOR or something
> > equivalent.
> > If someone posts a patch to add native TOR support i surely wont be unhappy
> > I also would very welcome more native IPFS support but that alone does not
> > fix the privacy / logging issue
> > 
> > Also i would be VERY happy if iam wrong and running a IPFS node can be made
> > 100% secure and private
> 
> I don't really understand how that is at all relevant to the issue at hand:
> We have hardcoded a companies server into our main codebase. Thus we endorse
> that company and basically say that we trust it.

That default was in no way intended as an endorsment. That view that this can
be seen as an endorsment is something i totally missed


> Which I for one do not. I don't know it at all.

i also dont know it at all


> If it turns out that company is acting badly, it will also reflect badly on
> the project. We, as a project, simply cannot do that.
> 
> It's easy to say that "a user will just pick the first gateway found on
> google anyway", but we cannot safe users from their own responsibility
> there.
> It's our responsibility to be trustworthy. Hardcoding servers like this does
> not instill trust.
> 
> Specially if the IPFS project then publishes a big blog post about ffmpeg
> having gained "native" support, which makes the whole effort appear even
> more dubious, since the support that was added is very much not native.
> 

> > independant of this, i would very much welcome the current gateway code to
> > be extended to verify the content so the gateway cannot modify it!
> > And this should be enabled for non local gateways by default i think
> 
> Seems like a good idea in any case. No idea how ipfs works, but does the url
> not work as hash for the contents it points to?

the way i understand it, it does. But IIRC its not just a hash its the hash from
the root of a merkle tree. That way it can verify partial data. And this is
important for us.
Consider someone downloads a 5gb file from ipfs to play it back, we cant really
wait to receive the whole 5gb before verifing

also there are v0 and v1 of CIDs, version 1 seems quite rich in what it can
represent, i presume only a small subset matters but all this would be better
to be explained by someone who knows this. Iam just reading the docs a bit
and iam a bit curious. Everything i say about IPFS internals is based on how
i understand it and could be inaccurate

thx

[...]
Nicolas George Aug. 15, 2022, 2:09 p.m. UTC | #26
Michael Niedermayer (12022-08-13):
> what iam a bit "upset" about is that running a IPFS node is presented as
> if that was more private than using a gateway.

I do not think it was suggested that it was more private and/or secure.

A more accurate and detailed statement of the issue is that both
solution raise concerns about security and privacy, but not the same,
and users must be allowed a chance to make an informed decision between
using one, the other or none at all depending on their situation.

Furthermore, this protocol has not been around virtually forever, we
cannot assume users just already know about these issues and will think
of them as soon as they see ipfs://. It is an argument for this to NOT
just work out-of-the-box and instead require some setting up by users.

And if users search on the web and take any random gateway they find, or
copy-paste a command-line they found somewhere instead, it is nor our
problem, because the web has been around virtually forever, and people
should know better by now than trusting anything they find.

As for what to do now, I suggest we approve Derek's patch removing the
default gateway, and discuss the rest after.

Regards,
Jean-Baptiste Kempf Aug. 15, 2022, 2:27 p.m. UTC | #27
Yo,

On Mon, 15 Aug 2022, at 16:09, Nicolas George wrote:
> As for what to do now, I suggest we approve Derek's patch removing the
> default gateway, and discuss the rest after.

I agree. This is the simplest for now, yes.

What one can do, is suggest known examples of  gateways in the error log message.

Best,
Michael Niedermayer Aug. 15, 2022, 5:53 p.m. UTC | #28
On Wed, Aug 10, 2022 at 11:27:08PM +0100, Derek Buitenhuis wrote:
> A gateway can see everything, and we should not be shipping a hardcoded
> default from a third party company; it's a security risk.
> 
> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
> ---
>  libavformat/ipfsgateway.c | 11 ++++-------
>  1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c
> index 5a5178c563..907b61b017 100644
> --- a/libavformat/ipfsgateway.c
> +++ b/libavformat/ipfsgateway.c
> @@ -240,13 +240,8 @@ static int translate_ipfs_to_http(URLContext *h, const char *uri, int flags, AVD
>          ret = populate_ipfs_gateway(h);
>  
>          if (ret < 1) {
> -            // We fallback on dweb.link (managed by Protocol Labs).
> -            snprintf(c->gateway_buffer, sizeof(c->gateway_buffer), "https://dweb.link");
> -
> -            av_log(h, AV_LOG_WARNING,
> -                   "IPFS does not appear to be running. "
> -                   "You’re now using the public gateway at dweb.link.\n");
> -            av_log(h, AV_LOG_INFO,
> +            av_log(h, AV_LOG_ERROR,
> +                   "IPFS does not appear to be running.\n\n"
>                     "Installing IPFS locally is recommended to "
>                     "improve performance and reliability, "
>                     "and not share all your activity with a single IPFS gateway.\n"
> @@ -259,6 +254,8 @@ static int translate_ipfs_to_http(URLContext *h, const char *uri, int flags, AVD
>                     "3. Define an $IPFS_PATH environment variable "
>                     "and point it to the IPFS data path "
>                     "- this is typically ~/.ipfs\n");
> +            ret = AVERROR(EINVAL);
> +            goto err;
>          }
>      }

Before this patch, only "experts" needed to change the IPFS settings.
After this patch everyone who wants to use IPFS needs to change the IPFS
settings.
The printed text is adequate to experts but not to the average user.
It should either explain the privacy & security implications of the
different options or point to some external documentation.
Such external documentation needs to stay available at the given link
also for the lifetime of the releases it is part of

Said differently, a user choosing a gateway needs to understand that this
choice can affect her privacy. Similarly that the choice betweem gateway and
node affects privacy too

Please add better documentation to achieve that. (maybe in a seperate patch
would be cleanest)

thx

[...]
Derek Buitenhuis Aug. 15, 2022, 7:35 p.m. UTC | #29
On 8/10/2022 11:27 PM, Derek Buitenhuis wrote:
> A gateway can see everything, and we should not be shipping a hardcoded
> default from a third party company; it's a security risk.
> 
> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
> ---
>  libavformat/ipfsgateway.c | 11 ++++-------
>  1 file changed, 4 insertions(+), 7 deletions(-)

I've been asked by almost all the active developers on FFmpeg at this
point to push this while we debate the error message, gateway list / solution,
nuking, etc.

I quintuple checked on IRC it was OK for me to push in the meantime. Logs are
there.

Many distros/packagers have already backported this patch to 5.1 themselves,
as they really (shockingly) do not want a default gateway.

So: Pushed.

Let us continue the discusion on the other aspects of this topic.

(Please do not send me hate mail or harassment. I really did check.)

- Derek
James Almer Aug. 15, 2022, 7:37 p.m. UTC | #30
On 8/15/2022 4:35 PM, Derek Buitenhuis wrote:
> On 8/10/2022 11:27 PM, Derek Buitenhuis wrote:
>> A gateway can see everything, and we should not be shipping a hardcoded
>> default from a third party company; it's a security risk.
>>
>> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
>> ---
>>   libavformat/ipfsgateway.c | 11 ++++-------
>>   1 file changed, 4 insertions(+), 7 deletions(-)
> 
> I've been asked by almost all the active developers on FFmpeg at this
> point to push this while we debate the error message, gateway list / solution,
> nuking, etc.
> 
> I quintuple checked on IRC it was OK for me to push in the meantime. Logs are
> there.

Can attest to that. This patch got more than enough +1s.

> 
> Many distros/packagers have already backported this patch to 5.1 themselves,
> as they really (shockingly) do not want a default gateway.
> 
> So: Pushed.
> 
> Let us continue the discusion on the other aspects of this topic.
> 
> (Please do not send me hate mail or harassment. I really did check.)
> 
> - Derek
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Michael Niedermayer Aug. 15, 2022, 9:47 p.m. UTC | #31
On Mon, Aug 15, 2022 at 08:35:18PM +0100, Derek Buitenhuis wrote:
> On 8/10/2022 11:27 PM, Derek Buitenhuis wrote:
> > A gateway can see everything, and we should not be shipping a hardcoded
> > default from a third party company; it's a security risk.
> > 
> > Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
> > ---
> >  libavformat/ipfsgateway.c | 11 ++++-------
> >  1 file changed, 4 insertions(+), 7 deletions(-)
> 
> I've been asked by almost all the active developers on FFmpeg at this
> point to push this while we debate the error message, gateway list / solution,
> nuking, etc.
> 
> I quintuple checked on IRC it was OK for me to push in the meantime. Logs are
> there.
> 
> Many distros/packagers have already backported this patch to 5.1 themselves,
> as they really (shockingly) do not want a default gateway.
> 
> So: Pushed.
> 
> Let us continue the discusion on the other aspects of this topic.
> 
> (Please do not send me hate mail or harassment. I really did check.)

I just like to note that many of these statments are not untrue but also
not exactly true the way they are written. That begins with the commit message
"it's a security risk.", sure it is one for some definiton of risk but 
after this patch there is more risk in practice for the average end user

I would be carefull with the distros which backported this. Honestly
i would question these distros security more than the security of the
gateway. Because I dont think they reviewed this 
It says this now:
                   "IPFS does not appear to be running.\n\n"
                   "Installing IPFS locally is recommended to "
                   "improve performance and reliability, "
That removes the gateway (because it cant be trusted and replaced it
by a litteral recommandition to install their software)
replace loging by code exec ... either you trust them or not

No, iam not angry or anything like that at all. Iam also not asking for any
revert or anything and i fully aknowledge that there seems to be a clear majority 
for the removal of the default.
I just have to point this out because i think something went a bit wrong here.

Also some of the argumentation today on IRC about crypto & NFTs felt a
bit like the "distant dark past" where random projects and random people where
attacked.
If one was concerned that using a default gateway could be seen
as endorsment by us. Be concerned please about how FFmpeg looks
when its developers attack other projects on its official development IRC channels.

thx


[...]
Nicolas George Aug. 15, 2022, 9:57 p.m. UTC | #32
Michael Niedermayer (12022-08-15):
> It says this now:
>                    "IPFS does not appear to be running.\n\n"
>                    "Installing IPFS locally is recommended to "
>                    "improve performance and reliability, "
> That removes the gateway (because it cant be trusted and replaced it
> by a litteral recommandition to install their software)
> replace loging by code exec ... either you trust them or not

The recommendation was already there, it was not changed by the patch.
And it definitely should not be there, it is an error message, not a
novel. Furthermore, even the actual error message, "IPFS does not appear
to be running", is just wrong.

The error message should be something like "IPFS gateway not
configured." and the rest in the documentation, starting with which
configuration files and environment variables it uses, and including
warnings about security and privacy concerns.

Regards,
Mark Gaiser Aug. 15, 2022, 11:53 p.m. UTC | #33
On Mon, Aug 15, 2022 at 11:57 PM Nicolas George <george@nsup.org> wrote:

> Michael Niedermayer (12022-08-15):
> > It says this now:
> >                    "IPFS does not appear to be running.\n\n"
> >                    "Installing IPFS locally is recommended to "
> >                    "improve performance and reliability, "
> > That removes the gateway (because it cant be trusted and replaced it
> > by a litteral recommandition to install their software)
> > replace loging by code exec ... either you trust them or not
>
> The recommendation was already there, it was not changed by the patch.
> And it definitely should not be there, it is an error message, not a
> novel. Furthermore, even the actual error message, "IPFS does not appear
> to be running", is just wrong.
>
> The error message should be something like "IPFS gateway not
> configured." and the rest in the documentation, starting with which
> configuration files and environment variables it uses, and including
> warnings about security and privacy concerns.
>

I'd like to clarify a few points regarding the gateway. As like many things
said in this thread, it's not all just as black/white as it might appear at
first glance.

First an earlier message from Micheal:
> Before this patch, only "experts" needed to change the IPFS settings.
> After this patch everyone who wants to use IPFS needs to change the IPFS
> settings.

This is both true and false where it entirely depends on the situation.
If one is running an IPFS node and if that node is 0.15.0 or up (to be
released in the coming days or weeks) then:
- the gateway file exists as ~/.ipfs/gateway when IPFS is running
- ffmpeg will find it and use it
- the user uses the local gateway

Granted, that is in the happy day scenario. Users who don't have IPFS
running now have to provide the gateway with one of the available options.

Next the message I reply on, specifically:
>  "IPFS does not appear to be running", is just wrong.

I suspect the patch didn't do what it originally did when no gateway was
found (which is the new case).
This message was supposed to be shown in that scenario:

            av_log(h, AV_LOG_INFO,
                   "Installing IPFS locally is recommended to "
                   "improve performance and reliability, "
                   "and not share all your activity with a single IPFS
gateway.\n"
                   "There are multiple options to define this gateway.\n"
                   "1. Call ffmpeg with a gateway param, "
                   "without a trailing slash: -gateway <url>.\n"
                   "2. Define an $IPFS_GATEWAY environment variable with
the "
                   "full HTTP URL to the gateway "
                   "without trailing forward slash.\n"
                   "3. Define an $IPFS_PATH environment variable "
                   "and point it to the IPFS data path "
                   "- this is typically ~/.ipfs\n");

In the new reality that message still wouldn't be as accurate as it can be.
It would miss a line explaining the cause and it would likely be an error
typed message instead.
I'd be happy to review and test a patch that fixes this!

Another quote from an earlier message:
> The printed text is adequate to experts but not to the average user.
> It should either explain the privacy & security implications of the
> different options or point to some external documentation.
> Such external documentation needs to stay available at the given link
> also for the lifetime of the releases it is part of

I'd be happy to help to create log messages and documentation that make
sense to everyone targeted as a user.
However, I do consider both ffmpeg and ipfs as fairly low level tools where
it should be expected that some technical jargon would be acceptable. In
other terms, not foolproof but technically correct and clear enough.
Applications building on top of ffmpeg should probably be the one giving a
end user friendly message (for example, vlc, kodi, mpv, ..)

Just let me know how I can help!

Lastly, I'd really appreciate it if I can get a cc when there is a change
in the ipfsgateway implementation in ffmpeg.
I'm not an ffmpeg dev at all thus it can be assumed that I don't monitor
the ffmpeg mailing lists.
A cc would therefore be really helpful. My name and mail is in the source
file.


> Regards,
>
> --
>   Nicolas George
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Michael Niedermayer Aug. 16, 2022, 2:46 p.m. UTC | #34
On Mon, Aug 15, 2022 at 11:47:53PM +0200, Michael Niedermayer wrote:
[...]
> If one was concerned that using a default gateway could be seen
> as endorsment by us. Be concerned please about how FFmpeg looks
> when its developers attack other projects on its official development IRC channels.

ive writte a longer argument about crypto and NFTs on 
https://guru.multimedia.cx/why-crypto/ listing by humble oppinion about it,
in case anyone cares. It kind of fits here but i didnt want to bore everyone
who doesnt care with posting that long (off topic drifting) text here

thx

[...]
Tomas Härdin Aug. 17, 2022, 3:03 p.m. UTC | #35
tor 2022-08-11 klockan 19:35 +0200 skrev Timo Rothenpieler:
> On 11.08.2022 19:21, Mark Gaiser wrote:
> > 
> > Here's the conversation requesting this very feature:
> > https://ffmpeg.org/pipermail/ffmpeg-devel/2022-March/293835.html
> 
> I generally agree with the points brought up there.
> But my conclusion very much is not "just put a somewhat random
> default 
> into the code".
> Even a list of defaults is not Okay.
> We can't hardcode "magic servers".
> 
> If it's not possible to make the protocol work without them, it
> likely 
> shouldn't have been merged in the first place.
> Why can't it access the files directly, but only via some magic http 
> gateway?
> Why does it need special code in ffmpeg in the first place, if you
> can 
> just access it via that http proxy-gateway anyway?

I raised this very point several times when IPFS support was first
suggested, and raised that ipfsgateway.c amounts to business logic that
does not belong in lavf. I see now hat others in this thread, like
Derek, agree with me on this, which is nice. IIRC I even suggested that
users should solve this in bash, a suggestion that was shouted down as
being "insecure".

I also suggested we should actually implement ipfs or link to a library
that implements it rather than shoving more string mangling crap into
lavf. A default gateway didn't exist when I last looked at the
patchset. Seems I wasn't vigilant enough.

The correct place to solve this is at the OS level. There should be a
fully fledged protocol handler in the OS that lavf can rely on, and
neither lavf nor any other library should ever try to implement
protocols themselvess. But that's more plan9-ish, so it will likely not
happen in the Linux world any time soon.

/Tomas
Michael Niedermayer Aug. 18, 2022, 2:31 p.m. UTC | #36
On Wed, Aug 17, 2022 at 05:03:56PM +0200, Tomas Härdin wrote:
> tor 2022-08-11 klockan 19:35 +0200 skrev Timo Rothenpieler:
> > On 11.08.2022 19:21, Mark Gaiser wrote:
> > > 
> > > Here's the conversation requesting this very feature:
> > > https://ffmpeg.org/pipermail/ffmpeg-devel/2022-March/293835.html
> > 
> > I generally agree with the points brought up there.
> > But my conclusion very much is not "just put a somewhat random
> > default 
> > into the code".
> > Even a list of defaults is not Okay.
> > We can't hardcode "magic servers".
> > 
> > If it's not possible to make the protocol work without them, it
> > likely 
> > shouldn't have been merged in the first place.
> > Why can't it access the files directly, but only via some magic http 
> > gateway?
> > Why does it need special code in ffmpeg in the first place, if you
> > can 
> > just access it via that http proxy-gateway anyway?
> 
> I raised this very point several times when IPFS support was first
> suggested, and raised that ipfsgateway.c amounts to business logic that
> does not belong in lavf. I see now hat others in this thread, like
> Derek, agree with me on this, which is nice. IIRC I even suggested that
> users should solve this in bash, a suggestion that was shouted down as
> being "insecure".

you cannot do it in bash
filter="ipfs://..."
on the command line is translated or not ?
if its drawtext showing the user on screen a URL it must not be
OTOH if the filter reads from the URL it has to be
this just isnt going to work at the bash command line level besides bash is
not even a dependancy of FFmpeg
not to mention that a ipfs:// link from one container to another container
would never show up on the command line


> 
> I also suggested we should actually implement ipfs or link to a library
> that implements it rather than shoving more string mangling crap into

for reference, mark replied in that thread:

    A "proper" implementation is unfeasible for ffmpeg purposes because a
    proper implementation would act as an IPFS node.
    That means it would:
    - spin up
    - do it's bootstrapping
    - connect to nodes and find new nodes to connect to
    - find the CID on the network
    - etc...

    This all adds a lot of startup time making it very unfriendly to users.
    In this scenario it could take up to minutes before your video starts
    playing if it doesn't time out.


> lavf. A default gateway didn't exist when I last looked at the
> patchset. Seems I wasn't vigilant enough.

[...]
Tomas Härdin Aug. 19, 2022, 9:15 a.m. UTC | #37
tor 2022-08-18 klockan 16:31 +0200 skrev Michael Niedermayer:
> On Wed, Aug 17, 2022 at 05:03:56PM +0200, Tomas Härdin wrote:
> > tor 2022-08-11 klockan 19:35 +0200 skrev Timo Rothenpieler:
> > > On 11.08.2022 19:21, Mark Gaiser wrote:
> > > > 
> > > > Here's the conversation requesting this very feature:
> > > > https://ffmpeg.org/pipermail/ffmpeg-devel/2022-March/293835.html
> > > 
> > > I generally agree with the points brought up there.
> > > But my conclusion very much is not "just put a somewhat random
> > > default 
> > > into the code".
> > > Even a list of defaults is not Okay.
> > > We can't hardcode "magic servers".
> > > 
> > > If it's not possible to make the protocol work without them, it
> > > likely 
> > > shouldn't have been merged in the first place.
> > > Why can't it access the files directly, but only via some magic
> > > http 
> > > gateway?
> > > Why does it need special code in ffmpeg in the first place, if
> > > you
> > > can 
> > > just access it via that http proxy-gateway anyway?
> > 
> > I raised this very point several times when IPFS support was first
> > suggested, and raised that ipfsgateway.c amounts to business logic
> > that
> > does not belong in lavf. I see now hat others in this thread, like
> > Derek, agree with me on this, which is nice. IIRC I even suggested
> > that
> > users should solve this in bash, a suggestion that was shouted down
> > as
> > being "insecure".
> 
> you cannot do it in bash

Is bash not Turing complete?

> filter="ipfs://..."
> on the command line is translated or not ?
> if its drawtext showing the user on screen a URL it must not be
> OTOH if the filter reads from the URL it has to be
> this just isnt going to work at the bash command line level besides
> bash is
> not even a dependancy of FFmpeg
> not to mention that a ipfs:// link from one container to another
> container
> would never show up on the command line

The point is that this is business logic that belongs elsewhere.

> > 
> > I also suggested we should actually implement ipfs or link to a
> > library
> > that implements it rather than shoving more string mangling crap
> > into
> 
> for reference, mark replied in that thread:
> 
>     A "proper" implementation is unfeasible for ffmpeg purposes
> because a
>     proper implementation would act as an IPFS node.
>     That means it would:
>     - spin up
>     - do it's bootstrapping
>     - connect to nodes and find new nodes to connect to
>     - find the CID on the network
>     - etc...
> 
>     This all adds a lot of startup time making it very unfriendly to
> users.
>     In this scenario it could take up to minutes before your video
> starts
>     playing if it doesn't time out.

Yes that is what implementing ipfs: entails. But ipfsgateway.c is not
actually ipfs: now is it? This would be like gopher.c using overbite as
a gateway instead of actually implementing gopher:

We don't need to shovel everything into this project. We can actually
rely on users being smart. For example vlc could do the business logic
of dealing with IPFS. Or, better yet, the OS.

/Tomas
Mark Gaiser Aug. 19, 2022, 12:52 p.m. UTC | #38
On Fri, Aug 19, 2022 at 11:15 AM Tomas Härdin <tjoppen@acc.umu.se> wrote:

> tor 2022-08-18 klockan 16:31 +0200 skrev Michael Niedermayer:
> > On Wed, Aug 17, 2022 at 05:03:56PM +0200, Tomas Härdin wrote:
> > > tor 2022-08-11 klockan 19:35 +0200 skrev Timo Rothenpieler:
> > > > On 11.08.2022 19:21, Mark Gaiser wrote:
> > > > >
> > > > > Here's the conversation requesting this very feature:
> > > > > https://ffmpeg.org/pipermail/ffmpeg-devel/2022-March/293835.html
> > > >
> > > > I generally agree with the points brought up there.
> > > > But my conclusion very much is not "just put a somewhat random
> > > > default
> > > > into the code".
> > > > Even a list of defaults is not Okay.
> > > > We can't hardcode "magic servers".
> > > >
> > > > If it's not possible to make the protocol work without them, it
> > > > likely
> > > > shouldn't have been merged in the first place.
> > > > Why can't it access the files directly, but only via some magic
> > > > http
> > > > gateway?
> > > > Why does it need special code in ffmpeg in the first place, if
> > > > you
> > > > can
> > > > just access it via that http proxy-gateway anyway?
> > >
> > > I raised this very point several times when IPFS support was first
> > > suggested, and raised that ipfsgateway.c amounts to business logic
> > > that
> > > does not belong in lavf. I see now hat others in this thread, like
> > > Derek, agree with me on this, which is nice. IIRC I even suggested
> > > that
> > > users should solve this in bash, a suggestion that was shouted down
> > > as
> > > being "insecure".
> >
> > you cannot do it in bash
>
> Is bash not Turing complete?
>

I believe we went over this in detail during those patch rounds when this
was brought up (by you?).
I didn't go back in the archives to find it, but some reasons that come to
mind:
- just handing the mere edge cases would make that bash script complex
- potentially involving regex
- add in gateway detection logic and you'll have a bash script that's
likely more complex than the current ipfsgateway.c code.
- not cross platform by any means

This would require the manual step of installing such a script to get
"ffplay ipfs://<cid>" to work.
Worst of all, it would not be usable for applications building on top of
ffmpeg (vlc, mpv, kodi) and would not be cross platform at all.


> > filter="ipfs://..."
> > on the command line is translated or not ?
> > if its drawtext showing the user on screen a URL it must not be
> > OTOH if the filter reads from the URL it has to be
> > this just isnt going to work at the bash command line level besides
> > bash is
> > not even a dependancy of FFmpeg
> > not to mention that a ipfs:// link from one container to another
> > container
> > would never show up on the command line
>
> The point is that this is business logic that belongs elsewhere.
>
> > >
> > > I also suggested we should actually implement ipfs or link to a
> > > library
> > > that implements it rather than shoving more string mangling crap
> > > into
> >
> > for reference, mark replied in that thread:
> >
> >     A "proper" implementation is unfeasible for ffmpeg purposes
> > because a
> >     proper implementation would act as an IPFS node.
> >     That means it would:
> >     - spin up
> >     - do it's bootstrapping
> >     - connect to nodes and find new nodes to connect to
> >     - find the CID on the network
> >     - etc...
> >
> >     This all adds a lot of startup time making it very unfriendly to
> > users.
> >     In this scenario it could take up to minutes before your video
> > starts
> >     playing if it doesn't time out.
>
> Yes that is what implementing ipfs: entails. But ipfsgateway.c is not
> actually ipfs: now is it? This would be like gopher.c using overbite as
> a gateway instead of actually implementing gopher:
>
> We don't need to shovel everything into this project. We can actually
> rely on users being smart. For example vlc could do the business logic
> of dealing with IPFS. Or, better yet, the OS.
>

.. we should probably fork this out into its own thread, but i do like to
have this discussion! ..

What you describe here is an opinion I hear all too often. 9 out of 10
times the gut reaction for "implementing IPFS support" is any of the above
suggestions you make.
Very understandable for someone with technical skills and just a surface
level overview of what ipfs support means.
Heck, I too would give that response and thought that way when I just
discovered IPFS!

But when you dive deep into this you'll find out that there are no easy
solutions.
Bear in mind, any solution where you'd need an ipfs implementation like
ffmpeg has now in each application is just not scalable. I wanted to add
IPFS support in ffmpeg because it allows for far easier IPFS usage in
anything that uses IPFS.
But is that ideal? Depends on the alternatives. Is it the best achievable
at the moment? Probably yes.

I don't understand the argument of "ipfsgateway.c amounts to business logic
that does not belong in lavf". Earlier arguments in this very same thread
argues for far more "full" ipfs support, isn't that contradicting? How can
a "full implementation" then ever be acceptable with that same reasoning? I
honestly don't get that so please do educate me in this regard.

For the sake of the argument and because I'm really curious to know too.
Say we want to go for OS level support of IPFS. Just in the mindset of the
meaning of "OS level" gives the impression that it would magically make
IPFS work everywhere on the OS. Say, again for the sake of the argument,
that actual OS level IPFS support does that magical behavior! Sweet! I want
it! How?
- Can it be mounted as special directories? Say for example on /ipfs and
/ipns.. Yes! But that makes this a linux specific support that won't work
for windows. Which means this option isn't ideal.
- Can there be a global ipfs:// and ipns:// protocol handler? If that even
exists, thathandler would do "something".. how would that be supported
across different applications? Say chrome, file browser and ipfs to name a
few. Or would that magical handler just return a file descriptor where
every application is supposed to be able to use file descriptors?
- Any more OS level alternatives?

Do keep the discussion for this going (perhaps in a new thread)! This is
really interesting stuff that could very well influence a path forward.
Provided that there is a credible cross platform way.


>
> /Tomas
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Tomas Härdin Aug. 22, 2022, 9:12 a.m. UTC | #39
fre 2022-08-19 klockan 14:52 +0200 skrev Mark Gaiser:
> 
> 
> I believe we went over this in detail during those patch rounds when
> this
> was brought up (by you?).
> I didn't go back in the archives to find it, but some reasons that
> come to
> mind:
> - just handing the mere edge cases would make that bash script
> complex
> - potentially involving regex
> - add in gateway detection logic and you'll have a bash script that's
> likely more complex than the current ipfsgateway.c code.
> - not cross platform by any means

Not our problem



> I don't understand the argument of "ipfsgateway.c amounts to business
> logic
> that does not belong in lavf". Earlier arguments in this very same
> thread
> argues for far more "full" ipfs support, isn't that contradicting?
> How can
> a "full implementation" then ever be acceptable with that same
> reasoning? I
> honestly don't get that so please do educate me in this regard.

I'd actually argue that in that case we should link a library that
implements IPFS, not split developer effort by trying to implement it
ourselves.

> 
> For the sake of the argument and because I'm really curious to know
> too.
> Say we want to go for OS level support of IPFS. Just in the mindset
> of the
> meaning of "OS level" gives the impression that it would magically
> make
> IPFS work everywhere on the OS. Say, again for the sake of the
> argument,
> that actual OS level IPFS support does that magical behavior! Sweet!
> I want
> it! How?
> - Can it be mounted as special directories? Say for example on /ipfs
> and
> /ipns.. Yes! But that makes this a linux specific support that won't
> work
> for windows. Which means this option isn't ideal.
> - Can there be a global ipfs:// and ipns:// protocol handler? If that
> even
> exists, thathandler would do "something".. how would that be
> supported
> across different applications? Say chrome, file browser and ipfs to
> name a
> few. Or would that magical handler just return a file descriptor
> where
> every application is supposed to be able to use file descriptors?
> - Any more OS level alternatives?

A better way of handling this could be FUSE. How exactly to implement
protocol handlers at the kernel or (more likely) systemd level is a
discussion for those projects. Probably dbus shenanigans resulting in
file descriptors.

I don't hate the current gateway solution enough to want to delete it,
since its current state of not having a default gateway won't cause
unexpected security problems for users. But I also deliberately chose
not to push the patchset because I didn't like it, because the proper
solution belongs elsewhere.

/Tomas
Nicolas George Aug. 22, 2022, 12:52 p.m. UTC | #40
Tomas Härdin (12022-08-22):
> I'd actually argue that in that case we should link a library that
> implements IPFS, not split developer effort by trying to implement it
> ourselves.

Is FFmpeg meant to be just a convenient set of wrappers for existing
libraries, then?

If not, what is your criterion to decide when splitting developer effort
is worth including a native version in FFmpeg.

Regards,
Ronald S. Bultje Aug. 23, 2022, 12:53 p.m. UTC | #41
Hi,

On Mon, Aug 22, 2022 at 5:52 AM Nicolas George <george@nsup.org> wrote:

> Tomas Härdin (12022-08-22):
> > I'd actually argue that in that case we should link a library that
> > implements IPFS, not split developer effort by trying to implement it
> > ourselves.
>
> Is FFmpeg meant to be just a convenient set of wrappers for existing
> libraries, then?
>
> If not, what is your criterion to decide when splitting developer effort
> is worth including a native version in FFmpeg.
>

I think it usually comes down to a developer wanting to, you know, spend
time on it.

As an example: if Tomas (or anyone) wants to write native IPFS: great.
Otherwise, we can accept an external lib for it.

Ronald
Nicolas George Aug. 23, 2022, 12:55 p.m. UTC | #42
Ronald S. Bultje (12022-08-23):
> I think it usually comes down to a developer wanting to, you know, spend
> time on it.

Sure, it can be that. But not only. I have seen people, including Tomas
himself, oppose code in favor of using a library.

Regards,
Tomas Härdin Aug. 24, 2022, 4:35 p.m. UTC | #43
mån 2022-08-22 klockan 14:52 +0200 skrev Nicolas George:
> Tomas Härdin (12022-08-22):
> > I'd actually argue that in that case we should link a library that
> > implements IPFS, not split developer effort by trying to implement
> > it
> > ourselves.
> 
> Is FFmpeg meant to be just a convenient set of wrappers for existing
> libraries, then?

It Depends.

If we could rid this project of all NIH:isms that would be great. Only
keep that which is strictly better than other existing libraries, for
example higher-performance decoders. For everything else we can do
subtree merges if people still insist the project should build "out of
the box".

One excellent example of this is the recent discussion around libxml2.
I maintain that developer effort should go toward improving libxml2.
Only if that is a lost cause, if libxml2 is hopelessly slow or
irredeemably buggy, only then would a new XML parser be justified. It
seems most developers understand this. But for some reason the notion
that the same applies to *all* parsers, including decoders and
demuxers, this notion is hard to swallow. And similarly for encoders
and muxers. I have yet to see a justification that is anything but
cargo culting.

This goes especially for formats like MXF, which I have made the case
on here multiple times that we should not maintain our own decoder for,
but rather pull in bmx. And everytime I have suggested this it has been
made clear that such patches would be rejected. And so MXF developer
effort is split.

Code is a liability. We should seek to have as little of it as
possible.

/Tomas
Michael Niedermayer Aug. 24, 2022, 8:54 p.m. UTC | #44
On Wed, Aug 24, 2022 at 06:35:04PM +0200, Tomas Härdin wrote:
> mån 2022-08-22 klockan 14:52 +0200 skrev Nicolas George:
> > Tomas Härdin (12022-08-22):
> > > I'd actually argue that in that case we should link a library that
> > > implements IPFS, not split developer effort by trying to implement
> > > it
> > > ourselves.
> > 
> > Is FFmpeg meant to be just a convenient set of wrappers for existing
> > libraries, then?
> 
> It Depends.
> 
> If we could rid this project of all NIH:isms that would be great. Only
> keep that which is strictly better than other existing libraries, for
> example higher-performance decoders. For everything else we can do
> subtree merges if people still insist the project should build "out of
> the box".
> 

> One excellent example of this is the recent discussion around libxml2.
> I maintain that developer effort should go toward improving libxml2.
> Only if that is a lost cause, if libxml2 is hopelessly slow or
> irredeemably buggy, only then would a new XML parser be justified. It
> seems most developers understand this. 

> But for some reason the notion
> that the same applies to *all* parsers, including decoders and
> demuxers, this notion is hard to swallow. And similarly for encoders
> and muxers. I have yet to see a justification that is anything but
> cargo culting.

Its not hard to swallow, it simply is wrong.
Why is there Tesla ?
to build cars ?
no
"Tesla’s mission is to accelerate the world’s transition to sustainable energy."
they could outsource everything, from chip design to batteries to software to
the car seats and so on but they dont because its better to reach their goal
to do them internally

Replace Tesla by FFmpeg, now what is our goal ?
Create some free multimedia framework ?
I would say that is ultimately, wrong
What we had done is to realize peoples multimedia needs and dreams.
People wanted to watch all the propriatary formats on free platforms
Now today you can play any multimedia file with FFmpeg on any free platform

Other things that fit into that mission would be
implement a streaming tool and infrastructure to replace youtube & tiktok
Its multimedia and its something people want, no advertisements, no 
"advertisers first"

Censorship resistant and private multimedia communication is another
potential goal. Just a few days ago i read about some guy who got
all his accounts terminated because his wife sent a picture from their
son to their doctor using his phone. The police was then lookin through
the guys pictures without his knowledge and determined its all fine but
the police couldnt even contanct him because his phone number and all
was terminated. This is not how communication should work, i mean
if you send a picture from your son to your doctor NOONE should be able
to look at that except your doctor. Not apple, not the police not the
government

These are ideas, they need people to work on them, 
iam just saying that IMNSHO these would belong into FFmpegs mission.
now do we need a internal mxf demuxer and muxer for these ? It depends
whatever is more efficient in reaching our goals.
Personally i think internal mxf code is better but i could be wrong.

ill send a 2nd mail more specifically about mxf

[...]
Michael Niedermayer Aug. 24, 2022, 9:03 p.m. UTC | #45
On Wed, Aug 24, 2022 at 06:35:04PM +0200, Tomas Härdin wrote:
[...]
>
> This goes especially for formats like MXF, which I have made the case
> on here multiple times that we should not maintain our own decoder for,
> but rather pull in bmx. And everytime I have suggested this it has been
> made clear that such patches would be rejected. And so MXF developer
> effort is split.

Is there a need for mxf (de) muxing without other containers ?
If the awnser is (mostly) no then the problem is not FFmpeg wanting its
own but rather that someone maintains mxf code outside ffmpeg.


>
> Code is a liability. We should seek to have as little of it as
> possible.

Look back at tesla, "vertical integration is a liability", that sounds
wrong. Quite the opposit, companies that split everything out seem to do
significantly worse. It doesnt mean everything should be done internally
but simply because some external work exists doesnt mean we need to use it
and then have to maybe maintain a codebase that we do not know and that
noone is willing to maintain and that noone from FFmpeg even has write
access to. Next some platforms may carry old versions of that external
code, some might not carry it at all. It can become a mess when we need
a specific feature and when distros like debian hava a policy that requires
shared libs to be used when avaiable. So debian would remove a internal
copy of a lib and force their shared lib to be used. At least that was
the policy when i last looked years ago.
And if we had a internal copy we would also have to do full security
maintaince of that internal copy. Again that is code noone in FFmpeg
knows.

for libxml2 these problems are less likely to hit us as we likely never need a
"new xml feature" but for a (de)muxer we quite likely will need the
latests version on every platform.
Also we have regression tests, external libs make that impossible
as the version of external libs can change the behavior. Again this
is a issue for mxf maybe less so libxml. You can also see that we have no
tests involving  any of the external encoder libs, for that very reason.
With each external lib that is needed for core features this would
become a quickly growing problem

thx
Kieran Kunhya Aug. 24, 2022, 9:18 p.m. UTC | #46
>
> for libxml2 these problems are less likely to hit us as we likely never
> need a
> "new xml feature" but for a (de)muxer we quite likely will need the
> latests version on every platform.
> Also we have regression tests, external libs make that impossible
> as the version of external libs can change the behavior. Again this
> is a issue for mxf maybe less so libxml. You can also see that we have no
> tests involving  any of the external encoder libs, for that very reason.
> With each external lib that is needed for core features this would
> become a quickly growing problem
>
>

Going back to technical arguments instead of utopian pipedreams (replacing
YouTube and Tiktok lol). You will never fit all the features of complex
containers like MXF, MP4, TS (and for argument's sake XML) inside a
generalised framework like FFmpeg.

Likewise with dav1d, we have seen that an external lib has allowed them to
introduced new paradigms such as mixed frame and sliced threads without
having to redo the whole framework of dozens of codecs. There is value to
this. There are also a lot of modern codec features which aren't easily
fittable into FFmpeg such as dependent substreams.

Kieran


Sent from my mobile device
Michael Niedermayer Aug. 25, 2022, 1:57 p.m. UTC | #47
On Wed, Aug 24, 2022 at 10:18:04PM +0100, Kieran Kunhya wrote:
> >
> > for libxml2 these problems are less likely to hit us as we likely never
> > need a
> > "new xml feature" but for a (de)muxer we quite likely will need the
> > latests version on every platform.
> > Also we have regression tests, external libs make that impossible
> > as the version of external libs can change the behavior. Again this
> > is a issue for mxf maybe less so libxml. You can also see that we have no
> > tests involving  any of the external encoder libs, for that very reason.
> > With each external lib that is needed for core features this would
> > become a quickly growing problem
> >
> >
> 
> Going back to technical arguments instead of utopian pipedreams (replacing
> YouTube and Tiktok lol). 

Well, who would have dreamt that humans will be able to fly ?
Maybe if the worlds best engineers and scientists and alot of capital was used
That where the Wright brothers, no actually it was not. The smartest people
with most degrees and alot of governemnt funding was a competing team that
i dont even remember the name of. The Wright Brothers had no degrees and
funded their stuff from the revenue of their bicycle shop

Or maybe create a free OS. That linux pipedream
Or maybe reverse engeneering all codecs and writing free implementations?
You know many of the people who did that yourself. Another pipedream

Or editing the human genome maybe, you know when you start with a single
celled embryo you can, wait no. There actually are a few people who had
the majority of their liver cells gen-edited in vivo as adults long after
their birth, to treat a rare desease called ATTR amyloidosis

...

Now maybe it will always stay a pipedream that all the evil sozial media
and communication platforms get replaced by privacy preserving, free speech
respecting, non adverziser controlled things. Quite possible, even quite likely
Then again maybe someone, maybe Elon Musk with twitter will take a bite
out of these giants. I have no clue.
My point was just that this sort of stuff would fit into FFmpegs mission
if somone wanted to do it. Its something id be happy to support.


> You will never fit all the features of complex
> containers like MXF, MP4, TS (and for argument's sake XML) inside a
> generalised framework like FFmpeg.

Maybe true but the reason is not that it cant be dont just that there are
features noone uses and noone needs.
I do know some video codec specs and there are bizare things in them that
arent worth the paper they are written on.
The features that are used or that people need, we must support IMO.


> 
> Likewise with dav1d, we have seen that an external lib has allowed them to
> introduced new paradigms such as mixed frame and sliced threads without
> having to redo the whole framework of dozens of codecs. There is value to
> this. There are also a lot of modern codec features which aren't easily
> fittable into FFmpeg such as dependent substreams.

The framework has, will and must evolve. Theres alot thats different between
FFmpeg 10 years ago and today.
External libs wont fix that btw. If the framework doesnt handle feature X
then it also doesnt with an external lib. User apps using the framework have
the problem both ways.

thx

[...]
Kieran Kunhya Aug. 25, 2022, 2:41 p.m. UTC | #48
> > You will never fit all the features of complex
> > containers like MXF, MP4, TS (and for argument's sake XML) inside a
> > generalised framework like FFmpeg.
>
> Maybe true but the reason is not that it cant be dont just that there are
> features noone uses and noone needs.
> I do know some video codec specs and there are bizare things in them that
> arent worth the paper they are written on.
> The features that are used or that people need, we must support IMO.
>

No, it's not that people don't want the features, it's just that they can't
be supported in an API that has been "designed" around AVI and assumed all
formats follow the same paradigm.
Like I said, dependent substreams in a different track can't be done. Also
moving between frame attached closed captions to track-based closed
captions and vice versa. How do I feed v210 into x264 directly without
having to decode and do two extra memory copies? External libraries support
much more advanced features of the container than trying to shoehorn it
into a rigid API. And people here will have to just accept the compromise.

I'm not even going to begin to touch the rest of your email with a
bargepole.

Kieran
Tomas Härdin Aug. 27, 2022, 7:05 a.m. UTC | #49
ons 2022-08-24 klockan 22:54 +0200 skrev Michael Niedermayer:
> On Wed, Aug 24, 2022 at 06:35:04PM +0200, Tomas Härdin wrote:
> > But for some reason the notion
> > that the same applies to *all* parsers, including decoders and
> > demuxers, this notion is hard to swallow. And similarly for
> > encoders
> > and muxers. I have yet to see a justification that is anything but
> > cargo culting.
> 
> Its not hard to swallow, it simply is wrong.
> Why is there Tesla ?
> to build cars ?
> no
> "Tesla’s mission is to accelerate the world’s transition to
> sustainable energy."

Tesla's mission is to generate profit, nothing else.

> they could outsource everything, from chip design to batteries to
> software to
> the car seats and so on but they dont because its better to reach
> their goal
> to do them internally

This has everything to do with economics of scale and ultimately
economizing on labour, thus lowering the value of Tesla's lithium-ion
cells and increasing profit.

For software the situation is very different, because the cost of
reproducing a program is effectively zero. All labour goes into
development. The goal of FFmpeg like every free software project is to
create use-values. Any labour spent in excess of what is necessary to
say be able to play MXF files is simply make-work.

I'm not sure what the bit about censorship has to do with my point.

/Tomas
Tomas Härdin Aug. 27, 2022, 7:29 a.m. UTC | #50
ons 2022-08-24 klockan 23:03 +0200 skrev Michael Niedermayer:
> On Wed, Aug 24, 2022 at 06:35:04PM +0200, Tomas Härdin wrote:
> [...]
> > 
> > This goes especially for formats like MXF, which I have made the
> > case
> > on here multiple times that we should not maintain our own decoder
> > for,
> > but rather pull in bmx. And everytime I have suggested this it has
> > been
> > made clear that such patches would be rejected. And so MXF
> > developer
> > effort is split.
> 
> Is there a need for mxf (de) muxing without other containers ?
> If the awnser is (mostly) no then the problem is not FFmpeg wanting
> its
> own but rather that someone maintains mxf code outside ffmpeg.

I think you missed my point about subtree merges

> > Code is a liability. We should seek to have as little of it as
> > possible.
> 
> Look back at tesla, "vertical integration is a liability", that
> sounds
> wrong. Quite the opposit, companies that split everything out seem to
> do
> significantly worse.

This again has to do with things like economics of scale. When it comes
to inter-company exchange, profit acts as a fetter on productivity.
Tesla has to pay not just the cost of Samsung's 18650 cells but also
Samsung's profit. These are two reasons why vertical integration can
make sense.

If Tesla could copy Samsung's 18650 production line, Samsung's capital,
literally for free then they would have done so from day one. This is
what free software is - a kind of commons capital that can be copied
gratis.

> It doesnt mean everything should be done internally
> but simply because some external work exists doesnt mean we need to
> use it
> and then have to maybe maintain a codebase that we do not know and
> that
> noone is willing to maintain and that noone from FFmpeg even has
> write
> access to.

Obviously we wouldn't pull in things we have zero access to. We could
for example subtree merge specific releases, and set up the build
system so that we can either use that or an equivalent shared system
version. One limitation of this approach is that new features in say
bmx can't be merged immediately but has to wait for the next official
release of bmx. But this can be handled by having a branch where
changes in our codebase that depend on changes in bmx coexist, and on
bmx's next release they are merged into master. Pressure could be
applied from our end on bmx to release early if the feature is a
pressing one.

> Next some platforms may carry old versions of that external
> code, some might not carry it at all.

Good thing we could build those dependencies ourselves then, if we set
up the build system correctly.

> Also we have regression tests, external libs make that impossible
> as the version of external libs can change the behavior. Again this
> is a issue for mxf maybe less so libxml. You can also see that we
> have no
> tests involving  any of the external encoder libs, for that very
> reason.
> With each external lib that is needed for core features this would
> become a quickly growing problem

Testing is certainly a challenge, but almost certainly easier than
reimplementing certain formats and codecs poorly, especially MXF.

/Tomas
Paul B Mahol Aug. 27, 2022, 7:53 a.m. UTC | #51
On Sat, Aug 27, 2022 at 9:30 AM Tomas Härdin <tjoppen@acc.umu.se> wrote:

> ons 2022-08-24 klockan 23:03 +0200 skrev Michael Niedermayer:
> > On Wed, Aug 24, 2022 at 06:35:04PM +0200, Tomas Härdin wrote:
> > [...]
> > >
> > > This goes especially for formats like MXF, which I have made the
> > > case
> > > on here multiple times that we should not maintain our own decoder
> > > for,
> > > but rather pull in bmx. And everytime I have suggested this it has
> > > been
> > > made clear that such patches would be rejected. And so MXF
> > > developer
> > > effort is split.
> >
> > Is there a need for mxf (de) muxing without other containers ?
> > If the awnser is (mostly) no then the problem is not FFmpeg wanting
> > its
> > own but rather that someone maintains mxf code outside ffmpeg.
>
> I think you missed my point about subtree merges
>
> > > Code is a liability. We should seek to have as little of it as
> > > possible.
> >
> > Look back at tesla, "vertical integration is a liability", that
> > sounds
> > wrong. Quite the opposit, companies that split everything out seem to
> > do
> > significantly worse.
>
> This again has to do with things like economics of scale. When it comes
> to inter-company exchange, profit acts as a fetter on productivity.
> Tesla has to pay not just the cost of Samsung's 18650 cells but also
> Samsung's profit. These are two reasons why vertical integration can
> make sense.
>
> If Tesla could copy Samsung's 18650 production line, Samsung's capital,
> literally for free then they would have done so from day one. This is
> what free software is - a kind of commons capital that can be copied
> gratis.
>
> > It doesnt mean everything should be done internally
> > but simply because some external work exists doesnt mean we need to
> > use it
> > and then have to maybe maintain a codebase that we do not know and
> > that
> > noone is willing to maintain and that noone from FFmpeg even has
> > write
> > access to.
>
> Obviously we wouldn't pull in things we have zero access to. We could
> for example subtree merge specific releases, and set up the build
> system so that we can either use that or an equivalent shared system
> version. One limitation of this approach is that new features in say
> bmx can't be merged immediately but has to wait for the next official
> release of bmx. But this can be handled by having a branch where
> changes in our codebase that depend on changes in bmx coexist, and on
> bmx's next release they are merged into master. Pressure could be
> applied from our end on bmx to release early if the feature is a
> pressing one.
>
> > Next some platforms may carry old versions of that external
> > code, some might not carry it at all.
>
> Good thing we could build those dependencies ourselves then, if we set
> up the build system correctly.
>
> > Also we have regression tests, external libs make that impossible
> > as the version of external libs can change the behavior. Again this
> > is a issue for mxf maybe less so libxml. You can also see that we
> > have no
> > tests involving  any of the external encoder libs, for that very
> > reason.
> > With each external lib that is needed for core features this would
> > become a quickly growing problem
>
> Testing is certainly a challenge, but almost certainly easier than
> reimplementing certain formats and codecs poorly, especially MXF.
>
>
Than why you are so called maintainer of thing you do not want to maintain?



Looks like some relatively "new" devs are completely failing to see the
point of FFmpeg.

And they are extremely ignorant of reality they live in all together.


If for whatever reason you do not want/like some code in FFmpeg then just
leave.





> /Tomas
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Tomas Härdin Aug. 27, 2022, 11:30 a.m. UTC | #52
lör 2022-08-27 klockan 09:53 +0200 skrev Paul B Mahol:
> On Sat, Aug 27, 2022 at 9:30 AM Tomas Härdin <tjoppen@acc.umu.se>
> wrote:
> 
> > ons 2022-08-24 klockan 23:03 +0200 skrev Michael Niedermayer:
> > > Also we have regression tests, external libs make that impossible
> > > as the version of external libs can change the behavior. Again
> > > this
> > > is a issue for mxf maybe less so libxml. You can also see that we
> > > have no
> > > tests involving  any of the external encoder libs, for that very
> > > reason.
> > > With each external lib that is needed for core features this
> > > would
> > > become a quickly growing problem
> > 
> > Testing is certainly a challenge, but almost certainly easier than
> > reimplementing certain formats and codecs poorly, especially MXF.
> > 
> > 
> Than why you are so called maintainer of thing you do not want to
> maintain?

Me maintaining mxfdec is precisely why I want to switch to bmx

> Looks like some relatively "new" devs are completely failing to see
> the
> point of FFmpeg.

I've been here longer than you

/Tomas
Baptiste Coudurier Aug. 27, 2022, 5:34 p.m. UTC | #53
On Aug 27, 2022, at 4:30 AM, Tomas Härdin <tjoppen@acc.umu.se> wrote:
> 
> lör 2022-08-27 klockan 09:53 +0200 skrev Paul B Mahol:
>> On Sat, Aug 27, 2022 at 9:30 AM Tomas Härdin <tjoppen@acc.umu.se>
>> wrote:
>> 
>>> ons 2022-08-24 klockan 23:03 +0200 skrev Michael Niedermayer:
>>>> Also we have regression tests, external libs make that impossible
>>>> as the version of external libs can change the behavior. Again
>>>> this
>>>> is a issue for mxf maybe less so libxml. You can also see that we
>>>> have no
>>>> tests involving  any of the external encoder libs, for that very
>>>> reason.
>>>> With each external lib that is needed for core features this
>>>> would
>>>> become a quickly growing problem
>>> 
>>> Testing is certainly a challenge, but almost certainly easier than
>>> reimplementing certain formats and codecs poorly, especially MXF.
>>> 
>>> 
>> Than why you are so called maintainer of thing you do not want to
>> maintain?
> 
> Me maintaining mxfdec is precisely why I want to switch to bmx

I strongly oppose to that.
Didn’t we just merge IMF support ? That seems to indicate that people are fine with FFmpeg MXF implementation.

If the need arises, I will maintain the MXF code, I use it everyday after all :)

— 
Baptiste
Tomas Härdin Aug. 28, 2022, 11:49 a.m. UTC | #54
lör 2022-08-27 klockan 10:34 -0700 skrev Baptiste Coudurier:
> On Aug 27, 2022, at 4:30 AM, Tomas Härdin <tjoppen@acc.umu.se> wrote:
> > 
> > lör 2022-08-27 klockan 09:53 +0200 skrev Paul B Mahol:
> > > On Sat, Aug 27, 2022 at 9:30 AM Tomas Härdin <tjoppen@acc.umu.se>
> > > wrote:
> > > 
> > > > ons 2022-08-24 klockan 23:03 +0200 skrev Michael Niedermayer:
> > > > > Also we have regression tests, external libs make that
> > > > > impossible
> > > > > as the version of external libs can change the behavior.
> > > > > Again
> > > > > this
> > > > > is a issue for mxf maybe less so libxml. You can also see
> > > > > that we
> > > > > have no
> > > > > tests involving  any of the external encoder libs, for that
> > > > > very
> > > > > reason.
> > > > > With each external lib that is needed for core features this
> > > > > would
> > > > > become a quickly growing problem
> > > > 
> > > > Testing is certainly a challenge, but almost certainly easier
> > > > than
> > > > reimplementing certain formats and codecs poorly, especially
> > > > MXF.
> > > > 
> > > > 
> > > Than why you are so called maintainer of thing you do not want to
> > > maintain?
> > 
> > Me maintaining mxfdec is precisely why I want to switch to bmx
> 
> I strongly oppose to that.
> Didn’t we just merge IMF support ? That seems to indicate that people
> are fine with FFmpeg MXF implementation.
> 
> If the need arises, I will maintain the MXF code, I use it everyday
> after all :)

Happy to see you still on this list Baptiste :)

My point is that both of us and the other few people in the free
software scene who care about MXF should focus our attention on bmx.

A similar argument can be made for j2k which I've been working on
lately. While the improvements I have made for the lavc decoder make it
considerably faster, it's still only about half as fast as openjpeg.
The effort currently being spent by the student implementing htj2k
support in our decoder might be better spent merging openjpeg and
implementing codeblock level multithreading like I have done for lavc's
j2k decoder..

/Tomas
Michael Niedermayer Aug. 28, 2022, 2:14 p.m. UTC | #55
On Sat, Aug 27, 2022 at 09:05:06AM +0200, Tomas Härdin wrote:
> ons 2022-08-24 klockan 22:54 +0200 skrev Michael Niedermayer:
> > On Wed, Aug 24, 2022 at 06:35:04PM +0200, Tomas Härdin wrote:
> > > But for some reason the notion
> > > that the same applies to *all* parsers, including decoders and
> > > demuxers, this notion is hard to swallow. And similarly for
> > > encoders
> > > and muxers. I have yet to see a justification that is anything but
> > > cargo culting.
> > 
> > Its not hard to swallow, it simply is wrong.
> > Why is there Tesla ?
> > to build cars ?
> > no
> > "Tesla’s mission is to accelerate the world’s transition to
> > sustainable energy."
> 
> Tesla's mission is to generate profit, nothing else.

I am sorry but Teslas mission is what i said, and its the first line
on their About page

https://www.tesla.com/about

Thats an official statement and as such I would presume we can trust it.
If the first line of their about page was untrue i presume Tesla would
open itself up for some lawsuits


> 
> > they could outsource everything, from chip design to batteries to
> > software to
> > the car seats and so on but they dont because its better to reach
> > their goal
> > to do them internally
> 
> This has everything to do with economics of scale and ultimately
> economizing on labour, thus lowering the value of Tesla's lithium-ion
> cells and increasing profit.
> 
> For software the situation is very different, because the cost of
> reproducing a program is effectively zero. All labour goes into
> development. The goal of FFmpeg like every free software project is to
> create use-values. Any labour spent in excess of what is necessary to
> say be able to play MXF files is simply make-work.

With software there is cost in maintaining external libraries, not just
API, features, distribution issues, security, but also developers being
unhappy for example. This is not all that different from physical goods
its different terms but a software project that integrates 500 libraries
maintained by 500 external teams is going to be a huge pain than if you
habd the 500 things internally. Its 500 not well known codebases, each
could become unmaintained, may be unavailable or outdated on some platform
not that different from a car manufactor having to deal with supply chain
issues and other things from 500 external companies

Also even if the analogy would fail that doesnt make moving core
functionally to externally maintained libs a good idea.

thx

[...]
diff mbox series

Patch

diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c
index 5a5178c563..907b61b017 100644
--- a/libavformat/ipfsgateway.c
+++ b/libavformat/ipfsgateway.c
@@ -240,13 +240,8 @@  static int translate_ipfs_to_http(URLContext *h, const char *uri, int flags, AVD
         ret = populate_ipfs_gateway(h);
 
         if (ret < 1) {
-            // We fallback on dweb.link (managed by Protocol Labs).
-            snprintf(c->gateway_buffer, sizeof(c->gateway_buffer), "https://dweb.link");
-
-            av_log(h, AV_LOG_WARNING,
-                   "IPFS does not appear to be running. "
-                   "You’re now using the public gateway at dweb.link.\n");
-            av_log(h, AV_LOG_INFO,
+            av_log(h, AV_LOG_ERROR,
+                   "IPFS does not appear to be running.\n\n"
                    "Installing IPFS locally is recommended to "
                    "improve performance and reliability, "
                    "and not share all your activity with a single IPFS gateway.\n"
@@ -259,6 +254,8 @@  static int translate_ipfs_to_http(URLContext *h, const char *uri, int flags, AVD
                    "3. Define an $IPFS_PATH environment variable "
                    "and point it to the IPFS data path "
                    "- this is typically ~/.ipfs\n");
+            ret = AVERROR(EINVAL);
+            goto err;
         }
     }