Message ID | 20200725024537.97453-1-lq@chinaffmpeg.org |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel,v2,1/2] avformat/url: check double dot is not to parent directory | expand |
Context | Check | Description |
---|---|---|
andriy/default | pending | |
andriy/make | success | Make finished |
andriy/make_fate | success | Make fate finished |
Hi Steven, It is better but still not correct. Consider this test: test("http://server/foo/bar", "a/b/../c/d/../e../.../..f/g../h../other/url/a.mp3/..."); It should give " http://server/foo/bar/a/c/e../.../..f/g../h../other/url/a.mp3/...". I think the best would be to use strtok(p, "/") to split the path into the components and for each ".." component remove the previous one (if there are some still). Best regards, Josef On Sat, Jul 25, 2020 at 4:45 AM Steven Liu <lq@chinaffmpeg.org> wrote: > fix ticket: 8814 > if get ".." in the url, check next byte and lead byte by double dot, > it there have no '/' and not root node, it is not used go to directory ".." > > Signed-off-by: Steven Liu <lq@chinaffmpeg.org> > --- > libavformat/url.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/libavformat/url.c b/libavformat/url.c > index 20463a6674..35f27fe3ca 100644 > --- a/libavformat/url.c > +++ b/libavformat/url.c > @@ -97,6 +97,18 @@ static void trim_double_dot_url(char *buf, const char > *rel, int size) > /* set new current position if the root node is changed */ > p = root; > while (p && (node = strstr(p, ".."))) { > + if (strlen(node) > 2 && node[2] != '/') { > + node = strstr(node + 1, ".."); > + if (!node) > + break; > + } > + > + if (p != node && p[node - p - 1] != '/') { > + node = strstr(node + 1, ".."); > + if (!node) > + break; > + } > + > av_strlcat(tmp_path, p, node - p + strlen(tmp_path)); > p = node + 3; > sep = strrchr(tmp_path, '/'); > -- > 2.25.0 > > > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
在 2020/7/25 下午2:11,“ffmpeg-devel 代表 Zlomek, Josef”<ffmpeg-devel-bounces@ffmpeg.org 代表 josef@pex.com> 写入: Hi Steven, It is better but still not correct. Consider this test: test("http://server/foo/bar", "a/b/../c/d/../e../.../..f/g../h../other/url/a.mp3/..."); It should give " http://server/foo/bar/a/c/e../.../..f/g../h../other/url/a.mp3/...". I think the best would be to use strtok(p, "/") to split the path into the components and for each ".." component remove the previous one (if there are some still). I think you can submit patch if you have full idea for it. Best regards, Josef On Sat, Jul 25, 2020 at 4:45 AM Steven Liu <lq@chinaffmpeg.org> wrote: > fix ticket: 8814 > if get ".." in the url, check next byte and lead byte by double dot, > it there have no '/' and not root node, it is not used go to directory ".." > > Signed-off-by: Steven Liu <lq@chinaffmpeg.org> > --- > libavformat/url.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/libavformat/url.c b/libavformat/url.c > index 20463a6674..35f27fe3ca 100644 > --- a/libavformat/url.c > +++ b/libavformat/url.c > @@ -97,6 +97,18 @@ static void trim_double_dot_url(char *buf, const char > *rel, int size) > /* set new current position if the root node is changed */ > p = root; > while (p && (node = strstr(p, ".."))) { > + if (strlen(node) > 2 && node[2] != '/') { > + node = strstr(node + 1, ".."); > + if (!node) > + break; > + } > + > + if (p != node && p[node - p - 1] != '/') { > + node = strstr(node + 1, ".."); > + if (!node) > + break; > + } > + > av_strlcat(tmp_path, p, node - p + strlen(tmp_path)); > p = node + 3; > sep = strrchr(tmp_path, '/'); > -- > 2.25.0 > > > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". -- Josef Zlomek _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Steven Liu (12020-07-25): > fix ticket: 8814 > if get ".." in the url, check next byte and lead byte by double dot, > it there have no '/' and not root node, it is not used go to directory ".." > > Signed-off-by: Steven Liu <lq@chinaffmpeg.org> > --- > libavformat/url.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/libavformat/url.c b/libavformat/url.c > index 20463a6674..35f27fe3ca 100644 > --- a/libavformat/url.c > +++ b/libavformat/url.c > @@ -97,6 +97,18 @@ static void trim_double_dot_url(char *buf, const char *rel, int size) > /* set new current position if the root node is changed */ > p = root; > while (p && (node = strstr(p, ".."))) { > + if (strlen(node) > 2 && node[2] != '/') { I have not yet looked at the whole patch, but this strlen() test is useless. And I think you would better rework the complete logic of the test: strstring ".." is a broken method. > + node = strstr(node + 1, ".."); > + if (!node) > + break; > + } > + > + if (p != node && p[node - p - 1] != '/') { > + node = strstr(node + 1, ".."); > + if (!node) > + break; > + } > + > av_strlcat(tmp_path, p, node - p + strlen(tmp_path)); > p = node + 3; > sep = strrchr(tmp_path, '/'); Regards,
刘歧 (12020-07-25): > I think the best would be to use strtok(p, "/") to split the path into the > components and for each ".." component remove the previous one (if there > are some still). > I think you can submit patch if you have full idea for it. strtok() is a terrible API, and better not used, but you are right, the logic of the parsing is flawed. I would suggest to keep walk from / to / (using strchr() probably), and keep a pointer on the last two in the source and the last one in the target IIRC. Regards,
Nicolas George <george@nsup.org> 于2020年7月25日周六 下午4:31写道: > > Steven Liu (12020-07-25): > > fix ticket: 8814 > > if get ".." in the url, check next byte and lead byte by double dot, > > it there have no '/' and not root node, it is not used go to directory ".." > > > > Signed-off-by: Steven Liu <lq@chinaffmpeg.org> > > --- > > libavformat/url.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/libavformat/url.c b/libavformat/url.c > > index 20463a6674..35f27fe3ca 100644 > > --- a/libavformat/url.c > > +++ b/libavformat/url.c > > @@ -97,6 +97,18 @@ static void trim_double_dot_url(char *buf, const char *rel, int size) > > /* set new current position if the root node is changed */ > > p = root; > > while (p && (node = strstr(p, ".."))) { > > > + if (strlen(node) > 2 && node[2] != '/') { > > I have not yet looked at the whole patch, but this strlen() test is > useless. > > And I think you would better rework the complete logic of the test: agreed, I think need lots of testcase for the logic, ../ .../.. dummy../... .../..dummy and so on, maybe need spend some time to do it :D > strstring ".." is a broken method. > > > + node = strstr(node + 1, ".."); > > + if (!node) > > + break; > > + } > > + > > + if (p != node && p[node - p - 1] != '/') { > > + node = strstr(node + 1, ".."); > > + if (!node) > > + break; > > + } > > + > > av_strlcat(tmp_path, p, node - p + strlen(tmp_path)); > > p = node + 3; > > sep = strrchr(tmp_path, '/'); > > Regards, > > -- > Nicolas George > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Steven Liu (12020-07-25): > > And I think you would better rework the complete logic of the test: > agreed, I think need lots of testcase for the logic, ../ .../.. > dummy../... .../..dummy and so on, > maybe need spend some time to do it :D I did not mean the FATE test for this feature, I meant the test of whether something is a parent directory or not. It should parse the path, not fish for a pattern. Regards,
Nicolas George <george@nsup.org> 于2020年7月25日周六 下午4:44写道: > > Steven Liu (12020-07-25): > > > And I think you would better rework the complete logic of the test: > > agreed, I think need lots of testcase for the logic, ../ .../.. > > dummy../... .../..dummy and so on, > > maybe need spend some time to do it :D > > I did not mean the FATE test for this feature, I meant the test of > whether something is a parent directory or not. It should parse the Yes, I mean, not only same as your suggestion, also need make full FATE. > path, not fish for a pattern. > > Regards, > > -- > Nicolas George > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
On Sat, 25 Jul 2020, Zlomek, Josef wrote: > Hi Steven, > > It is better but still not correct. Consider this test: > > test("http://server/foo/bar", > "a/b/../c/d/../e../.../..f/g../h../other/url/a.mp3/..."); > It should give " > http://server/foo/bar/a/c/e../.../..f/g../h../other/url/a.mp3/...". > > I think the best would be to use strtok(p, "/") to split the path into the > components and for each ".." component remove the previous one (if there > are some still). And I also would like to point out that using static strings with MAX_URL_SIZE is not OK. This function supports an arbitrary buffer size, so limiting it to MAX_URL_SIZE is a bug. Regards, Marton > > Best regards, > Josef > > On Sat, Jul 25, 2020 at 4:45 AM Steven Liu <lq@chinaffmpeg.org> wrote: > >> fix ticket: 8814 >> if get ".." in the url, check next byte and lead byte by double dot, >> it there have no '/' and not root node, it is not used go to directory ".." >> >> Signed-off-by: Steven Liu <lq@chinaffmpeg.org> >> --- >> libavformat/url.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/libavformat/url.c b/libavformat/url.c >> index 20463a6674..35f27fe3ca 100644 >> --- a/libavformat/url.c >> +++ b/libavformat/url.c >> @@ -97,6 +97,18 @@ static void trim_double_dot_url(char *buf, const char >> *rel, int size) >> /* set new current position if the root node is changed */ >> p = root; >> while (p && (node = strstr(p, ".."))) { >> + if (strlen(node) > 2 && node[2] != '/') { >> + node = strstr(node + 1, ".."); >> + if (!node) >> + break; >> + } >> + >> + if (p != node && p[node - p - 1] != '/') { >> + node = strstr(node + 1, ".."); >> + if (!node) >> + break; >> + } >> + >> av_strlcat(tmp_path, p, node - p + strlen(tmp_path)); >> p = node + 3; >> sep = strrchr(tmp_path, '/'); >> -- >> 2.25.0 >> >> >> >> >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > > > > -- > Josef Zlomek > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Marton Balint <cus@passwd.hu> 于2020年7月25日周六 下午5:40写道: > And I also would like to point out that using static strings with > MAX_URL_SIZE is not OK. This function supports an arbitrary buffer size, > so limiting it to MAX_URL_SIZE is a bug. Okay, get it, i'm reworking. Thanks Steven
Marton Balint (12020-07-25): > And I also would like to point out that using static strings with > MAX_URL_SIZE is not OK. This function supports an arbitrary buffer size, so > limiting it to MAX_URL_SIZE is a bug. Excellent point. That would require changing the prototype of a few functions, but they are internal, we can do it. For this, I would like to pitch the AVWriter API once again, because it is made precisely for these cases: when you need to return a string of arbitrary length, but would like to avoid, as much a possible, the overhead of dynamically allocating it and the many error tests that come with it. It is like AVBPrint, but better. Let me elaborate without polluting the discussion by showing the actual implementation. So, let us say that function A calls function B, and function B needs to return a string of arbitrary length but often small (I say string, but the API can deal with binary data). Likely use cases: rewriting an URL like here; escaping special characters; serialization to key=value string; serialization to JSON or XML; character encoding conversion, zlib-style unpacking, etc. First, from the point of view of B, the function that needs to return the string: Instead of either: int B(char *buf, size_t buf_size, other arguments); int B(char **buf, other arguments); request an AVWriter as argument: void B(AVWriter wr, other arguments); Then, just write in it with the various API functions: av_writer_write(), av_writer_printf(), etc. And that's all. Everything is taken care of: the string will grow as needed, the size will be tracked, error checks are made. Nothing to worry about at all. Now, from the point of view of A, the function that calls B and gets a string in return: Instead of declaring a buffer, or a pointer to a buffer, declare: AVWriter wr = av_dynbuf_writer(); and call B with it: B(wr, ...). Then, extract the string, check for error (of course: dynamic allocation can happen and fail; but this error check is the only one necessary), use the string and free it: char *msg = av_dynbuf_writer_get_data(wr, NULL); if (!msg) return AVERROR(ENOMEM); do something with msg av_dynbuf_writer_reset(wr); I concede, for this example, av_dynbuf_writer_get_data() is one more step than using the pointer directly. But as soon as the code becomes more complex, in particular as soon as it uses the same writer twice to put two strings together, this extra step is balanced by the fewer error checks necessary. All said, it makes one more API to learn, but IMHO, it is very simple to learn: seven line of sample code. And the benefits are immediate. Plus: there are many wonderful features that I have not told you about yet. So, Steven, Marton and the others: presented like that, does it look appealing? Shall I elaborate? Regards,
I will submit patch on monday. Josef On Sat, Jul 25, 2020 at 8:27 AM 刘歧 <lq@chinaffmpeg.org> wrote: > > > 在 2020/7/25 下午2:11,“ffmpeg-devel 代表 Zlomek, Josef”< > ffmpeg-devel-bounces@ffmpeg.org 代表 josef@pex.com> 写入: > > Hi Steven, > > It is better but still not correct. Consider this test: > > test("http://server/foo/bar", > "a/b/../c/d/../e../.../..f/g../h../other/url/a.mp3/..."); > It should give " > http://server/foo/bar/a/c/e../.../..f/g../h../other/url/a.mp3/...". > > I think the best would be to use strtok(p, "/") to split the path into > the > components and for each ".." component remove the previous one (if > there > are some still). > I think you can submit patch if you have full idea for it. > > Best regards, > Josef > > On Sat, Jul 25, 2020 at 4:45 AM Steven Liu <lq@chinaffmpeg.org> wrote: > > > fix ticket: 8814 > > if get ".." in the url, check next byte and lead byte by double dot, > > it there have no '/' and not root node, it is not used go to > directory ".." > > > > Signed-off-by: Steven Liu <lq@chinaffmpeg.org> > > --- > > libavformat/url.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/libavformat/url.c b/libavformat/url.c > > index 20463a6674..35f27fe3ca 100644 > > --- a/libavformat/url.c > > +++ b/libavformat/url.c > > @@ -97,6 +97,18 @@ static void trim_double_dot_url(char *buf, const > char > > *rel, int size) > > /* set new current position if the root node is changed */ > > p = root; > > while (p && (node = strstr(p, ".."))) { > > + if (strlen(node) > 2 && node[2] != '/') { > > + node = strstr(node + 1, ".."); > > + if (!node) > > + break; > > + } > > + > > + if (p != node && p[node - p - 1] != '/') { > > + node = strstr(node + 1, ".."); > > + if (!node) > > + break; > > + } > > + > > av_strlcat(tmp_path, p, node - p + strlen(tmp_path)); > > p = node + 3; > > sep = strrchr(tmp_path, '/'); > > -- > > 2.25.0 > > > > > > > > > > _______________________________________________ > > ffmpeg-devel mailing list > > ffmpeg-devel@ffmpeg.org > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > > > To unsubscribe, visit link above, or email > > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > > > > -- > Josef Zlomek > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > > > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Marton Balint <cus@passwd.hu> 于2020年7月25日周六 下午5:40写道: > > > > On Sat, 25 Jul 2020, Zlomek, Josef wrote: > > > Hi Steven, > > > > It is better but still not correct. Consider this test: > > > > test("http://server/foo/bar", > > "a/b/../c/d/../e../.../..f/g../h../other/url/a.mp3/..."); > > It should give " > > http://server/foo/bar/a/c/e../.../..f/g../h../other/url/a.mp3/...". > > > > I think the best would be to use strtok(p, "/") to split the path into the > > components and for each ".." component remove the previous one (if there > > are some still). > > And I also would like to point out that using static strings with > MAX_URL_SIZE is not OK. This function supports an arbitrary buffer size, > so limiting it to MAX_URL_SIZE is a bug. What about use av_malloc? or bprint? I think use av_malloc maybe easter to me. > > Regards, > Marton > > > Thanks Steven
Nicolas George <george@nsup.org> 于2020年7月25日周六 下午7:03写道: > > Marton Balint (12020-07-25): > > And I also would like to point out that using static strings with > > MAX_URL_SIZE is not OK. This function supports an arbitrary buffer size, so > > limiting it to MAX_URL_SIZE is a bug. > > Excellent point. That would require changing the prototype of a few > functions, but they are internal, we can do it. > > For this, I would like to pitch the AVWriter API once again, because it > is made precisely for these cases: when you need to return a string of > arbitrary length, but would like to avoid, as much a possible, the > overhead of dynamically allocating it and the many error tests that come > with it. It is like AVBPrint, but better. > > Let me elaborate without polluting the discussion by showing the actual > implementation. > > So, let us say that function A calls function B, and function B needs to > return a string of arbitrary length but often small (I say string, but > the API can deal with binary data). > > Likely use cases: rewriting an URL like here; escaping special > characters; serialization to key=value string; serialization to JSON or > XML; character encoding conversion, zlib-style unpacking, etc. > > > First, from the point of view of B, the function that needs to return > the string: > > Instead of either: > > int B(char *buf, size_t buf_size, other arguments); > int B(char **buf, other arguments); > > request an AVWriter as argument: > > void B(AVWriter wr, other arguments); > > Then, just write in it with the various API functions: > av_writer_write(), av_writer_printf(), etc. > > And that's all. Everything is taken care of: the string will grow as > needed, the size will be tracked, error checks are made. Nothing to > worry about at all. > > > Now, from the point of view of A, the function that calls B and gets a > string in return: > > Instead of declaring a buffer, or a pointer to a buffer, declare: > > AVWriter wr = av_dynbuf_writer(); > > and call B with it: B(wr, ...). > > Then, extract the string, check for error (of course: dynamic allocation > can happen and fail; but this error check is the only one necessary), > use the string and free it: > > char *msg = av_dynbuf_writer_get_data(wr, NULL); > if (!msg) > return AVERROR(ENOMEM); > do something with msg > av_dynbuf_writer_reset(wr); > > I concede, for this example, av_dynbuf_writer_get_data() is one more > step than using the pointer directly. But as soon as the code becomes > more complex, in particular as soon as it uses the same writer twice to > put two strings together, this extra step is balanced by the fewer error > checks necessary. > > > All said, it makes one more API to learn, but IMHO, it is very simple to > learn: seven line of sample code. And the benefits are immediate. > > Plus: there are many wonderful features that I have not told you about > yet. > > So, Steven, Marton and the others: presented like that, does it look > appealing? Shall I elaborate? Because language barries. maybe I cannot deep understand, maybe need more code examples. But look interesting, maybe more easy for use. > > Regards, > > -- > Nicolas George > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Steven Liu (12020-07-27): > What about use av_malloc? or bprint? > I think use av_malloc maybe easter to me. For internal APIs, definitely BPrint over naive av_malloc(). But this is unrelated to the issue at hand: the size limit exists in the caller of this function, not in this function. Regards,
On Mon, 27 Jul 2020, Steven Liu wrote: > Marton Balint <cus@passwd.hu> 于2020年7月25日周六 下午5:40写道: >> >> >> >> On Sat, 25 Jul 2020, Zlomek, Josef wrote: >> >> > Hi Steven, >> > >> > It is better but still not correct. Consider this test: >> > >> > test("http://server/foo/bar", >> > "a/b/../c/d/../e../.../..f/g../h../other/url/a.mp3/..."); >> > It should give " >> > http://server/foo/bar/a/c/e../.../..f/g../h../other/url/a.mp3/...". >> > >> > I think the best would be to use strtok(p, "/") to split the path into the >> > components and for each ".." component remove the previous one (if there >> > are some still). >> >> And I also would like to point out that using static strings with >> MAX_URL_SIZE is not OK. This function supports an arbitrary buffer size, >> so limiting it to MAX_URL_SIZE is a bug. > What about use av_malloc? or bprint? > I think use av_malloc maybe easter to me. It should be implemented in a way which does not use a temporary buffer. Seems doable. Regards, Marton
On Sat, 25 Jul 2020, Nicolas George wrote: > Marton Balint (12020-07-25): >> And I also would like to point out that using static strings with >> MAX_URL_SIZE is not OK. This function supports an arbitrary buffer size, so >> limiting it to MAX_URL_SIZE is a bug. > > Excellent point. That would require changing the prototype of a few > functions, but they are internal, we can do it. If I understand correctly, you are speaking about URL handling in general all over the codebase, and not only ff_make_absolute_url. > > For this, I would like to pitch the AVWriter API once again, I think we understand the concept. It just seems a tiny bit heavy weight, or at least that was the primary concern as far as I remember. Regards, Marton > is made precisely for these cases: when you need to return a string of > arbitrary length, but would like to avoid, as much a possible, the > overhead of dynamically allocating it and the many error tests that come > with it. It is like AVBPrint, but better. > > Let me elaborate without polluting the discussion by showing the actual > implementation. > > So, let us say that function A calls function B, and function B needs to > return a string of arbitrary length but often small (I say string, but > the API can deal with binary data). > > Likely use cases: rewriting an URL like here; escaping special > characters; serialization to key=value string; serialization to JSON or > XML; character encoding conversion, zlib-style unpacking, etc. > > > First, from the point of view of B, the function that needs to return > the string: > > Instead of either: > > int B(char *buf, size_t buf_size, other arguments); > int B(char **buf, other arguments); > > request an AVWriter as argument: > > void B(AVWriter wr, other arguments); > > Then, just write in it with the various API functions: > av_writer_write(), av_writer_printf(), etc. > > And that's all. Everything is taken care of: the string will grow as > needed, the size will be tracked, error checks are made. Nothing to > worry about at all. > > > Now, from the point of view of A, the function that calls B and gets a > string in return: > > Instead of declaring a buffer, or a pointer to a buffer, declare: > > AVWriter wr = av_dynbuf_writer(); > > and call B with it: B(wr, ...). > > Then, extract the string, check for error (of course: dynamic allocation > can happen and fail; but this error check is the only one necessary), > use the string and free it: > > char *msg = av_dynbuf_writer_get_data(wr, NULL); > if (!msg) > return AVERROR(ENOMEM); > do something with msg > av_dynbuf_writer_reset(wr); > > I concede, for this example, av_dynbuf_writer_get_data() is one more > step than using the pointer directly. But as soon as the code becomes > more complex, in particular as soon as it uses the same writer twice to > put two strings together, this extra step is balanced by the fewer error > checks necessary. > > > All said, it makes one more API to learn, but IMHO, it is very simple to > learn: seven line of sample code. And the benefits are immediate. > > Plus: there are many wonderful features that I have not told you about > yet. > > So, Steven, Marton and the others: presented like that, does it look > appealing? Shall I elaborate? > > Regards, > > -- > Nicolas George >
Steven Liu (12020-07-27): > Because language barries. maybe I cannot deep understand, maybe need > more code examples. > But look interesting, maybe more easy for use. All right, here is a practical albeit fictional example. Let us say we want av_metadata_to_user_string() that turns an AVDictionary into a string for users, and show_metadata_in_dialog() to print the metadata of all the streams in a file in a GUI dialog. av_metadata_to_user_string() using naïve av_malloc(): char *av_metadata_to_user_string(AVDictionary *meta) { size_t size = 1, len; AVDictionaryEntry *entry; char *str, *cur; entry = NULL; while ((entry = av_dict_get(meta, "", entry, AV_DICT_IGNORE_SUFFIX))) { len = strlen(entry->key); if (len > SIZE_MAX - size) return NULL; size += len; len = strlen(entry->value); if (len > SIZE_MAX - size) return NULL; size += len; if (3 > SIZE_MAX - size) return NULL; size += 3; } str = av_malloc(size); if (!str) return AVERROR(ENOMEM); cur = str; entry = NULL; while ((entry = av_dict_get(meta, "", entry, AV_DICT_IGNORE_SUFFIX))) { cur += av_strlcatf(cur, "%s: %s\n", entry->key, entry->value, str + size - cur); } av_assert0(cur == str + size - 1); return str; } av_metadata_to_user_string() using AVBPrint: void av_metadata_to_user_string(AVWriter wr, AVDictionary *meta) { AVDictionaryEntry *entry; while ((entry = av_dict_get(meta, "", entry, AV_DICT_IGNORE_SUFFIX))) { av_writer_printf(wr, entry->key, entry->value, str + size - cur); } } show_metadata_in_dialog(), without AVBPrint, but relying on Gtk+'s features (I want to show AVWriter is good, it would be dishonest not to use all the tools to make the other side look good): void show_metadata_in_dialog(AVFormatContext *ctx, GtkLabel *label) { GString *str = g_string_new(); unsigned i; for (i = 0; i < ctx->nb_streams; i++) { char *meta = av_metadata_to_user_string(ctx->streams->metadata); if (!meta) fatal_error("out of memory"); g_string_printf("Stream #%d:\n%s\n", i, meta); av_free(meta); } gtk_label_set_text(label, str->str); g_string_free(str); } show_metadata_in_dialog() using AVBPrint: void show_metadata_in_dialog(AVFormatContext *ctx, GtkLabel *label) { AVWriter = av_dynbuf_writer(); unsigned i; for (i = 0; i < ctx->nb_streams; i++) { av_writer_printf(wr, "Stream #%d:\n", i); av_metadata_to_user_string(wr, ctx->streams->metadata); g_string_print(wr, "\n"); } if (av_dynbuf_writer_get_error(wr)) fatal_error("out of memory"); gtk_label_set_text(label, av_dynbuf_writer_get_data(wr, NULL)); av_dynbuf_writer_reset(wr); } Now, imagine if a lot of FFmpeg functions did use the same AVWriter, and therefore worked well together. Also, not visible in the code: unless the metadata is big, the AVWriter version will keep all in the stack. Regards,
Marton Balint (12020-07-27): > If I understand correctly, you are speaking about URL handling in general > all over the codebase, and not only ff_make_absolute_url. I thought that was what you were speaking in the first place. ff_make_absolute_url() only does what the caller wants, the caller decides the size of the buffer, and we can be sure it is shorter than the concatenation of both parts. > I think we understand the concept. It just seems a tiny bit heavy weight, or > at least that was the primary concern as far as I remember. That's because I made the mistake of showing the implementation and talking about the benefits. The implementation is my problem (and whoever would succeed me if I were to stop maintaining it, but if it is accepted, I do not intend to). The benefits are for everybody. Please have a look at the other mail I just sent, where I show the benefits and keep the implementation for myself. Regards,
On Sat, 25 Jul 2020, at 13:03, Nicolas George wrote: > And that's all. Everything is taken care of: the string will grow as > needed, the size will be tracked, error checks are made. Nothing to > worry about at all. How is that different from open_memstream, which is done for this exact purpose? -- Jean-Baptiste Kempf - President +33 672 704 734
On Mon, Jul 27, 2020 at 8:57 PM Marton Balint <cus@passwd.hu> wrote: > It should be implemented in a way which does not use a temporary buffer. > Seems doable. > You are right. I will submit a new version. Best regards, Josef
Jean-Baptiste Kempf (12020-07-28):
> How is that different from open_memstream, which is done for this exact purpose?
I am not sure what you are referring to, I do not find a function by
that name in FFmpeg.
Are you referring to avio_open_dyn_buf()?
If so, there are three differences:
- avio_open_dyn_buf() resides in libavformat, and therefore is only
available for a tiny portion of the project. I had a vague project to
move part of AVIO to lavu, but it is huge work.
- avio_open_dyn_buf() always uses dynamic allocation. AVWriter starts
entirely on the stack (or whatever we want), and only resorts to
dynamic allocation if the buffer becomes too large. As a corollary,
avio_open_dyn_buf() requires an error check at the beginning, which
AVWriter does not.
- AVWriter is polymorphic. I have shown only av_dynbuf_writer(), but
there are others, and applications can implement their own. That is
the key to the "wonderful features" I have evoked. I would be happy to
tell more about them.
Regards,
> On Jul 28, 2020, at 4:05 PM, Nicolas George <george@nsup.org> wrote: > > Jean-Baptiste Kempf (12020-07-28): >> How is that different from open_memstream, which is done for this exact purpose? > > I am not sure what you are referring to, I do not find a function by > that name in FFmpeg. I think jb is referring to FILE *open_memstream(char **bufp, size_t *sizep); https://linux.die.net/man/3/open_memstream <https://linux.die.net/man/3/open_memstream> VLC has a wrapper for it: https://code.videolan.org/videolan/vlc/-/blob/master/include/vlc_memstream.h <https://code.videolan.org/videolan/vlc/-/blob/master/include/vlc_memstream.h> > > Are you referring to avio_open_dyn_buf()? > > If so, there are three differences: > > - avio_open_dyn_buf() resides in libavformat, and therefore is only > available for a tiny portion of the project. I had a vague project to > move part of AVIO to lavu, but it is huge work. > > - avio_open_dyn_buf() always uses dynamic allocation. AVWriter starts > entirely on the stack (or whatever we want), and only resorts to > dynamic allocation if the buffer becomes too large. As a corollary, > avio_open_dyn_buf() requires an error check at the beginning, which > AVWriter does not. > > - AVWriter is polymorphic. I have shown only av_dynbuf_writer(), but > there are others, and applications can implement their own. That is > the key to the "wonderful features" I have evoked. I would be happy to > tell more about them. > > Regards, > > -- > Nicolas George > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
zhilizhao (12020-07-28): > I think jb is referring to > > FILE *open_memstream(char **bufp, size_t *sizep); > https://linux.die.net/man/3/open_memstream <https://linux.die.net/man/3/open_memstream> > > VLC has a wrapper for it: > > https://code.videolan.org/videolan/vlc/-/blob/master/include/vlc_memstream.h <https://code.videolan.org/videolan/vlc/-/blob/master/include/vlc_memstream.h> Oh. Thanks. I had not realized such an useful function had been standardized. That means the argument about it being in lavf is wotrhless. But these two arguments: > > - avio_open_dyn_buf() always uses dynamic allocation. AVWriter starts > > entirely on the stack (or whatever we want), and only resorts to > > dynamic allocation if the buffer becomes too large. As a corollary, > > avio_open_dyn_buf() requires an error check at the beginning, which > > AVWriter does not. > > > > - AVWriter is polymorphic. I have shown only av_dynbuf_writer(), but > > there are others, and applications can implement their own. That is > > the key to the "wonderful features" I have evoked. I would be happy to > > tell more about them. apply to open_memstream() too. Please ask me about the wonderful features ;) Regards,
> On Jul 28, 2020, at 8:02 PM, Nicolas George <george@nsup.org> wrote: > > zhilizhao (12020-07-28): >> I think jb is referring to >> >> FILE *open_memstream(char **bufp, size_t *sizep); >> https://linux.die.net/man/3/open_memstream <https://linux.die.net/man/3/open_memstream> >> >> VLC has a wrapper for it: >> >> https://code.videolan.org/videolan/vlc/-/blob/master/include/vlc_memstream.h <https://code.videolan.org/videolan/vlc/-/blob/master/include/vlc_memstream.h> > > Oh. Thanks. I had not realized such an useful function had been > standardized. > > That means the argument about it being in lavf is wotrhless. But these > two arguments: > >>> - avio_open_dyn_buf() always uses dynamic allocation. AVWriter starts >>> entirely on the stack (or whatever we want), and only resorts to >>> dynamic allocation if the buffer becomes too large. As a corollary, >>> avio_open_dyn_buf() requires an error check at the beginning, which >>> AVWriter does not. fmemopen() can work with memory on stack or on heap, although it doesn’t support switch between the two dynamically. It’s a good to have feature like the small string optimization in std::string, but i’m not too keen on it. >>> >>> - AVWriter is polymorphic. I have shown only av_dynbuf_writer(), but >>> there are others, and applications can implement their own. That is >>> the key to the "wonderful features" I have evoked. I would be happy to >>> tell more about them. > > apply to open_memstream() too. > > Please ask me about the wonderful features ;) Please, what wonderful features are in your plan? > > Regards, > > -- > Nicolas George > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
On 2020-07-28 20:43 +0800, zhilizhao wrote: > > On Jul 28, 2020, at 8:02 PM, Nicolas George <george@nsup.org> wrote: > > > > zhilizhao (12020-07-28): > >> I think jb is referring to > >> > >> FILE *open_memstream(char **bufp, size_t *sizep); > >> https://linux.die.net/man/3/open_memstream <https://linux.die.net/man/3/open_memstream> > >> > >> VLC has a wrapper for it: > >> > >> https://code.videolan.org/videolan/vlc/-/blob/master/include/vlc_memstream.h <https://code.videolan.org/videolan/vlc/-/blob/master/include/vlc_memstream.h> > > > > Oh. Thanks. I had not realized such an useful function had been > > standardized. > > > > That means the argument about it being in lavf is wotrhless. But these > > two arguments: > > > >>> - avio_open_dyn_buf() always uses dynamic allocation. AVWriter starts > >>> entirely on the stack (or whatever we want), and only resorts to > >>> dynamic allocation if the buffer becomes too large. As a corollary, > >>> avio_open_dyn_buf() requires an error check at the beginning, which > >>> AVWriter does not. > > fmemopen() can work with memory on stack or on heap, although it doesn’t > support switch between the two dynamically. It’s a good to have feature like > the small string optimization in std::string, but i’m not too keen on it. > > >>> > >>> - AVWriter is polymorphic. I have shown only av_dynbuf_writer(), but > >>> there are others, and applications can implement their own. That is > >>> the key to the "wonderful features" I have evoked. I would be happy to > >>> tell more about them. > > > > apply to open_memstream() too. > > > > Please ask me about the wonderful features ;) > > Please, what wonderful features are in your plan? I would say FILE * is polymorphic as well. Maybe we are not in control to provide more implementations, the question still would be which implementations we would want. Alexander
zhilizhao (12020-07-28): > fmemopen() can work with memory on stack or on heap, although it doesn’t > support switch between the two dynamically. It’s a good to have feature like > the small string optimization in std::string, but i’m not too keen on it. I think it is an important optimization. This kind of string operation can happen once per frame, and with low-latency codecs or devices, once per frame can be hundreds or thousand of times per second. Note that we are actively trying to prevent memory allocations that can happen once per frame, to the point of adding pools to avoid them. This optimization is both much simpler and a bit more efficient than using pools. And even for cases that are not once per frame, this is interesting: because this optimization is there, we do not have to hesitate to use this API even if we know that the string is short and a fixed buffer suffices. This is interesting because the more parts of FFmpeg use this API, the more convenient it becomes. > Please, what wonderful features are in your plan? ;) I may have exaggerated a little bit about wonderful: anything is possible with normal string buffers anyway. But AVWriter allows nice optimizations, avoiding many intermediary buffers. A few examples: If you want to send something directly to a file or the network, you can have av_avio_writer() that does that with an AVIO context. If the thing is big but constructed by parts, we avoid needing a big buffer, and serialization and writing can happen in parallel. If we have to pile string processing on top of each other, for example first serializing something and then escaping special characters in the resulting string, AVWriter can be made to do it on the fly, without an intermediate buffer in the middle. If an application works with other kinds of memory allocation, for example a Java application using StringBuffer and such, AVWriter can be made to use directly these instead of first using av_malloc() for a buffer and then converting it to what the application needs. I think that gives a good idea of what is possible. Most importantly, none of these feature makes anything more complicated in our code: using AVWriter is still as simple as I have shown earlier ( https://ffmpeg.org/pipermail/ffmpeg-devel/2020-July/266897.html ). Everything complex is in the implementation, and it not that complex. Regards,
> On Jul 30, 2020, at 11:42 PM, Nicolas George <george@nsup.org> wrote: > > zhilizhao (12020-07-28): >> fmemopen() can work with memory on stack or on heap, although it doesn’t >> support switch between the two dynamically. It’s a good to have feature like >> the small string optimization in std::string, but i’m not too keen on it. > > I think it is an important optimization. This kind of string operation > can happen once per frame, and with low-latency codecs or devices, once > per frame can be hundreds or thousand of times per second. > > Note that we are actively trying to prevent memory allocations that can > happen once per frame, to the point of adding pools to avoid them. This > optimization is both much simpler and a bit more efficient than using > pools. > > And even for cases that are not once per frame, this is interesting: > because this optimization is there, we do not have to hesitate to use > this API even if we know that the string is short and a fixed buffer > suffices. This is interesting because the more parts of FFmpeg use this > API, the more convenient it becomes. > >> Please, what wonderful features are in your plan? > > ;) I may have exaggerated a little bit about wonderful: anything is > possible with normal string buffers anyway. But AVWriter allows nice > optimizations, avoiding many intermediary buffers. > > A few examples: > > If you want to send something directly to a file or the network, you can > have av_avio_writer() that does that with an AVIO context. If the thing > is big but constructed by parts, we avoid needing a big buffer, and > serialization and writing can happen in parallel. > > If we have to pile string processing on top of each other, for example > first serializing something and then escaping special characters in the > resulting string, AVWriter can be made to do it on the fly, without an > intermediate buffer in the middle. > > If an application works with other kinds of memory allocation, for > example a Java application using StringBuffer and such, AVWriter can be > made to use directly these instead of first using av_malloc() for a > buffer and then converting it to what the application needs. > > I think that gives a good idea of what is possible. Most importantly, > none of these feature makes anything more complicated in our code: using > AVWriter is still as simple as I have shown earlier ( > https://ffmpeg.org/pipermail/ffmpeg-devel/2020-July/266897.html > ). Everything complex is in the implementation, and it not that complex. > So we can get a writer with AVIO context as backend, with custom stream filters. It do sound interesting, as long as it doesn't go too far like io in Java and C++. It's valuable to support user allocated memory, however, I'm afraid string is only a small part of the use cases. > Regards, > > -- > Nicolas George > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Zhao Zhili (12020-07-31): > So we can get a writer with AVIO context as backend, with custom stream filters. > It do sound interesting, as long as it doesn't go too far like io in Java and C++. The maze of twisty little APIs all alike is one of the reasons I try to minimize my interactions with C++ and Java. In particular, I do not plan on any kind of built-in inheritance, which is the doorway to this nightmare world. > It's valuable to support user allocated memory, however, I'm afraid string is only > a small part of the use cases. I can reassure you: I wrote string, but everything is designed to be completely 8-bit clean, and can be used for any kind of binary data. Regards,
diff --git a/libavformat/url.c b/libavformat/url.c index 20463a6674..35f27fe3ca 100644 --- a/libavformat/url.c +++ b/libavformat/url.c @@ -97,6 +97,18 @@ static void trim_double_dot_url(char *buf, const char *rel, int size) /* set new current position if the root node is changed */ p = root; while (p && (node = strstr(p, ".."))) { + if (strlen(node) > 2 && node[2] != '/') { + node = strstr(node + 1, ".."); + if (!node) + break; + } + + if (p != node && p[node - p - 1] != '/') { + node = strstr(node + 1, ".."); + if (!node) + break; + } + av_strlcat(tmp_path, p, node - p + strlen(tmp_path)); p = node + 3; sep = strrchr(tmp_path, '/');
fix ticket: 8814 if get ".." in the url, check next byte and lead byte by double dot, it there have no '/' and not root node, it is not used go to directory ".." Signed-off-by: Steven Liu <lq@chinaffmpeg.org> --- libavformat/url.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)