diff mbox series

[FFmpeg-devel] http: support retry on connection error

Message ID AM0PR04MB6467B92899C0EDF919C4447DF5180@AM0PR04MB6467.eurprd04.prod.outlook.com
State New
Headers show
Series [FFmpeg-devel] http: support retry on connection error
Related show

Checks

Context Check Description
andriy/x86_make success Make finished
andriy/x86_make_fate success Make fate finished
andriy/PPC64_make success Make finished
andriy/PPC64_make_fate success Make fate finished

Commit Message

Eran Kornblau Oct. 25, 2020, 1:40 p.m. UTC
Hi,

This patch adds 2 options to http:
- reconnect_on_status - a list of http status codes that should be retried. the list can contain explicit status codes or the strings 4xx/5xx.
- reconnect_on_err - reconnects on arbitrary errors during connect, e.g. ECONNRESET/ETIMEDOUT.

The retry employs the same exponential backoff logic as the existing reconnect/reconnect_at_eof flags.

Related tickets:
https://trac.ffmpeg.org/ticket/6066
https://trac.ffmpeg.org/ticket/7768

Thanks!

Eran
Subject: [PATCH] http: support retry on connection error

added 2 new options:
- reconnect_on_status - a list of http status codes that should be
    retried. the list can contain explicit status codes / the strings
    4xx/5xx.
- reconnect_on_err - reconnects on arbitrary errors during connect, e.g.
    ECONNRESET/ETIMEDOUT

the retry employs the same exponential backoff logic as the existing
reconnect/reconnect_at_eof flags.

related tickets:
https://trac.ffmpeg.org/ticket/6066
https://trac.ffmpeg.org/ticket/7768
---
 libavformat/http.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 60 insertions(+), 2 deletions(-)

Comments

Eran Kornblau Nov. 2, 2020, 6:49 a.m. UTC | #1
Ping

Thanks,
Eran

From: Eran Kornblau
Sent: Sunday, October 25, 2020 3:40 PM
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: [PATCH] http: support retry on connection error

Hi,

This patch adds 2 options to http:
- reconnect_on_status - a list of http status codes that should be retried. the list can contain explicit status codes or the strings 4xx/5xx.
- reconnect_on_err - reconnects on arbitrary errors during connect, e.g. ECONNRESET/ETIMEDOUT.

The retry employs the same exponential backoff logic as the existing reconnect/reconnect_at_eof flags.

Related tickets:
https://trac.ffmpeg.org/ticket/6066
https://trac.ffmpeg.org/ticket/7768

Thanks!

Eran
Eran Kornblau Nov. 8, 2020, 6:58 a.m. UTC | #2
Pinging again...

Thanks,
Eran

From: Eran Kornblau
Sent: Sunday, October 25, 2020 3:40 PM
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org<mailto:ffmpeg-devel@ffmpeg.org>>
Subject: [PATCH] http: support retry on connection error

Hi,

This patch adds 2 options to http:
- reconnect_on_status - a list of http status codes that should be retried. the list can contain explicit status codes or the strings 4xx/5xx.
- reconnect_on_err - reconnects on arbitrary errors during connect, e.g. ECONNRESET/ETIMEDOUT.

The retry employs the same exponential backoff logic as the existing reconnect/reconnect_at_eof flags.

Related tickets:
https://trac.ffmpeg.org/ticket/6066
https://trac.ffmpeg.org/ticket/7768

Thanks!

Eran
Eran Kornblau Nov. 17, 2020, 8:52 p.m. UTC | #3
Another ping... would be great to have some feedback on this...

Eran

From: Eran Kornblau
Sent: Sunday, November 8, 2020 8:59 AM
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: RE: [PATCH] http: support retry on connection error

Pinging again...

Thanks,
Eran

From: Eran Kornblau
Sent: Sunday, October 25, 2020 3:40 PM
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org<mailto:ffmpeg-devel@ffmpeg.org>>
Subject: [PATCH] http: support retry on connection error

Hi,

This patch adds 2 options to http:
- reconnect_on_status - a list of http status codes that should be retried. the list can contain explicit status codes or the strings 4xx/5xx.
- reconnect_on_err - reconnects on arbitrary errors during connect, e.g. ECONNRESET/ETIMEDOUT.

The retry employs the same exponential backoff logic as the existing reconnect/reconnect_at_eof flags.

Related tickets:
https://trac.ffmpeg.org/ticket/6066
https://trac.ffmpeg.org/ticket/7768

Thanks!

Eran
Marton Balint Nov. 18, 2020, 8:46 p.m. UTC | #4
On Tue, 17 Nov 2020, Eran Kornblau wrote:

> Hi,
>
> This patch adds 2 options to http:
> - reconnect_on_status - a list of http status codes that should be retried. the list can contain explicit status codes or the strings 4xx/5xx.
> - reconnect_on_err - reconnects on arbitrary errors during connect, e.g. ECONNRESET/ETIMEDOUT.
>
> The retry employs the same exponential backoff logic as the existing reconnect/reconnect_at_eof flags.
>
> Related tickets:
> https://trac.ffmpeg.org/ticket/6066
> https://trac.ffmpeg.org/ticket/7768

> From f412ca316f26f8c52d9763a33703ab06134feb37 Mon Sep 17 00:00:00 2001
> From: erankor <eran.kornblau@kaltura.com>
> Date: Sun, 25 Oct 2020 15:25:13 +0200
> Subject: [PATCH] http: support retry on connection error
> 
> added 2 new options:
> - reconnect_on_status - a list of http status codes that should be
>     retried. the list can contain explicit status codes / the strings
>     4xx/5xx.

Maybe better name this option reconnect_on_http_error. Also this is a 
flags-like option, so use AV_OPT_TYPE_FLAGS with 4xx and 5xx as flags.


> - reconnect_on_err - reconnects on arbitrary errors during connect, e.g.
>     ECONNRESET/ETIMEDOUT

Maybe reconnect_on_network_error better reflects that this reconnects on 
underlying protocol (TCP/TLS) error.

Anyhow, the new options should be added to docs/protocols.texi.

> 
> the retry employs the same exponential backoff logic as the existing
> reconnect/reconnect_at_eof flags.
> 
> related tickets:
> https://trac.ffmpeg.org/ticket/6066
> https://trac.ffmpeg.org/ticket/7768
> ---
>  libavformat/http.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 60 insertions(+), 2 deletions(-)
> 
> diff --git a/libavformat/http.c b/libavformat/http.c
> index 3d25d652d3..ea14ef0c47 100644
> --- a/libavformat/http.c
> +++ b/libavformat/http.c
> @@ -119,8 +119,10 @@ typedef struct HTTPContext {
>      char *method;
>      int reconnect;
>      int reconnect_at_eof;
> +    int reconnect_on_err;
>      int reconnect_streamed;
>      int reconnect_delay_max;
> +    char *reconnect_on_status;
>      int listen;
>      char *resource;
>      int reply_code;
> @@ -164,6 +166,8 @@ static const AVOption options[] = {
>      { "method", "Override the HTTP method or set the expected HTTP method from a client", OFFSET(method), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, D | E },
>      { "reconnect", "auto reconnect after disconnect before EOF", OFFSET(reconnect), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, D },
>      { "reconnect_at_eof", "auto reconnect at EOF", OFFSET(reconnect_at_eof), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, D },
> +    { "reconnect_on_err", "auto reconnect in case of tcp/tls error during connect", OFFSET(reconnect_on_err), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, D },
> +    { "reconnect_on_status", "list of http status codes to reconnect on. the list can include specific status codes / 4xx / 5xx", OFFSET(reconnect_on_status), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, D },

AV_OPT_TYPE_FLAGS, as I explained above.

>      { "reconnect_streamed", "auto reconnect streamed / non seekable streams", OFFSET(reconnect_streamed), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, D },
>      { "reconnect_delay_max", "max reconnect delay in seconds after which to give up", OFFSET(reconnect_delay_max), AV_OPT_TYPE_INT, { .i64 = 120 }, 0, UINT_MAX/1000/1000, D },
>      { "listen", "listen on HTTP", OFFSET(listen), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 2, D | E },
> @@ -258,21 +262,75 @@ static int http_open_cnx_internal(URLContext *h, AVDictionary **options)
>      return location_changed;
>  }
> 
> +static int http_should_reconnect(HTTPContext *s, int err)
> +{
> +    const char *status_group;
> +    char http_code[4];
> +
> +    switch (err) {
> +    case AVERROR_HTTP_BAD_REQUEST:
> +    case AVERROR_HTTP_UNAUTHORIZED:
> +    case AVERROR_HTTP_FORBIDDEN:
> +    case AVERROR_HTTP_NOT_FOUND:
> +    case AVERROR_HTTP_OTHER_4XX:
> +        status_group = "4xx";
> +        break;
> +
> +    case AVERROR_HTTP_SERVER_ERROR:
> +        status_group = "5xx";
> +        break;
> +
> +    default:
> +        return s->reconnect_on_err;
> +    }
> +
> +    if (!s->reconnect_on_status) {
> +        return 0;
> +    }
> +
> +    if (av_match_list(status_group, s->reconnect_on_status, ',') > 0) {
> +        return 1;
> +    }
> +
> +    snprintf(http_code, sizeof(http_code), "%d", s->http_code);
> +
> +    return av_match_list(http_code, s->reconnect_on_status, ',') > 0;
> +}
> +
>  /* return non zero if error */
>  static int http_open_cnx(URLContext *h, AVDictionary **options)
>  {
>      HTTPAuthType cur_auth_type, cur_proxy_auth_type;
>      HTTPContext *s = h->priv_data;
>      int location_changed, attempts = 0, redirects = 0;
> +    int reconnect_delay = 0;
> +    uint64_t off;
> +
>  redo:
>      av_dict_copy(options, s->chained_options, 0);
>
>      cur_auth_type       = s->auth_state.auth_type;
>      cur_proxy_auth_type = s->auth_state.auth_type;
> 
> +    off = s->off;
>      location_changed = http_open_cnx_internal(h, options);

Are you sure you get AVERROR_HTTP_* here? Also if you follow the code 
there is special handling of 401/407, that should be done first before 
your checks, no?

> -    if (location_changed < 0)
> -        goto fail;
> +    if (location_changed < 0) {
> +        if (!http_should_reconnect(s, location_changed) ||
> +            reconnect_delay > s->reconnect_delay_max)
> +            goto fail;
> +
> +        av_log(h, AV_LOG_WARNING, "Will reconnect at %"PRIu64" in %d second(s).\n", off, reconnect_delay);
> +        location_changed = ff_network_sleep_interruptible(1000U * 1000 * reconnect_delay, &h->interrupt_callback);
> +        if (location_changed != AVERROR(ETIMEDOUT))
> +            goto fail;
> +        reconnect_delay = 1 + 2 * reconnect_delay;
> +
> +        /* restore the offset (http_connect resets it) */
> +        s->off = off;
> +
> +        ffurl_closep(&s->hd);
> +        goto redo;
> +    }
>
>      attempts++;
>      if (s->http_code == 401) {

Another comment is that if you want to reconnect based on errors 
classified as 4xx or 5xx, then probably you should revisit 
ff_http_averror(400, xxx) calls and check if they are indeed caused by 
client behaviour. Because if the server is violating the protocol, then 
they should be ff_http_averror(500, xxx) IMHO.

Regards,
Marton
diff mbox series

Patch

diff --git a/libavformat/http.c b/libavformat/http.c
index 3d25d652d3..ea14ef0c47 100644
--- a/libavformat/http.c
+++ b/libavformat/http.c
@@ -119,8 +119,10 @@  typedef struct HTTPContext {
     char *method;
     int reconnect;
     int reconnect_at_eof;
+    int reconnect_on_err;
     int reconnect_streamed;
     int reconnect_delay_max;
+    char *reconnect_on_status;
     int listen;
     char *resource;
     int reply_code;
@@ -164,6 +166,8 @@  static const AVOption options[] = {
     { "method", "Override the HTTP method or set the expected HTTP method from a client", OFFSET(method), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, D | E },
     { "reconnect", "auto reconnect after disconnect before EOF", OFFSET(reconnect), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, D },
     { "reconnect_at_eof", "auto reconnect at EOF", OFFSET(reconnect_at_eof), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, D },
+    { "reconnect_on_err", "auto reconnect in case of tcp/tls error during connect", OFFSET(reconnect_on_err), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, D },
+    { "reconnect_on_status", "list of http status codes to reconnect on. the list can include specific status codes / 4xx / 5xx", OFFSET(reconnect_on_status), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, D },
     { "reconnect_streamed", "auto reconnect streamed / non seekable streams", OFFSET(reconnect_streamed), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, D },
     { "reconnect_delay_max", "max reconnect delay in seconds after which to give up", OFFSET(reconnect_delay_max), AV_OPT_TYPE_INT, { .i64 = 120 }, 0, UINT_MAX/1000/1000, D },
     { "listen", "listen on HTTP", OFFSET(listen), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 2, D | E },
@@ -258,21 +262,75 @@  static int http_open_cnx_internal(URLContext *h, AVDictionary **options)
     return location_changed;
 }
 
+static int http_should_reconnect(HTTPContext *s, int err)
+{
+    const char *status_group;
+    char http_code[4];
+
+    switch (err) {
+    case AVERROR_HTTP_BAD_REQUEST:
+    case AVERROR_HTTP_UNAUTHORIZED:
+    case AVERROR_HTTP_FORBIDDEN:
+    case AVERROR_HTTP_NOT_FOUND:
+    case AVERROR_HTTP_OTHER_4XX:
+        status_group = "4xx";
+        break;
+
+    case AVERROR_HTTP_SERVER_ERROR:
+        status_group = "5xx";
+        break;
+
+    default:
+        return s->reconnect_on_err;
+    }
+
+    if (!s->reconnect_on_status) {
+        return 0;
+    }
+
+    if (av_match_list(status_group, s->reconnect_on_status, ',') > 0) {
+        return 1;
+    }
+
+    snprintf(http_code, sizeof(http_code), "%d", s->http_code);
+
+    return av_match_list(http_code, s->reconnect_on_status, ',') > 0;
+}
+
 /* return non zero if error */
 static int http_open_cnx(URLContext *h, AVDictionary **options)
 {
     HTTPAuthType cur_auth_type, cur_proxy_auth_type;
     HTTPContext *s = h->priv_data;
     int location_changed, attempts = 0, redirects = 0;
+    int reconnect_delay = 0;
+    uint64_t off;
+
 redo:
     av_dict_copy(options, s->chained_options, 0);
 
     cur_auth_type       = s->auth_state.auth_type;
     cur_proxy_auth_type = s->auth_state.auth_type;
 
+    off = s->off;
     location_changed = http_open_cnx_internal(h, options);
-    if (location_changed < 0)
-        goto fail;
+    if (location_changed < 0) {
+        if (!http_should_reconnect(s, location_changed) ||
+            reconnect_delay > s->reconnect_delay_max)
+            goto fail;
+
+        av_log(h, AV_LOG_WARNING, "Will reconnect at %"PRIu64" in %d second(s).\n", off, reconnect_delay);
+        location_changed = ff_network_sleep_interruptible(1000U * 1000 * reconnect_delay, &h->interrupt_callback);
+        if (location_changed != AVERROR(ETIMEDOUT))
+            goto fail;
+        reconnect_delay = 1 + 2 * reconnect_delay;
+
+        /* restore the offset (http_connect resets it) */
+        s->off = off;
+
+        ffurl_closep(&s->hd);
+        goto redo;
+    }
 
     attempts++;
     if (s->http_code == 401) {