From patchwork Thu Jul 30 15:10:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas George X-Patchwork-Id: 21385 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id EBBBE44AACC for ; Thu, 30 Jul 2020 18:10:18 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C646D68B31D; Thu, 30 Jul 2020 18:10:18 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from nef.ens.fr (nef2.ens.fr [129.199.96.40]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 25A5C680092 for ; Thu, 30 Jul 2020 18:10:12 +0300 (EEST) X-ENS-nef-client: 129.199.129.80 ( name = phare.normalesup.org ) Received: from phare.normalesup.org (phare.normalesup.org [129.199.129.80]) by nef.ens.fr (8.14.4/1.01.28121999) with ESMTP id 06UFABLV001459 for ; Thu, 30 Jul 2020 17:10:11 +0200 Received: by phare.normalesup.org (Postfix, from userid 1001) id 32C0EE00CF; Thu, 30 Jul 2020 17:10:11 +0200 (CEST) From: Nicolas George To: ffmpeg-devel@ffmpeg.org Date: Thu, 30 Jul 2020 17:10:08 +0200 Message-Id: <20200730151009.118835-1-george@nsup.org> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (nef.ens.fr [129.199.96.32]); Thu, 30 Jul 2020 17:10:11 +0200 (CEST) Subject: [FFmpeg-devel] [PATCH 1/2] lavf/url: add ff_url_decompose(). X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Signed-off-by: Nicolas George --- libavformat/tests/url.c | 34 +++++++++++++++++++ libavformat/url.c | 74 +++++++++++++++++++++++++++++++++++++++++ libavformat/url.h | 41 +++++++++++++++++++++++ tests/ref/fate/url | 45 +++++++++++++++++++++++++ 4 files changed, 194 insertions(+) I chose to keep only pointers to the beginnings despite Marton's suggestion because I find it plays better with re-constructing the URL afterwards. The property that each char of the string belongs to one and only one part is a good invariant, and the delimiting characters are clearly documented and allow to check if the field is present. Compared with the previous iteration, I added a few macros and the handling of NULL. diff --git a/libavformat/tests/url.c b/libavformat/tests/url.c index 1d961a1b43..e7d259ab7d 100644 --- a/libavformat/tests/url.c +++ b/libavformat/tests/url.c @@ -21,6 +21,31 @@ #include "libavformat/url.h" #include "libavformat/avformat.h" +static void test_decompose(const char *url) +{ + URLComponents uc; + int len, ret; + + printf("%s =>\n", url); + ret = ff_url_decompose(&uc, url, NULL); + if (ret < 0) { + printf(" error: %s\n", av_err2str(ret)); + return; + } +#define PRINT_COMPONENT(comp) \ + len = uc.url_component_end_##comp - uc.comp; \ + if (len) printf(" "#comp": %.*s\n", len, uc.comp); + PRINT_COMPONENT(scheme); + PRINT_COMPONENT(authority); + PRINT_COMPONENT(userinfo); + PRINT_COMPONENT(host); + PRINT_COMPONENT(port); + PRINT_COMPONENT(path); + PRINT_COMPONENT(query); + PRINT_COMPONENT(fragment); + printf("\n"); +} + static void test(const char *base, const char *rel) { char buf[200], buf2[200]; @@ -51,6 +76,15 @@ static void test2(const char *url) int main(void) { + printf("Testing ff_url_decompose:\n\n"); + test_decompose("http://user:pass@ffmpeg:8080/dir/file?query#fragment"); + test_decompose("http://ffmpeg/dir/file"); + test_decompose("file:///dev/null"); + test_decompose("file:/dev/null"); + test_decompose("http://[::1]/dev/null"); + test_decompose("http://[::1]:8080/dev/null"); + test_decompose("//ffmpeg/dev/null"); + printf("Testing ff_make_absolute_url:\n"); test(NULL, "baz"); test("/foo/bar", "baz"); diff --git a/libavformat/url.c b/libavformat/url.c index 20463a6674..26aaab4019 100644 --- a/libavformat/url.c +++ b/libavformat/url.c @@ -78,6 +78,80 @@ int ff_url_join(char *str, int size, const char *proto, return strlen(str); } +static const char *find_delim(const char *delim, const char *cur, const char *end) +{ + while (cur < end && !strchr(delim, *cur)) + cur++; + return cur; +} + +int ff_url_decompose(URLComponents *uc, const char *url, const char *end) +{ + const char *cur, *aend, *p; + + if (!url) { + URLComponents nul = { 0 }; + *uc = nul; + return 0; + } + if (!end) + end = url + strlen(url); + cur = uc->url = url; + + /* scheme */ + uc->scheme = cur; + p = find_delim(":/", cur, end); /* lavf "schemes" can contain options */ + if (*p == ':') + cur = p + 1; + + /* authority */ + uc->authority = cur; + if (end - cur >= 2 && cur[0] == '/' && cur[1] == '/') { + cur += 2; + aend = find_delim("/?#", cur, end); + + /* userinfo */ + uc->userinfo = cur; + p = find_delim("@", cur, aend); + if (*p == '@') + cur = p + 1; + + /* host */ + uc->host = cur; + if (*cur == '[') { /* hello IPv6, thanks for using colons! */ + p = find_delim("]", cur, aend); + if (*p != ']') + return AVERROR(EINVAL); + if (p + 1 < aend && p[1] != ':') + return AVERROR(EINVAL); + cur = p + 1; + } else { + cur = find_delim(":", cur, aend); + } + + /* port */ + uc->port = cur; + cur = aend; + } else { + uc->userinfo = uc->host = uc->port = cur; + } + + /* path */ + uc->path = cur; + cur = find_delim("?#", cur, end); + + /* query */ + uc->query = cur; + if (*cur == '?') + cur = find_delim("#", cur, end); + + /* fragment */ + uc->fragment = cur; + + uc->end = end; + return 0; +} + static void trim_double_dot_url(char *buf, const char *rel, int size) { const char *p = rel; diff --git a/libavformat/url.h b/libavformat/url.h index de0d30aca0..ae27da5c73 100644 --- a/libavformat/url.h +++ b/libavformat/url.h @@ -344,4 +344,45 @@ const AVClass *ff_urlcontext_child_class_iterate(void **iter); const URLProtocol **ffurl_get_protocols(const char *whitelist, const char *blacklist); +typedef struct URLComponents { + const char *url; /**< whole URL, for reference */ + const char *scheme; /**< possibly including lavf-specific options */ + const char *authority; /**< "//" if it is a real URL */ + const char *userinfo; /**< including final '@' if present */ + const char *host; + const char *port; /**< including initial ':' if present */ + const char *path; + const char *query; /**< including initial '?' if present */ + const char *fragment; /**< including initial '#' if present */ + const char *end; +} URLComponents; + +#define url_component_end_scheme authority +#define url_component_end_authority userinfo +#define url_component_end_userinfo host +#define url_component_end_host port +#define url_component_end_port path +#define url_component_end_path query +#define url_component_end_query fragment +#define url_component_end_fragment end +#define url_component_end_authority_full path + +#define URL_COMPONENT_HAVE(uc, component) \ + ((uc).url_component_end_##component > (uc).component) + +/** + * Parse an URL to find the components. + * + * Each component runs until the start of the next component, + * possibly including a mandatory delimiter. + * + * @param uc structure to fill with pointers to the components. + * @param url URL to parse. + * @param end end of the URL, or NULL to parse to the end of string. + * + * @return >= 0 for success or an AVERROR code, especially if the URL is + * malformed. + */ +int ff_url_decompose(URLComponents *uc, const char *url, const char *end); + #endif /* AVFORMAT_URL_H */ diff --git a/tests/ref/fate/url b/tests/ref/fate/url index 533ba2cb1e..84cf85abdd 100644 --- a/tests/ref/fate/url +++ b/tests/ref/fate/url @@ -1,3 +1,48 @@ +Testing ff_url_decompose: + +http://user:pass@ffmpeg:8080/dir/file?query#fragment => + scheme: http: + authority: // + userinfo: user:pass@ + host: ffmpeg + port: :8080 + path: /dir/file + query: ?query + fragment: #fragment + +http://ffmpeg/dir/file => + scheme: http: + authority: // + host: ffmpeg + path: /dir/file + +file:///dev/null => + scheme: file: + authority: // + path: /dev/null + +file:/dev/null => + scheme: file: + path: /dev/null + +http://[::1]/dev/null => + scheme: http: + authority: // + host: [::1] + path: /dev/null + +http://[::1]:8080/dev/null => + scheme: http: + authority: // + host: [::1] + port: :8080 + path: /dev/null + +//ffmpeg/dev/null => + authority: // + host: ffmpeg + path: /dev/null + Testing ff_make_absolute_url: (null) baz => baz /foo/bar baz => /foo/baz