From patchwork Fri Jan 26 13:04:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 45841 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1b0f:b0:199:de12:6fa6 with SMTP id ch15csp282686pzb; Fri, 26 Jan 2024 05:04:42 -0800 (PST) X-Google-Smtp-Source: AGHT+IE2c+ptC8bbFf6ggiakaWXEP8jXT6PQoYacR7Ydjzlf898bQ28NWvEINIW/4IjW/upamKAA X-Received: by 2002:a17:907:8e86:b0:a31:5d3e:b659 with SMTP id tx6-20020a1709078e8600b00a315d3eb659mr1384751ejc.4.1706274282394; Fri, 26 Jan 2024 05:04:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1706274282; cv=none; d=google.com; s=arc-20160816; b=EIS0uOoRHulYqsrQXcrFrsTZoou3Waf5fyHn6imu+V5NB9MpkxLBfOF1YS2KsX/hiY NroeNrxSB+hM+dyzRUUOl7unwkMPjfY1DuYpJqwNBdMkiibLKrxBQZybQtjDjX2Lrndx CjVAY5dCnO21UkHmpiDX3tJvwlN7AC55ijWLbKJNx4cXzJAT+3APZIsOEfP4EqRzC/Kx YDimXQmBNblXDSMpiz1SvNxP2/u9uAInhTtmbBn+0sfny2n+7h97IYXeKN7tEPWMOzNW Gc8fdOk60XGt/8OJGmS4ycO87f7N8rshh16cNCaZVd+qHihZyly+JhJvIXZ1IjpFAg5f KOWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=ifvAOxwl072yTHOnodzKl1jQB9guqkEbcnhqZWgKovw=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=idd54zFGx9mSVuxUi8tr2Q3koKNssdXgE2R1B6L2URFpVQgCPEUP5Y/XJhsXvXVnCJ 0DrlvqyWF70wZQhvzr4fe+Dkf/JgcOEuo5LyH+yM3H1rLplnc9q1boB5GorDhzBSCD4Y mHgbCyKz/82/RltiQcM40ExxZSF0kfIItTYpPHRe2ZiB8ZU5yadF7bUK9WEkXvFXIIYc x1XXfpabNIPBZdhc17JQD8ES5Fi1tPeqO0qdsAZeTKry2mXdEFrBjBU3USa7VsBxSUfA JMUT3aCXm/dbC09Z1e1YJkO6aBptBnjbEJSQCk+vyG0SAYEIhw5MiDBizuRpnqfLvj3o dH2g== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=X1AgSGfP; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id cb9-20020a170906a44900b00a31091fcc67si545607ejb.469.2024.01.26.05.04.40; Fri, 26 Jan 2024 05:04:42 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=X1AgSGfP; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EF16468C832; Fri, 26 Jan 2024 15:04:36 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f44.google.com (mail-lf1-f44.google.com [209.85.167.44]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id D569968C832 for ; Fri, 26 Jan 2024 15:04:29 +0200 (EET) Received: by mail-lf1-f44.google.com with SMTP id 2adb3069b0e04-50eab4bf47aso389994e87.0 for ; Fri, 26 Jan 2024 05:04:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1706274269; x=1706879069; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=xhbLhi38P6qjc3mF+u8DHkaYtITU9evCpsr6WVAB1tk=; b=X1AgSGfPjODF7wCAbCSz3NGmn49YwrDdLHGbMG8l0e2m0zNvuK/3MIZcmMstW8/on7 J+nUv93Z294CIk0bgbnTnhvTNRw/WOi5lRscbztXbJEL+Shbeh8O9lbe8P3gnPLTCGR+ t+HL7DDeFqTOCfHGFkDQ7g8VtSf/UUzRqwPirPyDp/saXZMB94Q1UG/aHHMqTmhC9AdY mcWA+3/a229iHwLpT5rQstxsDJyKS79LyekAd4S+IcCGoIet0y2lL/mh+lKXHyEjtRlw /nbjhTKsuMFYG930Wefs7WrqgmxrNCxzkzX6i2pgN++Ydgv4gFZHIkhQZh/L7lYIaf9D Mt+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706274269; x=1706879069; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xhbLhi38P6qjc3mF+u8DHkaYtITU9evCpsr6WVAB1tk=; b=F69EJCnhAU1oK+eEZ26I6ZwAU5lRKKgdP1jSVYJZh9rEBos086/VAX8bW55wb1LpNB wG6U/N0MxtN2R50esZBRGIGHJ/KdYisfTjCdhvKshpIIMzf4Qgh++716rUgQZ6R6J04y 4vSCm+/OKMAvVsvpupPr1jfDz7L5qcIIqQh+Ln6X+zbOlOb4tzFOyvSwaC+Oso9FSCWP wcBtycsKtP96RVPyqNZLmCxt3vZFutBWd4wCDly7Ecg+tma3GbApSj9oUxuXYJgB0waG 6VqgO63FM63YpFjctirnXaksdebyhXMQ12Unfiscln5tvv7q530XwvQixxIE1rfGFtcq cdWQ== X-Gm-Message-State: AOJu0YzSRqu6GyIA9MhiBgGoxNXJvDGLpZ9qveLCxYR8NBO44BVBihtY Ms/tX3zt0NR/KKOyBFaXAt0lcI7XeaXcsEUKFouP192A5c+Elnm9JhFLdqj1UEOq18MjtF1X/XQ pnw== X-Received: by 2002:ac2:5b5b:0:b0:50f:f9c6:1f3d with SMTP id i27-20020ac25b5b000000b0050ff9c61f3dmr587415lfp.1.1706274268698; Fri, 26 Jan 2024 05:04:28 -0800 (PST) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id p19-20020ac246d3000000b0051022dd2d4csm172366lfo.233.2024.01.26.05.04.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jan 2024 05:04:28 -0800 (PST) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Fri, 26 Jan 2024 15:04:27 +0200 Message-Id: <20240126130427.2159537-1-martin@martin.st> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] x86: Remove inline MMX assembly that clobbers the FPU state X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: zNCauNVk/Wgs These inline implementations of AV_COPY64, AV_SWAP64 and AV_ZERO64 are known to clobber the FPU state - which has to be restored with the 'emms' instruction afterwards. This was known and signaled with the FF_COPY_SWAP_ZERO_USES_MMX define, which calling code seems to have been supposed to check, in order to call emms_c() after using them. See 0b1972d4096df5879038f0af776f87f41e90ebd4, 29c4c0886d143790fcbeddbe40a23dfc6f56345c and df215e575850e41b19aeb1fd99e53372a6b3d537 for history on earlier fixes in the same area. However, new code can use these AV_*64() macros without knowing about the need to call emms_c(). Just get rid of these dangerous inline assembly snippets; this doesn't make any difference for 64 bit architectures anyway. Signed-off-by: Martin Storsjö --- libavcodec/dca_core.c | 16 ---------------- libavutil/x86/intreadwrite.h | 36 ------------------------------------ 2 files changed, 52 deletions(-) diff --git a/libavcodec/dca_core.c b/libavcodec/dca_core.c index 60508fabb9..5dd727fc72 100644 --- a/libavcodec/dca_core.c +++ b/libavcodec/dca_core.c @@ -770,10 +770,6 @@ static void erase_adpcm_history(DCACoreDecoder *s) for (ch = 0; ch < DCA_CHANNELS; ch++) for (band = 0; band < DCA_SUBBANDS; band++) AV_ZERO128(s->subband_samples[ch][band] - DCA_ADPCM_COEFFS); - -#ifdef FF_COPY_SWAP_ZERO_USES_MMX - emms_c(); -#endif } static int alloc_sample_buffer(DCACoreDecoder *s) @@ -837,10 +833,6 @@ static int parse_frame_data(DCACoreDecoder *s, enum HeaderType header, int xch_b } } -#ifdef FF_COPY_SWAP_ZERO_USES_MMX - emms_c(); -#endif - return 0; } @@ -1283,10 +1275,6 @@ static void erase_x96_adpcm_history(DCACoreDecoder *s) for (ch = 0; ch < DCA_CHANNELS; ch++) for (band = 0; band < DCA_SUBBANDS_X96; band++) AV_ZERO128(s->x96_subband_samples[ch][band] - DCA_ADPCM_COEFFS); - -#ifdef FF_COPY_SWAP_ZERO_USES_MMX - emms_c(); -#endif } static int alloc_x96_sample_buffer(DCACoreDecoder *s) @@ -1516,10 +1504,6 @@ static int parse_x96_frame_data(DCACoreDecoder *s, int exss, int xch_base) } } -#ifdef FF_COPY_SWAP_ZERO_USES_MMX - emms_c(); -#endif - return 0; } diff --git a/libavutil/x86/intreadwrite.h b/libavutil/x86/intreadwrite.h index 40f375b013..5e57d6a8cd 100644 --- a/libavutil/x86/intreadwrite.h +++ b/libavutil/x86/intreadwrite.h @@ -27,42 +27,6 @@ #if HAVE_MMX -#if !HAVE_FAST_64BIT && defined(__MMX__) - -#define FF_COPY_SWAP_ZERO_USES_MMX - -#define AV_COPY64 AV_COPY64 -static av_always_inline void AV_COPY64(void *d, const void *s) -{ - __asm__("movq %1, %%mm0 \n\t" - "movq %%mm0, %0 \n\t" - : "=m"(*(uint64_t*)d) - : "m" (*(const uint64_t*)s) - : "mm0"); -} - -#define AV_SWAP64 AV_SWAP64 -static av_always_inline void AV_SWAP64(void *a, void *b) -{ - __asm__("movq %1, %%mm0 \n\t" - "movq %0, %%mm1 \n\t" - "movq %%mm0, %0 \n\t" - "movq %%mm1, %1 \n\t" - : "+m"(*(uint64_t*)a), "+m"(*(uint64_t*)b) - ::"mm0", "mm1"); -} - -#define AV_ZERO64 AV_ZERO64 -static av_always_inline void AV_ZERO64(void *d) -{ - __asm__("pxor %%mm0, %%mm0 \n\t" - "movq %%mm0, %0 \n\t" - : "=m"(*(uint64_t*)d) - :: "mm0"); -} - -#endif /* !HAVE_FAST_64BIT && defined(__MMX__) */ - #ifdef __SSE__ #define AV_COPY128 AV_COPY128