From patchwork Thu Nov 7 09:25:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyosuke Kawakami X-Patchwork-Id: 52626 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:612c:2fe5:b0:4a6:1c7:11b7 with SMTP id kw5csp1740209vqb; Thu, 7 Nov 2024 13:11:14 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCV3MPe6ym18DwiHDOiMewnuJGqsyhIsE18cyf6nQbrribwDded5nRXvrYdDEPZeazU5SgKA6TNz08V/VOEuPzOH@gmail.com X-Google-Smtp-Source: AGHT+IGwTb8/YofDtANt547q93PKjWTmQ6+SE1MJghTF/xKKnupBABfUPYmeiyFt6Jdb99EuVsJm X-Received: by 2002:a05:6402:274b:b0:5ce:f524:c15d with SMTP id 4fb4d7f45d1cf-5cf0a4760fdmr75640a12.11.1731013874497; Thu, 07 Nov 2024 13:11:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1731013874; cv=none; d=google.com; s=arc-20240605; b=CGz2GDA9XbiiDVhsZ12ri6yx0IpnH8r3giKpHomGMzJfYua+DBXoLezsQ6pVAdUUW+ bdGvUMezYl4U7nIHQK/HREDoO/yZ76lDyW1myA/ZqmAEt16mfGFIjNmkapClF5urxzll RIYxcbbz+K80f9A+eTZ9r5oPxvLBhuirKEmqvxchkG82TsoKDS1C/DWvarssHlPwWnwS gKrGk2OK7EjgRgzuVoKngzbxKXIUHx1WLR+ZIGvbHRQ+ipghG6OLKwzYHC1F4fk9eJ7S U1pZYKjJCAiU5mqizuaMF3GtFv/vytt13jVfJ6Nz3wn5koS+HP8qkpdcyYqEPD5J7fz/ tbKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=ZAHv9SzTVO0oPzpzeLXkW1ih5DSynGU61YhZfGlz3yo=; fh=EPLJFGKGf4KWyWFKLkdCimGdVoIveyTDrTxQoubApaE=; b=Y5zffFPecvGx8jw+1UB4JL2ggOFcJoS4Af0Vw9hD62nPXMgPj8olyF2l7pVpW0vPJ4 40kxGqAhAT6pkwzQUDSKE8dA9USw33LZHWCwwCkSzMiIOMOAfW3Ul1QPNYpXLiXwnaCn CL/p0J9dPsb1YSZlxrT+qhxZxr7GZbA5HNtToQlPSjmDhNTvoXvuEaKjosDF4zv0KxAI D0q+fyh6t4hxhYvuC+yS/whChfWgXOciS+ym+JzCQGSi3xJKf5nvd67UJveeq87rP5/J W5nDNWU4BTxcyrIIz/PqLo7KJvT0II5x+ciEB880xHafcy+/kHkaV+lG6zPz38BMRJc6 UQsQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=hIFLdf4x; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTPS id 4fb4d7f45d1cf-5cf03b7dfe4si1490407a12.61.2024.11.07.13.11.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Nov 2024 13:11:14 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=hIFLdf4x; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 196CA68DE5F; Thu, 7 Nov 2024 11:26:02 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 828C768DE4A for ; Thu, 7 Nov 2024 11:25:55 +0200 (EET) Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-2113da91b53so5491415ad.3 for ; Thu, 07 Nov 2024 01:25:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730971553; x=1731576353; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=e24cHLJXpDDy+0TYvuJgw/8U7stl7NsPXLoPrBDUS5U=; b=hIFLdf4xZOn7X/cTHVrxKhdlPZDKo/6j18ZiFjF3U6qySfs3rNriWFJ093AfOg4YDk p6/zgQpG2fqgNQ9MtNJgilGuQXSgXQzfuH+gCihrf+jRC2SjKSA07YabeTX+Il9IOCJL ts03D2NjZU5uUNk8LDC5uvHkJ7d1rLthI/XYCHmJXoSVRSdRcciwD0LtGJEn5Xmcz5Fc r8akrxcEy5eMtOL9jMDFZe+tNpgU42LyNr3pQsANGDrJBLPbgSNQiIyDQsBnqXjX/ucj 9z4v2Bp3JF8uk0FpSzfhLqKLjtEY765FmAvzglmsSnbC9tRrV65qVkUQKUPXPSK0aOCv FPVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730971553; x=1731576353; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=e24cHLJXpDDy+0TYvuJgw/8U7stl7NsPXLoPrBDUS5U=; b=r63zQ6uDXCG+CP+OzprHvPNGNHcpWWEXkGuIvdC7TKn5O2pKB5kJsciZqVNT/qxcrE 1ZhKRVliZ/T1gXZ3GDmsDORv7iM9RXIgkoFG8P6eI7f96r8jbb+wALjUuZeptIipJL9d eQDBe1TUQ8FFEGo19Gl1IFVTa5aopfywazx2z4wBpXTSzvpxCdWqN7zUPJykPIE9cjqM KkU2Skpn5qFWv/RENyIV+WuzPO1RPeagUpg/aPzoyTaUSXwcjfuh25ZgcDkkfSwcYPMz QCE0JgzYVLtrtSMnfdE8Qr+BUd3dMLB4JGGJJofT6XXuhWWv0Zz/IeVkn0DF9rHcPa/H 044Q== X-Gm-Message-State: AOJu0YzTL/FPfVrMlXz9/pwiWD0XXsSCE9Ld4JMLxGeW+JmDCxu5DbMP QOOX9C9scjzYVzBhvtoLPJj3TvomXlInvTauOrpds/F7LYDo0sOa0nUCkeK1kDg= X-Received: by 2002:a17:902:da8d:b0:20c:d76b:a7a0 with SMTP id d9443c01a7336-210c6879efbmr564197455ad.8.1730971553418; Thu, 07 Nov 2024 01:25:53 -0800 (PST) Received: from localhost.localdomain ([240d:f:534:9f00:20e5:e5c4:c519:d782]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21177ddd646sm8006525ad.62.2024.11.07.01.25.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Nov 2024 01:25:52 -0800 (PST) From: Kyosuke Kawakami To: ffmpeg-devel@ffmpeg.org Date: Thu, 7 Nov 2024 18:25:14 +0900 Message-ID: <20241107092533.1113300-2-kawakami150708@gmail.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241107092533.1113300-1-kawakami150708@gmail.com> References: <20241107092533.1113300-1-kawakami150708@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] avcodec/x86/diracdsp: Migrate last remaining MMX function to SSE2 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Kyosuke Kawakami Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: of2AFKyPQ6j2 The add_dirac_obmc8_mmx function was the only MMX function left. This patch migrates it to SSE2. Signed-off-by: Kyosuke Kawakami --- libavcodec/x86/diracdsp.asm | 4 +--- libavcodec/x86/diracdsp_init.c | 10 +++------- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm index e5e2b11846..d438b668cf 100644 --- a/libavcodec/x86/diracdsp.asm +++ b/libavcodec/x86/diracdsp.asm @@ -247,14 +247,12 @@ cglobal add_dirac_obmc%1_%2, 6,6,5, dst, src, stride, obmc, yblen RET %endm -INIT_MMX -ADD_OBMC 8, mmx - INIT_XMM PUT_RECT sse2 ADD_RECT sse2 HPEL_FILTER sse2 +ADD_OBMC 8, sse2 ADD_OBMC 32, sse2 ADD_OBMC 16, sse2 diff --git a/libavcodec/x86/diracdsp_init.c b/libavcodec/x86/diracdsp_init.c index 6a31d3921f..ef01ebdf2e 100644 --- a/libavcodec/x86/diracdsp_init.c +++ b/libavcodec/x86/diracdsp_init.c @@ -24,8 +24,7 @@ void ff_add_rect_clamped_sse2(uint8_t *, const uint16_t *, int, const int16_t *, int, int, int); -void ff_add_dirac_obmc8_mmx(uint16_t *dst, const uint8_t *src, int stride, const uint8_t *obmc_weight, int yblen); - +void ff_add_dirac_obmc8_sse2(uint16_t *dst, const uint8_t *src, int stride, const uint8_t *obmc_weight, int yblen); void ff_add_dirac_obmc16_sse2(uint16_t *dst, const uint8_t *src, int stride, const uint8_t *obmc_weight, int yblen); void ff_add_dirac_obmc32_sse2(uint16_t *dst, const uint8_t *src, int stride, const uint8_t *obmc_weight, int yblen); @@ -89,15 +88,12 @@ void ff_diracdsp_init_x86(DiracDSPContext* c) #if HAVE_X86ASM int mm_flags = av_get_cpu_flags(); - if (EXTERNAL_MMX(mm_flags)) { - c->add_dirac_obmc[0] = ff_add_dirac_obmc8_mmx; - } - if (EXTERNAL_SSE2(mm_flags)) { c->dirac_hpel_filter = dirac_hpel_filter_sse2; c->add_rect_clamped = ff_add_rect_clamped_sse2; c->put_signed_rect_clamped[0] = (void *)ff_put_signed_rect_clamped_sse2; + c->add_dirac_obmc[0] = ff_add_dirac_obmc8_sse2; c->add_dirac_obmc[1] = ff_add_dirac_obmc16_sse2; c->add_dirac_obmc[2] = ff_add_dirac_obmc32_sse2; @@ -111,5 +107,5 @@ void ff_diracdsp_init_x86(DiracDSPContext* c) c->dequant_subband[1] = ff_dequant_subband_32_sse4; c->put_signed_rect_clamped[1] = ff_put_signed_rect_clamped_10_sse4; } -#endif +#endif // HAVE_X86ASM }