diff mbox series

[FFmpeg-devel,1/5] lavf/mxfdec: Speed up klv_read_packet()

Message ID 1de0719456ef2f015b83d214d73334853af26612.camel@haerdin.se
State New
Headers show
Series [FFmpeg-devel,1/5] lavf/mxfdec: Speed up klv_read_packet() | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Tomas Härdin Sept. 22, 2024, 4:33 p.m. UTC
This patchset speeds up mxfdec in various ways. The test file has been
generated with

 ffmpeg -t 10000 -f lavfi -i testsrc -s 160x120 out2.mxf

Performance is measured with callgrind using the command

 valgrind --tool=callgrind ./ffmpeg_g -loglevel quiet -i out2.mxf -codec copy -f null -

The callgraph has then been inspected using kcachegrind. The results
are as follows:

task_wrapper
5 812 306 937 a577d31
5 669 552 343 Speed up klv_read_packet()
5 648 440 947 Add and use IS_KLV_KEY_FAST() in some places
5 633 846 074 Add and use mxf_is_encrypted_triplet_key()
3 667 721 703 Speed up mxf_edit_unit_absolute_offset()
3 587 869 726 Remove a call to avio_tell() in klv_read_packet()

mxf_read_packet (250 001 calls)
3 821 662 859 a577d31
3 665 058 265 Speed up klv_read_packet()
3 647 320 931 Add and use IS_KLV_KEY_FAST() in some places
3 624 081 036 Add and use mxf_is_encrypted_triplet_key()
1 660 495 552 Speed up mxf_edit_unit_absolute_offset()
1 592 469 709 Remove a call to avio_tell() in klv_read_packet()

The biggest difference is made by speeding up
mxf_edit_unit_absolute_offset(). Here's how many cycles it uses before
and after patch 4:

mxf_edit_unit_absolute_offset.constprop.31
2 076 774 255 Add and use mxf_is_encrypted_triplet_key()
  105 047 878 Speed up mxf_edit_unit_absolute_offset()

Of the remaining cycles in mxf_edit_unit_absolute_offset(), 67 882 294
are spent in mxf_absolute_bodysid_offset() (272 cycles per call). Since
it already does a binary search it didn't seem worthwhile to mess with.

Patches 2 and 3 are somewhat dubious, but I've included them anyway to
get some feedback. We could both speed up the demuxer and cut down on
.text by omitting the first 4 bytes of every key.

/Tomas

Comments

Tomas Härdin Sept. 27, 2024, 1:22 p.m. UTC | #1
I'll push patches 1, 4 and 5 in a few days.

/Tomas
Tomas Härdin Oct. 1, 2024, 5:20 p.m. UTC | #2
fre 2024-09-27 klockan 15:22 +0200 skrev Tomas Härdin:
> I'll push patches 1, 4 and 5 in a few days.

Pushed

/Tomas
diff mbox series

Patch

From da4daac750955ccdf578c703fca7a90c93f7a1a8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= <git@haerdin.se>
Date: Sat, 14 Sep 2024 11:48:09 +0200
Subject: [PATCH 1/5] lavf/mxfdec: Speed up klv_read_packet()

---
 libavformat/mxfdec.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/libavformat/mxfdec.c b/libavformat/mxfdec.c
index 24f4ed1c33..99bf352e00 100644
--- a/libavformat/mxfdec.c
+++ b/libavformat/mxfdec.c
@@ -458,10 +458,26 @@  static int mxf_read_sync(AVIOContext *pb, const uint8_t *key, unsigned size)
     return i == size;
 }
 
+// special case of mxf_read_sync for mxf_klv_key
+static int mxf_read_sync_klv(AVIOContext *pb)
+{
+    uint32_t key = avio_rb32(pb);
+    // key will never match mxf_klv_key on EOF
+    if (key == AV_RB32(mxf_klv_key))
+        return 1;
+
+    while (!avio_feof(pb)) {
+        key = (key << 8) | avio_r8(pb);
+        if (key == AV_RB32(mxf_klv_key))
+            return 1;
+    }
+    return 0;
+}
+
 static int klv_read_packet(MXFContext *mxf, KLVPacket *klv, AVIOContext *pb)
 {
     int64_t length, pos;
-    if (!mxf_read_sync(pb, mxf_klv_key, 4))
+    if (!mxf_read_sync_klv(pb))
         return AVERROR_INVALIDDATA;
     klv->offset = avio_tell(pb) - 4;
     if (klv->offset < mxf->run_in)
@@ -3982,6 +3998,7 @@  static int mxf_read_packet(AVFormatContext *s, AVPacket *pkt)
             ret = klv_read_packet(mxf, &klv, s->pb);
             if (ret < 0)
                 break;
+            // klv.key[0..3] == mxf_klv_key from here forward
             max_data_size = klv.length;
             pos = klv.next_klv - klv.length;
             PRINT_KEY(s, "read packet", klv.key);
-- 
2.39.2