Message ID | DB6PR0101MB22142A7A49F01D9EB61518608FBA9@DB6PR0101MB2214.eurprd01.prod.exchangelabs.com |
---|---|
State | Accepted |
Commit | 72d5ce9fa6cd67497fbfffc97031cb6b7ab23c6d |
Headers | show |
Series | [FFmpeg-devel,v2,01/18] avcodec/pthread_slice: Don't reinitialise initialised mutex | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
Quoting Andreas Rheinhardt (2022-07-01 00:29:40) > diff --git a/libavcodec/hevcdec.h b/libavcodec/hevcdec.h > index e2dba54f26..b97b7d1354 100644 > --- a/libavcodec/hevcdec.h > +++ b/libavcodec/hevcdec.h > @@ -226,6 +226,11 @@ enum ScanType { > SCAN_VERT, > }; > > +typedef struct HEVCCABACState { > + uint8_t state[HEVC_CONTEXTS]; > + uint8_t stat_coeff[HEVC_STAT_COEFFS]; > +} HEVCCABACState; Might be good to mention why this is allocated separately from HEVCContext. Otherwise patch looks good.
Quoting Andreas Rheinhardt (2022-07-01 00:29:40) > The HEVC decoder has both HEVCContext and HEVCLocalContext > structures. The latter is supposed to be the structure > containing the per-slicethread state. > > Yet that is not how it is handled in practice: Each HEVCLocalContext > has a unique HEVCContext allocated for it and each of these > coincides with the main HEVCContext except in exactly one field: > The corresponding HEVCLocalContext. > This makes it possible to pass the HEVCContext everywhere where > logically a HEVCLocalContext should be used. > > This led to confusion in the first version of what eventually became > commit c8bc0f66a875bc3708d8dc11b757f2198606ffd7: > Before said commit, the initialization of the Rice parameter derivation > state was incorrect; the fix for single-threaded as well as > frame-threaded decoding was to add backup stats to HEVCContext > that are used when the cabac state is updated*, see > https://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268861.html > Yet due to what has been said above, this does not work for > slice-threading, because the each HEVCLocalContext has its own > HEVCContext, so the Rice parameter state would not be transferred > between threads. > > This is fixed in c8bc0f66a875bc3708d8dc11b757f2198606ffd7 > by a hack: It rederives what the previous thread was and accesses > the corresponding HEVCContext. > > Fix this by treating the Rice parameter state the same way > the ordinary CABAC parameters are shared between threads: > Make them part of the same struct that is shared between > slice threads. This does not cause races, because > the parts of the code that access these Rice parameters > are a subset of the parts of code that access the CABAC parameters. > > *: And if the persistent_rice_adaptation_enabled_flag is set. > > Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> > --- > libavcodec/hevc_cabac.c | 17 ++++++++--------- > libavcodec/hevcdec.c | 10 +++++----- > libavcodec/hevcdec.h | 10 +++++++--- > 3 files changed, 20 insertions(+), 17 deletions(-) > > diff --git a/libavcodec/hevc_cabac.c b/libavcodec/hevc_cabac.c > index a194f8a02a..985c97ef2a 100644 > --- a/libavcodec/hevc_cabac.c > +++ b/libavcodec/hevc_cabac.c > @@ -453,19 +453,18 @@ void ff_hevc_save_states(HEVCContext *s, int ctb_addr_ts) > (ctb_addr_ts % s->ps.sps->ctb_width == 2 || > (s->ps.sps->ctb_width == 2 && > ctb_addr_ts % s->ps.sps->ctb_width == 0))) { > - memcpy(s->cabac_state, s->HEVClc->cabac_state, HEVC_CONTEXTS); > + memcpy(s->cabac->state, s->HEVClc->cabac_state, HEVC_CONTEXTS); So if I'm reading this right, this copies the per-slice-context state into the decoder-global state. And it's done from slice threads with no locks. So how is this not racy?
Anton Khirnov: > Quoting Andreas Rheinhardt (2022-07-01 00:29:40) >> The HEVC decoder has both HEVCContext and HEVCLocalContext >> structures. The latter is supposed to be the structure >> containing the per-slicethread state. >> >> Yet that is not how it is handled in practice: Each HEVCLocalContext >> has a unique HEVCContext allocated for it and each of these >> coincides with the main HEVCContext except in exactly one field: >> The corresponding HEVCLocalContext. >> This makes it possible to pass the HEVCContext everywhere where >> logically a HEVCLocalContext should be used. >> >> This led to confusion in the first version of what eventually became >> commit c8bc0f66a875bc3708d8dc11b757f2198606ffd7: >> Before said commit, the initialization of the Rice parameter derivation >> state was incorrect; the fix for single-threaded as well as >> frame-threaded decoding was to add backup stats to HEVCContext >> that are used when the cabac state is updated*, see >> https://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268861.html >> Yet due to what has been said above, this does not work for >> slice-threading, because the each HEVCLocalContext has its own >> HEVCContext, so the Rice parameter state would not be transferred >> between threads. >> >> This is fixed in c8bc0f66a875bc3708d8dc11b757f2198606ffd7 >> by a hack: It rederives what the previous thread was and accesses >> the corresponding HEVCContext. >> >> Fix this by treating the Rice parameter state the same way >> the ordinary CABAC parameters are shared between threads: >> Make them part of the same struct that is shared between >> slice threads. This does not cause races, because >> the parts of the code that access these Rice parameters >> are a subset of the parts of code that access the CABAC parameters. >> >> *: And if the persistent_rice_adaptation_enabled_flag is set. >> >> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> >> --- >> libavcodec/hevc_cabac.c | 17 ++++++++--------- >> libavcodec/hevcdec.c | 10 +++++----- >> libavcodec/hevcdec.h | 10 +++++++--- >> 3 files changed, 20 insertions(+), 17 deletions(-) >> >> diff --git a/libavcodec/hevc_cabac.c b/libavcodec/hevc_cabac.c >> index a194f8a02a..985c97ef2a 100644 >> --- a/libavcodec/hevc_cabac.c >> +++ b/libavcodec/hevc_cabac.c >> @@ -453,19 +453,18 @@ void ff_hevc_save_states(HEVCContext *s, int ctb_addr_ts) >> (ctb_addr_ts % s->ps.sps->ctb_width == 2 || >> (s->ps.sps->ctb_width == 2 && >> ctb_addr_ts % s->ps.sps->ctb_width == 0))) { >> - memcpy(s->cabac_state, s->HEVClc->cabac_state, HEVC_CONTEXTS); >> + memcpy(s->cabac->state, s->HEVClc->cabac_state, HEVC_CONTEXTS); > > So if I'm reading this right, this copies the per-slice-context state > into the decoder-global state. And it's done from slice threads with no > locks. So how is this not racy? > a) I am not claiming that this is not racy; I am merely claiming that it does not introduce a new race, because the parts of the code that access these Rice parameters are a subset of the parts of code that access the CABAC parameters and tsan has never shown a race for me when updating cabac state or the rice parameters in general. b) (i) I readily admit that HEVC is not my forte*. WPP is supposed to be as follows: Given that HEVC needs the upper right and upper block/ctu for prediction, each row can only start decoding after the first two ctus of the row above it have been decoded. And then the cabac state of the row below is initialized from the cabac state of the row above after decoding its first two ctus. Therefore only one cabac state needs to be cached at any given time. (ii) You can see this in ff_hevc_save_states, where it is only saving the state after the second row (this presumes that the initial ctb_addr_ts in hls_decode_entry_wpp is a multiple of ctb_width which seems to be the case when tiles are disabled; I don't know what happens when both tiles and wpp are enabled. According to https://github.com/ultravideo/kvazaar/issues/201#issuecomment-391329526 this is not even allowed in any currently legal profile, but I don't think our decoder checks for that). (iii) After having saved the state, it is signalling that it is done with this ctu via ff_thread_report_progress2. The next thread waits for this event via ff_thread_await_progress2 and initializes its cabac state (if necessary). So there is your synchronization. (iv) Looking at ff_hevc_cabac_init, one can see that the first branch is always true when run from the first job; whereas ctb_addr_ts % s->ps.sps->ctb_width == 0 is true when tiles are disabled and when one is decoding the first ctu of a row (I don't know what happens in case tiles are enabled; probably mayhem. It seems kvazaar can produce such files, see above link.). c) The current state of affairs is btw weird: Given that the secondary HEVCContexts are overwritten by the main HEVCContext in hls_slice_data_wpp, the rice state that every HEVCContext starts with is given by the state of the first HEVCContext. And which row (of the last picture) this corresponds to depends upon the number of slice threads in use. This might cause problems if dependent_slice_segment_flag is enabled. d) See also the comment to patch #2. - Andreas *: I only wanted to share the common SEI parts of H.264 and HEVC due to softworkz's horrible way of sharing it.
Andreas Rheinhardt: > Anton Khirnov: > c) The current state of affairs is btw weird: Given that the secondary > HEVCContexts are overwritten by the main HEVCContext in > hls_slice_data_wpp, the rice state that every HEVCContext starts with is > given by the state of the first HEVCContext. And which row (of the last > picture) this corresponds to depends upon the number of slice threads in > use. This might cause problems if dependent_slice_segment_flag is enabled. > kvazaar allows to create files with dependent_slice_segment_flag == 1; but it puts one row into one slice and so the above issue would probably not manifest itself. The caveat "probably" is necessary, because decoding such files with slice threading leads to "entry_point_offset table is corrupted" errors (which is probably a decoder bug). - Andreas
Andreas Rheinhardt: > Andreas Rheinhardt: >> Anton Khirnov: >> c) The current state of affairs is btw weird: Given that the secondary >> HEVCContexts are overwritten by the main HEVCContext in >> hls_slice_data_wpp, the rice state that every HEVCContext starts with is >> given by the state of the first HEVCContext. And which row (of the last >> picture) this corresponds to depends upon the number of slice threads in >> use. This might cause problems if dependent_slice_segment_flag is enabled. >> > > kvazaar allows to create files with dependent_slice_segment_flag == 1; > but it puts one row into one slice and so the above issue would probably > not manifest itself. The caveat "probably" is necessary, because > decoding such files with slice threading leads to "entry_point_offset > table is corrupted" errors (which is probably a decoder bug). > The first slice segment header of a picture has entry points for all rows, even for those that are actually in other slice segments. I don't know whether this is how it is supposed to be. - Andreas
diff --git a/libavcodec/hevc_cabac.c b/libavcodec/hevc_cabac.c index a194f8a02a..985c97ef2a 100644 --- a/libavcodec/hevc_cabac.c +++ b/libavcodec/hevc_cabac.c @@ -453,19 +453,18 @@ void ff_hevc_save_states(HEVCContext *s, int ctb_addr_ts) (ctb_addr_ts % s->ps.sps->ctb_width == 2 || (s->ps.sps->ctb_width == 2 && ctb_addr_ts % s->ps.sps->ctb_width == 0))) { - memcpy(s->cabac_state, s->HEVClc->cabac_state, HEVC_CONTEXTS); + memcpy(s->cabac->state, s->HEVClc->cabac_state, HEVC_CONTEXTS); if (s->ps.sps->persistent_rice_adaptation_enabled_flag) { - memcpy(s->stat_coeff, s->HEVClc->stat_coeff, HEVC_STAT_COEFFS); + memcpy(s->cabac->stat_coeff, s->HEVClc->stat_coeff, HEVC_STAT_COEFFS); } } } -static void load_states(HEVCContext *s, int thread) +static void load_states(HEVCContext *s) { - memcpy(s->HEVClc->cabac_state, s->cabac_state, HEVC_CONTEXTS); + memcpy(s->HEVClc->cabac_state, s->cabac->state, HEVC_CONTEXTS); if (s->ps.sps->persistent_rice_adaptation_enabled_flag) { - const HEVCContext *prev = s->sList[(thread + s->threads_number - 1) % s->threads_number]; - memcpy(s->HEVClc->stat_coeff, prev->stat_coeff, HEVC_STAT_COEFFS); + memcpy(s->HEVClc->stat_coeff, s->cabac->stat_coeff, HEVC_STAT_COEFFS); } } @@ -508,7 +507,7 @@ static void cabac_init_state(HEVCContext *s) s->HEVClc->stat_coeff[i] = 0; } -int ff_hevc_cabac_init(HEVCContext *s, int ctb_addr_ts, int thread) +int ff_hevc_cabac_init(HEVCContext *s, int ctb_addr_ts) { if (ctb_addr_ts == s->ps.pps->ctb_addr_rs_to_ts[s->sh.slice_ctb_addr_rs]) { int ret = cabac_init_decoder(s); @@ -525,7 +524,7 @@ int ff_hevc_cabac_init(HEVCContext *s, int ctb_addr_ts, int thread) if (s->ps.sps->ctb_width == 1) cabac_init_state(s); else if (s->sh.dependent_slice_segment_flag == 1) - load_states(s, thread); + load_states(s); } } } else { @@ -556,7 +555,7 @@ int ff_hevc_cabac_init(HEVCContext *s, int ctb_addr_ts, int thread) if (s->ps.sps->ctb_width == 1) cabac_init_state(s); else - load_states(s, thread); + load_states(s); } } } diff --git a/libavcodec/hevcdec.c b/libavcodec/hevcdec.c index 33c9d74e5f..7d4de414cd 100644 --- a/libavcodec/hevcdec.c +++ b/libavcodec/hevcdec.c @@ -2501,7 +2501,7 @@ static int hls_decode_entry(AVCodecContext *avctxt, void *isFilterThread) y_ctb = (ctb_addr_rs / ((s->ps.sps->width + ctb_size - 1) >> s->ps.sps->log2_ctb_size)) << s->ps.sps->log2_ctb_size; hls_decode_neighbour(s, x_ctb, y_ctb, ctb_addr_ts); - ret = ff_hevc_cabac_init(s, ctb_addr_ts, 0); + ret = ff_hevc_cabac_init(s, ctb_addr_ts); if (ret < 0) { s->tab_slice_address[ctb_addr_rs] = -1; return ret; @@ -2579,7 +2579,7 @@ static int hls_decode_entry_wpp(AVCodecContext *avctxt, void *input_ctb_row, int return 0; } - ret = ff_hevc_cabac_init(s, ctb_addr_ts, thread); + ret = ff_hevc_cabac_init(s, ctb_addr_ts); if (ret < 0) goto error; hls_sao_param(s, x_ctb >> s->ps.sps->log2_ctb_size, y_ctb >> s->ps.sps->log2_ctb_size); @@ -3601,7 +3601,7 @@ static av_cold int hevc_decode_free(AVCodecContext *avctx) av_freep(&s->md5_ctx); - av_freep(&s->cabac_state); + av_freep(&s->cabac); for (i = 0; i < 3; i++) { av_freep(&s->sao_pixel_buffer_h[i]); @@ -3655,8 +3655,8 @@ static av_cold int hevc_init_context(AVCodecContext *avctx) s->HEVClcList[0] = s->HEVClc; s->sList[0] = s; - s->cabac_state = av_malloc(HEVC_CONTEXTS); - if (!s->cabac_state) + s->cabac = av_malloc(sizeof(*s->cabac)); + if (!s->cabac) return AVERROR(ENOMEM); s->output_frame = av_frame_alloc(); diff --git a/libavcodec/hevcdec.h b/libavcodec/hevcdec.h index e2dba54f26..b97b7d1354 100644 --- a/libavcodec/hevcdec.h +++ b/libavcodec/hevcdec.h @@ -226,6 +226,11 @@ enum ScanType { SCAN_VERT, }; +typedef struct HEVCCABACState { + uint8_t state[HEVC_CONTEXTS]; + uint8_t stat_coeff[HEVC_STAT_COEFFS]; +} HEVCCABACState; + typedef struct LongTermRPS { int poc[32]; uint8_t poc_msb_present[32]; @@ -482,8 +487,7 @@ typedef struct HEVCContext { int width; int height; - uint8_t *cabac_state; - uint8_t stat_coeff[HEVC_STAT_COEFFS]; + HEVCCABACState *cabac; /** 1 if the independent slice segment header was successfully parsed */ uint8_t slice_initialized; @@ -601,7 +605,7 @@ int ff_hevc_frame_rps(HEVCContext *s); int ff_hevc_slice_rpl(HEVCContext *s); void ff_hevc_save_states(HEVCContext *s, int ctb_addr_ts); -int ff_hevc_cabac_init(HEVCContext *s, int ctb_addr_ts, int thread); +int ff_hevc_cabac_init(HEVCContext *s, int ctb_addr_ts); int ff_hevc_sao_merge_flag_decode(HEVCContext *s); int ff_hevc_sao_type_idx_decode(HEVCContext *s); int ff_hevc_sao_band_position_decode(HEVCContext *s);
The HEVC decoder has both HEVCContext and HEVCLocalContext structures. The latter is supposed to be the structure containing the per-slicethread state. Yet that is not how it is handled in practice: Each HEVCLocalContext has a unique HEVCContext allocated for it and each of these coincides with the main HEVCContext except in exactly one field: The corresponding HEVCLocalContext. This makes it possible to pass the HEVCContext everywhere where logically a HEVCLocalContext should be used. This led to confusion in the first version of what eventually became commit c8bc0f66a875bc3708d8dc11b757f2198606ffd7: Before said commit, the initialization of the Rice parameter derivation state was incorrect; the fix for single-threaded as well as frame-threaded decoding was to add backup stats to HEVCContext that are used when the cabac state is updated*, see https://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268861.html Yet due to what has been said above, this does not work for slice-threading, because the each HEVCLocalContext has its own HEVCContext, so the Rice parameter state would not be transferred between threads. This is fixed in c8bc0f66a875bc3708d8dc11b757f2198606ffd7 by a hack: It rederives what the previous thread was and accesses the corresponding HEVCContext. Fix this by treating the Rice parameter state the same way the ordinary CABAC parameters are shared between threads: Make them part of the same struct that is shared between slice threads. This does not cause races, because the parts of the code that access these Rice parameters are a subset of the parts of code that access the CABAC parameters. *: And if the persistent_rice_adaptation_enabled_flag is set. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> --- libavcodec/hevc_cabac.c | 17 ++++++++--------- libavcodec/hevcdec.c | 10 +++++----- libavcodec/hevcdec.h | 10 +++++++--- 3 files changed, 20 insertions(+), 17 deletions(-)