-
Notifications
You must be signed in to change notification settings - Fork 41
Improve use of layered images. #118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Found a typo which I commented on inline. Do we also want to add recommendations, or do we do that as part of #104 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. My only concern is for the case where you have two output images (for instance two viewpoints) sharing a base layer. If you don't use lsel
, a decoder may try to use the layers for progressive refinement which is going to look a bit weird. On the other hand, the spec does note that if you want full control over what should happen you should be using lsel
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, minor comments inline.
Sorry for the delay, must have missed the notification that you updated the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good. Some minor questions inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I suggest a few small changes.
index.bs
Outdated
|
||
[[!AV1]] supports encoding a frame using multiple spatial layers. A spatial layer may improve the resolution or quality of the image decoded based on one or more of the previous layers. A layer may also provide an image that does not depend on the previous layers. Additionally, not all layers are expected to produce an image meant to be rendered. Some decoded images may be used only as intermediate decodes. Finally, layers are grouped into one or more [=Operating Points=]. The [=Sequence Header OBU=] defines the list of [=Operating Points=], provides required decoding capabilities, and indicates which layers form each [=Operating Point=]. | ||
|
||
[[!AV1]] delegates the selection of which [=Operating Point=] to process to the application. AVIF defines the [=OperatingPointSelectorProperty=] to control this selection. In the absence of an [=OperatingPointSelectorProperty=] associated with an [=AV1 Image Item=], the AVIF renderer is free to process any [=Operating Point=] present in the [=AV1 Image Item Data=]. In particular, when the [=AV1 Image Item=] is composed of a unique [=Operating Point=], the [=OperatingPointSelectorProperty=] should not be present. If an [=OperatingPointSelectorProperty=] is associated with an [=AV1 Image Item=], the op_index field indicates which [=Operating Point=] is expected to be presented for this item.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Just a comment. No change requested.] Just FYI: The op_index
field here corresponds to the operatingPoint
variable in the AV1 spec. There is also an OperatingPointIdc
variable in the AV1 spec, but OperatingPointIdc
is a bit mask, not an index. Assuming "Idc" stands for "index", I find the names used in the AV1 spec very confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the semantics of op_index
indicate that it gives the value of the operatingPoint
variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the semantics of
op_index
indicate that it gives the value of theoperatingPoint
variable?
If it is not too verbose, we can mention that OperatingPointSelectorProperty implements the choose_operating_point()
function and op_index
gives the value of the operatingPoint
variable in the AV1 spec. In other words, this is equivalent to the following statement in Section. 5.5.1 of the AV1 spec:
operatingPoint = choose_operating_point( )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Cyril: The new version of the PR looks good! I noticed a contradiction with HEIF's requirement on whether the 'lsel' property may be absent for a layered image item.
index.bs
Outdated
|
||
NOTE: When an author wants to offer the ability to render multiple [=Operating Points=] from the same AV1 image (e.g. in the case of multi-view images), multiple [=AV1 Image Items=] can be created that share the same [=AV1 Image Item Data=] but have different [=OperatingPointSelectorProperty=]s. | ||
|
||
[[!AV1]] expects the renderer to display only one frame within the selected [=Operating Point=], which should be the highest spatial layer that is both within the [=Operating Point=] and present within the temporal unit, but [[!AV1]] leaves the option for other applications to set their own policy about which frames are output. AVIF sets a different policy, and defines how the 'lsel' property (defined in [[!HEIF]]) is used to control which layer is rendered. In the absence of a 'lsel' property associated with an [=AV1 Image Item=], the renderer is free to render either: only the output image of the highest spatial layer, or to render all output images of all the intermediate layers, resulting in a form of progressive decoding. If a 'lsel' property is associated with an [=AV1 Image Item=], the renderer is expected to render only the output image for that layer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first sentence says "..., but [[!AV1]] leaves the option for other applications to set their own policy about which frames are output." We could note that this refers to the second paragraph in Section 7.18.1 of the AV1 spec:
If scalability is being used (OperatingPointIdc not equal to 0), an
application-specific function is called to decide whether this frame
will be output. If this function returns a value equal to 0, then this
process terminates immediately.
I think this cross reference would be useful because that "application-specific function" does not have a name and it is hard to find that paragraph when one reads the AV1 spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we want to add a reference to a specific section, but I added an indication that it's in the "general output process" section.
closes #103
Preview | Diff