Deciding on how to track media format information

sheppy · February 6, 2019, 12:46pm

To be fair, the situation with GroupData is in the process of being resolved at last, thanks to @wbamberg’s work to write a schema and tests. Though this may get a revisit during the work on the sidebar integration into the platform; if sidebar content comes from a data store of some kind, would that be GroupData, another data store separate from GroupData, or would we replace GroupData with a combined sidebar/API group data store? But that’s a separate topic from this discussion.

So I agree that we do need to be sure we have an ownership and maintenance plan in place for the mdn/media data store for codecs and containers, but I’m confident that such a thing is needed. We have to be able to describe codecs and containers (what they are and what they can do) separately from compatibility information.

I would like to see if anyone has thoughts on the structure I proposed for the BCD-ified version of the media container/codec compatibility data, too.

teoli · February 6, 2019, 1:08pm

Hi, Sheppy!

In browsers, medias formats are not supported in a binary way:
everywhere or nowhere.

For example, the set of codecs supported for WebRTC is not the same as
those accepted in .

How do you plan to represent such differences?

Cheers,

sheppy · February 6, 2019, 9:51pm

That’s an excellent question, and I’m glad you asked. I hadn’t decided on the way I feel is best for that yet. Let me offer a couple of options and see what folks think.

Option 1: Using an `apis` record

The first option: add an apis record to each container and codec record to indicate browser support for the container or codec in that API. Something like the following:

    "mp4": {
      ...
      "apis": {
        "html": {
          "__compat": {
            "support": {
              "firefox": {
                "version_added": "48"
              }
            },
            "status": {
              "experimental": false,
              "standard_track": true,
              "deprecated": false
            }
          }
        },
        "webaudio": {
          "__compat": {
            "support": {
              "firefox": {
                "version_added": "50"
              }
            },
            "status": {
              "experimental": false,
              "standard_track": true,
              "deprecated": false
            }
          }
        },
        "webrtc": {
          "__compat": {
            "support": {
              "firefox": {
                "version_added": "59"
              }
            },
            "status": {
              "experimental": false,
              "standard_track": true,
              "deprecated": false
            }
          }
        }
      },

This indicates that MP4 files work in HTML media elements in Firefox 48, in Web Audio in Firefox 50, and in WebRTC in 59. This could be added to other levels of the JSON hierarchy to indicate that specific component (or codec) capabilities were introduced later in one API than in others.

Option 2: Separate BCD files for each API

The second option is to make media the top level of a set of directories that cover the various media APIs’ support for media capabilities:

media/: Contains a folder for each API, such as…
- media/html: Contains JSON files covering compatibility of containers and codecs in HTML elements
  - media/html/mp4.json (etc)
- media/webaudio: JSON files covering containers and codecs that work in Web Audio
- media/webrtc: Same for WebRTC

exe-boss · February 7, 2019, 2:33am

I think option 1 might be slightly better.

sheppy · February 7, 2019, 1:05pm

I think option 1 is easier and less error-prone. It may not be as flexible. But I’m not sure how much that loss of flexibility really means to us at this level of detail.

I will say that I would love it if we could extend the BCD syntax slightly by allowing the support object to be, optionally, an array of objects rather than a single object, to allow for support set definitions by context.

For example:

{
  "container": {
    "mp4": {
      "__compat": {
        "mdn_url": "https://developer.mozilla.org/docs/Web/Media/Formats/Containers/MP4",
        "support": [
          {
            "html": {
              "firefox": {
                "version_added": "48"
              }
            }
          },
          {
            "webaudio": {
              "firefox": {
                "version_added": "55"
              },
              "chrome": {
                "version_added": "70"
              }
            }
          }
        ],
        "status": {
          "experimental": false,
          "standard_track": true,
          "deprecated": false
        }
      }
    }
  }
}

Here, support is an array of named objects; each object provides compatibility information for the container being described within the context of a given API or technology: HTML, Web Audio, etc. The syntax of each of the individual context objects is identical to the existing supports object.

This would optimize flexibility and capability both, while minimizing the amount of data entry to do as much as possible and while keeping all compatibility information in the same context as the item being described.

sheppy · February 7, 2019, 2:01pm

Better yet, add a contexts object that would be just like my ‘supports‘ array form above that would provide context specific compat information.

exe-boss · February 7, 2019, 5:28pm

That would also work for the gap properties:

sheppy · February 7, 2019, 6:20pm

I just had an even better idea for an amendment to BCD:

{
  "container": {
    "mp4": {
      "__compat": {
        "mdn_url": "https://developer.mozilla.org/docs/Web/Media/Formats/Containers/MP4",
        "support": {
          "context": {
            "html": {
              "firefox": {
                "version_added": "48"
              }
            },
            "webaudio": {
              "firefox": {
                "version_added": "55"
              },
              "chrome": {
                "version_added": "70"
              }
            }
          }
        },
        "status": {
          "experimental": false,
          "standard_track": true,
          "deprecated": false
        }
      }
    }
  }
}

This moves the context objects to be inside the support objects, so that contexts are a component of what’s supported instead of the other way around.

It’s possible (or likely, even) that the word “context” conflicts with existing BCD content, so we would either have to cope with that or rename it to something like __context.

sheppy · February 7, 2019, 9:21pm

Hoping that by early to mid next week we will have a decision made on the format for this data so I can begin to assemble an initial set of files for committing to the appropriate repositories. Need additional feedback from @fscholz in particular, though any other input is welcome.

Basically, the open question is: could we amend BCD to support the option of defining compatibility for a feature conditional upon the context in which it’s used, such as by altering the structure of a BCD record to look something like what’s seen in the previous comment on this thread?

If we can amend BCD to support usage contexts, steps going forward are:

Make a decision on the best way to implement contexts; a couple of proposals are shown above, with the one in comment 26, again, being my personal favorite so far, although we could swap the support and context layers.
Decide on a final name for the context object; since that name almost certainly collides with API members, it would create a conflict. My feeling is that since we already have __compat, we might as well go with __context.
Update the BCD schema to match the new design. Since the context object is optional, no existing files should need to be updated.
Update any tests to handle the revised BCD structure, including testing both with and without the context object being present.
Update the compatibility table macro(s) to support contexts; the table generated should be based upon the data in the context matching that of the current page. The context could come from a tag or from a parameter passed to the macro (or maybe parameter, falling back to tags?).

If we decide against amending BCD, then we will go with adding a context block as if it were a feature of an interface, property, method, etc. This would be like the apis object used in earlier proposals above, but renamed because the term “contexts” applies better to the concept than “APIs.”

fscholz · February 8, 2019, 9:39am

Amending the BCD schema is a complex task and I don’t think we have resources allocated right now to change the schema at its core. The current goal around BCD is to improve the data quality. I think that goal has higher priority than adding media formats in ways you propose. @chrisdavidmills and @atopal please chime in, if I’m wrong about this.

The context idea needs proper research and shouldn’t only be seen in the light of media formats, but should generally be useful for BCD. Again, such a data design change requires research, time and work on the data side of things but then also on the data consumer side. So as you say, we also need engneering on the compat tables on MDN.

All this said, my vote is to take a step back and have something working with the current BCD architecture (which might not be perfect) and the current {{compat}} table displaying. If, however, we decide to make major architectural changes, I would like to have proper resources allocated for it and have buy-in from MDN leadership, because I think you highly underestimate what is involved here to implement your proposal properly.

chrisdavidmills · February 8, 2019, 10:48am

I think Florian’s response is spot on here.

teoli · February 8, 2019, 11:14am

I think there is alternative way of storing to investigate here.

The main use case I see for media support information is answering the question “Can I use this media (Container/Codecs) in this context?”

For example: I am using a element and I would like to see if my ogg videos (with av1 and opus codecs) will be played correctly.

So I think we can put this information in the video element as a subfeature:

"html":{
  "element": {
    "video": { 
      "ogg container": {
        "av1 codec": {
          browser compat for av1 in ogg for <video>
        }
        "opus codec": {
          browser compat for opus in ogg for <video>.
        }
      }
    }
  }
}

And the same for all contexts where it can be used.

That way, all the information is stored, and it is accessible in each of the relevant context, without having to update the bcd schema, its semantic, or bcd tests; also no change in any MDN macros is needed.

What about spending 1 or 2 days to collect the data and put it in BCD that way?

sheppy · February 8, 2019, 4:42pm

I actually agree that amending BCD at this time is not a good idea, even though it would solve certain problems in an efficient way.

sheppy · February 8, 2019, 5:33pm

This is an interesting idea and has a lot of merit. It does not solve all the use cases I need to solve, though, since the documentation I’m currently working on also needs to be able to present information about the capabilities of each codec and container outside the context of the specific element or API. But it does solve a significant part of the problem, and it’s true that we should have this information direclty in the context of the video and audio elements.

So I’ve slightly tweaked the suggestion made by @teoli and came up with the following. The __containers object is located at the same level as the __compat object and all of the element’s attributes.

        "__containers": {
          "mp4": {
            "__compat": {
              "support": {},
              "status": {
                "experimental": false,
                "standard_track": true,
                "deprecated": false
              }
            },
            "h265": {
              "__compat": {
                "support": {
                  "firefox": {
                    "version_added": false
                  }
                },
                "status": {
                  "experimental": false,
                  "standard_track": true,
                  "deprecated": false
                }
              }
            },
            "mpeg2": {
              "__compat": {
                "support": {
                  "firefox": {
                    "version_added": "50"
                  }
                },
                "status": {
                  "experimental": false,
                  "standard_track": true,
                  "deprecated": false
                }
              }
            },
            "asp": {
              "__compat": {
                "support": {
                  "firefox": {
                    "version_added": false
                  }
                },
                "status": {
                  "experimental": false,
                  "standard_track": true,
                  "deprecated": false
                }
              }
            },
            "h264": {
              "__compat": {
                "support": {
                  "firefox": {
                    "version_added": true
                  }
                },
                "status": {
                  "experimental": false,
                  "standard_track": true,
                  "deprecated": false
                }
              }
            }
          },
          "webm": {
            "__compat": {
              "support": {},
              "status": {
                "experimental": false,
                "standard_track": true,
                "deprecated": false
              }
            },
            "vp8": {
              "__compat": {
                "support": {
                  "firefox": {
                    "version_added": "45"
                  }
                },
                "status": {
                  "experimental": false,
                  "standard_track": true,
                  "deprecated": false
                }
              }
            },
            "vp9": {
              "__compat": {
                "support": {
                  "firefox": {
                    "version_added": "47"
                  }
                },
                "status": {
                  "experimental": false,
                  "standard_track": true,
                  "deprecated": false
                }
              }
            }
          }
        }

I’ve posted a complete JSON file with fake data for MP4 and WebM containers on the <video> element. In Firefox at least, the built-in JSON viewer should make it pretty easy to look that over and get a better idea for the structure of the entire object.

If that passes muster for adding the information to BCD for now, then we’re in a good place to get a start. It doesn’t solve the problem of going from the name of a container and a codec to knowing whether or not it works in a particular browser, but it can answer what’s needed in the context of the element’s docs. That leaves me in the position of needing to find the answer to the other question still, but it’s a start…

exe-boss · February 9, 2019, 2:39am

I am more in favour of putting media information into a top‑level media.containers namespace, but apart from that, it would be exactly the same as in the above comment.

sheppy · February 11, 2019, 10:29pm

I will put together a PR that implements this, at least in a basic form, so we can start a more formal review process. I will hopefully do that tomorrow (February 12).

sheppy · February 11, 2019, 10:43pm

By this, do you mean that you would rather have the container and codec compatibility information entirely separate from the media elements?

My feeling is that we can have the compatibility information here, in the media elements, as the latest mockup I’ve shared, but add to a media.containers namespace in the MDN data repository, much like I discussed early on in this topic. That would provide information about the per-specification capabilities of each container and codec.

It could also provide some level of “is this available in browser X?” information if we really wanted it to. Even just a simple binary “yes/no” on whether or not it’s available in the latest edition of the browser could be useful.

sheppy · February 12, 2019, 8:35pm

Here’s an interesting question…

Do we have to reproduce the same set of data in the BCD for both <video> and <HTMLMediaElement>? They support the exact same set of media (since video elements can also play audio-only content). And of course the media formats supported by <audio> are also supported by <HTMLMediaElement>. Having to maintain all of that in multiple places seems like an invitation for mistakes to be made.

Especially curious to hear what @fscholz, @chrisdavidmills, and @exe-boss have to say since they’re either deeply involved in BCD or have at least given me good advice in the past, but open to feedback from anyone!

exe-boss · February 12, 2019, 11:38pm

I’m in favour of just sticking everything in media.containers (and maybe also media.codecs). And then link to the relevant MDN page on the codec from the “Browser compatibility” or “See also” section.

sheppy · February 13, 2019, 6:02pm

Right, but given that the decision was previously made not to do it that way, I’m hoping for feedback on the PR I submitted under the assumption that this is how we will proceed with implementing things.

I personally agree that I think it would be easier and more efficient to have the media information in one place and let that be referenced by the documentation as need be. But right now, we’re evaluating the implementation that puts the information directly into the media elements and APIs that use media.

Deciding on how to track media format information

Option 1: Using an apis record

Option 2: Separate BCD files for each API

Option 1: Using an `apis` record