OK, Iāve got a rough pass done at JSON structure for the data. Hereās what I have. Feedback is appreciated. The next step will probably be to actually create the files in the repos and start putting data in to see how it goes.
As before, there are four files, two in mdn/data
and two in mdn/browser-compat-data
. Those files are:
-
mdn/data
-
MediaContainers.json
: List of media container formats that are or may be available on the web
-
MediaCodecs.json
: List of media codecs that are or may be available
-
mdn/browser-compat-data
-
ContainerSupport.json
: For each browser, information on which container formats they support
-
CodecSupport.json
: For each browser, information about codecs they support and the containers that theyāre allowed in
In all cases, these formats are set up to be expandable so we can add more details about each format as we feel the need to do so.
Container format info: MediaContainers.json
The file MediaContainers.json
provides information about the set of container formats (file types) that may be supported by browsers. The JSON looks like this:
{
"container-id": {
"short_name": string,
"full_name": string,
"specs": url or array of urls,
"license": {
"id": spdx license identifier,
[optional] "notes": string or array of strings
},
"audio": {
"mimetypes": array of strings,
"extensions": array of strings,
"codecs": {
{
"codec-id1": boolean or {
[optional] "mimetypes": array of strings,
[optional] "extensions": array of strings
},
...
"codec-idN": {
...
}
}
}
},
"video": {
},
[optional] "preferred_codecs": preferred-codecs-descriptor,
[optional] "notes": string or array of strings
},
... more containers until all described
}
- The
license.id
is an ID from the SPDX license list at https://spdx.org/licenses/
- Each container has two parts,
audio
and video
, to provide any details for each of these, including the codecs permitted by the spec for each. There is room to add more details here over time.
- The
codecs
lists contain one entry for each codec supported by the spec; if a codec isnāt supported, it should not be in the list. Each entryās key is the ID of a codec from the MediaCodecs.json
file, and the value is either a boolean which should be true (a value of false should not be used) or an object providing additional details about the codecās capabilities when used with that codec.
- Each container has a list of MIME types and filename extensions. Each codec can also have these, optionally, to override the containerās default lists when that codec is used. For example, the normal MIME type for an MPEG audio file is
audio/mpeg
, but if the codec is MPEG-1 Audio Layer III, the MIME type becomes audio/mp3
and the extension becomes mp3
.
- The
video
block is structurally identical to the audio
one, at least for now. If we start adding more details like whether or not VBR is allowed, or whether the container supports HDR, then theyāll start to diverge.
preferred-codecs-descriptor
A container definition can, optionally, include a preferred_codecs
object that specifies the best, most common, or ideal codecs to use inside that container. That objectās value, a preferred-codecs-descriptor
, is an object like this:
{
"audio_only": string,
"audio_video": {
"audio": string,
"video": string
}
}
-
audio_only
: The codec-id of a a codec thatās preferred for audio-only media inside this container
-
audio_video
: An object specifying the preferred audio
and video
codecs for this container; applies to both video-only and audio-video media.
The idea here is that some containers have codecs that are most commonly used, or are generally considered the best choice. This block would let those codecs be identified (by their codec ID string) for both audio-only and audio/video files.
Codec info: MediaCodecs.json
The MediaCodecs.json
file provides lists of the audio and video codecs that may be available on the web. The file has two lists: one of audio codecs and one of video codecs; each codec object has as its key a unique identifying string.
The records use a few secondary objects which will be described after the primary objects:
-
compression-descriptor
: An object providing basic information about the codecās compression
-
best-for-value
: A string or array of strings indicating types of content the codec is good for; see the list farther down for what these values can be
-
license-id
: A license ID string taken from the SPDX license list at https://spdx.org/licenses/
-
spec-descriptor
: An object providing information about a specification
Main content
{
"audio": {
"codec-id1": {
"short_name": string,
"long_name": string,
"compression": compression-descriptor,
[optional] "best_for": one or more of best-for-value,
"max_bitrate": number,
"min_channels": integer,
"max_channels": integer,
"license": {
"id": spdx license ID string,
[optional] "notes": string or array of strings
},
"specs": array of spec-descriptor,
[optional] "notes": string or array of strings
},
"codec-id2": ...
},
"video": {
"codec-id1": {
"short_name": string,
"long_name": string,
"compression": compression-descriptor,
[optional] "max_width": integer,
[optional] "max_height": integer,
"max_bitrate": number,
"[optional] framerates": array of number,
-- OR --
"[optional] min_framerate": number,
"[optional] max_framerate": number,
"license": {
"id": spdx license ID string,
[optional] "notes": string or array of strings
},
"specs": array of spec-descriptor,
[optional] "notes": string or array of strings
},
"codec-id2": ...
}
}
- The
audio
object lists all of the audio codecs available to the docs; the video
version does the same for videol
- The
codec-id
values are strings based on the codecās short name, with no punctuation and all lower case. A codec named H.265 would be given the ID h265
, for instance.
- A codecās
short_name
is an abbreviation with appropriate capitalization and punctuation, suitable for human reading, such as āMP3ā or āH.265ā or the like. The long_name
would be the full name, such as āMPEG-1 Audio Layer IIIā for MP3.
-
max_bitrate
is the maximum bitrate of media using the codec in bits/sec.
-
max_channels
and min_channels
indicate the largest and smallest number of audio channels supported by an audio codec.
- Video codecs may include information about supported frame rates either as an array of specific frame rates that are permitted called
framerates
or a maximum and minimum frame rate in FPS.
- To indicate that a codec has no restriction for a given property, set its value to
false
. For example, if a codec specifies no maximum width, specify a value of max_width
of false
.
compression-descriptor
An object providing information about the compression used by a codec.
{
"lossy": boolean,
[optional] "ratio_min": number,
[optional] "ratio_typical": number,
[optional] "ratio_max": number,
[optional] "notes": string or strings
}
The only required value is lossy
, which is true
if the codecās compression is lossy. The ratio_*
values are used to provide hints as to how much the codec typically compresses data. These values are only estimates of the minimum, typical, and maximum compression ratios.
Need to define how the ratios are calculated
best-for-value
A string or array of strings identifying the type or types of content that the codec is best for. Possible values:
- Audio-only values
- Video-only values
animation
action
bright
dark
- Allowed for both audio and video
spec-descriptor
Describes a single specification. Many containers and/or codecs are defined in part across multiple specifications, so the specs
property is an array of spec-descriptor
objects.
{
"name": string
"url": url of specification
}
Containers supported by each browser: ContainerSupport.json
This fileācontained in BCDācontains a group of objects, one for each browser as identified by the same string we use in BCDās existing data.
{
"browser-name1": {
"audio": {
"container-id1": boolean or {
optional "notes": string or strings,
// Details about supported container features
// could go here eventually
},
...
"container-idN": {
optional "notes": string or strings,
// Details about supported container features
// could go here eventually
}
},
"video": {
"container-id1": boolean or {
optional "notes": string or strings,
// Details about supported container features
// could go here eventually
},
...
"container-idN": boolean or {
optional "notes": string or strings,
// Details about supported container features
// could go here eventually
}
}
},
"browser-name2": ...
}
- For each browser, we separately list the containers supported for audio and video files. Each container support object thatās present indicates that the browser supports that container; each container is identified by the same container ID used in the
MediaContainers.json
file.
- If the value given for a given container is a boolean, it simply indicates whether or not the container is known by the browser. This is the minimum information that can be given.
- Eventually, though, a container IDās value could be an object with additional information, with fields for things like notes and which of a containerās features are supported along with any restrictions or compatibility issues.
Codecs supported by each browser: CodecSupport.json
This file lists, for each browser, the audio and video codecs they support and, for each codec, the containers that the codec is allowed to be used in.
{
"browser-name1": {
"audio": {
"codec-id1": boolean or {
"containers": array of container-ids,
[optional] "notes": string or array of strings,
// eventually things like what features work, etc
},
...
"codec-idN": boolean or {
"containers": array of container-ids,
[optional] "notes": string or array of strings,
// eventually things like what features work, etc
}
},
"video": {
"codec-id1": boolean or {
"containers": array of container-ids,
[optional] "notes": string or array of strings,
// eventually things like what features work, etc
},
...
"codec-idN": boolean or {
"containers": array of container-ids,
[optional] "notes": string or array of strings,
// eventually things like what features work, etc
}
}
},
"browser-name2": ...
}
- Each browser has a sub-object,
audio
, listing the audio codecs, and another, video
, listing video codecs. Currently the formats of these are the same but that will change as details like features supported and the like are added.
- The codecs are identified by the same ID string used elsewhere. Those names are used as the keys for objects providing information about the browserās support for each codec.
- Each codec record has a
containers
object which is an array of the container IDs for the containers the browser is able to use the codec in. For instance, if a browser supports MPEG-1 Audio Layer III in MPEG/MPEG2 files, MP4 files, and ubermovie[^1] formats:
"containers": [ "mpeg", "mp4", "uber" ]
[^1]: Thatās not a real movie format.