Adding information to libosinfo
Some weeks back, Marc-Andre told me that it will probably be helpful for potential contributors if I could write a blog post explaining how new information could be added to libosinfo (the library Boxes relies on for information on various operating systems and their installer medias) so here I'm doing just that. Currently there are two types of information you can add, devices and operating systems. Usually, it'll be the latter that you'd want to add (e.g your favorite OS just made a new awesome release and you want libosinfo to know about it) but for the sake of completion, I'll describe both.
Libosinfo keeps its information database in a bunch of XML files. Although theoretically there could be just one XML file but that would have to be really huge and therefore will be very hard to edit/maintain so we keep each OS distro and device class in its own XML file.
Libosinfo recursively traverses the following locations, assuming application let libosinfo load its own default DB (which at least Boxes does):
The schema of these XML files is pretty straight-forward so just looking at the existing XML files under data/devices and data/oses in the libosinfo source tree will already tell you mostly everything you need to know about the schema.
Before you do that, you'll need to gather following data about the device in question:
the 'id' is created simply by combining the URL of the appropriate ID database (one of the URLs I mentioned above) with vendor and device IDs.
This one is better explained by showing you some real examples:
The 'id' here could really be just anything you like but if you adding a new variant/version of an OS to an existing file of the appropriate family, its good to follow the conventions being followed in that file. Same goes for 'short-id'. The 'upgrades' and 'derives-from' are optional entries. While former is not really used much for anything useful yet, the latter is meant to avoid some duplication.
The most common example of such duplication is list of devices supported out of the box by the OS in question. Notice that we didn't list any devices in the example above. The reason is not that Fedora 17 doesn't support any devices but rather that it inherits all device support from its parent and grand parents. To list devices supported by the OS, you add simple entries like this:
Now in this case, 'id' elements must match an ID of either an existing device in libosinfo's default database or a device you have added in your custom database. If your OS supports the above list devices for example and you don't list them here (or under any parent OS), applications like Boxes might not add these devices to virtual machines they create and you'll end-up with very crappy graphics and no sound in your VMs created for the OS in question.
Another important piece of information is resource requirements and recommendations. Its rather straight-forward as well:
'arch' attribute is usually just 'all', unless the OS in question has different requirements/recommendations for different architectures. The units for cpu, ram and storage are Hz and bytes respectively.
One last piece of information you really would want to add is about the installation and live media. While in future we might use it even for things like presenting downloadable OSs in Boxes (and other apps), for now we use this information mainly to detect the OS (along with other properties) given a media (ISO, USB stick or CD-ROM). Here is how that looks like:
The 'live' attribute means (as you guessed it) a media that can be simply booted for user to try the OS without having to install it first. If the media in question does not provide an installer at all, you want to explicitly specify 'installer' attribute with value 'false'.
The data under 'iso' element is what enables us to detect the media. You can get this information from a media using `isoinfo -d -i /path/to/iso/or/devicenode` command. I should make it clear at this point that values of 'volume-id' and 'system-id' nodes are not exact copies of the actual volume and system IDs but rather a regular expression.
If you are adding this information to libosinfo's default database and hope to contribute this upstream, we'd very much like you to add this information also to our tests (you don't want us to break support for your favourite OS at some point, do you?). Its very easy, you just put the output of the isoinfo command I mentioned to a file named $FILENAME_OF_YOUR_ISO.txt under test/isodata/$DISTRO/$SHORT_ID_OF_OS/ in the source directory.
As you probably guessed it, the 'kernel' and 'initrd' are completely optional and you only need to specify it for Linux-based operating systems. If you are adding information about a proprietary OS, we probably also need to skip the 'url' element.
Thats it! Happy hacking!
Libosinfo keeps its information database in a bunch of XML files. Although theoretically there could be just one XML file but that would have to be really huge and therefore will be very hard to edit/maintain so we keep each OS distro and device class in its own XML file.
Libosinfo recursively traverses the following locations, assuming application let libosinfo load its own default DB (which at least Boxes does):
- ${pkgdatadir}/libosinfo/db, where pkgdatadir typically is ${prefix}/share. This can be modified at runtime by setting OSINFO_DATA_DIR environment variable to whichever path you got the custom DB.
- ${sysconfdir}/libosinfo/db, where sysconfdir typically is ${prefix}/etc or /etc.
- ${HOME}/.config/libosinfo/db
The schema of these XML files is pretty straight-forward so just looking at the existing XML files under data/devices and data/oses in the libosinfo source tree will already tell you mostly everything you need to know about the schema.
Adding a new device
Before you do that, you'll need to gather following data about the device in question:
- Type: Qemu or virtio. If its not the latter, its the former.
- Bus type: usually USB or PCI.
- class: video, audio, block, input, net, watchdog, filesystem and memory.balloon are currently recognised values.
- vendor name and ID
- device name and ID
(device id="http://pciids.sourceforge.net/v2.2/pci.ids/10ec/8029") (name)ne2k_pci(/name) (bus-type)pci(/bus-type) (class)net(/class) (vendor)Realtek Semiconductor Co., Ltd.(/vendor) (vendor-id)10ec(/vendor-id) (device)RTL-8029(AS)(/device) (device-id)8029(/device-id) (/device)
the 'id' is created simply by combining the URL of the appropriate ID database (one of the URLs I mentioned above) with vendor and device IDs.
Adding a new OS
This one is better explained by showing you some real examples:
(os id="http://fedoraproject.org/fedora/17") (short-id)fedora17(/short-id) (name)Fedora 17(/name) (version)17(/version) (vendor)Fedora Project(/vendor) (family)linux(/family) (distro)fedora(/distro) (codename)Beefy Miracle(/codename) (upgrades id="http://fedoraproject.org/fedora/16"/) (derives-from id="http://fedoraproject.org/fedora/16"/) (/os)
The 'id' here could really be just anything you like but if you adding a new variant/version of an OS to an existing file of the appropriate family, its good to follow the conventions being followed in that file. Same goes for 'short-id'. The 'upgrades' and 'derives-from' are optional entries. While former is not really used much for anything useful yet, the latter is meant to avoid some duplication.
The most common example of such duplication is list of devices supported out of the box by the OS in question. Notice that we didn't list any devices in the example above. The reason is not that Fedora 17 doesn't support any devices but rather that it inherits all device support from its parent and grand parents. To list devices supported by the OS, you add simple entries like this:
(os id="http://fedoraproject.org/fedora/17") .. (devices) (device id="http://pciids.sourceforge.net/v2.2/pci.ids/1b36/0100"/) (!-- QXL --) (device id="http://pciids.sourceforge.net/v2.2/pci.ids/8086/2415"/) (!-- AC97 --) (/devices) (/os)
Now in this case, 'id' elements must match an ID of either an existing device in libosinfo's default database or a device you have added in your custom database. If your OS supports the above list devices for example and you don't list them here (or under any parent OS), applications like Boxes might not add these devices to virtual machines they create and you'll end-up with very crappy graphics and no sound in your VMs created for the OS in question.
Another important piece of information is resource requirements and recommendations. Its rather straight-forward as well:
(os id="http://fedoraproject.org/fedora/17") .. (resources arch="all") (minimum) (n-cpus)1(/n-cpus) (ram)671088640(/ram) (storage)94371840(/storage) (/minimum) (recommended) (cpu)4000000000(/cpu) (ram)1207959552(/ram) (storage)9663676416(/storage) (/recommended) (/resources) (/os)
'arch' attribute is usually just 'all', unless the OS in question has different requirements/recommendations for different architectures. The units for cpu, ram and storage are Hz and bytes respectively.
One last piece of information you really would want to add is about the installation and live media. While in future we might use it even for things like presenting downloadable OSs in Boxes (and other apps), for now we use this information mainly to detect the OS (along with other properties) given a media (ISO, USB stick or CD-ROM). Here is how that looks like:
(os id="http://fedoraproject.org/fedora/17") .. (media arch="x86_64") (url)http://download.fedoraproject.org/pub/fedora/linux/releases/16/Fedora/x86_64/iso/Fedora-16-x86_64-DVD.iso(/url) (iso) (volume-id)Fedora 16 x86_64 (DVD|Disc)(/volume-id) (system-id)LINUX(/system-id) (/iso) (kernel)isolinux/vmlinuz(/kernel) (initrd)isolinux/initrd.img(/initrd) (/media) (media arch="i686" live="true") (url)http://download.fedoraproject.org/pub/fedora/linux/releases/16/Live/i686/Fedora-16-i686-Live-Desktop.iso(/url) (iso) (volume-id)Fedora-16-i686-Live(-KDE)?(/volume-id) (system-id)LINUX(/system-id) (/iso) (kernel)isolinux/vmlinuz0(/kernel) (initrd)isolinux/initrd0.img(/initrd) (/media) (/os)
The 'live' attribute means (as you guessed it) a media that can be simply booted for user to try the OS without having to install it first. If the media in question does not provide an installer at all, you want to explicitly specify 'installer' attribute with value 'false'.
The data under 'iso' element is what enables us to detect the media. You can get this information from a media using `isoinfo -d -i /path/to/iso/or/devicenode` command. I should make it clear at this point that values of 'volume-id' and 'system-id' nodes are not exact copies of the actual volume and system IDs but rather a regular expression.
If you are adding this information to libosinfo's default database and hope to contribute this upstream, we'd very much like you to add this information also to our tests (you don't want us to break support for your favourite OS at some point, do you?). Its very easy, you just put the output of the isoinfo command I mentioned to a file named $FILENAME_OF_YOUR_ISO.txt under test/isodata/$DISTRO/$SHORT_ID_OF_OS/ in the source directory.
As you probably guessed it, the 'kernel' and 'initrd' are completely optional and you only need to specify it for Linux-based operating systems. If you are adding information about a proprietary OS, we probably also need to skip the 'url' element.
Thats it! Happy hacking!
Comments
The syntax looks very familiar to S-Expresssions but with the verbosity of XML, is there a reason why you do:
(os) ... (/os)
instead of
(os
...
)
?
I knew someone will point that out. :) The reason is that I was frustrated already with inability to put verbatim XML and didn't want to waste more time on it so I mostly just simply filtered all XML I pasted through `sed -e 's//)/g'`.
Since this is not going to work for proprietary OSs (as you said yourself) and given the amount of work (not to mention fights) involved in the alternative you are asking for, I must ask what is wrong with maintaining a centralized database?
Anyways, we are more than happy drop most of our data and help you in any way you can if you could kickstart this ambitious (IMO) project.
but here is some really usefull information:
http://dcos.net/projects/FOSS-TREE--ISO-PVDs--dcos.net-private-archive-nov-2013.text
ps. i dont use boxes. i use VMM.
http://dcos.net/projects/FOSS-TREE--ISO-PVDs--dcos.net-private-archive-nov-2013.text
ps. i dont use boxes. i use VMM.
UPDATED 2016!:
http://dcos.net/projects/FOSS-TREE--ISO-PVDs--dcos.net-private-archive-april-2016.text