The days of endlessly grepping source code for license and copyright information may be coming to a close. If the new Software Package Data Exchange (SPDX) project becomes popular, such information will be commonly available in a standard format that can be read by developers and business executives alike. Currently in late beta, the first version of the SPDX specification is scheduled for release next month.
What's the need for SPDX? To start with, as Kate Stewart, a project coordinator and leader of the SPDX technical team observes, many commercial products require an accurate bill of materials. When free software is bundled with products, developers must describe the contents accurately to avoid incompatible licenses and to ensure that the copyright notices required by each license are included in a product. In the case of software licensed under the GNU General Public License and other copyleft licenses, companies also need to be aware of what source code they are obligated to supply, so that they can decide how they will provide it.
These requirements can be onerous by themselves, especially when a company lacks experience with free software. Difficulties increase in cases when there may be one license for a particular bundle, and individual licenses for software contained by the bundle. Just knowing the bundle's license may not be enough, says Kim Weins, vice president of marketing at OpenLogic and leader of the SPDX business team.
To make matters worse, Weins says, additional software can be added all along the supply chain. For example, a chip set provider, device developer, and distributor may all add software contributions to a product. This complexity not only means that some software can be missed – especially when some providers record the information in an external spreadsheet and others in a text file – but that the same audits need to done at every step in the supply chain.
"SPDX is designed to streamline the process," Weins says. In theory, by providing consistent formats, SPDX should make the task of auditing free software easier and more efficient.
The SPDX project represents an alliance of common interests. Since at least 2007 people such as Karen Copenhaver of The Linux Foundation and Esteban Rockett and John Ellis of Motorola have discussed the legal need for such a specification. At the same time, Kate Stewart noticed the need for a common format for embedded development while working at embedded software developer Freescale and talking with colleagues at other companies. Corporate representatives in FOSSBazaar, a community concerned with the governance of free and open source software in business, also recognized the need.
Following discussions at LinuxCon in 2009, FOSSBazaar launched SPDX as a work group in early 2010, coordinated by Phil Odence of Black Duck Software and Kate Stewart. Soon after, the group gained the support of the Linux Foundation Member Counsel.
The SPDX project comprises business, legal, and technical teams, each of which focuses on one perspective of the group effort. "The goal here is literally just to get people to write things up," Stewart says, and make the creation of SPDX files an accepted part of packaging. SPDX has been coordinating its efforts with Debian and Red Hat to standardize the presentation of license and copyright information in their package formats. Other groups, including the Apache Foundation Software and the Open Source Initiative, have also expressed an interest in SPDX.
"The pressure for this is coming from the business side," Stewart says. "Others in the community, though, recognize the importance of this, and they've been helping out." Because participants perceive a similar need, Stewart says, SPDX development has included little discussion about the project's basic goals. So far, SPDX development has been mostly a discussion of details and standardization – what business, legal, and technical users are likely to need, what is explicit and what can be inferred – and not an argument about the general goal.
For example, one requirement was a standard list of short names for at least the most common free software licenses. This list was developed in consultation with Debian's efforts at standardizing this information to ensure compatibility.
Another concern is to create a web of trust and allow verification of review efforts. For this reason, in addition to fields for recording licenses and copyrights, the SPDX standard includes fields for listing the company and individual who created the SPDX file, as well those who have reviewed it. In these areas, SPDX resembles a software package that lists its maintainer in a distribution's repository.
During beta testing, participants have raised additional points of discussion. For example, does SPDX need separate fields to distinguish the original supplier of software from the entity that passed it along the supply chain? Are fields for a package checksum and verification code enough, or should SPDX support encryption in fields and digital signatures?
Some of this feedback from the beta testing may find its way into the general release. Other parts may be delayed until the next version of SPX. Stewart says, "The expectation is additional fields, rather than reworking what we've got."
At the heart of the general release will be the specification itself, which runs more than 60 pages in the latest draft. This standard can be implemented in an external spreadsheet or in a file written in SPDX's custom RDF format. Another part of the release will be a collection of tools: a Java-based SPDXViewer for the command line, SpreadsheetToRDF for converting an SPDX spreadsheept to an RDF file, RDFToSpreadsheet for converting RDF files to spreadsheets, and LicenseRDFaGenerator for converting a spreadsheet file to HTML pages.
Getting the general release out the door is only the first step for SPDX. The next step, even before planning for the next version of the standard, is to encourage and promote its use.
As Weins points out, the further upstream that SPDX is used, the more accurate it is likely to be and the easier to use. "It's important especially to think of the community because, as SPDX grows, we'd like to have the community that is the origin for most of the software kicking off the process by creating SPDX-formatted information about their code and licenses."
However, adoption of the new standard faces its own challenges. For instance, Weins worries that some companies may have trouble trusting the reviews listed in the SPDX files. At the same time, the fact that SPDX is more driven by business concerns than development issues may make some programmers resist adopting it. Faced with a new requirement like SPDX, Weins says, many programmers are likely to respond with some variation of, "'You know, I like this idea, but I'm concerned that it's going to cost me more work and time.' We don't see so much opponents as people who are concerned that, if it becomes a standard, it becomes a standard that they can use."
Still, SPDX has a jump start in its effort, since its participants represent several dozen of the largest companies working with free software. Using the standard may require some rethinking for everybody, but in the end, the hardest task the project faces may not be writing a new standard, but seeing it put to the purpose it was meant for.
Allowed tags: <a> link, <b> bold, <i> italics