In the first piece on open source auditing, I demonstrated the need for an open source audit for companies that are using any open source software and what you can expect out of an audit. But we've yet to go into detail regarding how an open source audit works. This time, I'd like to provide insight into how OpenLogic performs an open source software audit and how we train our customers to perform their own audits using our scanning tools. These tips will help you ensure a successful audit whether doing it yourself with scanning tools or using an outside audit vendor such as OpenLogic.
Let's start with preparing for the audit. Conducting an open source audit requires preparation, just like any other form of audit. Here, though, you won't need to be gathering receipts and expense reports — but the source code for the software that's in use.
You should start by getting a list from your developers of all of the open source software they have used. As discussed in Part 1, this list is likely to be incomplete, but will help you as you proceed with the audit. Next, you need to get together the code for any software that you're building and distributing or are likely to distribute. You should include all code that will be distributed as part of the product or with the product, whether modified or not. Remember that you're not only looking for software that may have problematic licensing or where you might have compliance issues, but to get a full assessment of what software you're depending on to ensure there are no other business considerations.You can exclude anything that is not shipped or distributed. This means, for example, if you're using GCC to build your software, it's not necessary to audit GCC or your build system unless you're also distributing that. Note that you may need to supply custom tools to comply with some licenses (like the GPL) if they're required to build the final package.
Next, you need to get your team together to participate in the audit. This is crucial — because you want to have the right people with the necessary expertise involved.
First, you're going to need one or more members of the engineering team that can answer the technical questions — how components are linked, how the build environment works, specific versions of software in use, etc.You will also need legal expertise as part of the audit. If your organization has an internal legal team, choose someone from your legal team who is familiar with open source software licensing. If your business doesn't have a legal team, or your legal team is not familiar with open source licensing, you may need to engage outside counsel.You'll also want to have representation from someone in management who understands the business issues and can set the expectations for the results of the audit. Maybe the primary focus of the audit is license compliance. Maybe the primary focus is on discovering exactly which technologies are in use and how to ensure policy compliance — more likely a mix of all of the above. An audit may uncover issues where there is no clear black and white answer, so your management representative should work with legal to assess company's risk profile in order to make decisions about how to respond to results of the audit.
Once you have your team put together, you can begin the audit. As mentioned in Part 1, you have the option to audit source code yourself or to hire a vendor to do the audit for you as a service. In either case, you or the vendor will likely want to make use of automated scanning tools. One thing that many teams want to know is how automated scanning tools work, since it's a bit different than self-reporting or doing an audit "by hand" by just looking through source code.
Most scanning tools use a number of methods to see if code in your product matches known open source projects or libraries. A scanner does this by comparing your code to a large repository of “fingerprints” of hundreds of thousands of known open source projects or libraries. No repository is going to have everything, but the size of a tool's "fingerprint repository" is one factor in the completeness of the audit.Another important factor are the techniques used to find the open source code. For example, at OpenLogic, the scanning tools that we sell and use in our audit services can identify open source that is used regardless of whether it is an entire project, a single file or a snippet of source code. The tools can even detect source that has been modified — for example by removing file headers, deleting or changing code. Our tools will search for things like the name of files in the project, pathnames, license text, or names in source code. In addition, the tools look for hash codes for files and whether blocks of source code match known projects.The flip side of trying to find all possible places where your code contains open source code is ensuring that there are as few false positives as possible. Because open source projects often reuse libraries and code from other open source projects, automated scanning tools may not be able to perfectly identify the original provenance of open source code. As a result, some scanners can produce a large number of matches, many of which are incorrect or redundant. Too many false positives means a lot of wasted time in reviewing and understanding the scan results. The scanning tools we use at OpenLogic help address these issues by using a variety of "noise reduction" techniques that help you zero in on the correct matches.When doing a scan, you may also run into situations where some code may not be licensed at all or may not have an obvious license. The reverse is true as well: scanning may turn up licenses that don't seem to be assigned to code at all. In other cases, you can find multiple licenses within an open source project that are in conflict with one another — that is, you cannot meet the requirements of both licenses. These situations will require additional research and investigation to determine the licensing for the code. In some cases you may want to contact developers from the original project to clarify the licensing, if necessary.If you're using a service or vendor to assist with the audit, you'll want to know about the tools and process that vendor follows. You may have the option to do an audit at your business location or to provide the code to the vendor to complete the audit in their offices. In the case of an audit done at the vendor location, you’ll wan to understand the "chain of custody" for handling the code while it's being examined. If the audit was triggered by an external event, like an acquisition or to provide results to an OEM partner, you’ll also want to specify who you want to see the results of the audit. For example, in the case of an acquisition, you may want the acquiring company may receive a copy of the audit as well. Lastly, if you are using a service provider for the audit, you’ll want to understand what warranty or indemnification is provided to back up the report provided.
Once the scanning and discovery phase of the audit are completed, it's time to sort through the licenses and and determine what terms are triggered. This is where the legal team is going to need to look over the licenses that have been identified and discuss with engineers how the projects are used. When analyzing a license, you can break the license down into a series of "if-then" statements. For example, a license may include something like the following: if you distribute this open source software, then you must also distribute a copy of the license. Your legal team and development experts can then determine if you are using the open source software in a way that triggers a particular requirement.
Once you know that a license obligation applies to your particular use of the software, then you must determine how to fulfill the obligation. Sometime, the devil really is in the details. For example, you might know you need to distribute the source code, but the license may require the source code to be provided in a particular way. Making the source code available, bur failing to do so in the way dictated by the license may still be considered non-compliance. In some cases, the meaning of certain clauses may be open to interpretation.Although many open source licenses are fairly straightforward, they also present their own set of challenges. Most open source licenses were not written by attorneys and do not track typical statutory or contract language. Some license requirements trigger off of particular engineering scenarios, requiring both a legal and developer perspective to ascertain the meaning. Although lawsuits have been filed regarding compliance, almost all have settled. Consequently, we have no judicial opinions regarding the interpretation of the more vexing license compliance issues.There may also be a variety of opinions from the open source ecosystem on particular interpretations or expectations about what a license means. The FSF, for example, has offered a lot of guidance on the GPL licenses — but that's no guarantee that a court will agree with the FSF's interpretation. However, understanding the viewpoints of a particular community that holds copyrights on the open source code is still an important consideration for license compliance.After the audit is completed, you'll have the information about what open source projects used and the applicable licenses. You’ll also have analyzed which obligations in the license are applicable based on your use case. As the last step you'll need to determine whether you are in compliance with each of the license obligations and whether the use of open source is in keeping with company policies, etc.
As you begin to audit your software for open source, you will quickly realize that your audits will go smoother if you put in place some basic compliance processes.
Companies embarking on an open source audit may find that there's a lot of unfamiliar territory, but this should not inhibit or dissuade companies from using open source. It's simply a good idea to be aware of what the use of open source in a business entails, and handling it responsibly. Just as you need to have a process to comply with the terms of proprietary software licenses, you also need to have a process to comply with open source licenses. An effective audit process will help achieve this.
Allowed tags: <a> link, <b> bold, <i> italics