What is open source?
By Paul Sawers
It’s difficult to overstate the role that open source plays in today’s technology-centric world, given that it intersects with just about every piece of software. Data from Synopsys, the company behind open source security management platform Black Duck, indicates that 98% of codebases contain at least some open source code.
But open source software is not only eating the world, as the popular expression goes, it’s also devouring the solar system — recent GitHub data showed that almost 12,000 GitHub developers contributed to the code that made the very first Martian helicopter flight possible on April 19, 2021. Chances are, though, most weren’t even aware that their contributions were used in the historic NASA mission.
But the path of open source is not always smooth to travel. The sheer number and variety of open source software packages makes it difficult for even the largest of businesses to determine what’s best for their needs, while the myriad license types and associated commercial interests creates friction and uncertainty. But even before we get to all these complexities, it’s worth taking a backward step to establish what open source actually is.
So … what is open source?
“Open source” in its purest form refers to software that is made freely available for anyone to access, copy, modify, and redistribute. It’s ultimately a collaborative, community-led approach that lowers the bar to entry and the cost of building software. This can be contrasted with proprietary “closed source” software that is built internally by a commercial company, and which usually won’t be truly “free” in any sense of the word; it can’t be inspected, modified, or redistributed, and it will likely cost something — if not hard cash, then by targeting you with ads.
However, more than a little nuance permeates all of this. As noted, most software relies on at least some open source components, including that created by all the major companies such as Google, Microsoft, and Facebook. And many of these companies also release some of their own internal technologies as open source software.
The term “open source” has pretty much emerged as more of an umbrella term under which many subsets exist, some of which aren’t as free and flexible as others, with different restrictions and licensing (see below) in place. And many debates ensue over whether something really is “open source” like it claims.
Thus we now have “free and open source software” (FOSS) to describe what many would argue is the truest and purest form of open source: that which can be used for any purpose without restrictions including being able to distribute copies to others, while the source code is also fully available to anyone wishing to modify and improve it. Any modifications, however, should be made available to other users too.
This is not to be confused with freeware, which is simply just software that does not come with a direct financial cost attached to it, but which likely won’t have any of the freedoms associated with open source software. Richard Stallman, founder of the Free Software Foundation, famously helped define what is meant by “free” in FOSS when he said:
Free software is a matter of liberty, not price. To understand the concept, you should think of ‘free’ as in free speech, not as in free beer.
To emphasize that the “free” denotes liberty rather than financial cost, the term FLOSS (“free/libre open source software”) is often used instead. But for all intents and purposes, FOSS and FLOSS mean the same thing.
Deviating from the pure FOSS (or FLOSS) ethos leads to other terms that you will often come across in the open source sphere. “Source available” refers to software that makes the source code available to view and even make modifications in some instances, though the license won’t give full rights to share and modify. Source available should not to be confused with actual open source software.
A good example of this is Lumberyard, a free game engine that Amazon launched back in 2016 to help developers create cross-platform games. The company eventually made the source code available to anyone under a proprietary license, allowing developers to customize their project using GitHub, while Amazon also accepts code contributions from the community. But the T&Cs are quite clear that this is no open source project, given that users cannot publicly release the Lumberyard engine source code anywhere outside of its official GitHub home and, crucially, cannot create their own game engine off the back of Lumberyard.
In making Lumberyard source code available (and free to use), Amazon wants to make the platform stickier for game developers, luring them away from rivals such as Unity and Unreal. And integrating Lumberyard tightly into its ecosystem, including Twitch and its AWS cloud platform, is where Amazon ultimately earns its coin.
Commercialization and licensing
All this isn’t to say that commercialization and open source software can’t be friends, though — far from it. Commercial open source software (COSS) companies are abundant, and usually are a single company that monetizes an open source project through selling additional services or add-ons such as analytics or security that will appeal to bigger businesses. Often, but by no means always, these types of businesses are also the open source project’s chief developers — that is, they are in charge of maintaining the project and committing code changes to the main codebase.
This is where it’s worth paying attention to the various licensing arrangements that different open source software projects employ. “While very permissive licenses are common, the ecosystem is made up of many different types of licenses,” Facebook’s head of open source Kathy Kam told VentureBeat. “It’s important to understand what stipulations come along with different pieces of software. For example, a license may restrict the commercial uses, preventing you from using that software to offer a particular service. Or it may require that derivative products are also open sourced. It’s important to read the fine print, especially as it may relate to what your business can or cannot do.”
MySQL, for example, is an open source relational database management system that Oracle releases under a dual license — one a GNU General Public License (GPL), the other proprietary. The former affords most of the freedoms one would expect from FOSS, though the license is what is known as copyleft, which means that any derivative software must be issued under similar license terms. In other words, new software built from the open source software must be released under a similar open source license.
Oracle’s secondary license is how it commercializes MySQL, selling it under the MySQL Enterprise Edition banner, which offers additional services not included in the GPL license, such as a fully managed database service; an enterprise-grade data backup service; a document store; and security smarts such as encryption and a firewall. Also, companies holding the commercial license are allowed to sell MySQL-based products without making the derivative product open source.
In contrast to copyleft licenses such as GPL, so-called permissive software licenses such as the MIT License, GNU All-permissive License, and the Apache license don’t impose derivative software restrictions, making it easier for a private company to repurpose it as part of a proprietary product. In fact, they could also re-license their new software under a GPL license if they wish.
Today, React is one of the top open source projects by just about every metric, and it should perhaps come as little surprise that the most popular open source projects, including Kubernetes, TensorFlow, Vue, and React, have all been released under a permissive license such as MIT or Apache. Developers — particularly commercial ones — don’t like restrictions.
This helps to highlight how licensing has emerged as a contentious issue in the open source world. Oftentimes a company will switch an open source project to a far less permissive license to protect its investment. Earlier this year, Amazon’s AWS revealed it was forking Elasticsearch, the Java-based open source engine for storing, searching, and analyzing large volumes of data, alongside the associated data visualization dashboard Kibana.
Forking, for the uninitiated, is the process of taking the original source code and developing a new program that’s independent from the original. It is generally considered a bad thing in the OSS realm as it tends to lead to tension between the communities that develop each incarnation, and causes general friction. However, forking is usually deemed the only way forward for an OSS project when two (or more) different entities develop different priorities.
In the case of Amazon and Elasticsearch, the move came after Elastic, the private company that spearheads Elasticsearch and Kibana, confirmed it was transitioning from a permissive Apache License to a dual source available Server Side Public License (SSPL) and a proprietary Elastic License. The main purpose of this change, Elastic said, was to prevent cloud service providers such as AWS from offering Elasticsearch as a service themselves. In other words, Elastic wanted to curtail major technology companies from benefiting from its hard work without necessarily investing sufficient resources back into the project itself.
Although Elastic insinuated that Elasticsearch was still an open source project (e.g. it said it was “doubling down on open”), the Open Source Initiative (OSI) doesn’t recognize the SSPL license as open source. The OSI board wrote:
Outside contributors donated time and energy with the understanding that their work was going towards the greater good, the public software commons. Now, instead, their contributions are embedded in a proprietary product. If they want to enjoy the fruits of their own and their co-contributors’ labor, they have to agree to a proprietary license or fork.
Aside from commercial open source software, you might also read similar-sounding terminology to basically describe the same thing. Red Hat, the gargantuan commercial open source software purveyor acquired by IBM for $34 billion back in 2018, refers to its software as “enterprise open source.” In a recent interview with VentureBeat, Red Hat technology evangelist Gordon Haff distinguished its products from a typical commercial open source product by describing it as something that offers “a hardened product for the enterprise, including added security, [and] vendor support 24/7.”
It’s probably just semantics, but a multi-billion dollar company looking to dive headfirst into open source software might be more enticed by something called “enterprise open source.”
Finally, another term you’re likely to encounter in the open source realm is “open core.” This is kind of similar to the dual-licensing model that some companies have followed to commercialize open source projects, though many have asserted that “open core has nothing to do with open source.”
Open core is a way of commercializing open source software by offering a limited set of features in the free edition, while selling add-ons as premium features. This certainly intersects with the principles of dual licensing, but open core perhaps leans toward more of a freemium business model — a basic set of features is available to everyone under a free license, but all the juicy useful stuff has to be bought under a proprietary license.
“The ‘core’ could be licensed under a permissive license that doesn’t place any restrictions on how the software can be used, but generally lacks all the ‘bells and whistles’ of a commercial offering,” said Martin Traverso, one of the cocreators of the distributed SQL query engine Presto, which is now known as Trino. “This model tends to be more compatible with projects that are independent or not owned by a single company.”
With a dual licensing model, a company may also sell additional features or services under a proprietary license, but a typical differentiator here is that the proprietary license will often remove any copyleft restrictions (i.e., the customer can then sell their software without having to make it open source).
“Dual licensing also typically requires that the copyright to the software be owned by a single entity that has the power to make those licensing decisions,” Traverso added.
So while the distinctions can get a little muddy, open core is perhaps more to do with feature availability than the licensing per se.
Depending on who you speak to, open source software is more secure than proprietary software because its code is in the “open” for anyone to analyze, or it’s less secure for that same reason given that anyone can easily access it. Recent data from Synopsys found that 84% of codebases contain at least one open source vulnerability.
“There’s always a potential for cybercriminals to try and leverage openness for malicious purposes,” noted Facebook product manager for open source Michael Cheng. “However, we’ve found that the advantages and benefits of open source greatly outweigh this risk. We have observed that good communities are quick to respond to security vulnerabilities — contributors amass support very quickly to patch these kinds of issues. Thus assessing the health of a community is just as important as the technological merits of a particular open source project.”
And this is a key point worth picking up on — the community is integral to any open source project, and the number of active contributors is indicative of its overall health. But just because something is open source doesn’t mean that it is inherently more secure. If a project isn’t actively maintained and hasn’t been updated in 2 years, then it’s likely to be less secure than a proprietary equivalent that is regularly updated. And the exact same principle applies the other way around.
“There is no security through obscurity — security depends on understanding your work and your security model, and this can be achieved with open source and proprietary systems,” said Oskari Saarenmaa, cofounder and CEO at Aiven, a company that manages open source data infrastructure on all the major clouds.
However, open source has the potential to be more secure than proprietary software. If you take two pieces of software, one open source and one proprietary and both receiving active updates, the open source incarnation may be more secure by virtue of the fact that it can be inspected by thousands of eyes.
“Open source gives you and the community a better chance of understanding the overall quality and maturity of a software system, and also allows you to repair and modify systems as needed,” Saarenmaa added. “Whether or not proprietary software packages are properly maintained can be much harder to assess.”
Now that we have a better idea of what open source is at a fundamental level, it’s working looking at some more perspectives from those within industry, in particular those companies responsible for commercializing and reaping the benefits of open source software.