Open Source Software--Licenses, Compliance, and Risk

Somewhere in your company's products, right now, there is code that you did not write, did not buy, and never negotiated for. It arrived for free, attached to a license that nobody on the deal team read, and it is doing real work — encrypting a password, parsing a date, drawing a button. Multiply that by a few thousand and you have a modern software product. The industry's own audits make the point with unsettling consistency: year after year, the security firms that scan commercial codebases find open source components in well over ninety percent of them, frequently making up the majority of the lines of code. The proprietary software you paid a vendor seven figures for is, under the hood, mostly open source software you got for nothing.

That is either the greatest bargain in the history of technology or a vast pile of unexamined legal obligation, and the honest answer is that it is both. Open source is the reason a two-person startup can ship a product that would have required a hundred engineers a generation ago. It is also the reason that same startup can, without anyone noticing, find itself contractually obligated to publish its crown-jewel source code to the world — or watch an acquisition collapse in diligence because nobody can prove what is inside the build.

This guide is a map of that terrain. We will define what "open source" actually means (it is a precise legal term, not a vibe), walk the license families from the permissive to the fiercely reciprocal, dig into the two hardest questions in the field — when does using open source code force you to open your code, and when may two licenses lawfully be mixed at all — and trace how courts came to treat these volunteer-drafted licenses as enforceable law. Then we turn to the modern practice: how organizations build open source compliance programs around a "software bill of materials," how open source becomes the make-or-break issue in mergers and acquisitions, and how the ground is shifting in 2026 under a wave of companies abandoning open source for "source-available" licenses, alongside the unresolved collision between open source and generative artificial intelligence. A judge, a general counsel, a software engineer, and an intelligent layperson should all be able to follow every step. Every term of art gets defined the first time it appears.

For a sense of where this fits in the broader picture, open source is one slice of the much larger question of how software is protected and transacted at all. If you want the companion pieces, our overview of the legal protection of software explains the copyright/patent/trade-secret/contract framework that open source licensing sits on top of, and our primers on software licensing agreements and software transactions cover the proprietary side of the same coin.

What "Open Source" Actually Means

Start with the most common misconception. "Open source" does not mean "free of charge," "public domain," "no rules," or "do whatever you want." Open source software is copyrighted software — fully protected by copyright the moment it is fixed in a tangible medium, 17 U.S.C. § 102(a) — that the author chooses to license to the entire world on standardized, generous terms. The magic is not the absence of legal rights; it is the deliberate, conditional grant of them. Every open source license is, at bottom, an exercise of copyright: the author owns the exclusive rights set out in 17 U.S.C. § 106 (to reproduce, prepare derivative works, distribute, and so on) and then hands many of them out on conditions. Pull those conditions and you are back to ordinary copyright infringement under 17 U.S.C. § 501. That insight — that open source runs on copyright rather than against it — is the key that unlocks everything else in this article.

A quick vocabulary note that the rest of the article depends on. Source code is the human-readable version a programmer writes and edits, full of words like "add," "move," and "print." A compiler or interpreter translates it into object code (also called a binary), the machine-runnable string of ones and zeros that a person cannot practically read or modify. The entire open source bargain turns on this distinction: shipping only the binary, while keeping the source secret, is the opposite of "open." That is also why the disclosure obligations in copyleft licenses (below) are about source, not binaries — the source is the thing that lets the next developer actually study, fix, and build on the code.

So what makes a license "open source" rather than just generous? The authoritative answer comes from the Open Source Initiative (OSI), the nonprofit founded in 1998 by Bruce Perens, Eric Raymond, and others to put a practical, business-friendly face on what had been a philosophically charged "free software" movement. The OSI maintains the Open Source Definition (OSD), a ten-point test that a license must satisfy before the OSI will certify it as "open source." A license that fails the test is not open source, no matter what its author calls it — a distinction that matters enormously when we reach the 2023–2025 relicensing wave below, because several prominent products left the open source world precisely by adopting licenses the OSI will not approve.

The ten criteria, in plain terms, are:

Free redistribution. The license cannot stop anyone from giving the software away or selling it as part of a larger bundle, and cannot demand a royalty for doing so.
Source code. The program must include its human-readable source code, or make it freely available. Shipping only the compiled binary is disqualifying.
Derived works. The license must allow modifications and let people distribute those modified versions.
Integrity of the author's source code. A license may require that changes be shipped as separate "patch files" rather than altered source, so that everyone can tell the original author's work from later edits — but only if it still permits modified builds.
No discrimination against persons or groups. You cannot write a license that excludes, say, a particular company or country.
No discrimination against fields of endeavor. A "free for non-commercial use only" license is, by this rule, not open source. This is the criterion that trips up well-meaning "ethical" licenses — and, as we will see, the relicensing wave.
Distribution of license. The rights travel with the code automatically; a downstream recipient needs no separate signed agreement.
License must not be specific to a product. The rights cannot depend on the code being part of one particular distribution.
License must not restrict other software. It cannot demand that everything else on the same disk also be open source.
License must be technology-neutral. No clause may require a specific technology or interface style (for instance, no "click-to-accept-only" requirement).

Read those together and a theme emerges. Open source is defined by what the licensor may not take away — the right to use, study, modify, and redistribute — far more than by any single business model. Criterion 6 is the quiet hero: by forbidding field-of-use restrictions, it draws the bright line between genuine open source and the much larger world of "source-available" code you can read but not freely reuse.

It helps to know that two organizations, with two different philosophies, gave the movement its vocabulary. The OSI's pragmatic OSD grew out of the older Free Software Definition from Richard Stallman's Free Software Foundation (FSF), founded in 1985. The FSF frames things as four freedoms — the freedom to run the program for any purpose (freedom 0), to study and modify it (freedom 1), to redistribute copies (freedom 2), and to distribute your modified versions (freedom 3). Stallman's famous line is that "free" here means free as in free speech, not free as in free beer. The two definitions overlap almost entirely; the practical difference is one of emphasis and, sometimes, of a fifth concern the FSF cares about intensely and the OSD treats as optional — copyleft, to which we now turn.

A last grounding fact for the skeptic who suspects this is all fringe software. The most-used programs on earth are open source: the Linux kernel runs most of the world's servers and sits under every Android phone; the Apache and nginx web servers move much of the internet's traffic; Firefox and Chromium (the engine under Google Chrome) are open; and even Bitcoin launched as open source code. This is the infrastructure of modern computing, not a hobbyist sideshow.

A Field Guide to Open Source Licenses

There are, by the OSI's own count, on the order of a hundred OSI-approved licenses, and many more "source-available" licenses floating around outside that list. You do not need to memorize them. You need to understand the two great families and the spectrum between them, because the family a license belongs to determines the single most important fact about it: what it requires you to give back.

Permissive licenses: take it and go

A permissive license says, in effect: here is the code, do almost anything you like with it, including putting it inside your closed, proprietary, paid product — just keep our copyright notice and license text attached, and don't sue us if it breaks. Permissive licenses impose minimal obligations and, crucially, do not require you to release the source code of whatever you build on top. They are "compatible" with proprietary development in a way copyleft licenses are not.

The three you will meet constantly:

The MIT License is the minimalist's favorite — a few sentences. Keep the notice; the software comes "as is." That's nearly the whole thing. Its brevity and clarity have made it the most popular license on code-hosting platforms by a wide margin.
The BSD licenses (from the Berkeley Software Distribution, the academic Unix lineage) are close cousins of MIT. The common variants are the "2-clause" and "3-clause" BSD; the 3-clause adds a no-endorsement clause forbidding you from using the authors' names to promote your product.
The Apache License 2.0 is the corporate-grade permissive license. It does everything MIT does and adds two things lawyers care about: an express patent grant (every contributor licenses their relevant patents to users — more on why that matters below) and a patent retaliation clause (sue an Apache project over its patents and your patent license to it terminates). Google chose Apache 2.0 for Android precisely because of this lawyerly completeness.

If your only goal is to use someone's code and ship a product, permissive licenses are about as easy as the law gets. The obligations are real but light: preserve attribution and license text, and (under Apache) be aware of the patent terms. The one trap people forget is the attribution itself — strip the MIT notice and you have, by the logic of Jacobsen v. Katzer below, exceeded the license and committed copyright infringement. Our deeper treatment of the corporate pitfalls lives in open source licensing landmines in enterprise software development.

Copyleft licenses: pay it forward, by force

A copyleft license flips the deal. It uses copyright to achieve the opposite of copyright's usual purpose: instead of locking the code up, it locks it open. The bargain is reciprocal — you may use, modify, and redistribute the code freely, but if you distribute a work built from it, you must release your work under the same open terms, source code and all. The freedoms are not just granted; they are made contagious, which is why copyleft is so often (and so loosely) called "viral." (Lawyers should resist the metaphor in any document that might be read by a judge: copyleft does not spread by contact like a disease; it is a condition on a grant that attaches only when you choose to distribute a combined work. "Reciprocal" is the accurate adjective.)

The archetype is the GNU General Public License (GPL), written by Stallman and the FSF, now most often seen as version 2 (1991) and version 3 (2007). The operational heart of GPLv2 is its Section 2(b), which requires that any distributed work that "in whole or in part contains or is derived from the Program" must be "licensed as a whole at no charge to all third parties" under the GPL. Translation: combine GPL code into your product and distribute the result, and the whole result may have to be GPL'd. The Linux kernel — the core of the operating system that runs most of the internet's servers, every Android phone, and the top of the supercomputer list — is licensed under GPLv2. So is an enormous amount of foundational infrastructure.

Copyleft comes in strengths:

Strong copyleft (GPL). Reciprocity reaches broadly into combined and derivative works. This is the "if you ship it, you open it" license.
Weak copyleft (LGPL, MPL). A deliberate compromise. The GNU Lesser General Public License (LGPL) was designed for software libraries — reusable bundles of code that other programs call on. Its bargain: you must keep the library itself (and your changes to it) open, but merely linking your proprietary program to an unmodified LGPL library does not force you to open your program. The Mozilla Public License (MPL) 2.0 draws the line at the file: changes to MPL-covered files must be shared, but you can combine those files with proprietary files in a "Larger Work" without infecting the proprietary parts. Weak copyleft is the engineering of "give back your improvements to the shared component, but keep your own product."
Network copyleft (AGPL). The GNU Affero General Public License (AGPL) closes what its drafters saw as the great loophole of the cloud era. The GPL's reciprocity is triggered by distribution — handing someone a copy. But if you run modified GPL software on a server and let the public use it over a network without ever shipping them a copy (the entire software-as-a-service model), classic GPL is never triggered and you owe nothing. AGPL's Section 13 plugs the gap: if users interact with your modified AGPL program over a network, you must offer them its corresponding source. This single clause is why AGPL is both beloved by free-software purists and quietly banned by the open source policies of many large companies, Google among them.

The spectrum, in one sentence each

If you remember nothing else: MIT/BSD say "keep our name on it." Apache 2.0 adds "and here are the patent rules." LGPL/MPL say "share back changes to our part." GPL says "share back the whole combined work you distribute." AGPL says "share back even if you only run it on a server." The further right you go, the more the license gives back to the commons — and the more carefully a business must think before mixing it into a proprietary product.

A word every one of them shares: no warranty

Notice what all of these licenses have in common, in capital letters. Open source code is given "AS IS," with warranties of merchantability and fitness for a particular purpose expressly disclaimed, and with no indemnification. There is no vendor on the hook if the code is defective, infringes a third party's patent, or carries a security hole. For a buyer used to commercial software contracts — where the vendor warrants title, promises non-infringement, and indemnifies you against IP claims — this is a jarring reallocation of risk. It is one more reason the compliance and provenance work below is not optional: with open source, you are the last line of defense, because the "supplier" has contracted all of its liability away. (This gap is exactly what the open source warranties in acquisition agreements, discussed below, are written to backfill — by shifting the risk onto the seller of the company, since the code's authors cannot be reached.)

The First Hard Question: What Is a "Derivative Work"?

Here is where the field gets genuinely difficult, where reasonable lawyers and brilliant engineers disagree, and where the largest dollar amounts are at stake. Copyleft's entire mechanism depends on a phrase borrowed from copyright law: the derivative work. Under the U.S. Copyright Act, a derivative work is one "based upon" one or more preexisting works — a translation, an adaptation, a work "recast, transformed, or adapted," 17 U.S.C. § 101. The copyright owner's exclusive right to make derivative works (17 U.S.C. § 106(2)) is precisely the lever copyleft pulls: by licensing that right only on condition that you keep the result open, the GPL makes its reciprocity bite. So the trillion-dollar question becomes: when does combining your code with GPL code create a "derivative work" of the GPL code — thereby pulling your code into the GPL's orbit?

The maddening truth is that there is very little authoritative case law squarely answering this for software combinations, and the leading voices disagree. The FSF takes an expansive view: if two pieces of code are tightly bound — sharing data structures in the same memory space, calling each other intimately — they form a single combined work, and the proprietary half is "derived from" the GPL half. Linus Torvalds and many engineers take a narrower, function-focused view. Adding to the uncertainty, some scholars argue the GPL's reach is partly a matter of contract (the license's own definition of what it covers) rather than purely the statutory derivative-work line — which can be broader or narrower than § 101 depending on how a court reads the text. The most practically important fault lines:

Static linking (where the GPL library's compiled code is copied bodily into your single executable at build time) is widely treated, including by the FSF, as creating a combined/derivative work. This is the high-risk case: statically link GPL code into your product and you very likely owe the GPL on the whole thing.
Dynamic linking (where your program calls a separate library file at runtime rather than baking it in) is genuinely contested. The FSF still considers it derivative; many lawyers and engineers think a mere runtime call is too thin a thread. This is unsettled law. The LGPL was invented specifically to make this case safe — dynamically link an LGPL library and you are clearly fine.
System calls and "arm's-length" interfaces. There is broad consensus that a program merely making system calls to the operating-system kernel through its documented interface is not thereby a derivative of the kernel. The Linux kernel even carries an explicit note from Torvalds that user programs using normal system calls are not considered derivative works. Communicating with a separate program over a pipe, socket, or network protocol is, in the prevailing view, also too arm's-length to trigger copyleft.
The "mere aggregation" safe harbor. The GPL itself says that merely putting a GPL program and an unrelated program together on the same disk or download — "mere aggregation" — does not make the unrelated program subject to the GPL. The trick is telling aggregation from combination.

A worked example makes the stakes concrete. Suppose Acme Robotics builds a warehouse robot. (This hypothetical is illustrative only.) Its proprietary navigation software is its crown jewel — the thing that justifies the company's valuation. Acme's engineers, racing to ship, grab a slick GPL-licensed mapping library and statically link it into the navigation binary, then sell the robots. Under the prevailing reading of GPL Section 2(b), Acme has just distributed a combined work that "contains or is derived from" GPL code — which means Acme may be obligated to release the source code of its crown-jewel navigation software to every customer, who may in turn redistribute it freely. The entire moat evaporates. Now change one fact: Acme's engineers had instead found the same functionality under the LGPL and dynamically linked to it as a separate library. Now Acme owes only the library's source (and any changes Acme made to the library) and keeps its navigation code proprietary. Same goal, same afternoon of coding, wildly different legal universe — decided entirely by which license the library carried and how the code was combined. This is not a theoretical risk; it is the single most common way companies blunder into copyleft exposure, and it is exactly why the compliance programs discussed below exist.

Because so much rides on these distinctions, and because the law is thin, the discipline of the field is avoidance over litigation: identify what you have, understand its license, and architect the combination so the question never has to be answered in court.

The Second Hard Question: License Compatibility

There is a quieter problem that ambushes even careful developers: license compatibility. It is one thing to combine your proprietary code with one open source component. It is another to combine two open source components with each other and redistribute the result — because their licenses may flatly contradict one another, and there may be no lawful way to ship the combination at all.

The mechanics are pure logic. Copyleft licenses do two things at once: they grant you rights, and they impose conditions on the combined work. If License A says "the whole combined work must be distributed under License A's exact terms" and License B imposes a condition that License A forbids, you cannot satisfy both at the same time, so you cannot distribute the combination. The result is a one-way street. Permissive code can almost always be folded into a copyleft project (the copyleft simply absorbs it), but copyleft code generally cannot be folded into a permissive project without converting the whole thing to copyleft.

The notorious real-world example is the historic incompatibility between GPLv2 and the Apache License 2.0. Apache 2.0's patent-termination and indemnity-style provisions impose conditions that GPLv2 (which forbids "further restrictions" on the granted rights) does not permit — so, by the FSF's own analysis, you cannot combine GPLv2-only code with Apache-2.0 code into a single distributed work. (GPLv3 was deliberately drafted to be compatible with Apache 2.0, which is one reason many projects moved to v3.) Other classic mismatches: GPLv2 and GPLv3 are themselves incompatible with each other unless a project is licensed "GPLv2 or later"; and several older licenses (the original 4-clause BSD with its "advertising clause," for instance) are incompatible with the GPL.

A short worked example. Suppose a startup wants to ship a single binary that statically links Library X (GPLv2-only) and Library Y (Apache 2.0). Each library, taken alone, is fine. Combined and distributed, the result is unshippable: the GPLv2 forbids the extra conditions Apache 2.0 attaches, and Apache 2.0's terms cannot simply be discarded. The startup's options are to find a differently licensed substitute for one library, to keep the two at genuine arm's length so no combined work is created, or to obtain a different license grant from one author. None of those is obvious to an engineer who simply saw two "open source" libraries and assumed they would play together. Compatibility, in short, is not a property of any single license — it is a property of the combination, and it has to be checked every time pieces are joined.

This is why mature compliance programs maintain a license compatibility matrix, not just a list of approved licenses, and why automated tooling now flags incompatible combinations at build time. The lesson for counsel: "all the components are open source" tells you almost nothing about whether the product can lawfully be distributed.

How These Licenses Became Enforceable Law: The Cases

For years, skeptics asked an obvious question: are these things even enforceable? A volunteer drafted the GPL; nobody signs it; no money changes hands. Is it a real, binding instrument or just an aspirational document? The answer, settled over two decades of litigation, is that open source licenses are emphatically enforceable — and the theory of enforcement turns out to matter as much as the result.

Jacobsen v. Katzer: the foundation stone

The landmark is Jacobsen v. Katzer, 535 F.3d 1373 (Fed. Cir. 2008). Robert Jacobsen released model-train software under the Artistic License, a permissive open source license that required users to keep attribution and note their changes. Matthew Katzer's company copied the code into a commercial product and stripped out the attribution — ignoring the license's conditions. The district court shrugged: it treated the license terms as mere contractual covenants, meaning Jacobsen's only remedy was a breach-of-contract suit for (essentially nominal) damages, with no copyright injunction available.

The Federal Circuit reversed, and the distinction it drew is the doctrinal cornerstone of open source enforcement. The court held that the Artistic License's requirements were not just contract promises but conditions on the copyright grant itself. Because the license permitted copying only if the user complied with the attribution and notice terms, a user who ignored those terms was acting outside the scope of the license — and copying outside the scope of a license is plain copyright infringement. That unlocks the copyright owner's full arsenal, most importantly the injunction (a court order to stop) and statutory damages, rather than the feeble contract remedies the district court had offered. The court memorably observed that open source licensing has "substantial economic value" even when no money changes hands, because attribution, community growth, and reputation are real economic interests. Jacobsen is the reason a GPL or MIT violation can be a copyright case with teeth, not just a polite contract dispute. (On remand, the case ultimately settled, but the appellate holding endures and is cited in essentially every serious treatment of the field.)

The "condition versus covenant" line Jacobsen drew is more than academic. If a license term is a condition on the grant, breaking it puts you outside the license and into copyright infringement (injunction, statutory damages, attorney's fees under 17 U.S.C. § 505). If a term is a mere covenant — an independent promise — breaking it is only a breach of contract (actual damages, no injunction as of right). Drafters of open source licenses now write conditions deliberately, and litigants fight hard over which bucket a given clause falls into.

SFC v. Vizio: the third-party-beneficiary turn

If Jacobsen anchored copyright enforcement, the most important recent case reopens the contract front from a surprising angle. In Software Freedom Conservancy, Inc. v. Vizio, Inc. (filed 2021; removed to and remanded from the Central District of California, No. 8:21-cv-01943), the Software Freedom Conservancy sued the television maker Vizio not as a copyright holder — the Conservancy holds no copyright in the GPL'd Linux code inside Vizio's smart TVs — but as a third-party beneficiary of the GPL and LGPL. The theory: the licenses are contracts between the code's authors and Vizio, and the people the licenses are meant to benefit are the end users who are promised access to the corresponding source code. Vizio, the suit alleges, shipped Linux-based TVs without making that source available, breaching the deal for the benefit of the public the GPL exists to serve.

This matters because it would give recipients of a product — not just copyright owners — standing to demand GPL compliance as a contract right. A federal court in 2022 declined to keep the case (it remanded to California state court, finding the claim sounded in contract rather than presenting a federal copyright question that supported removal), and the litigation has proceeded in state court on that contract theory. Win or lose on the merits, Vizio signals that the enforcement toolkit is broadening: the source-availability promise embedded in copyleft may be enforceable by the very users it was written for, dramatically expanding who can knock on a non-compliant company's door.

Neo4j v. PureThink: you cannot quietly add restrictions to an open license

A different fight clarified the other direction — what a licensor (or downstream redistributor) may and may not do to an open source grant. In Neo4j, Inc. v. PureThink, LLC (N.D. Cal.), affirmed in relevant part by the Ninth Circuit in 2022, the graph-database company Neo4j had released software under the AGPL, then layered on a "Commons Clause" that purported to forbid commercial resale. A downstream party removed the added restriction, redistributed the software, and marketed it as "free and open source." The Ninth Circuit held two practically important things: a licensee may not unilaterally strip restrictions that the licensor lawfully added to the license text it received, and — relevant to false-advertising and trademark exposure — calling AGPL-plus-Commons-Clause software "free and open source" can be actionably misleading, because the Commons Clause's commercial restriction means the software is not, in fact, open source by the OSD's lights. The takeaway pairs neatly with the relicensing wave below: the words "open source" are not a marketing flourish you can apply at will, and the modifications a licensor bolts onto a recognized license can have real teeth.

Artifex v. Hancom: license and contract both

Filling out the picture, Artifex Software, Inc. v. Hancom, Inc. (N.D. Cal. 2017) opened a complementary contract front years before Vizio. Artifex makes Ghostscript (a PDF/PostScript engine) under a dual model: free under the GPL, or for a fee under a commercial license for those who don't want to comply with copyleft. Hancom embedded Ghostscript in its office software, did not pay for the commercial license, and did not comply with the GPL's source-disclosure obligations. Artifex sued — and significantly, also pleaded breach of contract. The court refused to dismiss the contract claim, reasoning that the GPL could be enforced as a contract where the parties' conduct showed offer, acceptance, and consideration (Hancom got the software; the "price" was GPL compliance). The case settled, but it stands for the proposition that an open source license can be both a copyright license and an enforceable contract — which can matter for remedies, because contract law sometimes reaches conduct (and damages measures) that copyright does not.

The SCO-Linux saga: a cautionary epic

No survey is complete without the SCO litigation, the long, strange war that hung over Linux for most of the 2000s. The SCO Group claimed it owned the copyrights to the original Unix operating system and that IBM and the broader Linux community had improperly poured SCO's proprietary Unix code into Linux. SCO sued IBM for billions and rattled its sabre at Linux users generally, casting a fear-uncertainty-and-doubt cloud over enterprise Linux adoption at exactly the moment it was going mainstream. The plot twist came in SCO Group v. Novell: a federal court in Utah held that Novell, not SCO, actually owned the Unix copyrights — knocking the legs out from under SCO's entire theory. SCO slid into bankruptcy and the litigation dragged on for years more, a kind of zombie lawsuit. The lasting lessons are twofold: first, provenance matters — knowing who actually owns the code in your stack is not paranoia but basic hygiene; and second, the open source ecosystem responded to the SCO scare by building defensive infrastructure, most notably the Open Invention Network (OIN), a patent non-aggression pool formed by IBM, Red Hat, Sony, Philips, and others (later joined by Microsoft with its large patent portfolio) to shield Linux from patent attack.

Enforcement beyond U.S. courts

It is worth knowing that much open source enforcement happens outside litigation entirely — and much of the litigation that does happen has occurred in Germany. The activist project gpl-violations.org, founded by developer Harald Welte, won a series of German injunctions in the 2000s (against Sitecom, D-Link, Skype, and others) for shipping Linux-based devices without the required source code, establishing early that GPL terms are enforceable in European courts too. In the United States, the Software Freedom Conservancy and the FSF have historically preferred negotiated compliance to lawsuits, reserving litigation (such as the long-running BusyBox cases, and now Vizio) for repeat or stubborn offenders. The cultural norm in the open source world is genuinely compliance-first: the goal is to get companies into compliance, not to extract verdicts. But Jacobsen and its progeny guarantee that the courthouse is always available if persuasion fails.

Patents, Termination, and the Quiet Clauses That Bite

Copyright is only half the legal story of open source; patents are the other half, and they hide in clauses most people never read. A copyright license lets you copy the code. It does not, by itself, promise that using the code won't infringe someone's patent. Early permissive licenses (MIT, BSD) are silent on patents, which leaves a theoretical gap: a contributor could give you the code under MIT and then sue you for patent infringement for running it. Modern licenses close that gap deliberately.

Express patent grants. Apache 2.0, GPLv3, MPL 2.0, and others contain explicit clauses in which contributors license to users any of their patents that the contributed code necessarily infringes. This is a major reason sophisticated enterprises prefer Apache 2.0 over MIT for patent-sensitive projects.
Patent retaliation / defensive termination. These same licenses typically add a trap for the litigious: if you file a patent suit alleging that the open source project infringes your patents, your patent license (and sometimes your whole license) to that project automatically terminates. The clause turns the project's own users into a mutual-defense pact.
GPLv3's anti-Tivoization and anti-deal clauses. GPL version 3 was written in part to fix two perceived end-runs around GPLv2. "Tivoization" — named for TiVo's set-top boxes, which shipped GPL'd Linux but used hardware signature checks to block users from running modified versions — is countered by GPLv3's requirement that consumer devices ship the "installation information" needed to actually run modified builds. And GPLv3's patent provisions were drafted to neutralize private patent-license deals like the controversial 2006 Microsoft–Novell arrangement, extending any patent license a distributor grants to all downstream recipients.

These clauses rarely surface until they do — typically in litigation or M&A diligence — at which point they can be decisive. A patent-retaliation clause can vaporize a company's freedom to operate the moment it sues over the wrong patent. For the broader picture of how patents interact with software and standards, see our discussion of standard-essential patents and FRAND licensing and the foundational role of open standards in POSIX and POSIX standards.

Building an Open Source Compliance Program

If the law is this nuanced and the components this numerous, how does any company manage the risk? The answer is a compliance program — a set of policies, tools, and habits that make sure the organization knows what open source it uses, complies with each license's terms, and avoids the architectural blunders that trigger unwanted copyleft. This is no longer optional hygiene; for any company that ships software or expects to be acquired, it is table stakes. Practical Law's practice notes on open source use and compliance make the point bluntly: undocumented OSS can affect the monetary value of company-owned IP, reduce operational flexibility, expose proprietary source to competitors, and derail an acquisition. Our companion resource, the open source compliance checklists collection, turns the principles below into operational checklists.

A mature program rests on a few pillars:

Policy and governance. Someone owns the issue. Many organizations establish an Open Source Program Office (OSPO) or at least an open source review board, and adopt a written policy stating which licenses are pre-approved (typically permissive and weak-copyleft), which require legal review (strong copyleft like GPL/AGPL), and which are forbidden outright. The policy should also address inbound approval workflow (how a developer requests permission to use a component), outbound contributions (because pushing the company's code into an open source project is itself a licensing decision that can implicate the employer's IP), and a clear escalation path to legal for the hard linking-and-derivation questions.

Inventory and the SBOM. You cannot comply with licenses you don't know you have. The central artifact of modern compliance is the Software Bill of Materials (SBOM) — a complete, machine-readable inventory of every component in a piece of software, including its version, its origin, and its license. Think of it as the nutrition label for software: just as a packaged food lists every ingredient, an SBOM lists every open source (and proprietary) component baked into a build, including the transitive dependencies — the libraries your libraries depend on, several layers deep, where most surprises lurk. SBOMs are produced in standardized formats (the two leading ones are SPDX, now an ISO/IEC standard, and CycloneDX, an OWASP project) by automated Software Composition Analysis (SCA) tools that scan a codebase and flag every component and license.

The SBOM has rocketed from best-practice to near-mandate in just a few years. The pivotal moment was the 2021 federal cybersecurity executive order (Executive Order 14028), issued after the SolarWinds and Log4j supply-chain incidents, which directed the National Telecommunications and Information Administration (NTIA) and later the Cybersecurity and Infrastructure Security Agency (CISA) to define and promote SBOMs and required them from software vendors selling to the federal government. NTIA's "minimum elements" guidance and CISA's subsequent SBOM work (refreshed in CISA's 2025 update to the minimum-elements framework) have made the SBOM the lingua franca of software supply-chain transparency. Congress has reinforced the trend through defense-authorization provisions directing the Department of Defense toward SBOM requirements. By 2026, an SBOM is a routine contractual deliverable in enterprise and government software deals — and it is the foundational document of any credible open source compliance program. (Note the dual purpose: SBOMs serve both license compliance and security/vulnerability management, since knowing you ship Log4j version X is the only way to know you're exposed when a vulnerability in version X is announced.)

License obligation tracking and the notice file. For each component, the program must capture and satisfy the obligations: preserve copyright notices, include the required license texts (the ubiquitous "third-party notices" or NOTICE file in software products exists for exactly this reason), and — for copyleft components — provide source code through the channels the license specifies (whether by shipping it, including a written offer good for three years under GPLv2, or hosting it for download). The compliance work is mostly clerical, but the consequences of skipping it are not.

Architecture and compatibility review. The highest-value legal work happens before code is written: deciding how copyleft components may be combined with proprietary code, and checking that the open source components chosen are even compatible with each other (the second hard question above). A good program routes "we want to use this GPL/AGPL library" requests to review, where the linking-and-derivation analysis and the compatibility-matrix check actually get done.

Remediation and the "tainted code" problem. When an audit finds a copyleft component improperly fused into proprietary code, the fixes range from cheap (swap in a permissively licensed equivalent; isolate the component behind an arm's-length interface) to expensive (rewrite the affected code from scratch — a "clean room" reimplementation) to drastic (open-source the affected module). Catching the problem early is the difference between a half-day refactor and a strategic crisis.

Open Source in Mergers and Acquisitions

Nowhere does open source get more attention — or kill more deals — than in M&A due diligence, where a buyer's lawyers and technical advisers pull apart the target's codebase before the buyer pays for it. The reason is simple and brutal: a software company's value lives in the assumption that it owns and controls its proprietary code. Open source can quietly undermine that assumption in two ways, and an acquirer will hunt for both.

First, the copyleft contamination risk. If the target statically linked GPL or AGPL code into its flagship product and distributed it, the buyer may be acquiring a product whose source code is arguably required to be public — meaning the "proprietary moat" the buyer is paying a premium for may not be defensible. A single careless import statement can knock millions off a valuation or trigger a price retrade.

Second, the provenance and obligation risk. Even permissive open source carries obligations (attribution, notices), and undocumented components carry unknown obligations and unknown security vulnerabilities. Buyers increasingly demand an SBOM as a condition of the deal and run an independent SCA scan, often through specialized open source audit firms, precisely because the target's own representations cannot be taken on faith.

The practical upshot for any company that hopes to be acquired (or to sell software, or to raise a serious financing round) is that compliance is not a back-office nicety — it is enterprise value. The transactional representations matter too: software acquisition agreements and license agreements now routinely include detailed open source warranties — the target represents that it has complied with all applicable open source licenses; that its use of open source has not subjected, and will not subject, any proprietary code to a copyleft disclosure obligation; and that it has not distributed proprietary code under any open source license. Because the upstream authors disclaimed all warranties, these seller-side representations (backed by indemnities and, increasingly, escrow holdbacks tied to clean-up of identified issues) are how a buyer recovers the risk the licenses themselves refuse to bear. For how these representations slot into larger deals, see software transactions--an overview and the practical drafting guidance in software license agreement review checklists.

The Relicensing Wave: When "Open Source" Companies Walk Away

For most of open source's history, the gravity ran one direction: toward openness. The dramatic story of the 2020s is a series of prominent companies running the other way — abandoning genuine open source licenses for restrictive "source-available" licenses that let you read the code but not freely use it. Understanding why requires understanding the business problem, and understanding the licenses requires keeping the OSI Definition firmly in mind, because the entire controversy turns on which licenses are, and are not, actually open source.

The business problem: the cloud free-rider. Picture Beacon Data, a startup that builds a brilliant open source database under a permissive or weak-copyleft license, funds development by selling support and a hosted version, and gathers a thriving community. Then a hyperscale cloud provider — call it Titan Cloud — takes Beacon's freely licensed code, wraps it in a managed cloud service, sells it at enormous scale, contributes little back, and captures most of the revenue the open product generates. Beacon did the R&D; Titan monetized it. This "strip-mining" dynamic, entirely lawful under permissive and even most copyleft licenses, is the grievance that drove the relicensing wave. The companies that relicensed were, almost uniformly, trying to stop the hyperscalers from reselling their work as a service.

The new licenses:

The Server Side Public License (SSPL). Introduced by MongoDB in 2018, the SSPL is a modified AGPL that goes one giant step further: if you offer the software as a service, you must open-source not just the software but essentially the entire service stack used to provide it — management tools, automation, the works. That obligation is so sweeping that the OSI declined to recognize the SSPL as open source (MongoDB ultimately withdrew the submission amid OSI opposition), on the view that it discriminates against a field of endeavor (commercial SaaS providers) and so fails the OSD. SSPL is "source-available," not open source. Elastic (the company behind Elasticsearch) adopted the SSPL in 2021 in its public fight with Amazon — then, in a notable plot twist, returned to genuine open source by adding the AGPL as an option in 2024, an unusually candid reversal.
The Business Source License (BUSL/BSL). Pioneered by MariaDB and adopted famously by HashiCorp (Terraform, Vault, and others) in 2023, the BUSL is a delayed-open model. The source is available and broadly usable — except you may not use it to compete with the licensor's commercial offering — and the whole thing automatically converts to a true open source license (typically Apache 2.0 or MPL) after a set "change date," usually a few years out. The BUSL is explicitly not open source during the restricted window because of the non-compete field-of-use restriction (OSD criterion 6 again), but it eventually becomes open source by its own terms. HashiCorp's move was contentious enough that the community forked Terraform into a permanently open project, OpenTofu, under the stewardship of the Linux Foundation — a vivid demonstration of open source's ultimate check on licensors: if you take the code proprietary, the community can take the last open version and keep going without you. Redis made a similar SSPL/source-available move in 2024 (prompting a fork, Valkey, backed by the Linux Foundation), and then — like Elastic — announced a return to an open source option (AGPL) in 2025.

The "open core" model. Distinct from relicensing is open core, a business model (not a license) in which a company offers a genuinely open source "core" product and sells proprietary "enterprise" add-ons — security, scale, support, advanced features — on top. Open core lets a company stay honestly open at the foundation while monetizing the edges, and it remains the most common commercial open source business model. The relicensing wave is, in part, a confession that for some products (databases especially) open core alone could not hold off the hyperscalers.

The throughline for a lawyer advising a client: do not assume that a project you adopted years ago is still open source. A license can change with a new version. A component your product depends on may have relicensed under SSPL or BUSL while you weren't looking, quietly converting a compliance question into a competitive-use question — and the BUSL's anti-competition clause is a contract restriction, enforceable on its own terms, that no amount of "but it's open source" will wish away. (Recall Neo4j: you cannot simply delete a restriction you dislike from a license and proceed as if the software were freely open.) This is precisely the kind of moving target that makes the inventory-and-monitoring discipline of a compliance program indispensable.

Open Source Meets Artificial Intelligence

The newest and least-settled frontier is the collision of open source with generative AI, and it cuts in several directions at once. A practitioner in 2026 should understand the questions even though many answers are still pending in court.

AI trained on open source code. The large language models that power AI coding assistants were trained, in substantial part, on the vast corpus of public open source code. That raises an unresolved copyright question: does training a model on GPL- or MIT-licensed code, and then having the model emit suggestions, comply with — or even implicate — those licenses? When an AI assistant reproduces a chunk of someone's GPL'd code into a user's proprietary project, has it stripped the attribution and copyleft obligations the license requires? This is the core grievance in the Doe v. GitHub (Copilot) litigation in the Northern District of California, No. 4:22-cv-06823, where developers allege that AI code-generation tools reproduce their open source code without honoring license terms; the case has been substantially narrowed through motion practice, but the central tension — open source licenses assume attribution and license-text propagation that AI output does not naturally carry — remains live and is part of the broader wave of copyright infringement claims against generative AI.

The "open source AI" definition problem. A parallel fight concerns what it even means for an AI model to be open source. The OSI published its Open Source AI Definition (OSAID) in late 2024, attempting to extend open source principles to models (covering the model weights, the training and inference code, and sufficiently detailed information about the training data). The definition is contested — purists argue that a model is not truly open unless the training data is itself open, while several companies that label their models "open" release the weights under custom licenses that restrict certain uses (which, by the field-of-use logic above, are not open source in the classical sense). The vocabulary problem of "source-available vs. open source" has thus reappeared one level up, now about model weights instead of source code — and Neo4j's lesson that "open source" is not a free-floating marketing label applies here with full force.

Output, provenance, and SBOMs for AI. Practically, AI raises the stakes on everything earlier in this article. Code that flows out of an AI assistant has uncertain provenance — you may not know whether a generated snippet carries copyleft obligations — which makes the inventory-and-SBOM discipline more important, not less. Expect "AI bill of materials" (AIBOM) concepts to extend the SBOM idea to models and training data, and expect open source policies to add explicit rules about AI-generated code (mandatory scanning of generated snippets, suppression of large verbatim suggestions, and human review before AI output is committed). For the wider legal landscape, our overview of artificial intelligence key legal issues and the question of who owns what the machine creates provide the surrounding doctrine.

Putting It to Work: A Short Practical Walkthrough

Let's tie the threads together with one more invented company. Helios Health is a digital-health startup building a patient-monitoring platform it intends to sell to hospitals and, eventually, to sell itself to a larger acquirer. (Helios is hypothetical.) What should Helios actually do?

First, Helios adopts a written open source policy and assigns ownership — a single accountable person or small review board. The policy pre-approves MIT, BSD, and Apache 2.0; routes LGPL and MPL to a quick architecture check; requires legal sign-off before any GPL or AGPL component goes into a distributed product; and maintains a compatibility matrix so engineers cannot unknowingly combine, say, GPLv2-only and Apache-2.0 code in one binary. Helios is shipping software to hospitals (distribution), so copyleft is a real concern; if it were running purely as a hosted service, AGPL would move to the top of the worry list because of network copyleft.

Second, Helios wires an SCA tool into its build pipeline so that every build automatically generates an SBOM. Now Helios always knows, in machine-readable form, exactly what is inside its product and under what licenses — including the transitive dependencies where surprises hide.

Third, Helios honors obligations mechanically: a generated NOTICE file ships with the product, carrying every required attribution and license text, and source-code offers are prepared for any weak-copyleft components. Because Helios handles protected health information, its open source program runs alongside its security and privacy programs — the same SBOM that proves license compliance also drives vulnerability response, which dovetails with its broader privacy compliance program.

Fourth, Helios monitors for relicensing. The database it adopted in year one could relicense under SSPL or BUSL in year three; Helios's review board re-checks key dependencies on a schedule so a competitive-use restriction never sneaks in unnoticed.

Finally, when an acquirer comes calling, Helios hands over a clean SBOM, a documented compliance history, and confident open source warranties — and the diligence that sinks careless targets becomes a non-event. The lesson is that open source discipline is cheap and continuous when done from the start, and ruinously expensive when reconstructed under deal pressure. For the law firm's own technology choices, the same philosophy informs our pieces on running an open source law firm and the broader case for open tooling in running a law firm on open source technology.

Key Takeaways

Open source software is not free of legal rights — it is copyrighted software licensed to the world on conditions, and those conditions are enforceable. The OSI's ten-point Open Source Definition is the authoritative test for what counts as open source, and its no-field-of-use-restriction rule (criterion 6) is the bright line that separates true open source from the larger world of "source-available" code. And every open source license disclaims warranties and indemnity, which is why provenance and compliance work falls on the user.

The license families matter more than any single license. Permissive licenses (MIT, BSD, Apache 2.0) let you build proprietary products on top with light obligations; copyleft licenses (the GPL family) require you to share back, with strength ranging from weak (LGPL, MPL — share back changes to the shared component) to strong (GPL — share back the whole distributed work) to network-reaching (AGPL — share back even for server-side use).

Two questions are genuinely hard and unsettled: what combination of code creates a "derivative work" that triggers copyleft (static linking is high-risk; dynamic linking is contested; system calls and arm's-length interfaces generally are not enough), and whether two open source licenses are even compatible with each other in a single distributed work (GPLv2 and Apache 2.0 are the classic mismatch). Because the case law is thin, the discipline is architecture, compatibility checking, and avoidance, not litigation.

Jacobsen v. Katzer, 535 F.3d 1373 (Fed. Cir. 2008) established that open source license terms are conditions on the copyright grant, so violating them is copyright infringement with full remedies — the legal foundation that makes every other open source license enforceable. Artifex v. Hancom added that such licenses may also be enforceable contracts; SFC v. Vizio is testing whether end users can enforce the GPL's source-disclosure promise as third-party beneficiaries; Neo4j v. PureThink held that you cannot strip restrictions a licensor lawfully added or mislabel restricted code as "open source"; and the SCO-Linux saga taught the enduring lesson that provenance matters.

A real compliance program — policy, an OSPO, an SBOM generated by automated scanning, obligation tracking, and architecture/compatibility review — is now table stakes, not a luxury, especially because open source is the issue most likely to derail an acquisition.

The field is moving fast: the SSPL/BUSL relicensing wave means a project that was open source when you adopted it may not be open source today, sometimes carrying competitive-use restrictions; and generative AI has thrown open hard, unresolved questions about training on open source code, attribution in AI output, and what "open source" even means for an AI model.

Frequently Asked Questions

Is open source software really free? Can my company just use it without paying? You can almost always use it without paying money — that is the whole point — but "free of charge" is not the same as "free of obligations." Every open source license imposes conditions, ranging from light (keep the attribution notice) to heavy (release your own source code). The cost of open source is paid in compliance, not dollars, and ignoring the conditions can convert a free component into a copyright lawsuit.

What's the difference between a "permissive" and a "copyleft" license, in one breath? A permissive license (MIT, BSD, Apache 2.0) lets you put the code inside a closed, proprietary product as long as you keep the notices. A copyleft license (the GPL family) requires that if you distribute something built from the code, you release that under the same open terms — source code and all. Permissive gives back nothing but credit; copyleft gives back the code.

If I use one GPL library, do I really have to open-source my entire product? Possibly — and that is exactly why this matters. If you distribute a product that combines your code with GPL code into a derivative work (statically linking is the classic trigger), the GPL's reciprocity can reach your code too. You can often avoid this by using a permissively or weak-copyleft-licensed alternative, by linking dynamically to an LGPL version, or by keeping the GPL component at genuine arm's length. The analysis is genuinely tricky and unsettled; get it reviewed before you ship, not after.

Can I just combine any two open source components in my product? Not necessarily. Some licenses are incompatible with each other — most famously GPLv2-only and Apache 2.0 — meaning there is no lawful way to distribute a single work that combines them. Permissive code can almost always go into a copyleft project, but copyleft code generally cannot go into a permissive one without converting the whole thing. Compatibility is a property of the combination, not of any single license, so it has to be checked every time pieces are joined.

What is an SBOM and do I actually need one? A Software Bill of Materials is a complete, machine-readable list of every component in your software, with versions and licenses — a "nutrition label" for code. You increasingly do need one: it is the foundation of license compliance, it is essential for responding to security vulnerabilities, it is now expected in enterprise and government contracts (driven by the 2021 federal cybersecurity executive order and CISA's continuing SBOM work), and acquirers will demand it in diligence.

A project I depend on just switched to the SSPL or BUSL. Is it still open source, and what should I do? Probably not — the SSPL and the BUSL are "source-available" licenses, not OSI-approved open source, because they restrict certain uses (commercial SaaS for SSPL; competing with the licensor for BUSL). Practically: read the new license carefully for use restrictions, check whether they affect your use, watch whether the BUSL's eventual conversion date helps you, and consider whether a community fork of the last open version (like OpenTofu for Terraform or Valkey for Redis) is a better long-term bet. Do not assume you can simply ignore a restriction you dislike; Neo4j v. PureThink says you can't. This is exactly why you monitor your dependencies rather than adopt-and-forget.

Are open source licenses actually enforceable in court? Yes. Jacobsen v. Katzer (Fed. Cir. 2008) held that open source license terms are conditions on the copyright grant, so violating them is copyright infringement — which means injunctions and statutory damages, not just nominal contract damages. Artifex v. Hancom added that they can be enforced as contracts too, and SFC v. Vizio is testing whether end users can enforce them as third-party beneficiaries. German courts have enforced the GPL repeatedly. The enforcement culture favors negotiated compliance over lawsuits, but the courthouse is fully available.

Open source comes with no warranty. Who is on the hook if it breaks or infringes someone's patent? Essentially no one upstream — open source licenses disclaim all warranties and provide no indemnity, so the authors are not liable if the code is defective, insecure, or infringing. That risk lands on you. In commercial deals it gets reallocated by contract: buyers extract open source warranties and indemnities from the seller of the company (not from the code's authors, who can't be reached), backed by escrow or holdbacks. It is one more reason knowing exactly what you ship is non-negotiable.

Can I get in trouble for code an AI assistant generated for me? This is unresolved, and that uncertainty is itself the risk. AI coding tools were trained on open source code and can reproduce snippets without carrying the attribution or copyleft obligations the original licenses require (the issue at the heart of the Doe v. GitHub/Copilot litigation). Until the law settles, treat AI-generated code as having uncertain provenance: track it, scan it, suppress large verbatim suggestions, and don't assume it is obligation-free.

My company only runs software as a cloud service and never "distributes" anything. Does copyleft still apply? For the classic GPL, often not — its reciprocity is triggered by distribution, and pure SaaS may never distribute a copy. But the AGPL was written precisely to close that loophole: if users interact with your modified AGPL software over a network, you must offer them the source. And the SSPL goes even further for service providers. So "we only run it as a service" is a reason to scrutinize AGPL and SSPL components especially carefully, not a reason to relax.

This article provides general information only and is not legal advice. Open source licensing and software supply-chain law are nuanced, fact-specific, and fast-moving; the application of any license, case, or compliance practice to your situation may differ. Consult qualified counsel before making decisions about open source adoption, license compliance, or any transaction involving software.