Picture the moment a competitor lifts your website wholesale. The product descriptions you sweated over, the photographs you paid a studio to shoot, the carefully written "About Us" story, the custom illustrations, even the layout that took your designer three rounds to get right—all of it suddenly reappears on a rival's domain, sometimes word for word, sometimes with your company name clumsily swapped out. You feel robbed, and in a meaningful sense you are. So you call a lawyer and say the obvious thing: "I want to sue them for stealing my website."

And the lawyer asks a question that sounds almost rude: "Which part?"

That question is the whole subject of this guide. Because here is the truth that surprises nearly everyone the first time they hear it: there is no such thing as a copyright in "a website." The Copyright Act—Title 17 of the United States Code—does not list "website" anywhere as a category of protectable work. The U.S. Copyright Office will not let you write "website" on the line of the application that asks what kind of authorship you are claiming. A website, in the eyes of copyright law, is not a work. It is a container for works—a bundle of text, images, audiovisual material, sound, and computer code, each of which is (or is not) protected on its own terms, owned by its own owner, and registered through its own door.

Once you internalize that, everything else about website copyright snaps into focus. You stop asking "how do I register my website?" and start asking the questions that actually have answers: Who wrote this text and who owns it? Is this photograph mine or the freelancer's? Is the blog post "published" or "unpublished"? Can I sweep ten thousand product pages into one filing, or do I need ten thousand filings? And when someone copies it, what registration do I need in hand before I can walk into federal court?

This guide answers all of that. We will work through the Copyright Office's official framework—set out in Chapter 1000 of the Compendium of U.S. Copyright Office Practices (3d ed. 2021), the agency's published rulebook, supplemented by Circular 66, Copyright Registration of Websites and Website Content—and then connect it to the parts of website life that registration alone does not solve: the constant-updating problem, user-generated content, the DMCA's notice-and-takedown safe harbor under 17 U.S.C. § 512, the new small-claims forum at the Copyright Claims Board, the terms-of-service license that quietly governs everything your users post, and the live battleground of data scraping. Along the way we will use a running cast of invented businesses—"Brewmosa Coffee Co." (a craft-coffee retailer), "Pixelwright Studio" (a small design shop), and "Ferndale Forums" (a hobbyist community site)—so the abstractions stay concrete. Everything attributed to these three is hypothetical.

If you would like the broader picture of how registration works for any kind of work, our companion piece Copyright Registration—A Comprehensive Guide is the master hub, and the step-by-step mechanics of the online application live in How to Register a Copyright with the U.S. Copyright Office. For the foundational vocabulary, Copyright FAQs—Answers to Common Copyright Questions is a useful primer. This article is the website-specific deep dive.

Why bother registering at all? The federal-court ticket and the damages multiplier

Before we get into the how, let's be clear about the why, because website owners often assume copyright registration is a bureaucratic nicety. It is not. For United States works, registration is closer to a license to fight.

Copyright protection itself is automatic. The instant an original work is "fixed in a tangible medium of expression"—the moment your blog post is saved, your photo is captured, your code is written to disk—copyright exists (17 U.S.C. § 102(a); for the timing rule, see § 302(a)). You do not need to register, publish, or even add a © notice to own the copyright (registration is permissive under § 408(a)). So far, so good. But owning a copyright and being able to enforce it are two different things, and that gap is where registration earns its keep.

Registration is a precondition to suing. Section 411(a) of the Copyright Act says that no civil infringement action "shall be instituted until . . . registration of the copyright claim has been made." For years courts split on what "made" meant—was it enough to file the application, or did you have to wait for the Office to act? The Supreme Court settled it in Fourth Estate Public Benefit Corp. v. Wall-Street.com, LLC, 586 U.S. 296 (2019), holding unanimously that registration is "made" only when the Copyright Office actually acts on the application—registering the claim or refusing it—not merely when the applicant submits the paperwork. Translation: if you have not registered, and the alleged infringement is unfolding now, you may be stuck waiting months for the Office to process a fresh application before a court will even hear you, all while the copying continues. (There is one narrow lifeline: under § 411(a), if the Office refuses registration, you may still sue so long as you serve notice and a copy of the complaint on the Register of Copyrights, who may intervene on the registrability question. And you can pay for expedited "special handling" to register in connection with litigation—but it is not cheap.) The practical lesson is to register before you have a problem, not after.

Registration unlocks the remedies that make infringement worth fighting. Under 17 U.S.C. § 412, you can recover statutory damages and attorney's fees only if the work was registered before the infringement began—or, for published works, within three months of first publication. This matters enormously. Without timely registration, you are limited to "actual damages and profits," and proving that a competitor's lift of your product page cost you a specific dollar amount is often nearly impossible, especially for a small site. With timely registration, you can elect statutory damages of up to $30,000 per work infringed, rising to $150,000 per work for willful infringement, plus your legal fees (17 U.S.C. § 504(c)). For a website with dozens of separately registered photographs or articles, that math gets the other side's attention fast. (We unpack the full enforcement toolkit in What Are the Consequences of Pirating Intellectual Property?.)

Registration is prima facie evidence. A registration certificate obtained before or within five years of first publication constitutes prima facie evidence of the validity of the copyright and the facts stated in the certificate (17 U.S.C. § 410(c)). That shifts the burden to the infringer to come forward and prove your copyright is invalid—a meaningful head start in litigation. Timely registration also forecloses the "innocent infringement" mitigation defense that can shrink a defendant's exposure (17 U.S.C. §§ 401(d), 402(d)).

Registration opens two newer doors. First, you can record your registration with U.S. Customs and Border Protection to block the importation of infringing copies—useful if a counterfeiter offshore is duplicating your catalog. Second, and more important for small online businesses, registration (or a pending application) is the entry ticket to the Copyright Claims Board (CCB), the new copyright "small-claims court" created by the Copyright Alternative in Small-Claims Enforcement (CASE) Act of 2020 and operating inside the Copyright Office since 2022. The CCB hears claims up to $30,000 without the expense of federal litigation—but you must have a registration or a pending application to file, and the Office must actually issue the registration before the Board can render a final determination (17 U.S.C. §§ 1504–1505). For a Brewmosa-sized business facing a Brewmosa-sized infringer, the CCB can be the difference between a remedy and a shrug. We discuss the forum-selection tradeoffs in What Are the Consequences of Pirating Intellectual Property?.

So the case for registering your website's content is overwhelming. The only real questions are what to register and how.

A website is a bundle of works: the foundational concept

Let's go back to that container metaphor and make it precise, because everything follows from it.

The Copyright Office defines a website, for registration purposes, as "a webpage or set of interconnected webpages, including a home page, located on the same computer or server, and prepared and maintained as a collection of information by a person, group, or organization." That is a description of a place, not a work. The works are the things you put in that place.

Consider Brewmosa Coffee Co.'s site. A single product page might contain:

  • Literary works — the tasting notes, the brand story, the FAQ, the brewing instructions (text under 17 U.S.C. § 102(a)(1));
  • Pictorial and graphic works — the product photographs, the hand-drawn coffee-cherry illustrations, the logo, the iconography (§ 102(a)(5));
  • Audiovisual works — a short "how we roast" video embedded on the page (§ 102(a)(6));
  • Sound recordings — background audio or a podcast clip (§ 102(a)(7));
  • A computer program — the JavaScript, the back-end code, the custom plugin that runs the subscription box (software is a "literary work" under the Act); and
  • An arrangement — the original editorial judgment in selecting and ordering all of the above.

Each of those bullet points is a different kind of authorship, governed by different deposit rules, sometimes owned by a different person, and—critically—registered through a different application or group option. The site as a whole is just the membrane holding them together.

This is why the Office tells applicants, in no uncertain terms, not to list "website" as the authorship type. You register the contents according to what they predominantly are. A text-heavy blog post is a literary work. A gallery of product shots is a set of photographs. A how-to video is an audiovisual work. The software running the site is a computer program—and registration of code has its own quirks (deposit rules, trade-secret redaction options, the source-versus-object-code question), all covered in our companion guide, Copyright Registration of Computer Programs, and from the broader strategic angle in Legal Protection of Software—Copyrights, Patents, Trade Secrets, and Contracts.

There is one important wrinkle, and it is where the "register the website itself" instinct finds its only real home. You can register the website as a compilation or collective work—but only the thin layer of authorship that consists of the original selection, coordination, and arrangement of the content, and only under specific conditions. We'll get to that, but hold the thought: registering "the website as a compilation" protects your editorial arrangement, not the underlying photographs and text, unless you own all of those too.

What on a website is even copyrightable—and what isn't

Copyright protects original expression, and "original" means only that the work was independently created and possesses at least a minimal "spark" of creativity. That is a low bar—the Supreme Court in Feist Publications, Inc. v. Rural Telephone Service Co., 499 U.S. 340 (1991), said the requisite level of creativity is "extremely low; even a slight amount will suffice." But Feist also drew the floor: facts are not copyrightable, and neither is the "sweat of the brow" you expended gathering them. A bare alphabetical phone directory was held unprotectable because selecting and arranging names in alphabetical order is "garden-variety," "mechanical," and "devoid of even the slightest trace of creativity." That holding echoes loudly on the web, where so much "content" is functional data.

Here is the rough map of the copyrightable and the not-copyrightable on a typical website.

Generally protectable: original written content (articles, posts, descriptions with genuine expression), original photographs and artwork, custom illustrations, original videos and animations, original music and sound recordings, and human-authored computer code with creative expression. Also potentially protectable is the original selection and arrangement of all that content as a compilation, if there is real editorial creativity in it.

Generally not protectable on its own:

  • Ideas, plans, concepts, and "the idea for a website." Copyright protects expression, never the idea behind it (17 U.S.C. § 102(b))—the bedrock idea/expression dichotomy that the Supreme Court traced back to Baker v. Selden, 101 U.S. 99 (1879). Your brilliant concept for a coffee-subscription platform is not copyrightable; the specific words and images you use to express it are.
  • Domain names and URLs. brewmosa.com is not a copyrightable work (it may, however, function as a trademark—see the boundary in Copyright vs. Trademark—What Is the Difference?).
  • The "look and feel," layout, and format of a page. The general visual layout—where the menu sits, that there is a hero image up top—is treated as an unprotectable idea or a functional design element. This is one of the most misunderstood points in all of website copyright. You cannot register "the look and feel."
  • Functional design elements. Elements dictated by how the site has to work are filtered out as functional, not expressive—a principle close cousin to the "merger" and scènes à faire doctrines courts apply to software interfaces.
  • Common, unoriginal material. Familiar symbols, generic icons, standard typefaces, short phrases, names, titles, and slogans. (A slogan may be a trademark, but it is not copyrightable.)
  • Cascading Style Sheets (CSS). Because CSS largely encodes formatting and layout—the unprotectable stuff—the Office generally refuses claims consisting solely of style-sheet languages.
  • Mere facts and data. Per Feist, the prices, the addresses, the specs, and the raw data points are facts. The original way you select and arrange them might support a thin compilation copyright, but the facts themselves are free for the taking.

A useful way to hold this: the more your content reflects human creative choice (what to say, how to say it, what to show, how to compose the shot), the more solidly it is protected. The more it reflects function, convention, or fact (how the layout has to work, what the standard icon looks like, what the address is), the less protectable it becomes.

A word on the newest frontier: AI-generated content. The Copyright Office's position, set out in its 2023 registration guidance and reinforced in Thaler v. Perlmutter, 687 F. Supp. 3d 140 (D.D.C. 2023), aff'd, 130 F.4th 1039 (D.C. Cir. 2025), is that purely machine-generated material lacks the human authorship copyright requires. If Brewmosa's product blurbs are spun up wholesale by a generative model with no meaningful human authorship, those blurbs may not be registrable at all, and any registration must disclaim the AI-generated portions while claiming only the human-authored expression. As more sites are populated by AI tools, this caveat will increasingly shape what is actually registrable—a theme we develop in Artificial Intelligence Key Legal Issues—A Comprehensive Overview for Businesses and Legal Professionals.

HTML: a special case

People often want to register "the HTML" of their site, imagining it captures everything. It mostly doesn't. HTML (HyperText Markup Language) is a markup language that tells a browser how to display content. Much of it is auto-generated by website-building software, and the parts that merely dictate formatting and layout are not protectable (same reason as CSS). The Office will register HTML as a literary work only if it was written by a human and contains a sufficient amount of creative expression—and even then, the registration covers only the human-authored code text, not the visual formatting it produces and not any audio, image, or video content that the HTML merely references. In nearly all cases, if you want to protect what users see on the page, you should deposit the content as it appears, not the underlying HTML.

The website as a compilation or collective work

Now to the one route by which something like "the website itself" can be registered. The Copyright Act recognizes two related forms of authorship that exist on top of other works:

  • A compilation is "a work formed by the collection and assembling of preexisting materials or of data that are selected, coordinated, or arranged in such a way that the resulting work as a whole constitutes an original work of authorship" (17 U.S.C. § 101). The "preexisting materials" can be anything, including raw facts. The copyright is in the selection, coordination, and arrangement—nothing more.
  • A collective work is a particular kind of compilation: "a work, such as a periodical issue, anthology, or encyclopedia, in which a number of contributions, constituting separate and independent works in themselves, are assembled into a collective whole" (17 U.S.C. § 101). Think of a magazine: each article is its own work, and the issue as a whole is a collective work. (For the special rights this creates, see Contributions to a Collective Work, which digs into 17 U.S.C. § 201(c) and the freelancer-rights problem the Supreme Court tackled in New York Times Co. v. Tasini, 533 U.S. 483 (2001)—a case that arose precisely from the migration of print content into online databases.)

A website can qualify as a compilation or collective work when there is enough original creativity in how its content is selected, coordinated, or arranged—including, the Office notes, creativity in the overall hierarchy of the site, such as the way pages are connected and linked. If Pixelwright Studio curates a portfolio site where the choice of which projects to feature, in which order, grouped into which themed galleries, reflects genuine editorial judgment, that arrangement can be registered as a compilation.

But two limits cabin this route hard, and you must respect both.

First, the thin-copyright limit. A compilation registration protects only the selection-and-arrangement layer. It does not protect the underlying works unless you separately own and claim them. And the protection is famously "thin": a competitor who uses different content arranged in a similar overall way may not infringe, because what you own is the specific creative arrangement of your specific materials, not the layout concept. This is the lesson of Feist applied forward—the more your "arrangement" is dictated by convention or function, the thinner (or nonexistent) the protection. The compilation claim also does not cover the general layout or format of a page, and—again—the Office will refuse a claim built solely on style sheets.

Second, the ownership limit. When you register a website as a compilation or collective work, the registration can cover both the arrangement and the individual works appearing on the site—but only if the claimant fully owns the copyrights in both the compilation and the underlying works at the time of registration. This is the make-or-break condition. If Brewmosa owns all its own text, all its own photos, and the arrangement, it can file a single collective-work application that reaches the whole bundle. But if even one featured photo belongs to a freelancer who never assigned it, or one testimonial belongs to the customer who wrote it, that material has to be excluded from the claim. A claim in an entire website never extends to third-party material the claimant doesn't own—and, the Office specifically warns, it never extends to externally linked content. You don't own what lives on someone else's server just because you linked to it.

This is why ownership housekeeping (the next section) is not a side issue. It is the gate that determines whether you get to register the whole site in one efficient filing or have to carve it into pieces.

Who owns what: authorship on a website

To register anything, you must name the author and the copyright owner. The initial owner of any copyright is the work's author (17 U.S.C. § 201(a)). On a website, authorship can come from at least four directions, and getting this right is half the battle.

1. You (or your company), directly. If you personally wrote the copy and shot the photos, you are the author and owner. Simple.

2. Your employees, via work made for hire. Under 17 U.S.C. § 101 and § 201(b), a work prepared by an employee within the scope of employment is a "work made for hire," and the employer is deemed the author and owner from the start—not merely the licensee. If Brewmosa's in-house marketing manager writes the product copy as part of her job, Brewmosa owns it automatically. Whether someone is a true "employee" turns on the multifactor common-law agency test the Supreme Court laid out in Community for Creative Non-Violence v. Reid, 490 U.S. 730 (1989): the hiring party's right to control the manner and means of creation, the skill required, the source of tools, the location of the work, the duration of the relationship, the provision of employee benefits, and the tax treatment of the worker, among others. The label the parties use is not controlling; the substance of the relationship is.

3. Independent contractors—the great trap. Here is where countless website owners discover, too late, that they don't own their own site. If you hire a freelance designer, photographer, or developer as an independent contractor, the default rule flips: the contractor is the author and owner, not you—even though you paid for the work. A commissioned work qualifies as a "work made for hire" only if (a) it falls into one of nine specifically enumerated statutory categories (such as a contribution to a collective work, a part of an audiovisual work, a compilation, or an instructional text) and (b) the parties sign a written agreement expressly saying it is a work made for hire (17 U.S.C. § 101, second definition). General website copy and standalone graphic design often don't fit the nine categories, so the work-for-hire label may not even be available no matter what the contract says. The reliable fix is a written, signed assignment transferring all rights to you (which, under § 204(a), must be a writing signed by the rights-holder). If Pixelwright builds Brewmosa's site as a contractor and there is no assignment, Pixelwright owns the design—and Brewmosa is left with, at best, an implied, non-exclusive license to use it for its intended purpose. The lesson is blunt: get the assignment in writing, before the work starts, every time. (Our guides on Drafting Enforceable Non-Disclosure Agreements for Technology Transactions and Software Licensing Agreements cover the contract mechanics in depth.)

4. Third parties and user-generated content (UGC). Many sites display content created by people who are neither employees nor contractors: customer reviews, forum posts, uploaded photos, comments. By default, that content belongs to whoever created it. You cannot register it as your own unless ownership was validly transferred to you in writing—and the usual vehicle for that transfer is your terms of service. The Office will accept an application claiming UGC if you identify the authors and confirm that copyright in that content was transferred to you (the claimant). It also offers a sensible accommodation for sites with mountains of UGC: if the content was created by a large number of authors, you may provide a representative number of author names plus the number of additional contributors, rather than listing all ten thousand forum members. We return to the UGC-and-terms-of-service question below, because it is where copyright registration, contract law, and the DMCA all collide.

Worked example (hypothetical). Brewmosa's site has four content streams: (1) in-house-written marketing copy (Brewmosa owns it as employee work made for hire); (2) product photos shot by a freelance photographer (the photographer owns them unless the contract assigned them); (3) a logo and homepage design created by Pixelwright under a contract that includes a clean assignment clause (Brewmosa owns it); and (4) customer reviews (the customers own them unless the terms of service transfer copyright). If Brewmosa wants to register "the whole site" as a collective work, it can sweep in only streams (1) and (3), because those are the parts it fully owns. The photos and reviews must be excluded—or separately secured by written assignment first. One missing assignment shrinks the registration.

Is online content "published" or "unpublished"? (And why you must care)

Almost every website registration runs into one deceptively simple question on the application: Is the work published or unpublished? People assume "it's on the internet, of course it's published." That assumption is frequently wrong, and getting it wrong can complicate—or in some cases invalidate—a registration. So let's slow down.

Under the Copyright Act, "publication" is "the distribution of copies . . . of a work to the public by sale or other transfer of ownership, or by rental, lease, or lending" (17 U.S.C. § 101). It also includes offering to distribute copies to a group of persons for purposes of further distribution or public display. The crucial element is distribution of copies that the public is authorized to retain. And here is the counterintuitive part the Office stresses: merely displaying or performing a work online does not, by itself, publish it.

Work through the Office's own guidelines:

  • Streaming-only material is generally a public performance, not publication. When a user streams your video and retains no copy, no "copies" have been distributed. So a stream alone usually doesn't publish the work.
  • If downloading is expressly prohibited (e.g., the terms of service say "no downloading, copying, or printing"), the work may be unpublished—because users aren't authorized to retain copies.
  • Content posted by someone who lacks authority to do so is not "published" by that act. A pirate posting your photo where it can be downloaded doesn't publish it on your behalf; you never authorized retention.
  • If downloading is expressly authorized—a "Download Now" button, a "save this PDF" link—that work is published. But that authorization may be limited to the particular work and doesn't necessarily publish the whole site.
  • If the same work also exists in tangible copies (CDs, printed booklets, DVDs), it's published, regardless of how it appears online.
  • Implied-license cases are the murky middle. If there's no clear statement either way, but the site actively helps users download, copy, forward, or reproduce content, an implied license to retain copies may exist—and then the work is published.

Because this is genuinely hard and the consequences are real—publication affects deposit requirements, the availability of certain group options, and the running of the § 412 statutory-damages clock—the Office largely lets the applicant decide whether a given work is published or unpublished, guided by the factors above. Decide carefully and document your reasoning. A common safe practice: if your blog posts are display-only with terms of service that prohibit copying, you may reasonably treat them as unpublished; if you offer a downloadable PDF whitepaper behind a "Download" button, that whitepaper is published the day you make it available.

Why does the published/unpublished line matter so much for strategy, beyond filling in a box? Two reasons. First, several of the most powerful group registration options (which let you register many works in one filing for one fee) are available only for unpublished works, or only for works published within a tight window. Second, publication status changes what you must deposit. So the answer to "published or not?" cascades into your entire filing plan—which brings us to the heart of the matter.

The constant-updating problem and the limits of group registration

Here is the practical nightmare that makes website registration different from registering a novel. A book is finished. A website is never finished. You add a blog post on Monday, swap a photo on Tuesday, rewrite the homepage on Wednesday, and your developer pushes code on Thursday. Copyright law's basic unit is the discrete, fixed work; a living website is a moving target. How do you register something that changes every day?

Two doctrines collide here, and you need to understand both.

Every new version is a new work (the derivative-version rule)

The Office treats each materially changed version of a website as a separate work for registration purposes. A registration for "Version A" of your site covers the new authorship in Version A—the changes, revisions, and additions the author contributed to that version. It does not cover earlier or later versions, and it does not cover preexisting material embedded in the site. Specifically, a registration for a particular version does not reach:

  • previously published material;
  • previously registered material;
  • material in the public domain; or
  • copyrightable material owned by a third party.

There is a narrow exception: a single registration can cover both new material and preexisting material if the preexisting material was never published or registered before and the claimant owns both. Outside that lane, you must exclude the old stuff and claim only the new.

This is the same logic that governs sequels, translations, and software updates: each new version is, in copyright terms, a derivative work that protects only its own new contribution (17 U.S.C. § 103(b)). The point is worth a full read if you publish a frequently revised site; see Copyright Registration for Derivative Works for how to identify and exclude preexisting material on the application form so your claim is accurate and your registration is defensible. (Inaccuracies that the applicant knew were inaccurate and that would have mattered to the Register can jeopardize a registration under 17 U.S.C. § 411(b)—the safe-harbor-and-fraud provision the Supreme Court construed in Unicolors, Inc. v. H&M Hennes & Mauritz, L.P., 595 U.S. 178 (2022)—so accuracy here is not a formality.)

Taken alone, the derivative-version rule implies something grim: to fully cover a daily-updated site, you'd file a new application for every meaningful update. That's impractical, which is exactly the problem group registration exists to solve—within limits.

Group registration: the efficiency tools and their ceilings

A group registration lets you register multiple works with a single application and a single fee under § 408(c) of the Copyright Act (37 C.F.R. § 202.4). The Copyright Office offers several group options, each tightly defined by regulation. For websites, the relevant ones are:

  • Group of unpublished works (GRUW). You can register up to ten unpublished works by the same author in one application, provided the same person or organization owns the copyright in all of them. This is the workhorse for a site full of unpublished, display-only content of mixed types—but note the hard cap of ten works per filing and the requirement that everything be unpublished. (GRUW replaced the older "unpublished collection" option.)
  • Group registration of photographs. Two related options exist: a group of published photographs (GRPPH) and a group of unpublished photographs (GRUPH), each allowing up to 750 photographs in a single application (the published version requires that all photos be published within the same calendar year). For an image-heavy site—a photographer's portfolio, a product catalog—these are enormously valuable. (See How to Register a Copyright with the U.S. Copyright Office for the per-photo metadata the Office now requires.)
  • Group registration of short online literary works (GRTX). Designed precisely for the web, GRTX lets a single author register between 2 and 50 short online literary works in one application, where each work contains 50 to 17,500 words and was first published online within the same calendar year, by the same individual author who owns each work. This is the option built for bloggers, columnists, and online writers who publish many short text pieces and would otherwise drown in separate filings.
  • Group registration of serials and of newspapers/newsletters. A "serial" is a work issued in successive parts at intervals (newspapers, newsletters, journals, magazines, annuals). Frequently updated online publications that genuinely operate like serials may be able to use these group serial options. The fit is technical and the eligibility narrow; if your site is structured like a periodical, this route is worth exploring with counsel, but don't assume an ordinary blog qualifies as a "serial."

Now the limits, stated plainly, because this is where website owners get tripped up.

  1. The caps are real. GRUW tops out at 10 works. GRTX tops out at 50 short literary works and excludes anything outside the 50–17,500-word band (so a 30-word product blurb doesn't qualify, and neither does a 25,000-word treatise). Photo groups cap at 750. You cannot register infinite content in one filing.
  2. Eligibility conditions are strict and unforgiving. Single-author/single-owner requirements, publication-window requirements, work-type restrictions, and specific deposit-format rules apply to each option, and a defect can sink the whole group claim. Each work in a group must independently qualify—mix in one ineligible item and you risk the batch.
  3. No single "register the whole website forever" option exists. There is no group registration that captures a living, multi-author, mixed-media website across all its past and future versions in one stroke. The collective-work route (above) can capture a snapshot you fully own; the group options can capture batches of qualifying components; but nothing captures the perpetual, evolving whole.

So how do real businesses cope? With a registration cadence, not a one-and-done filing. A sensible program looks like this:

  • Photographs: batch them and file a group photo registration periodically (quarterly, say), sweeping in everything new—up to 750 at a time.
  • Blog posts and short articles: if they're short text works by a single author, batch them into GRTX filings (up to 50 each) on a regular schedule; if unpublished and display-only, GRUW (up to 10) may fit.
  • The site as a collective work or compilation: if you fully own everything on a given snapshot, register that snapshot to protect the arrangement plus the owned components—then re-register meaningful new snapshots periodically as derivative versions, excluding what's already registered.
  • The code: register the software separately as a computer program, re-registering major versions as derivative works (Copyright Registration of Computer Programs).
  • Crown-jewel content: anything especially valuable (your signature photo, your most-copied article) deserves its own dedicated registration so its § 412 damages clock starts the moment it's published.

The goal isn't to register every pixel of every version—that's impossible. The goal is to ensure that the content most likely to be stolen, and most valuable to protect, is registered timely—before infringement, or within three months of publication—so the statutory-damages-and-fees remedy is on the table when you need it.

Deposit requirements: what you actually send the Copyright Office

A registration application has three pieces: the form, the fee, and the deposit—a copy of the work being registered, which the Office keeps (it is nonreturnable). For websites, the deposit rules have some specific contours worth knowing.

  • Deposit the content as it actually appears online. If you're registering an individual work that lives on the site, the deposit must show the work in the context in which it appears—how a user would perceive it when accessing it online. You generally do not submit the underlying HTML (unless, as discussed, you specifically want to register human-written HTML as a literary work).
  • To register an entire website, deposit all the pages as they appear—regardless of volume. A claim in an entire website does not extend to content that can't be seen in the deposit, so embedded audio or video that doesn't render in a static capture isn't covered by a whole-site deposit; register those separately.
  • PDF is the Office's preferred format for website deposits. A PDF "package" can capture an entire site while preserving its organization and navigation. This is the cleanest way to fix a moving target in a single dated snapshot.
  • No links allowed. The Office will not accept a URL or a link to your live site as a deposit—the live site changes, and the Office needs a fixed copy. Likewise, don't submit loose files stripped of their on-site context; deposit the content as assembled and as it appeared.
  • Match the deposit to the publication date. If you've claimed the work as published, make sure the deposit reflects the content as it existed on the date of publication you listed. A mismatch between your claimed date and your deposited snapshot is a common, avoidable defect.
  • The mandatory-deposit wrinkle. Separate from registration, the Copyright Act imposes a "mandatory deposit" obligation for works published in the U.S. (17 U.S.C. § 407), feeding the Library of Congress's collection. That obligation generally does not apply to works published in the U.S. only online—with a limited exception for certain electronic-only works (such as online-only serials) that the Office has specifically demanded. Most websites won't trigger it, but news-style online serials should be aware of it.

Practical tip: a contemporaneous, dated deposit is also evidence. If you ever have to authenticate what your site looked like on a given day—a recurring problem in litigation—your registration deposit is a clean, third-party-held snapshot. (On the broader evidentiary problem, see Capturing the Web—A Practitioner's Guide to Authenticating Website Screenshots as Evidence in Federal Court.)

From registration to enforcement: the DMCA Section 512 safe harbor

Registration tells you what you own and lets you into court. But a huge amount of real-world website copyright life happens outside court, through the notice-and-takedown machinery of the Digital Millennium Copyright Act. Every website owner sits on one side or the other of this system—often both—so it deserves real explanation.

The relevant law is Title II of the DMCA, formally the Online Copyright Infringement Liability Limitation Act (OCILLA), codified at 17 U.S.C. § 512. Congress passed it in 1998 to solve a genuine dilemma: as the internet exploded, online platforms faced potentially ruinous copyright liability for infringing material that their users uploaded, even though the platform never chose to post it. If every host could be sued into oblivion for a single infringing user upload, the open internet couldn't function. But copyright owners needed a fast, cheap way to get infringing copies down. Section 512 strikes the bargain: it shields qualifying providers from monetary liability—for both direct and secondary infringement (Viacom Int'l, Inc. v. YouTube, Inc., 676 F.3d 19 (2d Cir. 2012); Perfect 10, Inc. v. Amazon.com, Inc., 508 F.3d 1146 (9th Cir. 2007))—in exchange for cooperation.

The four safe harbors

Section 512 creates four "safe harbors" that shield qualifying online service providers (OSPs) from monetary liability for their users' infringement, each covering a different function:

  1. § 512(a) — Transitory digital network communications (acting as a mere conduit, routing data—think an ISP passing packets).
  2. § 512(b) — System caching (temporarily storing content to speed delivery).
  3. § 512(c) — Storage at the direction of a user (hosting user-uploaded material—this is the big one for forums, social platforms, and any site with UGC).
  4. § 512(d) — Information location tools (linking and search—directing users to infringing material via links or a search engine).

For a community site like Ferndale Forums, where members upload posts and images, the § 512(c) hosting safe harbor is the lifeline. If it applies, Ferndale isn't liable in damages when a member uploads someone else's copyrighted photo—provided Ferndale meets the conditions.

The threshold conditions

Safe-harbor protection is not automatic; it is earned by compliance. Across the board, an OSP must:

  • Adopt and reasonably implement a repeat-infringer policy that terminates the accounts of repeat infringers in appropriate circumstances, and inform users of it (§ 512(i)). Courts take "reasonably implement" seriously—a policy on paper that's never enforced won't do, as BMG Rights Management (US) LLC v. Cox Communications, Inc., 881 F.3d 293 (4th Cir. 2018), made painfully clear when Cox lost the safe harbor for failing to actually terminate known repeat infringers.
  • Accommodate standard technical measures used by copyright owners to identify and protect works, and not interfere with them.
  • Designate an agent to receive takedown notices, register that agent with the Copyright Office through the online DMCA Designated Agent Directory (37 C.F.R. § 201.38), and post the agent's contact information on the site. (This applies to every safe harbor except the § 512(a) conduit harbor.)

That last requirement is easy to overlook and fatal to miss: if you host user content and you have not registered a current designated agent with the Copyright Office, you may forfeit the § 512(c) safe harbor entirely. The Office requires the provider's full legal name and a physical street address (no P.O. box), and the registration must be renewed periodically. It is a simple, cheap filing; do it, and keep it current.

The § 512(c) hosting conditions specifically

For the storage and information-location harbors, the provider must additionally:

  • Lack actual knowledge that the material is infringing, and not be aware of facts or circumstances from which infringement is apparent—the so-called "red-flag" knowledge standard. The Second Circuit in Viacom v. YouTube drew a careful line: red-flag knowledge requires awareness of specific infringing material, not merely a general awareness that infringement is rampant on the platform. The court also confirmed that the common-law doctrine of willful blindness—deliberately avoiding confirming an obvious fact—can defeat the safe harbor.
  • Act expeditiously to remove or disable access to infringing material once it gains knowledge or receives a proper takedown notice.
  • Not receive a financial benefit directly attributable to the infringing activity in a case where it has the right and ability to control that activity. (Courts read "right and ability to control" to require something more than the mere capacity to remove material—typically substantial influence over what users post.)

A useful guardrail case for hosts is Hendrickson v. eBay, Inc., 165 F. Supp. 2d 1082 (C.D. Cal. 2001): a takedown notice that fails to substantially comply with the statutory elements—for example, by not adequately identifying the infringing items and their location—does not trigger the host's removal obligation, and the host does not lose the safe harbor for declining to act on a defective notice. In other words, the burden is on the copyright owner to send a proper notice.

Notice, takedown, and counter-notice

The operational heart of § 512 is its notice-and-takedown framework:

  1. A copyright owner who finds its work posted on a platform sends a takedown notice (§ 512(c)(3)) containing the statutorily required elements: identification of the copyrighted work, identification of the infringing material and its location (enough that the host can find it), the complainant's contact information, a statement of good-faith belief that the use is unauthorized, a statement under penalty of perjury that the information is accurate and the complainant is authorized to act, and a physical or electronic signature.
  2. The OSP, to keep its safe harbor, expeditiously removes or disables access to the identified material and notifies the user who posted it.
  3. The user, if she believes the takedown was mistaken or the use was lawful (say, fair use), may send a counter-notification (§ 512(g)) with its own required elements, including a statement under penalty of perjury and consent to federal jurisdiction.
  4. If a valid counter-notice arrives, the OSP must notify the original complainant and, unless the complainant files suit within 10–14 business days, restore the material. This puts the burden back on the copyright owner to go to court if it really means it.

There are guardrails against abuse. Section 512(f) imposes liability on anyone who knowingly materially misrepresents that material is infringing (or that it was wrongly removed). And the Ninth Circuit made clear in Lenz v. Universal Music Corp., 815 F.3d 1145 (9th Cir. 2015), cert. denied, 137 S. Ct. 2263 (2017)—the famous "dancing baby" case—that a copyright owner must consider whether the challenged use is a fair use before sending a takedown notice; firing off automated takedowns without considering fair use can itself raise a triable § 512(f) misrepresentation claim. (Not every court has embraced Lenz's reasoning—the District of Massachusetts declined to follow an earlier Lenz district ruling in Tuteur v. Crosley-Corcoran, 961 F. Supp. 2d 333 (D. Mass. 2013)—but Lenz is the leading appellate authority.) The DMCA is powerful, but it is not a license to censor lawful uses.

If you're on either side of this process—wanting to get your stolen photo off a competitor's site, or running a platform that just received a notice—our step-by-step guide How to File a DMCA Takedown Notice and Respond to One walks through the exact elements and the strategy. The key registration connection: while you do not need a copyright registration to send a DMCA takedown notice (the notice process is separate from litigation), you do need a registration to file the lawsuit that a counter-notice can force you into, and to bring a claim before the Copyright Claims Board. So the two systems are complementary—takedown handles the easy cases fast; registration arms you for the hard ones.

Worked example (hypothetical). A Ferndale Forums member uploads a professional landscape photo without permission. The photographer emails Ferndale a takedown notice, but it just says "you have my pictures, take them down" with no URLs and no perjury statement. Under Hendrickson, that defective notice doesn't trigger Ferndale's removal duty or cost it the safe harbor. The photographer sends a corrected notice identifying the exact thread and image; Ferndale removes it within hours and notifies the member; the member files no counter-notice. Because Ferndale had a registered designated agent, a real repeat-infringer policy it actually enforces, and acted expeditiously on a proper notice, it keeps its § 512(c) protection—and the photographer, who had registered the photo within three months of first publication, retains the option to pursue the uploader for statutory damages.

Terms of service: the license that governs everything users touch

We've mentioned terms of service (TOS) several times because they quietly do an enormous amount of legal work on a content-driven website. A well-drafted TOS is where three threads we've been pulling come together.

It transfers (or licenses) user-generated content to you. Recall that UGC belongs to its creator by default. If you want any rights to use the reviews, photos, and posts your users contribute—and to register them—you need them to grant you rights. The standard mechanism is a clause in the TOS by which the user grants the platform a broad license (or, more rarely, assigns ownership) in what they upload. Most platforms take a license—typically a worldwide, royalty-free, sublicensable, perpetual license to host, display, reproduce, and adapt the content—rather than full ownership, because users won't tolerate signing away their copyrights and because a license is enough to run the service. But—and this matters for registration—a mere license is not ownership, so a licensed-but-not-assigned piece of UGC generally can't be registered by the platform as its own work. To register UGC as the claimant, you need an actual transfer of the copyright (a § 204(a) writing), which the Office will recognize if your terms effect that transfer and you identify the authors. Draft deliberately: decide whether you need a license (usually enough) or an assignment (only if you truly need to own and register the content).

It sets the publication and download rules that determine "published" status. As we saw, whether your content is "published" can turn on what your TOS says about downloading and copying. A TOS that prohibits copying supports an unpublished characterization (and unlocks unpublished group options); a TOS or interface that invites downloading points toward published. So your TOS isn't just legal boilerplate—it's an input to your registration strategy.

It governs scraping and reuse by third parties. A TOS typically prohibits automated harvesting of the site's content. Whether that prohibition is enforceable, and against whom, is one of the most contested questions in internet law right now—which is our last big topic.

For the contract-drafting mechanics—how to make a TOS enforceable (clickwrap versus browsewrap), what the UGC license clause should say, how it interacts with the DMCA policy—see Software Licensing Agreements and the platform-side discussion in Social Media Law Basics. On who bears the burden when users infringe through your platform, see also Section 230 Reform and Platform Liability for User-Generated IP Infringement—bearing in mind that Section 230 famously does not immunize intellectual-property claims, which is exactly why the DMCA's separate § 512 regime matters so much for copyright.

Scraping, copying, and the limits of website copyright

Suppose a competitor (or an AI company, or a data broker) doesn't manually copy your site but instead deploys a bot to scrape it—automatically downloading your text, images, prices, and listings at scale. Is that copyright infringement? The honest answer is: it depends, and the law is genuinely unsettled.

Start with what copyright does and doesn't reach. The expressive content—your original articles, your photographs—is protected, and copying it (even by bot) can infringe. But two big limits bite hard in the scraping context:

  • Facts and data aren't protected. Per Feist, the facts on your site—prices, addresses, specifications, the data points—are free. A scraper that harvests only factual data may take nothing copyrightable at all. Your original selection and arrangement of that data may carry a thin compilation copyright, but a scraper that extracts the underlying facts and rearranges them often slips past it.
  • Copyright is only one of several possible claims, and not always the strongest. Scraping disputes typically braid together copyright, breach of the website's terms of service (a contract claim), and the federal Computer Fraud and Abuse Act (CFAA, 18 U.S.C. § 1030)—an anti-hacking statute aimed at "unauthorized access."

The landmark modern case is hiQ Labs, Inc. v. LinkedIn Corp., which wound through the Ninth Circuit and reshaped the landscape (see 938 F.3d 985 (9th Cir. 2019), and 31 F.4th 1180 (9th Cir. 2022) after remand from the Supreme Court in light of Van Buren v. United States, 593 U.S. 374 (2021)). The headline holding: scraping publicly available data—data not behind a login or password—likely does not violate the CFAA, because accessing a public website isn't "access without authorization" in the hacking sense. That blunted the CFAA as an all-purpose anti-scraping weapon for public pages. But hiQ did not bless scraping across the board—on remand the parties ultimately settled after the court signaled that LinkedIn's breach-of-contract theory had teeth, because hiQ had agreed to LinkedIn's user agreement prohibiting scraping. The upshot: against public data, your strongest tools may be contract (your TOS) and, where original expression is actually copied, copyright—not the CFAA. The full analysis, including how copyright, contract, and computer-fraud claims interact, is in our dedicated piece Data Scraping After hiQ v. LinkedIn—Copyright, Contract, and Computer Fraud Claims.

For website owners, the practical takeaways are: (1) register your genuinely original content so a copyright claim is available against wholesale copying; (2) draft a TOS that clearly prohibits scraping and make it enforceable (clickwrap where possible), since contract is often your best lever against public-page scraping; (3) recognize that you generally can't copyright-protect bare data; and (4) use technical measures (rate limiting, CAPTCHAs, login walls) as part of a layered defense, since putting data behind authentication changes the legal analysis. This is also the live frontier of generative-AI training disputes, where the same copyright/contract/CFAA questions are being litigated at enormous scale—see Copyright Infringement Claims Against Generative AI—The New York Times, Getty, and What Comes Next.

A practical registration playbook for a typical website

Let's pull the strands together into something you can actually act on. Here is a sensible program for a content-driven business site like Brewmosa's.

  1. Audit ownership first. Map every category of content—copy, photos, illustrations, logo, video, code, UGC—and confirm who owns each. Get written assignments from every contractor and freelancer (designer, photographer, developer). Fix the ownership gaps before you register, because you can only register what you own.
  2. Get the TOS right. Make sure your terms (a) license or assign UGC to you as needed, (b) state your download/copy rules (which affects published/unpublished status), and (c) prohibit scraping, in an enforceable clickwrap form.
  3. Register a DMCA agent and post a DMCA policy. If you host any user content, file your designated agent with the Copyright Office, keep the registration current, and post the agent's information on the site. Pair it with a written repeat-infringer policy you actually enforce. This protects your § 512(c) safe harbor.
  4. Register crown-jewel content immediately and individually. Your most valuable, most copyable assets get their own registrations so the § 412 damages clock starts at publication and a CCB or federal claim is always available.
  5. Set a batch cadence for the rest. Quarterly group photo registrations (up to 750 each); GRTX filings for short articles (up to 50 each, 50–17,500 words); GRUW for unpublished display-only batches (up to 10 each).
  6. Register the software separately as a computer program, re-registering major versions as derivative works.
  7. Consider a collective-work snapshot. If you fully own everything on the live site, register a dated snapshot as a collective work (preferably as a PDF package) to protect the arrangement plus owned components—then re-register meaningful new snapshots as derivative versions, excluding previously registered material.
  8. Keep records. Track what you've registered, when, and which version, so future derivative-version filings can accurately exclude prior material—and so your claims stay accurate under § 411(b) and Unicolors.

Done consistently, this turns an unregisterable "website" into a well-protected portfolio of registered works—exactly what you'll want in hand the day a competitor decides your content is theirs for the taking.

Key takeaways

  • There is no "website" copyright. A website is a bundle of separate works—text, images, video, sound, code, and arrangement—each registered through its own door according to what it predominantly is.
  • You can register the site as a compilation or collective work, but that protects only the thin selection-and-arrangement layer, and it reaches the underlying works only if you fully own all of them.
  • Ownership is the gatekeeper. Employees create work-for-hire you own; contractors own their work unless they sign an assignment; users own their UGC unless your terms transfer it. Fix ownership before registering.
  • "Online" doesn't automatically mean "published." Publication turns on whether you authorized the public to retain copies—and the answer drives your deposit rules and which group options you can use.
  • Group registration has hard ceilings: up to 10 unpublished works (GRUW); 2–50 short online literary works of 50–17,500 words each (GRTX); up to 750 photographs per group. No option captures a perpetually evolving multi-author site, so register on a cadence.
  • Each new version is a derivative work that protects only its new material; exclude previously published, registered, public-domain, and third-party content from your claim, and keep the claim accurate (Unicolors).
  • Register before infringement (or within three months of publication) to unlock statutory damages and attorney's fees under § 412—and remember Fourth Estate: you generally can't sue until the Office acts. Registration also opens the Copyright Claims Board small-claims route.
  • DMCA § 512 governs the day-to-day: host UGC responsibly (registered agent, enforced repeat-infringer policy, expeditious takedown) to keep your safe harbor, and use notice-and-takedown as your fast remedy.
  • Copyright is only one tool against scraping; facts aren't protected, so your terms of service (contract) and technical measures often matter as much as your registrations.

Frequently asked questions

Can I register my entire website with a single copyright application? Not as "a website"—there's no such registration class. You can, however, register a snapshot of the site as a compilation or collective work in one application if you fully own every piece of content on it. You can also batch many qualifying components into single group filings (up to 750 photographs, up to 50 short online text works, up to 10 unpublished works). But because sites constantly change and often contain third-party material, most businesses register on an ongoing cadence rather than in one filing. See Copyright Registration—A Comprehensive Guide.

My web designer built my site. Don't I own it because I paid for it? Probably not—at least not the copyright. An independent contractor owns the copyright in what they create unless they sign a written assignment (or a valid work-made-for-hire agreement, which is available only for limited categories of commissioned work). Paying the invoice typically buys you a license to use the site, not ownership of its copyright. Always get a signed assignment of all rights from designers, photographers, and developers before the work begins.

Do I need to register before I can send a DMCA takedown notice? No. The DMCA notice-and-takedown process under 17 U.S.C. § 512 doesn't require a registration—you can send a proper takedown notice for any work you own. But you do need a registration (or pending application) to file a copyright lawsuit (per Fourth Estate) or to bring a claim before the Copyright Claims Board, which matters if the alleged infringer files a counter-notice and forces the issue. So takedowns handle the easy cases; registration arms you for litigation. See How to File a DMCA Takedown Notice and Respond to One.

Who owns the reviews, photos, and comments my users post? By default, the users do. To use that user-generated content—or to register it as your own—you need them to grant you rights, almost always through your terms of service. Most platforms take a broad license (enough to run the service); a license is not ownership, so to register UGC as the claimant you generally need an actual transfer of copyright, plus identification of the authors. Draft your terms with this distinction in mind. See Social Media Law Basics.

Is my content "published" just because it's on the internet? No. Publication means distributing copies the public is authorized to retain—merely displaying or streaming a work online usually isn't publication. If your terms prohibit downloading, the content may be unpublished; if you offer a "Download" button, it's published. This matters because it drives your deposit rules and which group registration options you can use.

Can someone legally scrape my website? It depends. Scraping publicly available data likely doesn't violate the Computer Fraud and Abuse Act after hiQ v. LinkedIn and Van Buren, and bare facts aren't copyrightable under Feist. But scraping can still breach an enforceable terms-of-service contract, and copying your original expression (articles, photos) can infringe copyright. Your best defenses are a strong, enforceable TOS, registration of your original content, and technical measures like login walls. See Data Scraping After hiQ v. LinkedIn.

How often should I register if my site changes constantly? Treat registration as a recurring program, not a one-time event. Register crown-jewel content individually and immediately; batch photographs and short articles into group filings on a regular schedule (quarterly works well); register software separately and re-register major versions; and consider periodic collective-work snapshots of content you fully own. The aim is timely registration of valuable content—before infringement or within three months of publication—so statutory damages and fees stay available.

What deposit do I send for a website? Send the content as it actually appears to users—PDF is the Office's preferred format, and a PDF "package" can capture a whole site while preserving navigation. Don't submit a link to your live site (the Office won't accept it), and don't strip content out of its on-site context. If you claim the work as published, make sure the deposit matches your stated publication date.

What is the Copyright Claims Board, and how does it fit in? The CCB is a copyright "small-claims tribunal" inside the Copyright Office, created by the CASE Act, that hears claims up to $30,000 without full federal litigation. It is well suited to a small online business with a clear infringement but limited litigation budget. You must have a registration or a pending application to file, and the Office must actually register the work before the Board issues a final determination. Participation is voluntary—a respondent can opt out—but for many website owners it is a realistic enforcement path that registration unlocks.

Related articles


This article provides general information and is not legal advice. Copyright law is fact-specific and evolving; for guidance on your particular website and content, consult qualified counsel.