A trademark survey's value lives entirely in its methodology: the universe must be right, the format must fit, the questions must be neutral, the administration must be blind, the conditions realistic, and the confusion netted against a control before any percentage means anything. This checklist tracks the design choices in the order they matter—which is also the order a cross-examiner attacks them. For the doctrine, see Consumer Survey Expert Methodology in Trademark Cases; for admissibility, Daubert Challenges to Consumer Survey Experts.

Phase 1 — Decide whether to commission a survey at all

  • Confirm the issue genuinely turns on consumer perception.
  • Confirm the budget and schedule permit a rigorous design (often a five-/six-figure, multi-month undertaking).
  • Confirm a competent expert believes a defensible survey can actually be built for these marks and this market.
  • Retain an experienced, independent survey expert early.

WHY / traps. A mediocre survey is often worse than none—it hands the opponent a Daubert target and may produce a number that undercuts the party that paid for it. A survey is not mandatory; confusion can be proven by the other factors and real-world actual confusion. If a credible expert cannot in good conscience design a survey that finds what you hope, that is itself useful information about the case.

Phase 2 — Define the universe (the load-bearing decision)

  • Identify the theory of confusion: forward, reverse, sponsorship, etc.
  • Set the universe for forward confusion = prospective purchasers of the junior user's (defendant's) goods/services.
  • Flip the universe for reverse confusion = prospective purchasers of the senior user's goods/services.
  • For fame/dilution, use the broad general consuming public; for secondary meaning, the relevant purchasers.
  • Draft screening (filter) questions that actually capture that universe—right product category, timeframe, and price range.

WHY / traps. The wrong universe is usually fatal and cannot be cured by cross-examination or argument—you measured the wrong people. Two named errors: over-inclusive (sweeps in non-purchasers, diluting and distorting) and under-inclusive (excludes relevant purchasers, cannot generalize). The format does not change the universe; a Squirt survey on the wrong universe is exactly as defective as an Eveready one. Loose or tight screeners are a back door through which a universe problem walks right back in.

Phase 3 — Choose the format

  • Choose Eveready when the senior mark is strong/famous: show only the junior mark/product and ask open-ended questions (who puts this out, what makes you say so, what else do they make, do they need permission).
  • Choose Squirt when the senior mark is weaker or consumers genuinely encounter both marks together: show both marks and ask a neutral same-source/affiliation question.
  • For a Squirt design, build a lineup/array that salts the test marks among unrelated marks to avoid a naked two-way face-off.
  • Match the format to the lie of the ball—do not default to habit.

WHY / traps. Eveready relies on memory and minimizes suggestion—the gold standard for strong marks—but run on an obscure mark it under-counts confusion (respondents never heard of the plaintiff). Squirt risks suggestion: juxtaposing two marks nudges respondents to find a relationship no shopper would experience. If a plainly famous mark's expert chose Squirt, ask why—the answer is sometimes that Eveready showed too little confusion.

Phase 4 — Build the control and plan the net

  • Add a control cell identical to the test cell in every respect except the allegedly infringing feature (a plainly non-infringing mark or the feature altered/removed).
  • Plan to report net confusion = test cell − control cell, not the raw figure.
  • Make the control especially robust for Squirt designs, where juxtaposition inflates raw confusion.

WHY / traps. Without a control a reported confusion percentage is essentially uninterpretable—every survey generates background "noise" (guessing, pre-existing beliefs, instrument artifacts). The first question a sophisticated reader asks is not "what was the rate?" but "what was the control, and what is the net?" A Squirt survey with no control is one of the most exploitable artifacts in trademark litigation.

Phase 5 — Draft questions, stimuli, and administration

  • Use open-ended, non-leading questions; avoid demand effects that telegraph the desired answer.
  • Offer a genuine "don't know / no opinion" option so uncertainty is not coded as confusion.
  • Rotate stimulus and answer-choice order to neutralize order effects.
  • Probe spontaneous answers neutrally ("What makes you say that?").
  • Replicate marketplace realism: show full trade dress, packaging, color, and point-of-sale context—not bare word marks on a white screen.
  • Administer double-blind: neither interviewer nor respondent knows the sponsor or purpose.

WHY / traps. Leading questions and demand effects generate confusion that exists only in the instrument. Single-blind is not enough—an interviewer who knows the "right" answer can signal it unconsciously. An artificial side-by-side no shopper would ever see measures something other than real-world confusion; this has special bite in trade dress cases.

Phase 6 — Sample, code, and report

  • Choose the sampling method (probability ideal; non-probability online panel or mall intercept acceptable if properly executed and justified).
  • Size each cell (test and control) for a usable margin of error; report cell sizes, margin of error, and confidence level (commonly 95%).
  • Code open-ended responses objectively—blind coders, a written protocol, an inter-coder reliability check—and preserve and produce the verbatims.
  • Document the design rationale in writing and build the survey to survive a Daubert/Rule 702 challenge from day one.
  • Interpret the result with the net figure in mind (≈15%+ supports confusion; below ≈10% tends to negate; 10–15% is a gray zone), as one factor among the Polaroid/Sleekcraft set.

WHY / traps. Too small a sample yields a margin of error so wide the number is meaningless. Coding is where mischief hides—"I'm not really sure" can be coded as confused or uncertain depending on who holds the pen. A report that will not produce its verbatims is signaling something. Under amended Rule 702 (2023), the proponent must establish reliability by a preponderance and the opinion must reflect a reliable application of the method—fundamental defects (wrong universe, no control, leading questions, unrealistic stimulus) increasingly draw exclusion rather than mere "weight."

Surveys for other issues

  • Secondary meaning: test whether a descriptive term identifies a single source; control for respondents who would associate any term with one source.
  • Fame (dilution): sample the general consuming public, not a niche buyer pool.
  • Genericness — Teflon: tutor respondents on brand vs. common names, then have them classify a salted list (courts generally prefer this for cleaner data).
  • Genericness — Thermos: ask what respondents would call the product to capture spontaneous generic use.

WHY / traps. Survey methodology is a single discipline applied to different questions—each format must still satisfy the same universe, neutrality, execution, and coding standards.

Common mistakes

  • The wrong universe (the single most common fatal flaw).
  • No control, so the raw figure overstates confusion.
  • Wrong format for the mark's strength.
  • Leading questions and demand effects.
  • Artificial side-by-side stimuli divorced from the marketplace.
  • Single-blind (or unblinded) administration.
  • Tiny samples and unreported margins of error.
  • Subjective coding and withheld verbatims.

Primary authority

  • Rules: Fed. R. Evid. 702 (as amended 2023) and 703; Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579 (1993); Kumho Tire Co. v. Carmichael, 526 U.S. 137 (1999); Gen. Elec. Co. v. Joiner, 522 U.S. 136 (1997).
  • Formats: Union Carbide Corp. v. Ever-Ready, Inc., 531 F.2d 366 (7th Cir. 1976) (Eveready); SquirtCo v. Seven-Up Co., 628 F.2d 1086 (8th Cir. 1980) (Squirt); E.I. DuPont de Nemours & Co. v. Yoshida Int'l, Inc., 393 F. Supp. 502 (E.D.N.Y. 1975) (Teflon); King-Seeley Thermos Co. v. Aladdin Indus., 321 F.2d 577 (2d Cir. 1963) (Thermos).
  • Universe: Bristol-Myers Squibb Co. v. McNeil-P.P.C., Inc., 973 F.2d 1033 (2d Cir. 1992).
  • References: Federal Judicial Center, Reference Manual on Scientific Evidence (Reference Guide on Survey Research, Shari Seidman Diamond); McCarthy on Trademarks §§ 32:158 et seq.; INTA survey guidelines.

Survey design and admissibility are fact-specific; consult a qualified survey expert and trademark counsel.

Related resources

This checklist is general information, not legal advice. Consult qualified trademark litigation counsel and a qualified survey expert about any particular matter.