Carbon Copy Testing

December 20, 2021

Anoth­er jab against test cov­er­age cargo cult­ing, I sup­pose.

Teams that cargo cult TDD or code cov­er­age met­rics tend to end up writ­ing tests that pro­vide little to no regres­sion pro­tec­tion nor refac­tor­iz­abil­i­ty. There’s a case to be made for writ­ing more tests, which is a prob­lem that many teams have. And then there’s a prob­lem in which teams that strive to do this and try to quan­ti­fy it (after all, how can you improve some­thing you cannot mea­sure?), then fall into the trap of doing things like enact­ing min­i­mum code cov­er­age in the CI pipeline1.

Let’s say we have a User model defined like this:

class User
  attr_reader :name, :age

  def initialize(name, age)
    @name = name
    @age = age
  end
end

How would we test a piece of code like this? Maybe let’s try ini­tial­iz­ing it and see if we can read back the data we pro­vid­ed:

require 'user'

describe User do
  context 'Given User model' do
    it 'can initialize correctly' do
      name = 'John'
      age = 30
      user = User.new(name, age)
      expect(user).to be_a User
      expect(user.name).to eq name
      expect(user.age).to eq age
    end
  end
end

At first glance, this does­n’t seem too bad of an exam­ple. But we’re at a point along the spec­trum, of which the extreme end looks some­thing like this:

describe User do
  context 'Given User model' do
    it 'can initialize correctly' do
      file = File.open('lib/user.rb').read
      expect(file).to eq <<~EOS
        class User
          attr_reader :name, :age

          def initialize(name, age)
            @name = name
            @age = age
          end
        end
      EOS
    end
  end
end

I’d say the first exam­ple isn’t very far on the spec­trum from the second exam­ple. What addi­tion­al ben­e­fit does writ­ing the test the first way give us, as opposed to just writ­ing it the second way? Maybe it gives us some rep­re­sen­ta­tion­al inde­pen­dence and enforces the duck typing con­tract, i.e. if it has access­able name and age fields, it’s a User. If we ever wanted to refac­tor the User class and imple­ment in some other way to main­tain this con­tract, we could.

Can you think of a way to improve or refac­tor the code with­out chang­ing the test? Unless you made an actual mis­take like mis­nam­ing the field to begin with, prob­a­bly not.

Per­son­al­ly, I’d say the ROI on writ­ing this test is at best, slight­ly higher than zero, and at worst, neg­a­tive. Some­one has to actu­al­ly write the test2 , and now anyone who’s trying to add a new field through the appli­ca­tion’s call path has to do double the work. If your goal is to punish your engi­neers for test­ing non-busi­ness-logic relat­ed data piping, the second method achieves the same out­come.

After all, if you’re making the same mis­take twice, I mean, you must have a good reason for it right?

I’ve come up with a term for these kinds of tests — carbon copy test­ing3.


Of course, if there is busi­ness logic in your models, then it’s worth test­ing. Let’s say we need to imple­ment a method on the User class to gen­er­ate a stable string rep­re­sen­ta­tion of a user. We could do some­thing like this:

class User
  attr_reader :name, :age

  def initialize(name, age)
    @name = name
    @age = age
  end

  def stable_hash
    "User:#{name}:#{age}"
  end
end

And test it like this:

describe User do
  context 'Given User model' do
    it 'should have the same stable string representation for two similar users' do
      user = User.new('John', 30)
      user2 = User.new('John', 30)
      expect(user.to_s).to eq(user2.to_s)
    end
  end
end

If we ever wanted to refac­tor this to use MD5, we could do so:

require 'digest'

class User
  attr_reader :name, :age

  def initialize(name, age)
    @name = name
    @age = age
  end

  def to_s
    Digest::MD5.hexdigest("User:#{name}:#{age}")
  end
end

And the test should still con­tin­ue to pass, as we expect, with­out any mod­i­fi­ca­tions.


Using a lan­guage with static types makes it sig­nif­i­cant­ly less likely that carbon copy tests will appear, but if engi­neers are still forced to comply with code cov­er­age met­rics or have a code review cul­ture of expect­ing tests to be writ­ten for every file changed, they may still have no choice but to resort to writ­ing them.


  1. See Good­hart’s Law.
  2. Github Copi­lot dra­mat­i­cal­ly lowers the amount of time required for this, espe­cial­ly (and iron­i­cal­ly so) for carbon copy tests.
  3. Per­haps sim­i­lar in spirit to snap­shot test­ing in the con­text of fron­tend test­ing, but I think the use case for that is more legit­i­mate.