Putting experimental code on trial

Rick Powell
7 min readApr 10, 2021

The code referenced in the following article can be found here.

Can we just do this one thing quickly?

It’s a common request that can instil a sense of dread and defeat in a developer, often because the word ‘quick’ is associated with resulting code that ends up being ‘dirty’, ‘hacky’ or ‘risky’. Doing things quickly can mean compromising on well-thought-out principles refined through years of experience. It can mean coming back to the code 6 months later and cursing your former self as you struggle to comprehend the coded representation of a previously clear and obvious requirement. Its knowing that these changes are committed to history through source code; an archive of regrettable decision making, freely accessible to the very people you wouldn’t want to see it — your peers.

There is some comfort in recognising that the problem of doing things quickly isn’t unique to the world of software engineering. Regardless of industry or problem, faster completion is a much-cherished goal. Taking a simple example of a coffee shop, wouldn’t it be great if coffee was served just that little bit more quickly to caffeine-deprived customers?

If we were to try and achieve this, what steps would we take? One obvious step might be to hire more staff, but this comes with a financial cost. You could offset this cost with a cheaper blend, but that could compromise on taste. At some point, a balance needs to be struck across taste, quality of service and cost. Yet even when all those constraints are set, there is still the opportunity to deliver more quickly by focussing on the efficiency of the whole operation.

A fast feedback loop

If you’re opening a coffee shop, you want the coffee to be loved by customers. And as the owner, you’d most likely enlist the help of friends to find a great taste. This is not an exact science — you wouldn’t start selling coffee based on theory alone, where the correct temperatures and pressures have been calculated and the recommended blend has been supplied. And that’s because trialling your coffee on others is the real proof of the pudding. If you get positive feedback from these trials, this would give you more confidence than anything else to start selling to the public.

The same line of thinking can be applied when it comes to building software. Trialling software allows businesses to obtain tangible customer feedback, and in turn gives confidence that your software is providing a service that customers want or need. CI/CD processes really make the dream of build-measure-learn a reality, where faster iterations of this cycle are the ultimate goal. Making the ‘build’ part of the build-measure-learn loop efficient is really the responsibility of software developers, so is asking for something to be done quickly really such an unreasonable request?

Feature switches can be a really powerful tool in reducing the build time in this cycle. They allow experimental code to remain dormant until the switch is turned on. As a consequence:

  • There’s minimal risk in deploying this code to a live environment, as long as the feature switch is off. This means the feature code can make it into a release branch ASAP, even before its ready to be trialled, which reduces the development cycle time.
  • If the feature doesn’t work out and there’s no need to pursue it further, we can turn off the switch and remove it, along with the feature code.
  • If the feature does work out, we can promote the code to be a first-class part of the code-base through refactoring, and just remove the switch.

Introducing service replacements with decorators

As feature switches are developed, its always good to bear their purpose in mind. Remembering the fact that the code is just part of a trial, we should ensure that:

  • The new feature can be added with no impact on existing operations
  • The new feature can be removed with no impact on existing operations

This is intrinsic in greenfield development. There are no prior dependencies or integration points to worry about. By definition, greenfield development is isolated from any existing code. When development is brownfield however, and we are looking to trial a potential replacement of existing code, this goal is a little more complex. Through the use of decorators however, we can implement a feature switch that can introduce service replacements without impacting on any existing code.

As an example, we can model a coffee shop with a coded representation of its operation.

Coffee can only be purchased where there is stock available

Making a purchase is not going to be possible if the weight of coffee required (based on the coffee strength) exceeds the weight of coffee in stock for a particular blend. This is a business rule, defended by throwing an exception in the domain but also validated prior to the purchase request.

Unfortunately, the supplier delivering the coffee beans is unable to deliver and there’s no coffee left.

This simple implementation of a SupplierService represents the problem at hand — there are 0 kgs of coffee available, for all blends. Making coffee purchases is therefore no longer possible.

Caffeine-deficient customers are getting impatient and the manager needs to act quickly! They decide to ring another supplier to try and get some more coffee beans.

The decision to use another temporary supplier is recorded as the feature switch “UseTemporarySupplier”, but we don’t know what the finer details are yet…

The supplier agrees to deliver coffee, but this supplier is unproven because the coffee shop has never used them before. Therefore, the decision comes with a risk, but its better than having no coffee at all.

The unproven supplier can supply 100kg of coffee for all blends. The supplier represents a different implementation of how supply is obtained, so has its own UnprovenSupplierService.

Thinking a bit further ahead, there are still a few questions up in the air. When will the original supplier be back in action? Will they continue to use this new supplier or switch back to the old supplier? Until those questions are answered, we’re trialling a new supplier:

Adding a decorator to the ISupplierService service keeps the door open for using both the normal and the trial coffee suppliers, based on the UseTemporarySupplier switch we made earlier. We can leave our original supplier code inactive, but still available. It also leaves any calling code unaffected as the decorator acts as an ISupplierService to be resolved through the DI container.

The code changes required to trial the new supplier are the addition of a new UnprovenSupplierSerivce, UnprovenSupplierServiceDecorator and FeatureSwitch. Importantly, with these changes, the method for purchasing coffee has been left untouched. Because of this, and the fact that our new code is not active yet, we can release straight to live with confidence, even whilst the feature code is under development!

Once trialled, if the new supplier doesn’t work out and the coffee shop continues to use the original supplier, it’s simple enough to remove these bits of code. If it does work out and becomes a permanent replacement, it’s easy enough to remove the decorator and replace the SupplierService with the UnprovenSupplierService.

Introducing method replacements with delegates

Using the decorator pattern to implement feature switches is quite a limited approach in a lot of cases. Most business logic doesn’t need to be shoe-horned into services and more clarity can often be obtained in classes with domain-rich behaviour.

In our coffee shop, we can model an example of where this is apparent. Currently, the cost of a coffee is calculated based on the strength requested by the customer:

The manager wants to experiment with the pricing by reducing the cost of the weakest coffee, but increasing the cost of the strongest coffee. One way to implement this might be to do the following:

Quick and easy. But then it comes to releasing the change, suddenly it becomes very difficult to look your QA team in the eye and say “this code change comes with no risk so there’s no need to test”. Furthermore, what if the experimental pricing works out and we want to remove the feature switch? Is that another completely risk-free release? The risk comes in the fact that the original implementation of the costing method has been modified to account for changes to the pricing.

In order to leave the original implementation unaltered, we can introduce a layer of delegation into the domain:

Let’s break down these changes:

  • GetCost has been duplicated and renamed to GetExperimentalCost, before implementing the new pricing model changes. This represents the alternative costing method to be used when the feature switch is enabled.
  • Both GetCost and GetExperimentalCost have then been set to private. This is to encapsulate the feature switch decision making internally, so that referencing client code does nothing have to.
  • A new GetCost method has been introduced as the public entry point for calculating the cost of a coffee. This returns a delegate to either one of the Costing methods, based on a feature switch that is passed in.

Aside from not altering the original implementation, it’s also a lot easier to see the differences that the feature switch introduces with a side-by-side method comparison.

The code consuming GetCost (line 17 below) also looks pretty clean too. In the case that the GetCost method signatures for the old and new implementation match, it’s an easy way of keeping the feature switch out of the method parameters so they don’t have to change:

You could argue that there has been a change to the implementation of the calling method (MakePurchase), and you’d be correct. However the change is limited to passing a boolean “useExperimentalPricing” parameter through a series of chained method calls. Any breakages with this can be caught at compile time, and importantly this doesn’t equate to any change in logical behaviour directly within this method.

Once the experiment is finished and the manager is ready to go with one costing model or the other, we can remove the delegating method and publicly expose the winning costing method (and rename if required).

Conclusion

Both the techniques outlined above aren’t anything particularly fancy or ground-breaking, but they do allow the introduction of experimental code to be just that, whilst having minimal impact on the existing code that should continue to work as expected. Service and method replacements also have the fortunate side-effect of being largely self-documenting, making it easier for fellow peers to comprehend and maintain.

--

--