Building a Delivery Ecosystem: Part 1

7 min readAug 19, 2020

Are you confused about which teams need DevOps engineers? Even more confused about whether you need a standalone DevOps team, or perhaps a DevSecOps, NetDevOps, or SecDevNetOps team? Do you have FOMO (fear of missing out) over all things mesh?

We enjoy increasingly diverse ways to create value using technology, but how we deliver software, configuration, and data can be complicated. Engineers are more successful when they share an understanding of what they build and operate, along with their dependencies on value delivery processes. In this post, we refer to the collection of products and services that an engineering team relies on to publish products or deliver value to their customers as a delivery ecosystem.

What is a Delivery Ecosystem?

Imagine a shipping company. In order to deliver packages in an accurate and timely fashion, this company requires physical infrastructure around the country or world, including sorting centers, warehouses, major hubs, and airports. In addition, the company builds transportation capabilities using planes, trains, trucks, drivers, workers, and more to move goods.

If one sorting center follows a different procedure than the others, then it may send a package to the wrong recipient. If there are not enough drivers to deliver presents for the holiday season, then packages are delivered two months later than expected. If a storm sweeps through a region and routes cannot be diverted, packages may be delayed or even worse, lost. Without consistent and well-organized infrastructure, the company cannot support scalable, resilient delivery that can accommodate weather or customer demand. Such errors in delivery introduce customer dissatisfaction and affect the shipping company’s growth.

Similarly, a delivery ecosystem consists of the set of products and capabilities that software engineering teams rely on to deliver and operate services and applications. Many vendors in the business of providing cloud computing services offer a delivery ecosystem. In these ecosystems, engineers have a one-click experience to create and deploy software — for example, using Google App Engine, Amazon Web Services Elastic Beanstalk, Azure App Service, Heroku, and more.

Organizations managing their own engineering teams and data centers are similarly pursuing efficient, scalable, resilient, value delivery. These organizations invest time and effort to procure hardware, set it up, ensure it is compliant and secure, and customize it for engineering and operational needs. Over time, this may result in varied and unique hardware or server configurations. As with our shipping company scenario, this can lead to delayed delivery of business value, lack of resilience to unexpected patterns or system chaos, or poor security and risk management.

Characteristics of the Delivery Ecosystem

Engineers rely on two key characteristics of their delivery ecosystem to create value:

Self-service and programmable so that engineering teams can obtain resources on-demand with minimal friction.
Continuously delivered for quickly applying changes and avoiding or promptly recovering from downtime.

While we often associate these characteristics with IaaS and SaaS, we can have a self-service and continuously delivered ecosystem in the data center. For example, when legacy applications are tied to data center capabilities that are not self-service or continuously delivered, value delivery is constrained. If a modernization initiative focuses solely on refactoring or decomposing the legacy application without addressing delivery ecosystem constraints, the business value of modernization is limited. Exploring infrastructure-as-code for automation and configuration management can enable self-service and continuous delivery in both data center and cloud environments. These capabilities may be custom-built, provisioned via Infrastructure-as-a-Service (IaaS), or Software-as-a-Service (SaaS) providers.

Role of the Delivery Ecosystem

Business value is best measured through a holistic view of the entire technology value chain:

From the theory of constraints, we know that focusing on only one section of the value chain, such as the creation of services and applications, is not an effective strategy for accelerating value delivery. For example, we might redesign our systems architecture to support microservices, but our local optimization does not address increased agility in provisioning the core compute, network policy, or container orchestration. Investments into delivery ecosystem improvements become difficult to justify and measure without aligning them with business value. How can we include and optimize the delivery ecosystem holistically, as part of the technology value chain?

Delivery Ecosystem Journeys and Value Metrics

Managing a delivery ecosystem as software allows us to apply agile approaches and build the ecosystem quickly and incrementally. To do this, we describe the primary user journeys for the delivery ecosystem, define key metrics, and use the journeys to refine and prioritize delivery product backlogs. Delivery product boundaries and considerations are discussed in part two of this article. Primary user journeys for the delivery ecosystem include:

Path to Production (P2P): The P2P journey drives delivery product backlogs related to build pipelines and code promotion environments for content delivery, compute and storage, container registries, compute clusters, and network connectivity. For example, a typical P2P MVP enables code promotion of a simple ‘hello world’ application through a pre-production or isolated production environment. P2P metrics may include cycle time for production deployment of a new service or application, such as ‘time to hello world,’ and deployment failure rates.
Path to Repair (P2R): The P2R journey prioritizes features for log aggregation, service observability, auto-scaling, circuit-breaking, metrics, and alerting needed to support resilience, scalability, and an efficient debugging experience, including build monitors and operations dashboards. P2R metrics may include MTTR (Mean Time To Repair), SLO/SLA/SLI, and measured outcomes of resilience activities such as load or chaos testing.
Path to Compliance (P2C): P2C features enable authentication and authorization (IAM), vulnerability management, secrets management and rotation, network policy, business metrics, and regulatory automation to support audit requirements that vary by industry. Path to Compliance metrics may include audit metrics and MTTC (Mean Time To Compliance) as time needed to detect and address breaches and vulnerabilities.

These primary journeys could be represented as themes or epics, with features refined and prioritized within delivery product boundaries, and as a way to scale and coordinate teams when evaluating, creating, or renovating a delivery ecosystem.

Getting Started: Key Decisions

Typically, the first step in constructing a delivery ecosystem is choosing a service provider, whether within the data center or a public cloud provider. We can bucket initial considerations into three types: evolvability, autonomy, and situational. These three types of considerations often come into conflict with each other.

Evolvability should take precedence, as it accounts for changes in technology. These might include decisions around building certain capabilities versus buying a product. While building capability seems attractive, it requires resources to invest in construction and operation. When purchasing from a provider, we consider vendor lock-in. Both build and buy decisions can increase change friction, the potential effort needed to evolve a system, or embrace new vendors or technologies.

Autonomy refers to the on-demand provisioning of compliant resources. An engineering team may not concern itself directly with security or compliance implementation as long as they are provided a means to ensure and verify that security or compliance is in place. By receiving these resources on-demand in a self-service form, teams can focus on development. However, inflexible templating or too many restrictions on engineering may affect evolvability and increase change friction.

Finally, situational considerations such as cost, vendor contracts, and international or regional factors are often hard constraints that cannot be ignored. Idealistically, they contribute to the decision but are not the primary deciding factors. Instead, they often conflict with interests in evolvability and autonomy. If situational considerations take precedence, it’s best to include potential effects on evolvability, change friction, and autonomy when considering the total cost of ownership.

After choosing a potential candidate service provider or providers, the next step is to examine the providers’ technologies and tooling. This diversity of technologies and tooling typically leads to the question:

“Should we assemble technologies that are “Best-of-Breed” (‘a la carte’), or exploit the provider’s integrated ecosystem?”

For organizations that are making the first steps into cloud adoption as a delivery ecosystem, leveraging integrated capabilities can reduce complexity in procurement and mean time to market for initial delivery ecosystem capabilities. It is always possible that vendor integrated capabilities will not be considered industry best-of-breed, but aligning with a single vendor’s technology stack can reduce ramp-up fatigue for software delivery and operations teams. Looking at provider capabilities in terms of the prioritized user journeys and metrics for your delivery ecosystem can help focus build-vs-buy decisions.

An increasing number of delivery ecosystem technologies are open source or vendor-supported open source. As a result, some even support built-in integrations with other open-source tooling (such as the Cloud Native Computing Foundation projects). It is challenging to balance the best individual features from many tools versus a set of beta features supported by one. Consider the following when choosing a vendor’s capabilities:

Debug-ability: how do we debug across many tools? How do we debug with one?
Improvability: are product teams or communities amenable to the improvement of features?
Responsibility: is it self-hosted/deployed or a software-as-a-service (SaaS)?
Complexity: how do we support, procure, and integrate changes to these tools in the future?

Once these considerations are taken into account, what are the starting products and capabilities we need from a delivery ecosystem to enable initial delivery of value? We suggest some product boundaries in Part 2 of this article.

This article was originally written and published by a WWCode member on her blog, Paula Paul.