Saturday, June 15, 2024

Service Mesh

Before service meshes, engineering teams on the cutting-edge of microservices architecture resorted to in-house developed components to address challenges they faced in the early days of microservices. As projects grew, growing pains emerged, and features for teams to consider included basic service discovery, security, traffic management, and observability. 

In a monolithic architecture, the encapsulation of cross-cutting concerns, or features that multiple components need to share, resolves similar problems. The absence of a shared feature in both architectures, microservices and monolithic, inevitably leads to a bigger problem: code-duplication. 

For example, in a monolithic architecture, logging and error-handling code may need to be duplicated across several modules. Similarly for microservices, each service may implement its own authentication and authorization logic, leading to code maintenance challenges and inconsistencies.

Problems to Solve

It doesn't take hundreds of services for this need to become apparent. Avoiding duplicate code becomes difficult in running fewer than twenty services. 


In fact, in a minimalist two-service environment, where service A calls service B, how does service A find service B? What if service B moves? If a call fails, then how many times should service A retry? How does service B know that service A is actually calling it? Should service B always accept requests from service A? If a new version of service B is released, then can service A remain compatible with the previous version? What happens if both services log and trace calls independently? 

Solutions

To solve most of these problems, service meshes introduce an infrastructure layer called a data plane in which sidecars (proxies) are paired with each other. Sidecars keep the cross-cutting concerns separate from the application code. They intercept all outbound and inbound traffic from services, and are guided by system-wide policy provided by a control plane


The control plane can also be thought of as a mesh coordinator. It keeps track of where service instances run and propagates configuration changes to the entire system. The control plane sends service discovery information to sidecars, enabling them to route traffic correctly. It also enables sidecars to establish a secure communication channel by issuing and rotating certificates. 

Service mesh architecture offers enormous advantages in managing and securing microservices. By abstracting network and security concerns into sidecars, service meshes like Istio provide centralized control, policy enforcement, and observability. 

A downside is in deploying and managing a service mesh. This could become somewhat complicated, as service meshes require upfront consideration of cloud resource utilization (memory, cpu, bandwidth). And teams should think through the specific needs of the application architecture to justify the added complexity. 

When implemented correctly, service meshes give engineering teams superpowers by improving overall system security, resilience, and scalability. As an added advantage, a service mesh simplifies application maintenance by helping teams to avoid code-duplication.