论文标题
解剖服务网格
Dissecting Service Mesh Overheads
论文作者
论文摘要
服务网格在现代应用程序生态系统中起着核心作用,它提供了一种连接构成分布式应用程序的不同服务的简单而灵活的方法。但是,由于它们在应用程序流量上插入的方式,它们可以大大增加应用程序延迟和资源消耗。我们开发了一种分解方法和一种称为Meshinsight的工具,以系统地表征服务网格的开销,并帮助开发人员在感兴趣的部署方案中量化开销。使用Meshinsight,我们确认服务网的开销可能高达185%,高达185%的延迟,对于我们的基准应用程序,可高达92%的虚拟CPU核心 - 但严重性与配置的方式和应用程序工作密切相关。高间接费用的主要贡献者也根据配置而变化。当服务网格作为TCP代理运行时,IPC(过程间通信)和套接字会占主导地位,但是协议解析作为HTTP代理时占主导地位。 Meshinsight还使我们能够研究优化对服务网格的端到端影响。我们表明,并非所有看似主张的优化导致现实设置的高度降低。
Service meshes play a central role in the modern application ecosystem by providing an easy and flexible way to connect different services that form a distributed application. However, because of the way they interpose on application traffic, they can substantially increase application latency and resource consumption. We develop a decompositional approach and a tool, called MeshInsight, to systematically characterize the overhead of service meshes and to help developers quantify overhead in deployment scenarios of interest. Using MeshInsight, we confirm that service meshes can have high overhead -- up to 185% higher latency and up to 92% more virtual CPU cores for our benchmark applications -- but the severity is intimately tied to how they are configured and the application workload. The primary contributors to overhead vary based on the configuration too. IPC (inter-process communication) and socket writes dominate when the service mesh operates as a TCP proxy, but protocol parsing dominates when it operates as an HTTP proxy. MeshInsight also enables us to study the end-to-end impact of optimizations to service meshes. We show that not all seemingly-promising optimizations lead to a notable overhead reduction in realistic settings.
