Loading...

QStack and 3 test tools for C10M



About

An open-source high concurrency user-space network stack and three test tools for C10M

TCP/IP network stack is irreplaceable for Web services in datacenter front-end servers, and the demand for which is growing rapidly for emerging high concurrency network service applications (including Internet, Internet of Things, mobile Internet, etc.) especially. Therefore, C10M problem was proposed in industry, that is, how to enable a single commercial server to simultaneously handle millions of clients, and support tens of millions of concurrent connections. The existing network stack schemes often face the dilemma between high concurrency and low tail latency on application service. Our research group break this dilemma via a flexible architectural design QStack, as a solution to support C10 problem. This report will talk about QStack and 3 test tools (including MCC, LightShaper, HCMonitor), all of which have been open source. As this architecture shown, these tools can work together to test high concurrency. They also can be used separately with other tools. The only problem is that, when they work with other tools, some functions or performance, like concurrency, would be compromised.

Also, we propose an abstract benchmark MCCBench (published in Bench'22) for C10M, defining the methodology of 1) load generation, 2) service function and 3) service performance evaluation, and give a C10M test case MCCBench-IoT, designing from a real scenario of some well-known IoT Company based on our open-source tools. MCCBench-IoT has completed C10M test in one server and 100 millions of concurrency test in multiple servers.

Please feel free to contact us at zhangwl@ict.ac.cn, if you have any question. We also sincerely invite people interested to communicate, study and work together.

QStack

QStack is a user-space TCP/IP network stack that enables high concurrent network service up to 10 million in a single server with good user experience (i.e., low tail latency).
Contributions:
Full-datapath zero copy and full-stack lock free
        low overhead processing in user space
Application definable full-datapath priority
        low tail latency with request feature labels to guide the cross-layers prioritization in low overhead
Elastic framework
        High CPU efficiency and high concurrency by adjusting CPU resources used by stack adaptively from as low as one core to the whole server for fluctuant datacenter workload. while scaling to the whole server, C10M is support
Source code:
Github https://github.com/acs-network/qstack
Gitee https://gitee.com/acs-dcn/qstack

MCC

MCC is a distributed load generator to simulate massive clients that enables high concurrency up to 10 million in one server.
Contributions:
High concurrency
        Kernel-bypass with lightweight user-level stack (mTCP)
Scalability in multi-core and multi-server systems
        Shared-nothing architecture, distributed with multi-threaded model
Source code:
Github https://github.com/acs-network/mcc
Gitee https://gitee.com/acs-dcn/mcc

LightShaper

LightShaper, a pure software network traffic transformation tool in low cost based on dpdk, as an optional auxiliary to the network load generator.
Contributions:
Waveform shaping, speed control and simulate WAN traffic characteristics (e.g., OoO, high latency, packet drop) for network load
        Filling placeholder packet to get microsecond accuracy in packet interval control
Decouple traffic feature management from load generator for independent regulation
Source code:
Github https://github.com/acs-network/lightshaper
Gitee https://gitee.com/acs-dcn/lightshaper

HCMonitor

HCMonitor, a full traffic monitor system for that enables high concurrency up to 10 million TCP concurrent connections.
Contributions:
Accurate latency based on full traffic
        Compute server-side latency, excluding the queuing delay of the client, etc
        Full traffic statistics instead of sampling, and accurate to each request latency
High concurrency monitoring
        Lock free, pipelined process with multiple thread based on DPDK, and display results (including latency CDF distribution, concurrency) in real time
Transparent to network services
        Utilize the Switch mirror
Source code:
Github https://github.com/acs-network/hcmonitor
Gitee https://gitee.com/acs-dcn/hcmonitor