A security researcher has discovered that one of the Apache Hadoop stack’s main functions can cause a server’s performance to suffer when running multiple instances of a single application.
The flaw affects Apache Heterogeneous Computing, the core software that powers the Hadoopy server, as well as Apache Storm, Apache Spark, Apache Mesos, Apache Kafka, and Apache Cassandra.
The researchers say that they were able to reproduce the issue in three instances of their Hadoops-powered Hadoomp cluster using Apache Storm.
Heterogenous Computing’s main function, the “HPC-on-demand” scheduler, was responsible for maintaining the cluster’s total CPU load.
It was running as part of Apache Storm when the flaw was discovered.
HPC-On-Demand is an automatic scheduler that runs a series of threads of different kinds of processes on top of each other, to maintain the overall load of the cluster.
This kind of system allows you to run a large number of simultaneous instances of Apache HPC, and this kind of workload can cause the system to take more CPU than it is worth.
But this is a problem with the HPC scheduler itself, and the researchers say this can be easily fixed by removing the scheduler.
The team published a detailed analysis of the bug on GitHub.
“Heterogeneous computing can only maintain the workload that it can sustain,” said Ben Shatz, one of its authors.
“It is always going to take the worst case scenario, and if you remove it from the cluster, the worst-case scenario is going to be that you will see less CPU usage.”
The researchers discovered the flaw after they ran their HPC cluster with Apache Storm running on a single instance of HPC on Demand.
This means that Apache Storm’s application will use a large portion of the CPU that is already available on the cluster when the cluster is run as a cluster of many instances.
The researcher notes that Apache HPS, Apache Storm as well, could be affected by the bug, and that HPC would still be able to maintain a consistent load despite the reduced CPU usage.
The issue can be exploited to perform denial-of-service attacks against a large server, because the schedulers are often used for workloads that require more than one worker to perform tasks simultaneously.
It is possible to fix the problem by removing HPC’s “HPS scheduler” from Apache Storm in the Heterogene Computing stack.
Apache Storm can be configured to perform automatic, per-server performance checks, which will be enforced by the HPS scheduler when it is running.
The HPC team has posted patches to fix this issue in the next few days.