Introduction
An independent security researcher, on August 31st, 2024, demonstrated a successful supply chain attack on Azure Karpenter Provider, an open-source project maintained by Microsoft. A vulnerable GitHub Actions workflow led to this attack. The researcher successfully exploited the vulnerability and gained access to the workflow's GITHUB_TOKEN, which had "id-token: write" permission to the repository. This means that the token could be used to access the cloud resources that the workflow had access to.
As Azure Karpenter Provider had been using StepSecurity Harden-Runner, this attack was detected in real time. StepSecurity Harden-Runner provides network egress control and CI/CD infrastructure security for GitHub-hosted and self-hosted environments
StepSecurity reported the detection within an hour of it being exploited to the Microsoft Security Response Center (MSRC). We are proud to share that StepSecurity has been acknowledged in the MSRC acknowledgment portal for detecting and reporting this issue. The portal recognizes individuals and companies who have contributed to making Microsoft’s online services safer by privately disclosing and assisting in remediating security vulnerabilities.
For executive summary, check out the video below.
What was the vulnerability?
A GitHub Actions workflow in the Microsoft Azure Karpenter Provider project was misconfigured to download and run untrusted code in CI/CD with elevated permissions. Any GitHub user could have exploited this vulnerability to steal the workflow’s CI/CD credentials.
The vulnerability was in the e2e.yml reusable workflow, which checks out code using an explicit ref provided via an input parameter, and runs in a privileged context.
This ref comes from an artifact created in a previous workflow. The resolve-args.yml reusable workflow is responsible for extracting the ref from the artifact.
The ApprovalComment workflow stores the commit ID into the artifact. This commit is later used to check out the code, allowing the attacker’s code to be run in the e2e.yml privileged workflow.
The privileged e2e.yml workflow is automatically triggered when the ApprovalComment workflow completes successfully as defined in the E2EMatrixTrigger workflow.
How was the vulnerability exploited?
The exploit begins when the researcher creates a pull request from their fork and modifies the ApprovalComment workflow to store a commit ID of their choice into the artifact.
In the malicious commit, which is part of the pull request, the researcher added a command that runs a malicious script hosted on a gist.
The modified ApprovalComment workflow saves the malicious commit information in the artifact. You can see this in this build log.
Finally, the vulnerable workflow checks out and runs code from the researcher’s forked repository. The malicious script, inserted earlier, is executed, demonstrating arbitrary code execution in a privileged context, as seen in this build log.
The researcher’s exploit code (the poc.sh file from the gist) is designed to exfiltrate credentials from the privileged workflow. The YOUR_EXFIL variable is set to an external domain, where the extracted data will be sent.
What could have happened in a real malicious attack?
This could have caused a Codecov-style software supply chain attack. An adversary could have exfiltrated CI/CD secrets to access the cloud environments that this workflow had access to.
How did StepSecurity detect the attack?
All workflows in this repository had been configured to use Harden Runner in audit mode since Jan 2024.
This is the egress traffic from a non-malicious run of the e2e.yml workflow. As expected, the only outbound calls are made to api.github.com in the commit-status/start step, which are marked as allowed.
When the exploit occurred, Harden-Runner detected suspicious outbound traffic. As you can see, there are now anomalous calls to gist.githubusercontent.com and the exfiltration domain n6uw1ivl7vxwuf43sas4hjs17sdj19py.oastify.com.
These calls were flagged as anomalous because they had never been observed in previous runs and were not part of the baseline. The call to gist was part of the researcher’s modified action, and the exfiltration domain was used to steal secrets from the workflow.
This is the StepSecurity Insights page for the malicious run:
Remediation
To remediate the vulnerability, the developers first removed the approval comment trigger from the workflow configuration. This was the trigger that allowed the exploit to be executed once the approval comment workflow completed.
Additionally, the developers configured Harden-Runner to operate in block mode instead of audit mode. By switching to block mode, any unauthorized or anomalous network traffic is immediately blocked, preventing exfiltration attempts like the one seen in the exploit.
Conclusion
A security researcher demonstrated a supply chain attack on Azure Karpenter Provider, detected by StepSecurity Harden-Runner, highlighting the need to secure CI/CD pipelines from emerging threats. We commend Azure Karpenter Provider maintainers for their vigilance and use of StepSecurity to protect workflows. Kudos to the researcher for ethically testing vulnerabilities, raising CI/CD security awareness, and enhancing the open-source ecosystem.