How to Install Apache Airflow on Windows: A Beginner’s Guide to WSL

Tolulade Ademisoye
6 min readJun 3, 2024

--

Data Orchestration & Workflows

I’m writing this installation guide to help rookie engineers start their journey into Apache Airflow, just like I did. After stalling for a long time, I finally installed Airflow on Windows, and I want to share my experience and the resources I used during the setup.

Google Images

What is Apache Airflow?

Apache Airflow is an open-source platform built on Python. It enables you to schedule and monitor batch-oriented workflows in your project or company. The emphasis here is on the word “batch.”

Current Limitations on Windows

Apache Airflow currently has limited support for Windows, meaning that installing it directly may be challenging. While there are successful installations using Docker, this approach is still not direct.

When I tried to install Airflow directly using Windows PowerShell, I encountered a peculiar issue: the `airflow.cfg` file, which should appear after installation, didn’t show up in my Windows folder. Despite multiple attempts (around 2–4 times), I had the same result each time.

This error eventually led me to use a Linux environment for the installation, which worked well.

Try Semis today — Get a tech mentor or become one

Steps in Installing Apache Airflow on Windows

This guide provides two methods for installing Apache Airflow on Windows. The first method involves installing it directly from your Command Prompt or PowerShell, and the second method uses the Windows Subsystem for Linux (WSL). The second method worked for me.

Installing Directly on Windows

While this method doesn’t currently work well on Windows, I’ve included it here in case future updates improve the process.

Please note: If this direct approach doesn’t work, follow these steps to clean up:

1. Delete the Apache folder from your PC.
2. Delete the environment variable for Apache:
— Open Settings in Windows.
— Right-click and type “Edit environment variable”.
— Look for the entry you used for Apache and delete it.

1. Direct Installment Steps:

Open Command Prompt or Windows PowerShell on your computer.

Navigate to the directory where you want to install Airflow. You can use cd .. to go back a step in the directory, or run cd \ to go to the home directory on the C drive.

Navigate to the desired directory for installation (e.g., C:\> cd "C:\Program Files" if you want to install it in Program Files. Create the folder in Program Files and navigate there).

To install pip and upgrade it, run:

python -m pip install — upgrade pip

Install a virtual environment using conda or venv. This is useful for separating dependencies:

python -m venv myenvvenv

Navigate into this directory (e.g., if you created the directory myenvvenv in Program Files, run):

cd “C:\Program Files\Apache_Airflow\myenvvenv”

Activate the Virtual Environment (if using command prompt terminal)

myenv\Scripts\activate

Activate the Virtual Environment (if using Windows Powershell)

.\myenv\Scripts\Activate.ps1

When trying to activate the virtual environment in the terminal, you might encounter a permission issue due to PowerShell’s execution policy. To temporarily update the policy and activate the script, follow these steps:

Set-ExecutionPolicy RemoteSigned

Try Semis today — Get a tech mentor or become one

After activating your virtual environment, your command prompt/Shell will change to indicate that the virtual environment is active. To deactivate it after your work, simply run:

deactivate

Set up the home directory/folder in Windows. This directory will be used to store logs, configuration files, and the Airflow SQLite database. You can do this by creating a directory and setting it as the home directory. For example:

mkdir airflow

$env:AIRFLOW_HOME=”C:\Program Files\Apache_Airflow\myenv\airflow” or

[System.Environment]::SetEnvironmentVariable(“AIRFLOW_HOME”, “C:\Program Files\Apache_Airflow\myenv\airflow”, “User”)

This command will set the `AIRFLOW_HOME` environment variable to the path `C:\Program Files\Apache_Airflow\myenv\airflow` for the current PowerShell session. If you want to make this change permanent for the current user account, you can use the following command:

[System.Environment]::SetEnvironmentVariable method:

To confirm that the environment variable is set, you can run the following command:

C:\Program Files\Apache Airflow> Get-ChildItem Env: | Where-Object {$_.Name -eq ‘AIRFLOW_HOME’}

or

echo $env:AIRFLOW_HOME

Next
Install Apache Airflow:

You can navigate inside the virtual environment to install Airflow there:

pip install apache-airflow

This command will download and install the latest version of Apache Airflow along with its dependencies. The installation process may take a few minutes to complete.

If successful, you should see an output indicating the successful installation.

Next is to initialise Apache airflow DB
Before starting Airflow, you need to initialise the backend database. This can be done by running the following command:
run:
airflow db init

This command should be run only once when you’re setting up Apache Airflow for the first time or after resetting the database.

If the initialisation didn’t work for you, like in my case, I had to switch to WSL (Linux). I’ll discuss this further below.

If the Apache database initialisation worked for you, the next step is to start the Airflow Web server.

airflow webserver -p 8080

This command will start the web server on port 8080, allowing you to access the Airflow user interface through your web browser. You can replace 8080 with the desired port number if needed.

Next, open another terminal, activate virtual env and start the scheduler
run:
myvenv\Scripts\activate (if cmd)
airflow scheduler

Deactivating the Virtual env when you are done working
run: deactivate

The virtual environment (venv) ensures that your project dependencies are isolated and managed effectively.

2. Installing via WSL on Windows (this worked for me)

Before proceeding, ensure that WSL is installed on your Windows machine. You can use PowerShell to install it. Visit the Microsoft guide for more details: Microsoft WSL Installation Guide.

You can run the following command in PowerShell or cmd to install WSL:

wsl --install

Enable virtualization:

The next thing is to Enable Virtualization in BIOS in your PC, restart your computer, and go to the BIOS setup most likely in the Advanced, CPU Configuration, or Security tab. Look for Intel Virtualization Technology (VT-x), AMD-V, or SVM Mode and enable it.

Enable Virtual Machine Platform in Windows:

  • Open the Windows Start menu and search for “Turn Windows features on or off.”
  • In the list of features, locate “Hyper-V” under the “Virtual Machine Platform” category.
  • Check the box next to “Hyper-V” and any other sub-features it might have (like “Windows Hypervisor Platform”).
  • Click “OK” and restart your computer for the changes to take effect.

Try Semis today — Get a tech mentor or become one

Setting Up Apache Airflow in WSL

After implementing the above virtualization steps, start Linux Ubuntu, and run the following code after the other in your Linux terminal (in Windows).

sudo apt update (update package list)

sudo apt install python3 python3-venv python3-pip (installs python, virtual env and pip)

python3 -m venv airflow_venv (create the virtual environment)

source airflow_venv/bin/activate (activate the virtual env)

Set the Airflow_Home environment variable

export AIRFLOW_HOME=~/airflow

To make the AIRFLOW_HOME environment variable persistent, you can add it to your .bashrc file. Here’s how:

run: nano ~/.bashrc

paste this in it and save: export AIRFLOW_HOME=~/airflow

Then install Apache Airflow

pip install apache-airflow (install airflow in the venv)

Initialise the airflow db

airflow db init

airflow webserver -p 8080 (to start web server)

airflow scheduler

or run

airflow standalone (if test purposes, run this standalone rather than -airflow db init

I hope these guides are helpful! Don’t forget to subscribe to my Medium for more tech guides. If you’re looking for a tech mentor or want to become one, check out Semis. We offer great mentorship opportunities.

Try Semis today

References:

Also with support from AI;

Openai ChatGPT, Google Germini & Anthropic Claude

Written by;

Tolulade

Try Semis today — Get a tech mentor or become one

--

--

Tolulade Ademisoye

i build enterprise AI & data for the world at Reispar Technologies