Technical tidbits from the sysadmin world...

AzureDevops Self Hosted Agent in Kubernetes with Workload Identity or Application Registration

There is more than a few blog posts around about how to use the Azure DevOps agent in a container (and by extension Kubernetes), there are 
even a handful of examples for using an Entra Application Registrations or Workload Identies that miss out on a critical step - the 'exit' needs a new token because they are not long lived!

Both the examples below use the Dockerfile and script from the official docs as a base.
Example 1 - Use AKS Workload Identity to authenticate and register the build agent
- Create a managed identity and add it to Azure Devops as a 'Basic' User, give it "Read" permissions to all build agents and Admin to the pool it will use.
- Setup standard Workload Identity config on the Managed Identity and deploy required service account in AKS
- Configure the script like below, the main differences are that the script takes the Workload Identity Token File and uses that to get an access token.
- To ensure the container can cleanup on exit, it needs to get a new access token to do the exit - this is missed in all the other examples I've seen, if you don't do this the token aquired at startup is very likely expired and then it can't exit gracefully
          set -e
          jwt=$(cat ${AZURE_FEDERATED_TOKEN_FILE})
          access_token=$(curl -X POST -d "scope=${scope}&grant_type=client_credentials&client_id=${AZURE_CLIENT_ID}&client_assertion_type=${clientAssertionType}&client_assertion=${jwt}"${AZURE_TENANT_ID}/oauth2/v2.0/token --http1.1 | jq '.access_token' | sed -e 's/^"//' -e 's/"$//')
          echo $access_token > /azp/.token
          if [ -z "${AZP_URL}" ]; then
          echo 1>&2 "error: missing AZP_URL environment variable"
          exit 1
          if [ -n "${AZP_WORK}" ]; then
          mkdir -p "${AZP_WORK}"
          cleanup() {
          trap "" EXIT
          if [ -e ./ ]; then
          print_header "Cleanup. Removing Azure Pipelines agent..."
          #Get a new access token
          jwt_exit=$(cat ${AZURE_FEDERATED_TOKEN_FILE})
          access_token_exit=$(curl -X POST -d "scope=${scope}&grant_type=client_credentials&client_id=${AZURE_CLIENT_ID}&client_assertion_type=${clientAssertionType}&client_assertion=${jwt_exit}"${AZURE_TENANT_ID}/oauth2/v2.0/token --http1.1 | jq '.access_token' | sed -e 's/^"//' -e 's/"$//')
          echo $access_token_exit > /azp/.token_exit
          # If the agent has some running jobs, the configuration removal process will fail.
          # So, give it some time to finish the job.
          while true; do
          ./ remove --unattended --auth "PAT" --token $(cat "/azp/.token_exit") && break
                echo "Retrying in 30 seconds..."
                sleep 30
          print_header() {
          echo -e "\n${lightcyan}$1${nocolor}\n"
          # Let the agent ignore the token env variables
          print_header "1. Determining matching Azure Pipelines agent..."
          AZP_AGENT_PACKAGES=$(curl -LsS \
          -u user:$(cat "/azp/.token") \
          -H "Accept:application/json" \
          AZP_AGENT_PACKAGE_LATEST_URL=$(echo "${AZP_AGENT_PACKAGES}" | jq -r ".value[0].downloadUrl")
          if [ -z "${AZP_AGENT_PACKAGE_LATEST_URL}" -o "${AZP_AGENT_PACKAGE_LATEST_URL}" == "null" ]; then
          echo 1>&2 "error: could not determine a matching Azure Pipelines agent"
          echo 1>&2 "check that account "${AZP_URL}" is correct and the token is valid for that account"
          exit 1
          print_header "2. Downloading and extracting Azure Pipelines agent..."
          curl -LsS "${AZP_AGENT_PACKAGE_LATEST_URL}" | tar -xz & wait $!
          source ./
          trap "cleanup; exit 0" EXIT
          trap "cleanup; exit 130" INT
          trap "cleanup; exit 143" TERM
          print_header "3. Configuring Azure Pipelines agent..."
          ./ --unattended \
          --agent "${AZP_AGENT_NAME:-$(hostname)}" \
          --url "${AZP_URL}" \
          --auth "PAT" \
          --token $(cat "/azp/.token") \
          --pool "${AZP_POOL:-Default}" \
          --work "${AZP_WORK:-_work}" \
          --replace \
          --acceptTeeEula & wait $!
          print_header "4. Running Azure Pipelines agent..."
          chmod +x ./
          # To be aware of TERM and INT signals call ./
          # Running it with the --once flag at the end will shut down the agent after the build is executed
          ./ --once "$@" & wait $!
Example 2 - Use an Application Registration to authenticate and register the build agent

- Create an Entra Application Registration and add it to Azure Devops as a 'Basic' User, give it "Read" permissions to all build agents and Admin to the pool it will use.
- Generate a secret
- Configure the script like below, the main differences are that the script takes the App Reg details and uses that to get an access token.
- To ensure the container can cleanup on exit, it needs to get a **new** access token to do the exit - this is missed in all the other examples I've seen, if you don't do this the token aquired at startup is very likely expired and then it can't exit gracefully
              set -e
              if [ -z "${AZP_URL}" ]; then
              echo 1>&2 "error: missing AZP_URL environment variable"
              exit 1
              if [ -n "$AZP_CLIENTID" ]; then          
              AZP_TOKEN=$(curl -X POST -d "grant_type=client_credentials&client_id=$AZP_CLIENTID&client_secret=$AZP_CLIENTSECRET&resource=$resource"$AZP_TENANTID/oauth2/token | jq -r '.access_token')
              if [ -z "$AZP_TOKEN_FILE" ]; then
              if [ -z "$AZP_TOKEN" ]; then
              echo 1>&2 "error: missing AZP_TOKEN environment variable"
              exit 1
              echo -n $AZP_TOKEN > "$AZP_TOKEN_FILE"
              unset AZP_TOKEN
              if [ -n "${AZP_WORK}" ]; then
              mkdir -p "${AZP_WORK}"
              cleanup() {
              trap "" EXIT
              if [ -e ./ ]; then
              print_header "Cleanup. Removing Azure Pipelines agent..."
                  # If the agent has some running jobs, the configuration removal process will fail.
                  # So, give it some time to finish the job.
                  while true; do
                    #Generate a new Access Token
                    AZP_TOKEN_EXIT=$(curl -X POST -d "grant_type=client_credentials&client_id=$AZP_CLIENTID&client_secret=$AZP_CLIENTSECRET&resource=$resource"$AZP_TENANTID/oauth2/token | jq -r '.access_token')
                    echo -n $AZP_TOKEN_EXIT > /azp/.token_exit
                    ./ remove --unattended --auth "PAT" --token $(cat "/azp/.token_exit") && break
                    echo "Retrying in 30 seconds..."
                    sleep 30
              print_header() {
              echo -e "\n${lightcyan}$1${nocolor}\n"
              # Let the agent ignore the token env variables
              print_header "1. Determining matching Azure Pipelines agent..."
              AZP_AGENT_PACKAGES=$(curl -LsS \
              -u user:$(cat "/azp/.token") \
              -H "Accept:application/json" \
              AZP_AGENT_PACKAGE_LATEST_URL=$(echo "${AZP_AGENT_PACKAGES}" | jq -r ".value[0].downloadUrl")
              if [ -z "${AZP_AGENT_PACKAGE_LATEST_URL}" -o "${AZP_AGENT_PACKAGE_LATEST_URL}" == "null" ]; then
              echo 1>&2 "error: could not determine a matching Azure Pipelines agent"
              echo 1>&2 "check that account "${AZP_URL}" is correct and the token is valid for that account"
              exit 1
              print_header "2. Downloading and extracting Azure Pipelines agent..."
              curl -LsS "${AZP_AGENT_PACKAGE_LATEST_URL}" | tar -xz & wait $!
              source ./
              trap "cleanup; exit 0" EXIT
              trap "cleanup; exit 130" INT
              trap "cleanup; exit 143" TERM
              print_header "3. Configuring Azure Pipelines agent..."
              ./ --unattended \
              --agent "${AZP_AGENT_NAME:-$(hostname)}" \
              --url "${AZP_URL}" \
              --auth "PAT" \
              --token $(cat "/azp/.token") \
              --pool "${AZP_POOL:-Default}" \
              --work "${AZP_WORK:-_work}" \
              --replace \
              --acceptTeeEula & wait $!
              print_header "4. Running Azure Pipelines agent..."
              chmod +x ./
              # To be aware of TERM and INT signals call ./
              # Running it with the --once flag at the end will shut down the agent after the build is executed
              ./ --once "$@" & wait $!
Running in Kubernetes
To have static names, the best way to run this in Kubernetes is with a statefulset that way there are static names, when the container exits after a build k8s will replace it.