Deprecation of Multi-Step Web Tests

We have loved using Azure multi-step web tests for quite some time now. It allows us to track the availability of our services, and be notified very quickly in the event of any downtime, alerting our action groups, and texting us in the event of a production outage. Microsoft has announced that multi-step web tests will be deprecated, starting on 8/31/24. Their recommendation for replacing multi-step web tests, is to use any compute you want (I use Azure functions in this blog post), in conjunction with TrackAvailability.

Our Solution (Azure Functions)

In order to keep our multi-step web tests, and utilize C# code, we went the route of using a timed Azure function, that calls off to our tested services and tracks their availability results to a separate instance of application insights than the one our function app is using. This allows us to create separate action groups that monitor availability test results, and alert on them accordingly, while still maintaining application insights data on our function app. (Since there are other functions running inside of it, besides just availability tests).

This will require us to have two separate instances of telemetry setup in our Azure functions. One is used for insights on the function app itself, and one is to write availability results. I will demonstrate how we did that below.

Code Examples

We start with a function per availability test. In this case, we are testing an addresses controller on one of our web APIs. There is some authentication work and shared logic that is placed inside of BaseAvailabilityTestFunction.

Availability Test

using System.Threading.Tasks;
using Microsoft.ApplicationInsights;
using Microsoft.ApplicationInsights.Extensibility;
using Microsoft.Azure.WebJobs;
using Microsoft.Extensions.Logging;
using AzureFunction.Shared.AvailabilityTests.Tests;

namespace AzureFunction.Shared.AvailabilityTests.Functions 
{
	public class AvailabilityTestsAddressesControllerFunction: BaseAvailabilityTestFunction 
	{ 
	    [FunctionName("ApiServiceAddressesControllerAvailabilityTest")]
		public async Task Run([TimerTrigger("%AvailabilityTests:AvailabilityTestsTimerInterval%", RunOnStartup = true)] TimerInfo myTimer, ILogger log, ExecutionContext executionContext) 
		{
			var testName = "API Service - Addresses Controller";
			await ExecuteAvailabilityTest(new AddressesControllerAvailabilityTest(), testName, log);
		}
	}
}

Base Availability Test Function

BaseAvailabilityTestFunction is where we store our availability test shared logic. Things like test timing, Polly retries, logging, and writing telemetry with our telemetry client. (More info on the telemetry client later in this post) In our business case, we want our availability tests to automatically retry for a bit, using Polly, to rule out any transient type errors with our network, since this can, unfortunately, happen given our setup.

using Polly;
using System;
using System.Diagnostics;
using System.Threading.Tasks;
using Microsoft.ApplicationInsights.DataContracts;
using Microsoft.Extensions.Logging;
using AzureFunction.Shared.AvailabilityTests.Tests;

namespace AzureFunction.Shared.AvailabilityTests.Functions
{
    public class BaseAvailabilityTestFunction
    {
        private AvailabilityTelemetry GetAvailabilityTelemetry(string availabilityName)
        {
            string location = Environment.GetEnvironmentVariable("AvailabilityTests:RegionName");

            var availability = new AvailabilityTelemetry
            {
                Name = availabilityName,
                RunLocation = location,
                Success = false
            };
            return availability;
        }

        public async Task ExecuteAvailabilityTest(BaseApiServiceTest ApiServiceTest, string testName, ILogger log)
        {
            Stopwatch stopwatch=new Stopwatch();
            AvailabilityTelemetry availability = GetAvailabilityTelemetry(testName);

            //retry up to 15 times, waiting 10 seconds between retries, to rule out brief network blips
            var retryPolicy = Policy
                .Handle<Exception>()
                .WaitAndRetryAsync(15, i => TimeSpan.FromSeconds(10), (exception, timeSpan, retryCount, context) =>
                {
                    log.LogError($"Failed calling {testName} - Availability Test - Retry Count: {retryCount} Exception: {exception.Message}");
                });

            try
            {
                await retryPolicy.ExecuteAsync(async () =>
                {
                    using var activity = new Activity("AvailabilityContext");
                    stopwatch.Start();
                    activity.Start();
                    availability.Context.Operation.Id = Activity.Current.RootId;
                    await ApiServiceTest.RunTest(
                        AvailabilityTestsTelemetryClientFactory.AvailabilityTestsTelemetryClient,
                        availability.Id, Activity.Current.RootId);
                    stopwatch.Stop();
                    availability.Success = true;
                    log.LogInformation(
                        $"{testName} - Availability Test Run completed successfully at: {DateTime.Now}");
                });
            }
            catch (Exception ex)
            {
                log.LogInformation($"{testName} - Availability Test Run failed at: {DateTime.Now}");
                availability.Success = false;
                availability.Message = ex.Message;
                throw;
            }
            finally
            {
                availability.Duration = stopwatch.Elapsed;
                availability.Timestamp = DateTimeOffset.UtcNow;
                AvailabilityTestsTelemetryClientFactory.AvailabilityTestsTelemetryClient.TrackAvailability(availability);
                AvailabilityTestsTelemetryClientFactory.AvailabilityTestsTelemetryClient.Flush();
            }
        }
    }
}

Validating Test Runs and Writing Telemetry

This is what the actual test code looks like. Perform all the necessary work to obtain a bearer token, and then use that token in subsequent API calls. Finally, validate that what is returned from the API call, is what is expected.

using System;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Threading.Tasks;
using Microsoft.ApplicationInsights;

namespace AzureFunction.Shared.AvailabilityTests.Tests
{
    class AddressesControllerAvailabilityTest : BaseApiServiceTest
    {
        public override async Task RunTest(TelemetryClient telemetryClient, string parentId, string operationId)
        {
            var bearerToken = await GetBearerTokenAsync(telemetryClient, parentId, operationId);

            using var httpClient = new HttpClient();
            
            string endpoint = Environment.GetEnvironmentVariable("AvailabilityTests:AddressApiEndpoint");

            httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", bearerToken);

            httpClient.DefaultRequestHeaders.Add("ApiKey",Environment.GetEnvironmentVariable("AvailabilityTests:ApiServiceApiKey"));
            httpClient.DefaultRequestHeaders.Add("Accept", "application/vnd.ApiService.addresses.v1+json");

            var controllerResponse = await ExecuteHttpGetRequest(telemetryClient, parentId, operationId, httpClient, endpoint);

            var controllerResponseContents = await controllerResponse.Content.ReadAsStringAsync();

            ValidateControllerResponse(Environment.GetEnvironmentVariable("AvailabilityTests:AddressesControllerAvailabilityTestExpectedResponse"), controllerResponseContents);
        }
    }
}

Base API Service Test

Shared API service test logic. Things like obtaining bearer tokens, writing telemetry data, and correctly setting correlation ids and parent ids, so that everything looks correct when viewing availability test results in application insights, and we see dependency data within our test results as well.

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Net;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Threading.Tasks;
using Microsoft.ApplicationInsights;
using Microsoft.ApplicationInsights.DataContracts;
using Newtonsoft.Json;

namespace AzureFunction.Shared.AvailabilityTests.Tests
{
    public abstract class BaseApiServiceTest
    {
        private const string GrantType = "password";
        private const string Scope = "ApiServiceStore";

        private readonly string _identityServerAuthorizeEndpoint = Environment.GetEnvironmentVariable("AvailabilityTests:IdentityServerAuthorizeEndpoint");
        private readonly string _identityServerAuthorizationHeaderBasicValue = Environment.GetEnvironmentVariable("AvailabilityTests:IdentityServerAuthorizationHeaderBasicValue");
        
        public abstract Task RunTest(TelemetryClient telemetryClient, string parentId, string operationId);

        protected void ValidateControllerResponse(string expectedControllerResponseContent,
            string customerResponseContents)
        {
            if (!customerResponseContents.Contains(expectedControllerResponseContent))
            {
                throw new Exception($"Response did not contain expected string: {expectedControllerResponseContent}");
            }
        }

        public async Task<string> GetBearerTokenAsync(TelemetryClient telemetryClient, string parentId, string operationId)
        {
            string username = Environment.GetEnvironmentVariable("AvailabilityTests:IdentityServerUsername");
            string password = Environment.GetEnvironmentVariable("AvailabilityTests:IdentityServerPassword");

            using var httpClient = new HttpClient();
            Dictionary<string, string> tokenEndpointParameters =
                new Dictionary<string, string>
                {
                    {"grant_type", GrantType},
                    {"username", username},
                    {"password", password},
                    {"scope", Scope}
                };

            var tokenEndpointEncodedContent = new FormUrlEncodedContent(tokenEndpointParameters);

            httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic", _identityServerAuthorizationHeaderBasicValue);
            
            var stopWatch = new Stopwatch();
            stopWatch.Start();

            var tokenResponse =
                await httpClient.PostAsync(_identityServerAuthorizeEndpoint, tokenEndpointEncodedContent);

            stopWatch.Stop();

            RequestTelemetry rt = new RequestTelemetry
            {
                Name = "POST " + _identityServerAuthorizeEndpoint,
                Url = new Uri(_identityServerAuthorizeEndpoint),
                Duration = stopWatch.Elapsed,
                Timestamp = DateTimeOffset.UtcNow,
                ResponseCode = ((int)tokenResponse.StatusCode).ToString(),
                Success = tokenResponse.IsSuccessStatusCode,
            };

            foreach (var responseHeader in tokenResponse.Headers)
            { 
                rt.Properties.Add($"{responseHeader.Key}", $"{String.Join(",", responseHeader.Value)}");
            }

            rt.Properties.Add("Response Body", await tokenResponse.Content.ReadAsStringAsync());
            rt.Context.Operation.Id = operationId;
            rt.Context.Operation.ParentId = parentId;
            telemetryClient.TrackRequest(rt);

            if (tokenResponse.StatusCode == HttpStatusCode.OK)
            {
                var tokenResponseContents = await tokenResponse.Content.ReadAsStringAsync();
                dynamic tokenResponseJson = JsonConvert.DeserializeObject(tokenResponseContents);
                var bearerToken = tokenResponseJson.access_token.Value;
                return bearerToken;
            }

            throw new Exception(@"Unable to obtain bearer token from Identity Server for availability test.");
        }

        protected async Task<HttpResponseMessage> ExecuteHttpGetRequest(TelemetryClient telemetryClient, string parentId, string operationId,
            HttpClient httpClient, string urlEndpoint)
        {
            var stopWatch = new Stopwatch();
            stopWatch.Start();

            var getResponse = await httpClient.GetAsync(urlEndpoint);

            stopWatch.Stop();

            RequestTelemetry rt = new RequestTelemetry
            {
                Name = "GET " + urlEndpoint,
                Url = new Uri(urlEndpoint),
                Duration = stopWatch.Elapsed,
                Timestamp = DateTimeOffset.UtcNow,
                ResponseCode = ((int)getResponse.StatusCode).ToString(),
                Success = getResponse.IsSuccessStatusCode,
            };

            foreach (var responseHeader in getResponse.Headers)
            {
                rt.Properties.Add($"{responseHeader.Key}", $"{String.Join(",", responseHeader.Value)}");
            }

            rt.Properties.Add("Response Body", await getResponse.Content.ReadAsStringAsync());
            rt.Context.Operation.Id = operationId;
            rt.Context.Operation.ParentId = parentId;
            telemetryClient.TrackRequest(rt);
            return getResponse;
        }
        
        protected async Task<HttpResponseMessage> ExecuteHttpPostRequest(TelemetryClient telemetryClient, string parentId, string operationId,
            HttpClient httpClient, string urlEndpoint, HttpContent httpContentParameter=null)
        {
            var stopWatch = new Stopwatch();
            stopWatch.Start();

            var response =
                await httpClient.PostAsync(urlEndpoint, httpContentParameter);

            stopWatch.Stop();

            RequestTelemetry rt = new RequestTelemetry
            {
                Name = "POST " + urlEndpoint,
                Url = new Uri(urlEndpoint),
                Duration = stopWatch.Elapsed,
                Timestamp = DateTimeOffset.UtcNow,
                ResponseCode = ((int)response.StatusCode).ToString(),
                Success = response.IsSuccessStatusCode,
            };

            foreach (var responseHeader in response.Headers)
            {
                rt.Properties.Add($"{responseHeader.Key}", $"{String.Join(",", responseHeader.Value)}");
            }

            rt.Properties.Add("Response Body", await response.Content.ReadAsStringAsync());
            rt.Context.Operation.Id = operationId;
            rt.Context.Operation.ParentId = parentId;
            telemetryClient.TrackRequest(rt);
            return response;
        }
    }
}

Startup of Function App

We use a startup class in our function app, to create a singleton instance of our telemetry client. This ensures that our client does not suffer from a memory leak.


using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Hosting;
using AzureFunction.Shared;
using AzureFunction.Shared.AvailabilityTests.Functions;

[assembly: WebJobsStartup(typeof(Startup))]
namespace AzureFunction.Shared
{
    public class Startup : IWebJobsStartup
    {
        public void Configure(IWebJobsBuilder builder)
        {
           AvailabilityTestsTelemetryClientFactory.CreateSingletonAvailabilityTestsTelemetryClient();
        }
    }
}

Telemetry Client Factory

Code to create a singleton Telemetry Client. This is the singleton telemetry client that is used to write availability test results.

using System;
using Microsoft.ApplicationInsights;
using Microsoft.ApplicationInsights.Extensibility;

namespace AzureFunction.Shared.AvailabilityTests.Functions
{
    /// <summary>
    /// One common instance of the telemetry client just for availability tests
    /// </summary>
    public static class AvailabilityTestsTelemetryClientFactory
    {
        private static TelemetryClient _availabilityTestsTelemetryClient = null;

        public static void CreateSingletonAvailabilityTestsTelemetryClient()
        {
            if (_availabilityTestsTelemetryClient == null)
            {
                var telemetryConfiguration = new TelemetryConfiguration()
                {
                    ConnectionString =
                        Environment.GetEnvironmentVariable("AvailabilityTests:ApplicationInsightsConnectionString")
                };
                _availabilityTestsTelemetryClient = new TelemetryClient(telemetryConfiguration);
            }
        }
        public static TelemetryClient AvailabilityTestsTelemetryClient => _availabilityTestsTelemetryClient;
    }
}

These were the basics needed, to track an end to end availability test result, using an Azure timed function. This also allows you to write telemetry data to two different telemetry clients within the context of a single function. This allows us to gather application insights data on our function, while still writing availability test results to a separate application insights instance.