Delivering a Customizable, Graphical Insight into Azure VM Security, Health and Connectivity Using Several Azure Services Together

In this blog I want to walkthrough a solution I recently architected and implemented along with a two other MTC architects to deliver a solution we needed for two reasons:

  1. To provide insight into the VMs hosted in Azure across the global Microsoft Technology Center environment
  2. Showcase the use of some key Microsoft cloud technologies

The Requirement

The global MTC organization is made up of around 30 offices which each have several Azure subscriptions to host the projects they are working on and environments used in customer activities. Additionally, there are several global, shared Azure subscriptions that host core infrastructure and experiences. These subscriptions are tied to various Azure AD tenants depending on requirements. The primary subscription for each MTC also hosts a virtual network that is part of a global IP space that is connected via one of four regional ExpressRoute circuits to the MTC worldwide VPN that provides connectivity between all MTC offices.

While there is a standard governance and process guide each MTC has control of their own subscriptions and resources however from a central MTC organization perspective insight into several key factors was required.

  • Are the VMs registered with the central Log Analytics instance to report inventory and patch state. Log Analytics is part of the Operations Management Suite and is used to accept log information from almost any sort and then provides power analytical capabilities to use that information to provide insight into the environment. A number of solutions are included that provide visibility into best practices, patch status, anti-malware status and much more. For OS instance visibility Log Analytics uses the Microsoft Monitoring Agent (MMS) which is the same agent used by System Center Operations Manager.
  • What is the current patch status of the VM. This is provided by information to Log Analytics and to Azure Security Center if registered. Azure Security Center (ASC) provides a central security posture location for Azure resources including VM health, network health, storage health and more.
  • Is the VM connected to ExpressRoute. This can be found by checking the virtual network a VM is attached to and if that virtual network has an ExpressRoute Gateway connected.
  • Does the VM have a public IP and is it health. Public IP existence can be found through the properties of VM IP configurations and the health which is based on use of Network Security Groups to lock down communication through ASC.
  • Is the VM older than 30 days. Object creations are logged in Azure. By default, these are kept for 60 days which enables a search of the logs for the VM creation. If not found it would mean the VM is older than 60 days and if found the exact age can be determined. The age is useful as short-term VMs do not have the same levels of reporting requirements, i.e. does not have to be registered to OMS.

The insight into the health needed to be in the form to provide easy overall insight while allowing detail to be exposed through drilling down into the data.

The Solution

I started off crafting a solution in PowerShell through which I can access the full knowledge of the Azure Resource Manager via the AzureRM module and also other solutions such as Log Analytics, Azure Security Center and Azure Storage.

If you like to read the end of the book below is the final solution and what I will walk through is some of the detail you see in the picture.

The first challenge was the context to run the script under since multiple Azure AD tenants were utilized and I didn’t want to have to manage multiple credentials. Therefore, Azure AD B2B (business to business) was utilized. A single identity in the main Azure AD tenant was created and then a communication sent to each MTC to add that identity via Azure AD B2B to any local Azure AD tenant instances and then to give that account Read permissions to all subscriptions. This enabled a single credential to be used across every subscription, regardless of the Azure AD tenant the subscription was tied to. This same credential was also give rights to the Log Analytics instance all VMs reported to which enables queries to be run.

Now the access was available the next step was the actual PowerShell to gather the required information. A storage account was created that would be used to store the output of the execution which would be a basic execution report and two JSON files that contained custom objects representing the VM state and Azure subscription information.

The basic PowerShell flow is as follows:

  • Import the ASC and Log Analytics PowerShell modules
  • Access the credential that will be used
  • Connect to Azure using the credential
  • Store a list of every subscription associated to the credential in an array
  • Connect to the Azure Storage account to create a context for BLOB storage
  • Connect to the Log Analytics workspace and trigger two queries whose results were stored in two arrays
    • List of all machines that report to the instance that are stored in Azure
    • List of all machines that are missing patches that are stored in Azure
  • Create three files that contained todays date; log files, VM JSON files and subscription JSON file
  • Create two empty arrays that will store custom objects for VM state and subscription information
  • For every subscription perform the following:
    • List the administrators and write to the log
    • Retrieve the ASC status for the subscription and store in an array
    • For every Resource Group
      • Find the virtual networks connected to ExpressRoute gateway and store in an array
      • For every VM in the Resource Group
        • Find the creation time by scanning the operational log of Azure. Save the creation time if found and if older than 30 days or report older than 30 days if no log found
        • For each NIC inspect the IP configurations
          • Is it connected to a virtual network that has ExpressRoute connectivity
          • Does it have a public IP address and if so what is the health of that public IP based on information previously saved from ASC
        • Is the VM registered in OMS
        • Is the VM missing patches based on information from OMS or ASC
        • Create a custom object using a hash table with all desired information about the VM and add to the VM object array
    • Add a subscription information custom object to the subscription array
  • Upload the three data files generated to the Azure storage account as BLOBs

To actually run the PowerShell I used Azure Automation which not only provided a resilient engine to run the code but capabilities such as credentials which could securely store the identity that was used removing any need to hardcode it in the script itself. The schedule capability was used to trigger the runbook (the container for the PowerShell in Azure Automation) to right daily at 11pm.

At this point in an Azure Storage account was a report and two JSON files with one of them, the VM state JSON file, the most useful which enabled all information to be queried easily however the goal was to have it more easily digestible which meant PowerBI and ideally getting the data more easily available to everyone, e.g. Teams along with a notification that the nights execution was successful.

The solution was to use a Logic App (created by Ali Mazaheri, https://blogs.msdn.com/alimaz) which enables activities to be chained together using various connectors which include Azure Storage, Teams and SharePoint. The Logic App was designed with a recurrence trigger (but could also trigger based on object creations and other triggers) and to then perform the following:

  • List the blobs in the azurescan container (a container is like a folder in Azure Store)
  • For each object that is not empty
    • Get the BLOB content
    • Create a file containing that content in SharePoint
    • Copy the BLOB to an archive BLOB
    • Delete the original BLOB

 

  • Write a message to a team’s channel that the log migration was completed (Or send an email, notification to phone, etc.)

A great feature of Logic Apps is that they are implemented by adding the built-in connectors or your own API apps, Azure Functions and then graphically laying out the flow using conditions, branches and those connectors by passing output as an input for next connectors and in this case some custom expressions. Below is the key content of the Logic App (as an alternative we could have also used Azure Functions and EventGrid to achieve the same goal).

The final step was the Power BI portion to read in the file from SharePoint and provide a visualization of the data contained in the JSON. David Browne created this powerful dashboard that enabled various visualizations of the data and easy access to change the criteria of the data contained.

The Power BI Service can connect directly to SharePoint Online to read the files.  Power Query in Power BI is used to identify the latest data files, convert them from JSON to a tabular format and to clean the data.  The data is then loaded into an in-memory Tabular Model hosted by Power BI and configured for daily refresh.

Leave a Reply