Hell’s Cloud Ops

Been watching Hell’s Kitchen in the background while working on some projects and I think it would make an awesome cloud operations show and a fun way to communicate some core concepts. Imagine…..

Chef in calm voice – OK team, today we are working on providing a tasty SQL service for our customer that will be used from a fairly basic application. Off you go.

<contestants scurry off to their workstation areas>

<chef wanders over to Bob>

Chef angry – Bob, WHAT ARE YOU DOING?

Bob – I’m creating each VM to be part of the SQL cluster I’m creating

Chef furious – You’re creating each VM one at a time in the portal???? Oh my god! Is your computer made of red and yellow plastic with “My first” written on the top of it? At least I see you’re using Availability Sets for some resiliency but this is ridiculous. How will you ensure consistency? How will you scale to creating 50 instances of this? How would this integrate with DevOps. Start again, use Infrastructure as Code and if I see you in a portal that mouse will be going where the sun doesn’t shine.

Bob – Yes chef!

<15 minutes later Bob presents his template>

Chef – OK, nice template, good resources. oh no no no no. What have you done????? WHY HAVE YOU HARD CODED values in the resources section??? WHERE IS THE PARAMETER FILE?? How are you going to change control this? How will you deploy this between different environments, deploy between different instances. You donkey! Take environment specific values out of the template and get them in a parameter file! Then you have one, change controlled template. Environment, instance specific values are completely separate! IDIOT! FIX THIS!

<5 minutes later Bob returns>

Chef – Lets see. Good parameter use, lets look at the parameter file. DONKEY! Are you here to destroy the company??? WHYYYYY do you have the administrator password in the parameter file???

Bob – I needed it to join the machines to the domain via the domain join extension chef

Chef – And you felt the best way to do that was to place that password in the file that you then uploaded to a repository??? Your companies most important password is now known to everyone and a group of teenagers has taken over your company, your wife has left you and your kids pretend they are adopted they are so embarrassed. Good luck stocking vending machines after destroying your company. IDIOT! Where would be a better place do you think? CAN YOU THINK?

Bob – Azure Key Vault chef

Chef – Can you do that? are you capable. DO IT! And heaven help you if you forget to update the vault’s advanced access policy to allow use of the secret from ARM template deployments.

<5 minutes and Bob returns>

Chef – Lets see how you can ruin my day now. This is acceptable. Will work well. Nice use of secrets. I see you even created a release pipeline. Now tell me, why didn’t you just use Azure SQL database?

<A small tear rolls down Bob’s cheek and credits roll>

Deploying Agents to Azure IaaS VMs using the Custom Script Extension

In an ideal world organizations should try to avoid creating custom images with their own special agents and configurations. This means a lot of image management as each time an agent is updated the image has to be updated in addition to the normal patching of OS instances. The Azure marketplace has a large number of OS images that are kept up-to-date which should be used if possible and any customization performed on top of that. I recently had a Proof of Concept where a number of agents needed to be deployed post VM deployment along with other configurations. Items such as domain join can be done with the domain join extension but for the other agent installs we decided to use the Custom Script Extension to call a bootstrap script which would do nothing other than pull down all content from a certain container using azcopy.exe and then launch a master script. The master script would be part of the downloaded content and would then perform all the silent installations and customization’s required.

A storage account is utilized with two containers:

  • Artifacts – This contains the master script and all the agent installers etc. This could use a zip file to enable a structure to be maintained of the various agents and the master script could unzip at the start
  • Bootstrap – This contains azcopy.exe (in my case version 10) and the bootstrap.ps1 file that does nothing other than call azcopy to copy everything from the artifacts container to the c:\ root, then launch the master script from the local copy

Below is my example bootstrap.ps1 file. Notice it has one parameter, the URI of the container which will be the shared access signature enabling access.

Azcopy.exe was downloaded from https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10 and copied to the bootstrap container along with the bootstrap.ps1 file. In my case there is nothing sensitive in the file and so I made the container public. This would avoid having to have an access key as part of my ARM template that would ultimately call this script.

All the installers and the master script were uploaded to the artifacts container. For this container I wanted a shared access signature (SAS) that would give read and list rights. The idea would be some automation would generate a new SAS each week and write to a secret in key vault that only people that should deploy had access to. The SAS would have a lifetime of 2 weeks to have an overlap with the newly generated. In addition to generating and storing the complete SAS I needed a second version that was escaped for cmd.exe. This is because the SAS has & in it which was being interpreted during my testing breaking its use. I tried to use no parse (–%) but this did not work since it was being called by cmd.exe therefore the escape is to use ^&. The script below generates the SAS and the escaped SAS and writes both versions as secrets to key vault.

Once this was done I now had a SAS available in the key vault that would give read and list to the artifacts container. Remember to configure the Access Policy on the vault to enable use of secrets from ARM templates (advanced settings) and additionally for the users/groups to have access to the secret. A test of this process to my local machine worked, i.e.

Next I tried calling as I would via the Custom Script Extension which with the escaped version worked great (note its the escaped URL as this will get expanded in the template).

Initially my test was to an existing Azure VM so I used the following (note I’m getting the escaped version of the secret from Key Vault):

Once this worked I finally created an ARM template that included a reference to the secret and all worked as planned.

The parameter file (note I also get a secret to join the domain even though I’m not using the domain join extension in this example):

The actual template (note in the CSE extension at the end I need the single quotes around the URI or it once again tries to interpret it so you have to use two, i.e. ”, to get one ‘ when it actually executes):

And the execution (note the network and RG already existed in my environment).

Hope this helps!

New Free Data Courses on Pluralsight Available!

Over the past two months I’ve been busy on some Data in Azure courses for Microsoft and Pluralsight. These are free and you just need to sign-up for a free account on Pluralsight. These will shortly be available via Azure as well but are available now through Pluralsight.

Azure Stack Marketplace Management

Been doing some work with Azure Stack and wanted to easily update all the Microsoft provided extensions and a set of core images if there are new versions by running a simple script.

Script available at https://github.com/johnthebrit/AzureStack/blob/master/azurestackmarketplace.ps1. Simply run the script and after it downloads the assets it will check if there are older versions and prompt you if you want to delete the old ones.

Lots of new Azure Design and Identity free training available

I may have seemed to be very quiet over the past few months but that’s because I’ve been working pretty much every night and weekend on 11 new courses for azure.com that will shortly be available via the site but are immediately available for free via PluralSight. If you don’t have an account simply sign up for a free account and you can then access my (and other peoples tracks).

Planning Microsoft Azure Identity and Security

Planning Microsoft Azure Infrastructure

The identity track looks at identity management before diving into authentication, authorization, auditing, monitoring and risk. The infrastructure track looks at compute, storage, networking and monitoring.

I hope you find these courses useful and there are more to come.

On a side note I’m trying to raise money for Cure Childhood Cancer as part of my Ironman Chattanooga on 9/30/2018. This will be my 5th Ironman this year and 12th overall. If you can help even a little please head over to https://www.firstgiving.com/fundraiser/john-savill/IM2018 and maybe your company matches so if they do that helps as well. I’ll be trackable on the day via https://bat.live/track/imchattanooga2018?bib=356.

Thank you!

Delivering a Customizable, Graphical Insight into Azure VM Security, Health and Connectivity Using Several Azure Services Together

In this blog I want to walkthrough a solution I recently architected and implemented along with a two other MTC architects to deliver a solution we needed for two reasons:

  1. To provide insight into the VMs hosted in Azure across the global Microsoft Technology Center environment
  2. Showcase the use of some key Microsoft cloud technologies

The Requirement

The global MTC organization is made up of around 30 offices which each have several Azure subscriptions to host the projects they are working on and environments used in customer activities. Additionally, there are several global, shared Azure subscriptions that host core infrastructure and experiences. These subscriptions are tied to various Azure AD tenants depending on requirements. The primary subscription for each MTC also hosts a virtual network that is part of a global IP space that is connected via one of four regional ExpressRoute circuits to the MTC worldwide VPN that provides connectivity between all MTC offices.

While there is a standard governance and process guide each MTC has control of their own subscriptions and resources however from a central MTC organization perspective insight into several key factors was required.

  • Are the VMs registered with the central Log Analytics instance to report inventory and patch state. Log Analytics is part of the Operations Management Suite and is used to accept log information from almost any sort and then provides power analytical capabilities to use that information to provide insight into the environment. A number of solutions are included that provide visibility into best practices, patch status, anti-malware status and much more. For OS instance visibility Log Analytics uses the Microsoft Monitoring Agent (MMS) which is the same agent used by System Center Operations Manager.
  • What is the current patch status of the VM. This is provided by information to Log Analytics and to Azure Security Center if registered. Azure Security Center (ASC) provides a central security posture location for Azure resources including VM health, network health, storage health and more.
  • Is the VM connected to ExpressRoute. This can be found by checking the virtual network a VM is attached to and if that virtual network has an ExpressRoute Gateway connected.
  • Does the VM have a public IP and is it health. Public IP existence can be found through the properties of VM IP configurations and the health which is based on use of Network Security Groups to lock down communication through ASC.
  • Is the VM older than 30 days. Object creations are logged in Azure. By default, these are kept for 60 days which enables a search of the logs for the VM creation. If not found it would mean the VM is older than 60 days and if found the exact age can be determined. The age is useful as short-term VMs do not have the same levels of reporting requirements, i.e. does not have to be registered to OMS.

The insight into the health needed to be in the form to provide easy overall insight while allowing detail to be exposed through drilling down into the data.

The Solution

I started off crafting a solution in PowerShell through which I can access the full knowledge of the Azure Resource Manager via the AzureRM module and also other solutions such as Log Analytics, Azure Security Center and Azure Storage.

If you like to read the end of the book below is the final solution and what I will walk through is some of the detail you see in the picture.

The first challenge was the context to run the script under since multiple Azure AD tenants were utilized and I didn’t want to have to manage multiple credentials. Therefore, Azure AD B2B (business to business) was utilized. A single identity in the main Azure AD tenant was created and then a communication sent to each MTC to add that identity via Azure AD B2B to any local Azure AD tenant instances and then to give that account Read permissions to all subscriptions. This enabled a single credential to be used across every subscription, regardless of the Azure AD tenant the subscription was tied to. This same credential was also give rights to the Log Analytics instance all VMs reported to which enables queries to be run.

Now the access was available the next step was the actual PowerShell to gather the required information. A storage account was created that would be used to store the output of the execution which would be a basic execution report and two JSON files that contained custom objects representing the VM state and Azure subscription information.

The basic PowerShell flow is as follows:

  • Import the ASC and Log Analytics PowerShell modules
  • Access the credential that will be used
  • Connect to Azure using the credential
  • Store a list of every subscription associated to the credential in an array
  • Connect to the Azure Storage account to create a context for BLOB storage
  • Connect to the Log Analytics workspace and trigger two queries whose results were stored in two arrays
    • List of all machines that report to the instance that are stored in Azure
    • List of all machines that are missing patches that are stored in Azure
  • Create three files that contained todays date; log files, VM JSON files and subscription JSON file
  • Create two empty arrays that will store custom objects for VM state and subscription information
  • For every subscription perform the following:
    • List the administrators and write to the log
    • Retrieve the ASC status for the subscription and store in an array
    • For every Resource Group
      • Find the virtual networks connected to ExpressRoute gateway and store in an array
      • For every VM in the Resource Group
        • Find the creation time by scanning the operational log of Azure. Save the creation time if found and if older than 30 days or report older than 30 days if no log found
        • For each NIC inspect the IP configurations
          • Is it connected to a virtual network that has ExpressRoute connectivity
          • Does it have a public IP address and if so what is the health of that public IP based on information previously saved from ASC
        • Is the VM registered in OMS
        • Is the VM missing patches based on information from OMS or ASC
        • Create a custom object using a hash table with all desired information about the VM and add to the VM object array
    • Add a subscription information custom object to the subscription array
  • Upload the three data files generated to the Azure storage account as BLOBs

To actually run the PowerShell I used Azure Automation which not only provided a resilient engine to run the code but capabilities such as credentials which could securely store the identity that was used removing any need to hardcode it in the script itself. The schedule capability was used to trigger the runbook (the container for the PowerShell in Azure Automation) to right daily at 11pm.

At this point in an Azure Storage account was a report and two JSON files with one of them, the VM state JSON file, the most useful which enabled all information to be queried easily however the goal was to have it more easily digestible which meant PowerBI and ideally getting the data more easily available to everyone, e.g. Teams along with a notification that the nights execution was successful.

The solution was to use a Logic App (created by Ali Mazaheri, https://blogs.msdn.com/alimaz) which enables activities to be chained together using various connectors which include Azure Storage, Teams and SharePoint. The Logic App was designed with a recurrence trigger (but could also trigger based on object creations and other triggers) and to then perform the following:

  • List the blobs in the azurescan container (a container is like a folder in Azure Store)
  • For each object that is not empty
    • Get the BLOB content
    • Create a file containing that content in SharePoint
    • Copy the BLOB to an archive BLOB
    • Delete the original BLOB

 

  • Write a message to a team’s channel that the log migration was completed (Or send an email, notification to phone, etc.)

A great feature of Logic Apps is that they are implemented by adding the built-in connectors or your own API apps, Azure Functions and then graphically laying out the flow using conditions, branches and those connectors by passing output as an input for next connectors and in this case some custom expressions. Below is the key content of the Logic App (as an alternative we could have also used Azure Functions and EventGrid to achieve the same goal).

The final step was the Power BI portion to read in the file from SharePoint and provide a visualization of the data contained in the JSON. David Browne created this powerful dashboard that enabled various visualizations of the data and easy access to change the criteria of the data contained.

The Power BI Service can connect directly to SharePoint Online to read the files.  Power Query in Power BI is used to identify the latest data files, convert them from JSON to a tabular format and to clean the data.  The data is then loaded into an in-memory Tabular Model hosted by Power BI and configured for daily refresh.

Using the Azure PS Drive

If you leverage the Azure Cloud Shell in the Azure portal its a very convenient way to manage Azure resources using PowerShell and the CLI but you may have also noticed an actual Azure drive, i.e. Set-Location azure: and you can navigate around your Azure resources (this is actually the default location when the cloud shell opens). At the top level are subscriptions and you can then navigate to resource groups, VMs, WebApps and more.

The Azure drive is provided via the Simple Hierarchy in PowerShell (SHiPS) provider which you can see via Get-PSProvider.

The actual functionality is evolving, its a project on GitHub at https://github.com/PowerShell/SHiPS but this also means you can run this same provider outside of the Azure Cloud Shell.

You need to ensure you are running the latest version of the AzureRM module then download, install, add an Azure account and add the provider:

You can now navigate to Azure: and enjoy the same feature as when in the Azure Cloud Shell.

Note this is completely different from the Azure Cloud Drive which is the persistent file storage you have in the Azure Cloud Shell that is backed by Azure Files and enables data to be saved and used between sessions. Use Get-CloudDrive to see the current configuration and if you wish to change it simply run Dismount-CloudDrive and then restart the shell and select Advanced options to customize the location.

Writing to files with Azure Automation

Azure Automation enables PowerShell (and more) to be executed as runbooks by runbook workers hosted in Azure. Additionally Azure Automation accounts bring capabilities such as credential objects to securely store credentials, variables, scheduling and more. When a runbook executes it runs in a temporary environment that does not have any persistent state and so if you want to work with files you need to save them somewhere, for example to an Azure storage account as a blob, before the runbook completes.

You can actually create and use files as normal using the default path within PowerShell during execution, just remember to save the files externally before the script completes.

For example create a file as usual:

Then before ending the PowerShell, copy it to a blob (as an example storage place):

 

 

Easily create multiple subnets in an Azure Virtual Network

I recently needed to create a whole set of subnets in a large number of virtual networks of various sizes. I thought some variables would be a great way to quickly create the set of subnets in each virtual network which were each /20 networks in a shared class B IP which enabled 16 virtual networks per Class B IP space. The goal was to show that each subnet didn’t need to be a full class C (/24) in instead we could use smaller subnets based on the number of hosts actually required. I’ve included the comments which explains the subnets created and the number of hosts supported in each.