Data Analytics Blog

Welcome to the Data Analytics blog of Pivotal BI. As an independent consultancy we like to write about a broad spectrum of related subjects, technology and providers. We hope you find the content insightful, enjoyable and of benefit along your data analytics journeys.

ByJames Pretorius

Azure Data Factory Development with GitFlow – Part 1

If you’re using or planning to use Git integration with Azure Data Factory then it’s well worth taking the time to define a suitable branching model for managing the various life-cycles of your project (think feature, hotfix and release). In this short series, I’ll discuss how we can structure our Azure Data Factory development with GitFlow, an easy to comprehend branching model for managing development and release processes.

Part 1: Overview

Part 2: Implementation Detail

In part 1 of this series, you’ll get an overview of the various components which make up the solution. I’ll follow this up in part 2 with the implementation detail on how to deploy and configure your Data Factory environments to tie them in with the workflow. If all goes to plan we should land up with something along the lines of the below:

Azure Data Factory Development with GitFlow

Component Overview

Now that we sort of know where we’re heading, let’s take a closer look at a few of the components that will make up the solution, namely:

  1. Git integration in Azure Data Factory UI for maintaining our source code
  2. GitFlow branching model, a structured approach to align our source code with the project life-cycle
  3. Continuous integration and delivery (CI/CD) in Azure Data Factory for the deployment of Data Factory entities (pipelines, datasets, linked services, triggers etc.) across multiple environments
  4. Azure DevOps for additional controls not available in the Azure Data Factory user interface
  5. Azure Data Factory for obvious reasons

Git integration in Azure Data Factory UI

With Azure Data Factory Git integration you can source control your Data Factory entities from within the Azure Data Factory UI,  unfortunately, it does come with a few bugbears:

  1. Changes to the Data Factory automatically result in a commit e.g. new ADF entity (pipeline, connection etc.) = new commit
  2. Assigning Azure DevOps work items to a commit is not possible (use Azure DevOps instead)
  3. Merging between branches is only possible as part of a pull request
  4. The ability to publish a Data Factory is available from the collaboration branch only

For a more detailed look at the Git integration functionality in Azure Data Factory, have a read through the official documentation.

GitFlow branching model

First introduced by Vincent Driessen, GitFlow is a branching model for Git repositories which defines a method for managing the various project life-cycles. As with all things in life, there are lovers and haters but personally, I’m very fond of the approach. Having used it successfully on numerous projects I can vouch that, on more than one occasion, it has saved me from a merge scenario not too dissimilar to Swindon’s Magic Roundabout.

For those of you not familiar with GitFlow it’s well worth spending a few minutes reading through the details at nvie.com. In summary, and for the purpose of this post, it uses a number of branches to manage the development life-cycle namely master, develop, release, hotfix and feature. Each branch maintains clean, easy to interpret code which is representative of a phase within the project life-cycle.


Continuous integration and delivery (CI/CD) in Azure Data Factory

Continuous integration and delivery, in the context of Azure Data Factory, means shipping Data Factory pipelines from one environment to another (development -> test -> production) using Azure Resource Manager (ARM) templates.

ARM templates can be exported directly from the ADF UI alongside a configuration file containing all the Data Factory connection strings and parameters. These parameters and connection strings can be adjusted when importing the ARM template to the target environment. With Azure Pipelines in AzureDevOps, it is possible to automate this deployment process – that’s possibly a topic for a future post.

For a more detailed look at the CI/CD functionality in Azure Data Factory, have a read through the official documentation.

Azure DevOps

Azure DevOps is a SaaS development collaboration tool for doing source control, Agile project management, Kanban boards and various other development features which are far beyond the scope of this blog. For the purpose of this two-part post we’ll primarily be using Azure DevOps for managing areas where the ADF UI Git integration is lacking, for example, pull requests on non-collaboration destination branches and branch merges.

Azure Data Factory

To support the structured development of a Data Factory pipeline in accordance with a GitFlow branching model more than one Data Factory will be required:

  • Feature and hotfix branch development will take place on separate data factories, each of which will be Git connected.
  • Testing releases for production (both features and hotfixes) will be carried out on a third Data Factory, using ARM templates for Data Factory entity deployment without the need for Git integration.
  • For production pipelines, we’ll use a fourth Data Factory. Again, as per release testing, using ARM templates for Data Factory entity deployment without the need for Git integration.

Of course, there are no hard or fast rules on the above. You can get away with using fewer deployments if you’re willing to chop and change the Git repository associated with the Data Factory.  There is a charge for inactive pipelines but it’s fairly small and in my opinion not worth considering if additional deployments are going to make your life easier.

…to be continued

That covers everything we need, I hope you’ve got a good overview of the implementation and formed an opinion already on whether this is appropriate for your project. Thanks for reading. Come back soon for part 2!

ByJames Pretorius

Dynamically Flatten a Parent-Child Hierarchy using Power Query M

Introduction

For those of you familiar with recursive common table expressions in SQL, iterating through a parent-child hierarchy of data is a fairly straight forward process. There are several examples available which demonstrate how one could approach this problem in Power Query M. In this post, we’ll use recursion and dynamic field retrieval to loop through and dynamically flatten a parent-child hierarchy using Power Query M.

Show and Tell

Before we begin, let’s take a quick look at an example of the function being invoked in Power BI. This should hopefully give you some idea of where we’re heading with this.

Dynamically Flatten a Parent-Child Hierarchy using Power Query M

Dynamically Flatten a Parent-Child Hierarchy using Power Query M

Sample Data

Let’s look at some sample data to get a feel for the type of data the function requires as input and the resultant dataset the function will output.

Input

ParentNodeIDParentNodeNameChildNodeIDChildNodeName
100Stringer2Shamrock
200Avon201Levy
200Avon202Brianna
200Avon203Wee-Bey
2Shamrock3Slim Charles
3Slim Charles51Bodie
3Slim Charles52Poot
3Slim Charles53Bernard
51Bodie61Sterling
51Bodie62Pudding
52Poot61Sterling
52Poot62Pudding

Output

ParentNodeIDChildNodeID1ChildNodeID2ChildNodeID3ChildNodeID4ParentNodeNameChildNodeName1ChildNodeName2ChildNodeName3ChildNodeName4HierarchyLevelHierarchyPathIsLeafLevelHierarchyNodeID
100Stringer1100false100
1002StringerShamrock2100|2false2
10023StringerShamrockSlim Charles3100|2|3false3
1002351StringerShamrockSlim CharlesBodie4100|2|3|51false51
100235161StringerShamrockSlim CharlesBodieSterling5100|2|3|51|61true61
100235162StringerShamrockSlim CharlesBodiePudding5100|2|3|51|62true62
1002352StringerShamrockSlim CharlesPoot4100|2|3|52false52
100235261StringerShamrockSlim CharlesPootSterling5100|2|3|52|61true61
100235262StringerShamrockSlim CharlesPootPudding5100|2|3|52|62true62
1002353StringerShamrockSlim CharlesBernard4100|2|3|53true53
200Avon1200false200
200202AvonBrianna2200|202true202
200201AvonLevy2200|201true201
200203AvonWee-Bey2200|203true203

Power Query M Function

The fFlattenHiearchy function consists of an outer function and a recursive inner function.

The outer function will:

  1. Identify the top level parents
  2. Trigger the flattening
  3. Bolt on a few metadata columns

The inner function will:

  1. Self join onto the input hierarchy table to identify the next level of children within the hierarchy
  2. Filter out rows where the parent = child, to avoid infinite recursion

Flatten the hierarchy

The below code is used to build the basic hierarchy structure:

(
  hierarchyTable as table
  ,parentKeyColumnIdentifier as text
  ,parentNameColumnIdentifier as text
  ,childKeyColumnIdentifier as text
  ,childNameColumnIdentifier as text
) as table =>
  let
    #"Get Root Parents" = Table.Distinct(
      Table.SelectColumns(
        Table.NestedJoin(hierarchyTable
          ,parentKeyColumnIdentifier
          ,hierarchyTable
          ,childKeyColumnIdentifier
          ,"ROOT.PARENTS"
          ,JoinKind.LeftAnti
        )
        ,{
          parentKeyColumnIdentifier
          ,parentNameColumnIdentifier
        }
      )
    ),
    #"Generate Hierarchy" = fGetNextHierarchyLevel(
      #"Get Root Parents"
      ,parentKeyColumnIdentifier
      ,1
    ),
    fGetNextHierarchyLevel = (
      parentsTable as table
      ,nextParentKeyColumnIdentifier as text
      ,hierarchyLevel as number
    ) =>
      let 
        vNextParentKey = childKeyColumnIdentifier & Text.From(hierarchyLevel),
        vNextParentName = childNameColumnIdentifier & Text.From(hierarchyLevel),
        #"Left Join - hierarchyTable (Get Children)" = Table.NestedJoin(parentsTable
          ,nextParentKeyColumnIdentifier
          ,hierarchyTable
          ,parentKeyColumnIdentifier
          ,"NODE.CHILDREN"
          ,JoinKind.LeftOuter
        ),
        #"Expand Column - NODE.CHILDREN" = Table.ExpandTableColumn(#"Left Join - hierarchyTable (Get Children)"
          ,"NODE.CHILDREN"
          ,{
              childKeyColumnIdentifier
              ,childNameColumnIdentifier
          },{
              vNextParentKey
              ,vNextParentName
          }
        ),
        #"Filter Rows - Parents with Children" = Table.SelectRows(#"Expand Column - NODE.CHILDREN"
          ,each Record.Field(_,vNextParentKey) <> null 
              and Record.Field(_,vNextParentKey) <> Record.Field(_,nextParentKeyColumnIdentifier)
        ),
        #"Generate Next Hierarchy Level" = if Table.IsEmpty(#"Filter Rows - Parents with Children")
          then parentsTable 
          else Table.Combine(
            {
              parentsTable
              ,@fGetNextHierarchyLevel(
                #"Filter Rows - Parents with Children"
                ,vNextParentKey
                ,hierarchyLevel + 1
              )
            }
        )
      in 
        #"Generate Next Hierarchy Level"
  in
     #"Generate Hierarchy"

Add additional metadata

Additional metadata columns can be added to the hierarchy by using the code below. My original approach was to update these columns in each iteration of the hierarchy and although the code was slightly more digestible in comparison to the below, I did find it came at the expense of performance.

#"Add Column - HierarchyPath" = Table.AddColumn(#"Generate Hierarchy",
      "HierarchyPath"
      ,each Text.Combine(
        List.Transform(
          Record.FieldValues(
            Record.SelectFields(_,
              List.Select(Table.ColumnNames(#"Generate Hierarchy")
                ,each Text.StartsWith(_,childKeyColumnIdentifier) 
                  or Text.StartsWith(_,parentKeyColumnIdentifier)
              )
            )
          )
          ,each Text.From(_)
        )
        ,"|"
      )
      ,type text
    ),
    #"Add Column - HierarchyNodeID" = Table.AddColumn(#"Add Column - HierarchyPath",
      "HierarchyNodeID"
      ,each List.Last(Text.Split([HierarchyPath],"|"))
      ,type text
    ),
    #"Add Column - HierarchyLevel" = Table.AddColumn(#"Add Column - HierarchyNodeID",
      "HierarchyLevel"
      ,each List.Count(Text.Split([HierarchyPath],"|"))
      ,Int64.Type
    ),
    #"Add Column - IsLeafLevel" = Table.AddColumn(#"Add Column - HierarchyLevel",
      "IsLeafLevel"
      ,each List.Contains(
          List.Transform(
            Table.Column(
              Table.NestedJoin(hierarchyTable
              ,childKeyColumnIdentifier
              ,hierarchyTable
              ,parentKeyColumnIdentifier
              ,"LEAFLEVEL.CHILDREN"
              ,JoinKind.LeftAnti
              )
              ,childKeyColumnIdentifier
            )
            ,each Text.From(_)
          )
        ,List.Last(Text.Split([HierarchyPath],"|"))
      )
      ,type logical
    )

The whole caboodle

Below you will find the full code for the function along with some documentation towards the end. You can plug the below code straight into a blank query in Power BI and reference it from your hierarchy query to flatten it.

let 
  fFlattenHierarchy = (
    hierarchyTable as table
    ,parentKeyColumnIdentifier as text
    ,parentNameColumnIdentifier as text
    ,childKeyColumnIdentifier as text
    ,childNameColumnIdentifier as text
  ) as table =>
  let
    #"Get Root Parents" = Table.Distinct(
      Table.SelectColumns(
        Table.NestedJoin(hierarchyTable
          ,parentKeyColumnIdentifier
          ,hierarchyTable
          ,childKeyColumnIdentifier
          ,"ROOT.PARENTS"
          ,JoinKind.LeftAnti
        )
        ,{
          parentKeyColumnIdentifier
          ,parentNameColumnIdentifier
        }
      )
    ),
    #"Generate Hierarchy" = fGetNextHierarchyLevel(
      #"Get Root Parents"
      ,parentKeyColumnIdentifier
      ,1
    ),
    fGetNextHierarchyLevel = (
      parentsTable as table
      ,nextParentKeyColumnIdentifier as text
      ,hierarchyLevel as number
    ) =>
      let 
        vNextParentKey = childKeyColumnIdentifier & Text.From(hierarchyLevel),
        vNextParentName = childNameColumnIdentifier & Text.From(hierarchyLevel),
        #"Left Join - hierarchyTable (Get Children)" = Table.NestedJoin(parentsTable
          ,nextParentKeyColumnIdentifier
          ,hierarchyTable
          ,parentKeyColumnIdentifier
          ,"NODE.CHILDREN"
          ,JoinKind.LeftOuter
        ),
        #"Expand Column - NODE.CHILDREN" = Table.ExpandTableColumn(#"Left Join - hierarchyTable (Get Children)"
          ,"NODE.CHILDREN"
          ,{
              childKeyColumnIdentifier
              ,childNameColumnIdentifier
          },{
              vNextParentKey
              ,vNextParentName
          }
        ),
        #"Filter Rows - Parents with Children" = Table.SelectRows(#"Expand Column - NODE.CHILDREN"
          ,each Record.Field(_,vNextParentKey) <> null 
              and Record.Field(_,vNextParentKey) <> Record.Field(_,nextParentKeyColumnIdentifier)
        ),
        #"Generate Next Hierarchy Level" = if Table.IsEmpty(#"Filter Rows - Parents with Children")
          then parentsTable 
          else Table.Combine(
            {
              parentsTable
              ,@fGetNextHierarchyLevel(
                #"Filter Rows - Parents with Children"
                ,vNextParentKey
                ,hierarchyLevel + 1
              )
            }
        )
      in 
        #"Generate Next Hierarchy Level",
    #"Add Column - HierarchyPath" = Table.AddColumn(#"Generate Hierarchy",
      "HierarchyPath"
      ,each Text.Combine(
        List.Transform(
          Record.FieldValues(
            Record.SelectFields(_,
              List.Select(Table.ColumnNames(#"Generate Hierarchy")
                ,each Text.StartsWith(_,childKeyColumnIdentifier) 
                  or Text.StartsWith(_,parentKeyColumnIdentifier)
              )
            )
          )
          ,each Text.From(_)
        )
        ,"|"
      )
      ,type text
    ),
    #"Add Column - HierarchyNodeID" = Table.AddColumn(#"Add Column - HierarchyPath",
      "HierarchyNodeID"
      ,each List.Last(Text.Split([HierarchyPath],"|"))
      ,type text
    ),
    #"Add Column - HierarchyLevel" = Table.AddColumn(#"Add Column - HierarchyNodeID",
      "HierarchyLevel"
      ,each List.Count(Text.Split([HierarchyPath],"|"))
      ,Int64.Type
    ),
    #"Add Column - IsLeafLevel" = Table.AddColumn(#"Add Column - HierarchyLevel",
      "IsLeafLevel"
      ,each List.Contains(
          List.Transform(
            Table.Column(
              Table.NestedJoin(hierarchyTable
              ,childKeyColumnIdentifier
              ,hierarchyTable
              ,parentKeyColumnIdentifier
              ,"LEAFLEVEL.CHILDREN"
              ,JoinKind.LeftAnti
              )
              ,childKeyColumnIdentifier
            )
            ,each Text.From(_)
          )
        ,List.Last(Text.Split([HierarchyPath],"|"))
      )
      ,type logical
    )
  in
    #"Add Column - IsLeafLevel",
  //Documentation
  fFlattenHierarchyType = type function (
    hierarchyTable as (type table meta [
      Documentation.FieldCaption = "Hierarchy"
      ,Documentation.LongDescription = "A table containing a parent-child hierarchy"
      ]
    )
    ,parentKeyColumnIdentifier as (type text meta [
      Documentation.FieldCaption = "Parent Key Column Identifier"
      ,Documentation.LongDescription = "The name of the column used to identify the key of the parent node in the hierarchy"
      ,Documentation.SampleValues = { "ParentID" }
      ]
    )
    ,parentNameColumnIdentifier as (type text meta [
      Documentation.FieldCaption = "Parent Name Column Identifier"
      ,Documentation.LongDescription = "The name of the column used to identify the name of the parent node in the hierarchy"
      ,Documentation.SampleValues = { "ParentName" }
      ]
    )
    ,childKeyColumnIdentifier as (type text meta [
      Documentation.FieldCaption = "Child Key Column Identifier"
      ,Documentation.LongDescription = "The name of the column used to identify the key of the child node in the hierarchy"
      ,Documentation.SampleValues = { "ChildID" }
      ]
    )
    ,childNameColumnIdentifier as (type text meta [
      Documentation.FieldCaption = "Child Name Column Identifier"
      ,Documentation.LongDescription = "The name of the column used to identify the name of the child node in the hierarchy"
      ,Documentation.SampleValues = { "ChildName" }
      ]
    )
  ) as list meta [
    Documentation.Name = "fFlattenHierarchy"
    ,Documentation.LongDescription = "Returns a flattened hierarchy table from a parent-child hierarchy table input."
      & "The number of columns returned is based on the depth of the hierarchy. Each child node will be prefixed"
      & "with the value specified for the childNameColumnIdentifier parameter"
    ,Documentation.Examples = {
      [
      Description = "Returns a flattened hierarchy table from a parent-child hierarchy table"
      ,Code = "fFlattenHierarchy(barksdaleOrganisation, ""ParentNodeID"", ""ParentNodeName"", ""ChildNodeID"", ""ChildNodeName"")"
      ,Result = "{100,2,3,51,62,""Stringer"",""Shamrock"",""Slim Charles"",""Bodie"",""Pudding"",5,""100|2|3|51|62"",TRUE,62}"
        & ",{100,2,3,51,""Stringer"",""Shamrock"",""Slim Charles"",""Bodie"",4,""100|2|3|51"",FALSE,51}"
      ]
    }
  ]
in
  Value.ReplaceType(fFlattenHierarchy, fFlattenHierarchyType)

Final Thoughts

It looks fairly verbose but in reality, it’s just 10 steps to flatten a hierarchy and bolt on the metadata columns, you could probably blame my code formatting for thinking otherwise. Anyone who is used to seeing common table expressions in SQL will hopefully find the logic defined in the function familiar.

In terms of performance, I’ve not pitted this against other methods nor done any performance testing but I’ve executed it against a ragged hierarchy with 11 levels spanning several thousand rows and it spat out results fairly instantaneously.

ByChristo Olivier

Automate Your Power BI Dataset Refresh with Python

Introduction

Most people using Power BI normally do so with Microsoft technology at the core of their business and IT operations. Power BI is rapidly adding connectors for non-Microsoft technologies and new capabilities on a monthly basis. The combination of new sources, excellent visualisation and modelling, and a low price point is leading to it being used with technologies other than the Microsoft data platform.

This blog is the result of one such project. I am currently using Power BI as the main reporting tool on a Google Cloud Platform (GCP) data project. There are lots of interesting topics to discuss given the technologies being used. However, this post is going to focus on the programmatic refreshing of your Power BI datasets using Python. “Why Python?” I hear you say. Well the workflow tool used on GCP is none other than Apache Airflow, implemented as Composer. Apache Airflow is written in Python and you create all of your workflows using Python.

When I looked at the Power BI REST API documentation all of the examples were in C#. After a lot of reading and experimenting (aka hitting my head against my desk), I had the process running the way I wanted. This post is my attempt at creating the document I wish existed when I started.

The main steps

It is important to understand the main steps involved in this process before we get into the detail. Having this high-level process clearly defined was one of the things missing from the information online.

  1. The first step is to create an App registration for your application. This will provide you with a client id for your app which is needed for authentication later on.
  2. Next you will grant your App the permissions it requested in Azure AD.
  3. Using the information from the previous steps you acquire an authentication token from Azure AD in your python code.
  4. Finally, you use the authentication token to call the Power BI REST API.

Registering your App

The first and most important part of this entire process is to create a Power BI app registration. There are multiple ways of doing this and this video from Guy in a Cube will give you all of the information you need. No matter how you choose to do your app registration there are three main things you need to know.

1. The Power BI REST API only supports delegated permissions. That means that you need to run any calls to the REST API in the context of a user. For unattended applications, such as our data pipeline step, you need to register your app as a Native app. You only receive the client id when you register it as native app. Server-side Web apps receive both a client id and client secret but this is the wrong type of app for our use case. When you authenticate from your code you will need the client id together with the username and the password of the account that has delegated these permissions to the app.

2. You need to ensure you select the correct permissions when registering your app. For our purposes we need access to Read and write all datasets. As always take the approach of providing the minimum permissions needed. You can always add more permissions later in the Azure Portal.

3. This brings us to the most overlooked yet important point which is granting permissions to the App in the Azure Portal. You need to log into the Azure Portal with the account that will be delegating the permissions to the app. This will be the account of which you will pass the username, password with the client id to authenticate against Azure AD. If you do not perform this step you will end up with authentication errors. (You might get an authentication token when you authorise but you won’t get the correct scope in the access token)

Python and the Power BI REST API

Interacting with the Power BI REST API requires a two-step process.

1. Acquiring an access token from Azure AD by supplying your client id, username and password

2. Calling the desired REST API using the access token received in step 1.

There is a choice in how you can perform step 1. You can choose to either use the ADAL Python library or pure REST calls to obtain the access token.

Before we continue a word of caution. In the examples provided below we have the client id, username and password entered directly in the code. This is for demonstration purposes only. I cannot stress this enough. For production solutions you would want to use the credential management best practice of your platform.

ADAL Library

The adal library for Python is the official Microsoft Azure Active Directory authentication library. It provides you with everything you need to authenticate against Azure AD using Python. Below is an example of the code you will use to authenticate and get your access token. Keep in mind that we have to pass the username and password along with the Client ID. No other way of authenticating will give us a token with the scopes we need to call the Power BI REST API successfully.

import adal

authority_url = 'https://login.windows.net/common'
resource_url = 'https://analysis.windows.net/powerbi/api'

client_id = <INSERT CLIENT ID>
username = <INSERT USERNAME>
password = <INSERT PASSWORD>

context = adal.AuthenticationContext(authority=authority_url,
                                     validate_authority=True,
                                     api_version=None)

token = context.acquire_token_with_username_password(resource=resource_url,
                                                     client_id=client_id,
                                                     username=username,
                                                     password=password)

access_token = token.get('accessToken')

As you can see the adal library makes it extremely easy to authenticate against Azure AD and get the token you need.

REST

An alternative to the ADAL library is normal REST API calls to obtain the token. Below is an example of the code you will use if you were using the requests library to make your REST calls.

import requests

url = 'https://login.microsoftonline.com/common/oauth2/token'
data = {
        'grant_type': 'password',
        'scope': 'https://api.powerbi.com',
        'resource': 'https://analysis.windows.net/powerbi/api',
        'client_id': <INSERT CLIENT ID>,
        'username': <INSERT USERNAME>,
        'password': <INSERT PASSWORD>
       }

r = requests.post(url, data=data)
access_token = r.json().get('access_token')

Refreshing the Power BI Dataset

So now that we have gotten our access token, we can move onto the next step which is refreshing our dataset in Power BI. In order to do this, you will need one or two keys depending on where your dataset is located.

If your dataset is in the workspace of the account under which your app will be running, the “My Workspace” of that account, then you only need the dataset key of your dataset. If, however your dataset is located in an app workspace you will need to get both the group id and the dataset key.

You get these values from the URL in your browser when you navigate to the settings of the dataset. Here are examples for each one of the scenarios.

Once you have these keys you are ready to construct your API call to refresh your dataset. The Power BI REST API documentation shows the two different API calls.

Below is an example of the code you would use to refresh a dataset in an App Workspace.

import adal
import requests

authority_url = 'https://login.windows.net/common'
resource_url = 'https://analysis.windows.net/powerbi/api'

client_id = <INSERT CLIENT ID>
username = <INSERT USERNAME>
password = <INSERT PASSWORD>

context = adal.AuthenticationContext(authority=authority_url,
                                     validate_authority=True,
                                     api_version=None)

token = context.acquire_token_with_username_password(resource=resource_url,
                                                     client_id=client_id,
                                                     username=username,
                                                     password=password)

access_token = token.get('accessToken')

refresh_url = 'https://api.powerbi.com/v1.0/myorg/groups/<INSERT GROUP ID>/datasets/<INSERT DATASET KEY>/refreshes'

header = {'Authorization': f'Bearer {access_token}'}

r = requests.post(url=refresh_url, headers=header)

r.raise_for_status()

Conclusion

As you can see it is extremely easy to refresh your datasets in Power BI using Python and the Power BI REST API. The most common mistakes people make which account for the majority of forum posts online are:

  1. Register their app as the wrong type, Server-side Web app, instead of the correct type which is Native app.
  2. Not granting the App permissions in the Azure portal using the account in which context it will be running.
  3. Trying to acquire the Azure AD token using client credentials, client id and client secret, instead of acquiring it with a username and password.

By getting the steps right you will have no problem in automating your Power BI tasks with Python or any other programming language you choose to use.

ByJames Pretorius

Alias Azure Analysis Services using Proxies in Azure Functions

Introduction

This post details how to alias Azure Analysis Services using proxies in Azure Functions. A cost-effective, flexible and codeless solution to manage link:// protocol redirects for Azure Analysis Services.

Let’s have a quick recap of the aliasing functionality in Azure Analysis Services before we dive into the implementation detail for this solution.

Connecting to Azure Analysis Services from Power BI Desktop, Excel and other client applications requires end users to specify the Analysis Services server name. For example, when connecting to a server called myanalysisservices, in the UK South region, you would use the address: asazure://uksouth.asazure.windows.net/myanalysisservices. As you can see, it’s a fairly unwieldy and hard to remember approach for connecting to the server from client tools.

An alternate approach is to use a shorter server alias, defined using the link:// protocol e.g. link://<myfriendlyname>. The endpoint defined by the shorter server alias simply returns the real Analysis Services server name in order to allow for connectivity from the client tools. A shorter server alias will (amongst other benefits):

  • Provide end users with a friendly name for connecting to the Azure Analysis Services server
  • Ease the migration of tabular models between servers by removing any impact on end users and removing the need for large amounts of manual updates to client tool content post migration e.g. changing the datasource in all Power BI reports and Excel connected workbooks

Any HTTPS endpoint that returns a valid Azure Analysis Services server name can provide the aliasing capability. The endpoint must support HTTPS over port 443 and the port must not be specified in the URI.

Additional information on aliasing Azure Analysis Services can be found in the following Microsoft documentation: https://docs.microsoft.com/en-us/azure/analysis-services/analysis-services-server-alias.

Implementation Overview

Components

Implement aliasing for Azure Analysis Services by deploying and configuring the following components:

  • CNAME DNS record – Provides your end users with a friendly server name for Analysis Services
  • Azure Function App with proxy capabilities – Passes the Analysis Services server name to the client tools

The CNAME destination will be configured to point to the Azure Function App. The Azure Function App will be configured with a proxy entry to serve up the connection information for the Azure Analysis Services server.

Alias multiple servers using one CNAME record by configuring the destination Azure Function App with multiple proxy entries. Each of these proxy entries must be configured with a different route template e.g. link://<myCNAME>/<myProxyEntryRouteTemplate>. Please see implementation detail below for additional information.

Implementation steps

Deploy Azure Analysis Services aliasing by following the steps below (covered in detail later in this post):

  1. Provision an Azure Function App
  2. Create a CNAME record for the Azure Function App
  3. Add the CNAME record to the Azure Function App
  4. Create proxy entries for each Azure Analysis Services server

Costs

1million executions per month and 400K GB-s worth of resource consumption are provided free as part of the Azure Functions Consumption Pricing Plan. Aliasing provided by the proxy service should fall well within the minimum execution time (100ms) and memory consumption (128MB) thresholds of this price plan. This should put you at < 125K GB-s of resource consumption over 1million executions. After this, charges are £0.15 per 1 million executions (assuming the minimum resource consumption of 100ms runtime @ 128MB memory consumption).

There are alternative pricing plans available, detailed information on Azure Functions pricing can be found here: https://azure.microsoft.com/en-gb/pricing/details/functions/

Implementation Detail

Provision an Azure Function App

  1. Login to https://portal.azure.com
  2. Click on Create a resource
  3. Search for and select Function App
  4. Click on Create
  5. Complete the necessary fields on the new Function App blade to create a new Azure Function App.

Take note of the app name for the Azure Function App as you’ll need to use this when creating the CNAME record.

Please note that the name/URL of the Azure Function App is not the HTTPS endpoint which will be used as the alias by your end users. Feel free to name the Azure Funtion App in accordance with your naming conventions as the CNAME record will be used for friendly naming.

It is possible to configure delegated administration, for managing the Azure Function App proxy capability, by assigning the Website Contributor role on the Azure Function App access control list.

Create a new CNAME record for the Azure Function App

Create a new CNAME record with your domain service provider and set the destination for the record to the URL of the newly created Function e.g. myaliashandler.azurewebsites.net

Add the CNAME record to the Azure Function App

  1. Login to https://portal.azure.com
  2. Open the Azure Functions blade
  3. Load the blade for the newly created Azure Function App resource
  4. Click on Platform Features > Custom Domains

    1. Click on Add Hostname

  1. Add the new hostname
  2. Click on Validate
  3. Once validated click on Add Hostname
  4. Add the SSL Binding by clicking on Add Binding next to the new hostname
  5. Select the relevant private certificate and click on Add Binding

Upload a private certificate to manage the SSL binding for the new hostname

  1. Load the blade for the newly created Azure Function App resource
  2. Click on Platform Features > SSL
  3. Choose the menu option, Private Certificate > Upload Certificate to upload your private certificate
  4. Load the blade for the newly created Azure Function App resource
  5. Click on Platform Features > Custom Domains
  6. Add the SSL Binding by clicking on Add Binding next to the new hostname
  7. Select the relevant private certificate and click on Add Binding

Create an alias entry for Azure Analysis Services on the Azure Function App

  1. Login to https://portal.azure.com
  2. Open the Functions App blade
  3. Click on the newly created Azure Function App resource
  4. Select the + symbol next to Proxies
    Name: Add a meaningful name for the proxy
    Route template: The path which will invoke this proxy entry
    Allowed HTTP methods: GET
    Set the following under Response Override
    Status code: 200
    Status message: accept
    Under Headers:
    Content-Type: text/plain
    Body: The address to the Azure Analysis Services server e.g. asazure://<region>.asazure.windows.net/<server>

  1. On completion click create

You’ve now successfully configured an alias for your Azure Analysis Services server.

Testing

Azure Function App testing can be carried out in a web browser by connecting to: https://<myFunctionApp>/<myRouteTemplate&gt; e.g. https://myaliashandler.azurewebsites.net/finance. The fully qualified Analysis Services server name should appear in the web browser.

Further testing can be conducted using the CNAME entry, in your web browser try connecting to https://<myCNAME.myDomain.com>/<myRouteTemplate&gt; e.g. https://data.mydomain.com/finance

Finally, from within a client tool which supports the link protocol, connect using link://myCNAME.myDomain.com/<myRouteTemplate> e.g. link://data.mydomain.com/finance

Further Reading

https://docs.microsoft.com/en-us/azure/azure-functions/functions-proxies

https://docs.microsoft.com/en-us/azure/analysis-services/analysis-services-server-alias