Data Transformation In Power BI – Part 4 – Complex Power Query / M Techniques

Requirements

I have project schedule data in MS Excel format. This is an enhanced version of the Excel Project template spreadsheet.

ExcelProjectSample

The core data sections that needed to be imported into Power BI is as below.

Untitled

The Setions 2 and 3 can be calculated in Power BI based on the data in Section 5 (as in Excel) but the idea here is to demonstrate how such complex structure could be transformed into a Power BI table using M (Power Query). Further, the Project Manager has the ability change information in Sections 2 and 3 based on his estimates. Therefore it is required capture these data from the source.

The above Excel data needs to be transformed into a single Power BI table format – with data in sections 1, 2, 3 and 4 tranformed into columns and be part of the data in section5 – as below.

Capture

There will be different project schedule files for different projects. All the project shcedule files will follow the same template.

These multiple project schedule files will be stored in a common folder. All the project schedule files in the folder should be imported into Power BI to create a Program Plan.

 

Required Final Output

Program Plan

Untitled

There will be other reports for individual projects and Resource Workload and Planning reports.

Untitled1

 

Data Transformation in M / Power Query

Step 1 – Setup file import from a folder into Power BI

To read about how to import files from a folder into Power BI check the article Filter different types of files from a folder and merge into a single table in Power BI.

When setting up Power BI to import files from a folder, I usually use a Sample/Template file to do all the data transformations and then exclude the Sample/Template file from the combined data table.

Untitled

Untitled

Untitled1.jpg

Step 2 – M / Power Query Transformations

In Power Query Edit window you should see new Query Groups (folder icon).

Capture

We need to do the required transformation on the table “Transform Sample File from Projects Data” under “Sample Query [2]” query group. Do not edit the function file (fx icon). If you need to edit the M query, click the Advance Editor of the “Transform Sample File from Projects Data” and do all edits there. This will be automatically reflected in the function file. These data tranformations are done to the file(s) BEFORE combining them together.

In other words, Power BI will do all the data transformation steps defined for “Transform Sample File from Projects Data” table for EACH file in the folder and then combine the transformed files together as “Projects Data” table under “Other Queries [1]” query group.

Untitled2

Note: If you need to do any data transformation on the combined data (AFTER all files are combined together) then you need to select the “Projects Data” combined table under “Other Queries [1]” query group and do the transformations for that table.

 

Final M/ Power Query Code

Download the M code text file here.

let
    //Source table
    Source = Excel.Workbook(#”Sample File Parameter1″, null, true),
    #”Project Planner_Sheet” = Source{[Item=”Project Planner”,Kind=”Sheet”]}[Data],
    #”Removed Columns1-1″ = Table.RemoveColumns(#”Project Planner_Sheet”,{“Column11”, “Column12”, “Column13”, “Column14”, “Column19”, “Column20”, “Column21”, “Column22”, “Column27”, “Column28”, “Column29”, “Column30”, “Column31”, “Column32”, “Column33”, “Column34”, “Column35”, “Column36”, “Column37”, “Column38”, “Column39”, “Column40”, “Column41”, “Column42”}),
    #”Removed Other Columns1-2″ = Table.SelectColumns(#”Removed Columns1-1″,{“Column1”, “Column2”, “Column3”, “Column4”, “Column5”, “Column6”, “Column7”, “Column8”, “Column9”, “Column10”, “Column15”, “Column16”, “Column17”, “Column18”, “Column23”, “Column24”, “Column25”, “Column26”, “Column43”, “Column44”, “Column45”, “Column46”}),
    #”Added Index1-3″ = Table.AddIndexColumn(#”Removed Other Columns1-2″, “Index-1”, 1, 1),

    //Transformation to get left side headers – Table 01
    #”Kept First Rows2-1″ = Table.FirstN(#”Removed Other Columns1-2″,5),
    #”Removed Columns2-2″ = Table.RemoveColumns(#”Kept First Rows2-1″,{“Column1”, “Column3”, “Column5”, “Column6”, “Column7”, “Column8”, “Column9”, “Column10”, “Column15”, “Column16”, “Column17”, “Column18”, “Column23”, “Column24”, “Column25”, “Column26”, “Column43”, “Column44”, “Column45”, “Column46”}),
    #”Pivoted Column2-3″ = Table.Pivot(#”Removed Columns2-2″, List.Distinct(#”Removed Columns2-2″[Column2]), “Column2”, “Column4”),
    #”Renamed Columns2-4″ = Table.RenameColumns(#”Pivoted Column2-3″,{{“Report Date :”, “Report Date”}, {“Start Date     :”, “Start Date”}, {“End Date      :”, “End Date”}, {“Duration       :”, “Duration”}, {“Effort           :”, “Effort”}}),
    #”Added Index2-5″ = Table.AddIndexColumn(#”Renamed Columns2-4″, “Index-2”, 1, 1),

    //Transformation to get right side headers – Table 02
    #”Removed Columns3-1″ = Table.RemoveColumns(#”Project Planner_Sheet”,{“Column1”, “Column2”, “Column3”, “Column4”, “Column5”, “Column6”, “Column7”, “Column9”, “Column10”, “Column11”, “Column12”, “Column13”, “Column14”, “Column15”, “Column16”, “Column17”, “Column18”, “Column19”, “Column20”, “Column21”, “Column22”, “Column23”}),
    #”Removed Other Columns3-2″ = Table.SelectColumns(#”Removed Columns3-1″,{“Column8”, “Column24”}),
    #”Kept First Rows3-3″ = Table.FirstN(#”Removed Other Columns3-2″,5),
    #”Filtered Rows3-4″ = Table.SelectRows(#”Kept First Rows3-3″, each ([Column24] <> null)),
    #”Pivoted Column3-5″ = Table.Pivot(#”Filtered Rows3-4″, List.Distinct(#”Filtered Rows3-4″[Column8]), “Column8”, “Column24”),
    #”Renamed Columns3-6″ = Table.RenameColumns(#”Pivoted Column3-5″,{{“Schedule Health   :”, “Schedule Health”}, {“Organisation        : “, “Organisation”}, {“Project Manager  :”, “Project Manager”}}),
    #”Added Index3-7″ = Table.AddIndexColumn(#”Renamed Columns3-6″, “Index-3”, 1, 1),

    // Merge Table 01 to Table 02 = [Table 04]
    #”Merged Queries4-1″ = Table.NestedJoin(#”Added Index2-5″,{“Index-2″},#”Added Index3-7”,{“Index-3″},”Added Index2-5”,JoinKind.LeftOuter),
    #”Expanded Added Index4-2″ = Table.ExpandTableColumn(#”Merged Queries4-1″, “Added Index2-5”, {“Schedule Health”, “Organisation”, “Project Manager”, “Index-3”}, {“Schedule Health”, “Organisation”, “Project Manager”, “Index-3”}),


    //Transformation to get the Project Name – Table 05
    #”Removed Columns5-1″ = Table.RemoveColumns(#”Project Planner_Sheet”,{“Column1”, “Column2”, “Column3”, “Column4”, “Column5”, “Column6”}),
    #”Removed Other Columns5-2″ = Table.SelectColumns(#”Removed Columns5-1″,{“Column7”}),
    #”Kept First Rows5-3″ = Table.FirstN(#”Removed Other Columns5-2″,1),
    #”Renamed Columns5-4″ = Table.RenameColumns(#”Kept First Rows5-3″,{{“Column7”, “Project Name”}}),
    #”Added Index5-5″ = Table.AddIndexColumn(#”Renamed Columns5-4″, “Index-5”, 1, 1),


    // Merge Table 04 to Table 05 = [Table 06]
    #”Merged Queries6-1″ = Table.NestedJoin(#”Expanded Added Index4-2″,{“Index-3″},#”Added Index5-5”,{“Index-5″},”Expanded Added Index4-2”,JoinKind.LeftOuter),
    #”Expanded Expanded Added Index6-2″ = Table.ExpandTableColumn(#”Merged Queries6-1″, “Expanded Added Index4-2”, {“Project Name”, “Index-5”}, {“Project Name”, “Index-5”}),
    #”Added Index6-3″ = Table.AddIndexColumn(#”Expanded Expanded Added Index6-2″, “Index-6”, 1, 1),

    // Transformation to get Progress data = [Table 9]
    #”Removed Columns9-1″ = Table.RemoveColumns(#”Project Planner_Sheet”,{“Column1”, “Column2”, “Column3”, “Column4”, “Column5”, “Column6”}),
    #”Removed Other Columns9-2″ = Table.SelectColumns(#”Removed Columns9-1″,{“Column7”}),
    #”Kept First Rows9-3″ = Table.FirstN(#”Removed Other Columns9-2″,3),
    #”Removed Top Rows9-4″ = Table.Skip(#”Kept First Rows9-3″,1),
    #”Split Column by Delimiter9-5″ = Table.SplitColumn(#”Removed Top Rows9-4″, “Column7”, Splitter.SplitTextByEachDelimiter({“:”}, QuoteStyle.Csv, false), {“Column7.1”, “Column7.2”}),
    #”Changed Type9-6″ = Table.TransformColumnTypes(#”Split Column by Delimiter9-5″,{{“Column7.1”, type text}, {“Column7.2”, Percentage.Type}}),
    #”Pivoted Column9-7″ = Table.Pivot(#”Changed Type9-6″, List.Distinct(#”Changed Type9-6″[Column7.1]), “Column7.1”, “Column7.2”),
    #”Added Index9-8″ = Table.AddIndexColumn(#”Pivoted Column9-7″, “Index-9”, 1, 1),

    // Merge Table 06 to Table 09 = [Table 10]
    #”Merged Queries10-1″ = Table.NestedJoin(#”Added Index6-3″,{“Index-6″},#”Added Index9-8”,{“Index-9″},”Added Index6-3”,JoinKind.LeftOuter),
    #”Expanded Added Index10-2″ = Table.ExpandTableColumn(#”Merged Queries10-1″, “Added Index6-3”, {“Actual Progress      “, “Planned Progress   “, “Index-9”}, {“Actual Progress      “, “Planned Progress   “, “Index-9”}),
    #”Added Index10-3″ = Table.AddIndexColumn(#”Expanded Added Index10-2″, “Index-10”, 1, 1),


    // Remove top 6 rows to get the schedule data = Table 07
    #”Removed Top Rows7-1″ = Table.Skip(#”Added Index1-3″,6),
    #”Promoted Headers7-2″ = Table.PromoteHeaders(#”Removed Top Rows7-1″, [PromoteAllScalars=true]),
    #”Removed Columns7-3″ = Table.RemoveColumns(#”Promoted Headers7-2″,{“7”}),
    #”Added Index7-4″ = Table.AddIndexColumn(#”Removed Columns7-3″, “Index-7”, 1, 1),

    // Merge Table 06 to Table 07 = [Table 08]
    #”Merged Queries8-1″ = Table.NestedJoin(#”Added Index7-4″,{“Index-7″},#”Added Index10-3”,{“Index-10″},”Added Index7-4”,JoinKind.LeftOuter),
    #”Expanded Added Index8-2″ = Table.ExpandTableColumn(#”Merged Queries8-1″, “Added Index7-4”, {“Report Date”, “Start Date”, “End Date”, “Duration”, “Effort”, “Index-2”, “Schedule Health”, “Organisation”, “Project Manager”, “Index-3”, “Project Name”, “Index-5”, “Index-6”, “Actual Progress      “, “Planned Progress   “, “Index-9”, “Index-10”}, {“Report Date”, “Start Date”, “End Date”, “Duration”, “Effort”, “Index-2”, “Schedule Health”, “Organisation”, “Project Manager”, “Index-3”, “Project Name”, “Index-5”, “Index-6”, “Actual Progress      “, “Planned Progress   “, “Index-9”, “Index-10”}),
 
   //Fill down the header information and remove all Index columns
    #”Filled Down8-5″ = Table.FillDown(#”Expanded Added Index8-2″,{“Report Date”, “Start Date”, “End Date”, “Duration”, “Effort”, “Index-2”, “Schedule Health”, “Organisation”, “Project Manager”, “Index-3”, “Project Name”, “Index-5”, “Index-6”, “Actual Progress      “, “Planned Progress   “}),
    #”Removed Columns8-6″ = Table.RemoveColumns(#”Filled Down8-5″,{“Index-5”, “Index-6”, “Index-3”, “Index-2”, “Index-7”, “Index-9”, “Index-10”}),
    #”Added Index8-7″ = Table.AddIndexColumn(#”Removed Columns8-6″, “Index”, 1, 1)

in
    #”Added Index8-7″

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.

Up ↑

%d bloggers like this: