Introduction

Hello friend,

My name is Annie Xu. I am a IT girl who loves fitness. I love to share and love to help people.

Since 2013, I started my journey creating Business Intelligent (BI) solutions and found myself very passionate about this area. Inside the circle of technologies, knowledge shared by others is extremely helpful to me.

That’s why I started my very first blog site to share my knowledge and tips during my journey of  learning Business Intelligence.

Hopefully, you find some useful information and feel free to send me your comments, thoughts, or questions at anniexu1990@gmail.com

post

Featured post

You may need a table variable for IRR and NPV for DAX

I finished my Fundamental Finance Accounting course last week. I learned 2 awesome functions which helps which which projects to undertaken. Frist is NPV net present value and the second is IRR(Internal Rate of Return). I found those two are so powerful to use helping to decide which investment to choice so I want to implement in DAX. Fortunately, in DAX there are two functions XNPV and XIRR which I can choose to use. I found that XNPV and XIRR are also in excel and would be a replacement of IRR and NPV.  They have slightly different calculation details but the purpose are the same. You can find http://www.differencebetween.net/business/finance-business-2/difference-between-npv-and-xnpv/ and https://corporatefinanceinstitute.com/resources/knowledge/modeling/xirr-vs-irr-excel-financial-model/  for more details.

The Excel XNPV function is a financial function that calculates the net present value (NPV) of an investment using a discount rate and a series of cash flows that occur at irregular intervals. Calculate net present value for irregular cash flows. Net present value. =XNPV (rate, values, dates)

The Excek XIRR(Internal Rate of Return) is the discount rate which sets the Net Present Value (XNPV) of all future cash flow of an investment to zero.  If the NPV of an investment is zero it doesn’t mean it’s a good or bad investment, it just means you will earn the IRR (discount rate) as your rate of return. =XIRR(values,dates,guess)

The challenge in DAX, I found, is that , we may not have the date field. For example, we may just calculate a intial investiment and an estimate assuming paying amount every year or every month in the future. Something like this:

Assumption - IRR NPV

So in DAX, we can create table variable in the Measure to leverage XIRR and XNVP.

XIRR =
VAR cashflowtable =
    UNION (
        GENERATE (
            GENERATESERIES ( YEAR ( NOW () )YEAR ( NOW () ) ),
            VAR inc = [Value]
            RETURN
                ROW (
                    "date"DATE ( inc11 ),
                    "Values"SUM ( 'Considered Project'[Initial Inventestment] )
                )
        ),
        GENERATE (
            GENERATESERIES ( YEAR ( NOW () ) + 1YEAR ( NOW () ) + [Investment Years] ),
            VAR inc = [Value]
            RETURN
                ROW (
                    "date"DATE ( inc11 ),
                    "Values"SUM ( 'Considered Project'[Annual Cash flow] )
                )
        ),
        GENERATE (
            GENERATESERIES (
                YEAR ( NOW () ) + [Investment Years],
                YEAR ( NOW () ) + [Investment Years]
            ),
            VAR inc = [Value]
            RETURN
                ROW (
                    "date"DATE ( inc11 ),
                    "Values"SUM ( 'Considered Project'[Exit Value] )
                )
        )
    )
RETURN
    XIRR ( cashflowtable, [Values], [date] )
XNVP =
VAR cashflowtable =
UNION (
GENERATE (
GENERATESERIES ( YEAR ( NOW () ), YEAR ( NOW () ) ),
VAR inc = [Value]
RETURN
ROW (
"date", DATE ( inc, 1, 1 ),
"Values", SUM ( 'Considered Project'[Initial Inventestment] )
)
),
GENERATE (
GENERATESERIES ( YEAR ( NOW () ) + 1, YEAR ( NOW () ) + [Investment Years] ),
VAR inc = [Value]
RETURN
ROW (
"date", DATE ( inc, 1, 1 ),
"Values", SUM ( 'Considered Project'[Annual Cash flow] )
)
),
GENERATE (
GENERATESERIES (
YEAR ( NOW () ) + [Investment Years],
YEAR ( NOW () ) + [Investment Years]
),
VAR inc = [Value]
RETURN
ROW (
"date", DATE ( inc, 1, 1 ),
"Values", SUM ( 'Considered Project'[Exit Value] )
)
)
)
RETURN
XNPV ( cashflowtable, [Values], [date], [Interest Rate] )

Hopefully, you find this helpful.

Thanks,

your friend, Annie

Migrating Local Power Query to PowerBI Data Flow and link it to Azure Data Lake Storage account

Within Microsoft lastest PowerBI update, the one exites me most is that we can link PowerBI data flow to Azure data lake storage account.

Benefits are:

  1. When you have a PBIX imported mode reaching more than 1 GB and you are not Primier memeber you can not publish to PowerBI.com. However, with DataFlow in PowerBi you can save upto 10 GB data.
  2. Your workspace members are able to reuse the same data flow as souce of their PowerBI file.
  3. When liked to Azure Data Lake Sotrage Gen 2 you can use data got from PowerBI Dataflows as data source across your Azure applications.14a64eeb-b1a7-4a12-889f-8b573c98ef20[1]

To try this, you need to have the following things

You have PowerBI-Pro license and above.

You are an admin to your Azure portal. (Or you are good friend of your admin ^_^)

There are few things I found out when I did my experiment and I think you might face are:

  1. You can use blank query to copy M Codes from your Power Queries
  2. If you have a table function to transfer and you don’t have primier account, you can still do so by select “do not load”
  3. To configure Azure Data Lake Gen 2 connect to PowerBI, follow this document from microsoft. https://docs.microsoft.com/en-us/power-bi/service-dataflows-connect-azure-data-lake-storage-gen2
  4. When you configured everything right according to the above microsoft document. You may still have issue like I did. Please follow the following communication string  to see the lastest udpate on the PowerBI community discussing. https://community.powerbi.com/t5/Service/Power-BI-Dataflow-to-Azure-Data-Lake-Gen2-Setup/td-p/583564

Latest update, Microsoft team has fixed the above. Now I have tested and works fine.

Thanks,

Your friend, Annie

How to calculate PMT and CUMPRINC in DAX

PMT is a function in excel for the periodic payment for a loan.

In excel, The Syntax is PMT(rate, nper, pv, [fv], [type])

  • Rate    Required. The interest rate for the loan.
  • Nper    Required. The total number of payments for the loan.
  • Pv    Required. The present value, or the total amount that a series of future payments is worth now; also known as the principal.

Yet, in DAX, there is no built in function yet for PMT. To calculate this we need to reapply for its mathmatic logic.

So, you can use

DAXVersionPMT = ((Pv*Rate))/(1-((1+Rate))^(-1*Nper))))*(-1) 
// Remind: if you are trying to calculate monthly payment but rate is annual rate, then you need to convert to monthly rate by devide to 12
CUMPRINC is the accumuated principle amount for a loan 

In excel, the syntax is CUMPRINC(rate, nper, pv, start_period, end_period, type)

The CUMPRINC function syntax has the following arguments:

  • Rate    Required. The interest rate.
  • Nper    Required. The total number of payment periods.
  • Pv    Required. The present value.
  • Start_period    Required. The first period in the calculation. Payment periods are numbered beginning with 1.
  • End_period    Required. The last period in the calculation.
  • Type    Required. The timing of the payment.

In DAX right now, there is no built in function fo rCUMPRINC yet.
So, you can use the follow DAX
Accumulated Principle = var p=[PV]

var i=[Rate]

var n=[Nper]

var b=[Start_period]

var e=[End_period]

var m=[DAXVersionPMT]*(-1)

var q=12 // if rate is annual rate and you calculated you PMT as monthly period, then you need to convert to monthly rate by devide to 12, if the rate is for monthly rate, then you just need to use q=1

Return ((p-m*q/i)*(1+(i/q))^(e)+m*q/i)-((p-m*q/i)*(1+(i/q))^(b-1)+m*q/i)

Thanks,
Your friend,Annie

An easy way to compare all tables’ DDL between two SQL environments

To continue my story about data warehouse migration from UAT to PROD environment.

The DDL difference between the same source loading table from two environemnts will cause my data warehouse load fail, so I need to find an easy to compare the two for all my sourcing tables . There are a lot of good tools with free trail out there to choose. However, there are some regulation from my client side to installing third party tool so I need to create a query to help me to do so instead.

Inspired by an answer from this Post which compares one table. I generated the following query which can loop through all of my targeted sourcing tables.

DECLARE @Table VARCHAR(100);
DECLARE @i int=1;
While @i<= (select max([RANKColumn]) from [dbo].[EvaluationTableList] ) — this is the list table which contains all the tables you want to validate
Begin
SET @Table=(select [TABLENAME] from [dbo].[EvaluationTableList] where [RANKColumn]=@i)
insert into [dbo].[EnviromentObjectDifference] — (this is the output table)
SELECT Table1.ServerName,
Table1.DBName,
Table1.SchemaName,
Table1.TableName,
Table1.ColumnName,
Table1.name DataType,
Table1.Length,
Table1.Precision,
Table1.Scale,
Table1.Is_Identity,
Table1.Is_Nullable,
Table2.ServerName as T2ServerName,
Table2.DBName as T2DBName,
Table2.SchemaName as T2SchemaName,
Table2.TableName as T2TableName,
Table2.ColumnName as T2ColumnName,
Table2.name as T2DataType,
Table2.Length as T2Length,
Table2.Precision as T2Precision,
Table2.Scale as T2Scale,
Table2.Is_Identity as T2Is_Identity,
Table2.Is_Nullable as T2Is_Nullable
FROM
(SELECT @Server1 ServerName,
@DB1 DbName,
SCHEMA_NAME(t.schema_id) SchemaName,
t.Name TableName,
c.Name ColumnName,
st.Name,
c.Max_Length Length,
c.Precision,
c.Scale,
c.Is_Identity,
c.Is_Nullable
FROM [ServerName01].[DatabaseNameO1].sys.tables t
INNER JOIN [ServerName01].[DatabaseNameO1].sys.columns c ON t.Object_ID = c.Object_ID
INNER JOIN sys.types st ON St.system_type_id = c.System_Type_id AND st.user_type_id = c.user_type_id
WHERE t.Name =@Table) Table1
FULL OUTER JOIN
(SELECT @Server2 ServerName,
@DB2 DbName,
SCHEMA_NAME(t.schema_id) SchemaName,
t.name TableName,
c.name ColumnName,
st.Name,
c.max_length Length,
c.Precision,
c.Scale,
c.Is_Identity,
c.Is_Nullable
FROM [ServerName02].[DatabaseNameO2].sys.tables t
INNER JOIN [ServerName02].[DatabaseNameO2].sys.columns c
ON t.Object_ID = c.Object_ID
INNER JOIN sys.types st
ON St.system_type_id = c.System_Type_id
AND st.user_type_id = c.user_type_id
WHERE t.Name = @Table) Table2
ON Table1.ColumnName = Table2.ColumnName
where Table1.ColumnName is null or Table2.ColumnName is null;
SET @i=@i+1
END

Hopefully, you find this helpful.

Thanks,

Your friend, Annie

How to use T-SQL to validate all SSIS packages at once

Recently, I need to do a data warehouse migration for a client. Since there might be some difference between the Dev environment source databases and Prod environment source databases. The migrated SSIS packages for building data warehouse might have some failures because of the changes. So the challenge is how can I validate all my DW packages (100 +) all at once.

There are a good amount of posts out there to do one package validation at a time using the Management Studio GUI (just right click on the package under SSIS Catalog)

validate SSIS package

Or using T-SQL store procedure [SSISDB].[catalog].[validate_package]. Here is the explanation of this store procedure from Microsoft.

To get the validation all at once, we can use a loop query to execute [SSISDB].[catalog].[validate_package] one by one.

Here is the code I created and would love to share.

DECLARE @i INT= 1;
DECLARE @validation_id BIGINT;
DECLARE @packageName NVARCHAR(250);
WHILE @i <= (SELECT MAX(package_id)
FROM SSISDB.catalog.packages)
BEGIN
SET @packageName = (SELECT concat(NAME, ‘.dtsx’)
FROM SSISDB.catalog.packages
WHERE package_ID = @i);
EXEC [SSISDB].[catalog].[validate_package]
@package_name = @packageName,
@folder_name = N’(the folder name which the packages are in)‘,
@project_name = N’(the project name which the packages are in)‘,
@use32bitruntime = False,
@environment_scope = S,
@reference_id = 2,   /* this is the reference id of the environment the package may use. If there is no environment reference, you can assign NULL’ */
@validation_id = @validation_id OUTPUT;
SET @i = @i + 1;
END;
After you ran the above statements, you can when use the following query to check the validation result.
SELECT  object_name,
                CASE status
                    WHEN 4
                    THEN ‘Fail’
                    WHEN 7
                    THEN ‘Success’
                END AS ‘Status’
FROM SSISDB.catalog.validations;
Thanks for viewing and I would love to have any feedbacks.
Your friend, Annie

Create Dynamics CRM Online – User Activity PowerBI Report

Couple days ago, a client of mine is asking me to build a Dynamics CRM Online user activities dashboard. In Dynamics CRM On-premises version, I know, this information resides in audit table. However, with Dynamics CRM online, I am not able to access directly to the backend database. Fortunately, there are few PowerBI content packs for Dynamics CRM online out there in market place so that I was able to read through the PBIX file and find out how PowerBI is connected to the Dynamics CRM Online data.

PowerBI is using OData as data source to connect to Dynamics CRM online web services which gives you almost all data available (There are still some important fields and tables which are in On-premises version and causing the challenge of build reports effectively) . You just need to change the highlighted piece below to your organization CRM web name.

CRM Online Audit Report snapshots

 

Odata Tables available

I did find the ‘Audit’ table available. However, there are few challenges to interpret the data. I am listing the challenges below and the steps I handled it.

  1. Initially, the ‘Audit’ table is blank.
    • This is because in CRM online you need to enable the setting of Audit Tracking. You can follow this blog to enable it.
  2. Every field is in its code or id format. And, there is no mapping table in Odata connection to map between field id and field value.
    • I have check the Microsoft provided content pack and found out those mapping table are hard coded. So I have to google the CRM Audit Entity documentation in Microsoft to get the hard coded mapping for the following fields.
      • Operation – The action that causes the audit – It will be create, delete, or update
      • Action – Actions the user can perform that cause a change
      • Audit Table fields
  3. You can expand the userID field to get the information  (such as full name, login account, address, etc.) of user who caused a change. This however, only provide information for users who make changes like ‘create’, ‘update’, and ‘delete’. For ‘access’ status, this userid only reflect as ‘SYSTEM’ as the user.
    • To get the information about the user’s name who access CRM, we need to link the ‘_objectid_value’ field from Audit table to the ‘UserId’ field from systemusers table.
  4. Missing ‘ObjectTypeCode‘ field. This is the key field is missing from this table for the Odata connection data pull. Thus, we are not able to find which entities the changes were made on.
    • Unfortunately, I have not find a solution for this unless CRM team reveal this field for the CRM Audit Entity.
    • If this field is available in the future, we can use this field in combine with ‘Attributemask’ fields to get the column names of the objects which has been changed. So that we can build further analysis report.

For now, with the information we can get, we can start to build dashboard with user login activities.

Final Result.PNG

Thanks.

PowerBI – two ways to dynamically change measures in the same visual

I love that you can always find multiple ways to solve one problem. Inspired by the Youtube video by Guys in the cube and blog reading from SQLBI.com, I found there are two ways to dynamically change measures in the same visual. The first way is more towards front end report developer using bookmark and the second way is more towards data modeler using a technique called disconnected tables.

Bookmarks method:

Bookmarks

The basic steps are:

  1. Enable Bookmarks under File ->’Options and Settings’ ->Options->Preview Features->’Bookmarks’
  2. Create one visual for example a bar chart with first measure, for example Revenue by Sales Territory
  3. and copy paste the visual 2 times. in the coped visuals replace the measure you want to replace. For example cost by sales territory.
  4. Overlay the three visuals together.
  5. Go to ‘View’ on the top tool panel and make sure ‘Bookmarks Pane’ and and ‘Selection Pane’ are checked
  6. Add bookmarks with selection of visuals to be visible associated with the bookmark you created
  7. .Bookmark setting
  8. Add the images/shapes which represent the measure you want to select on and link it to the bookmark you created.
  9. Image link

Disconnected Table method:

This method is more towards PowerBI modelers. Basically, the idea is to have a Field in a independent table (no relationship to other tables) as Slicer with your measure choice and then create a measure using SELECTEDVALUE function to have the measure dynamically switch referring measures based on the choice made on the slicer.

Step 1. Created an new table called ‘Measurechoice’ like image below.

Discounnected table

Step 2. Create a measure called ‘Selected Value’

Selected Value = SWITCH(SELECTEDVALUE(‘Measurechoice'[Index]),1,[Total Revenue],2,[Total Cost],3,[Total Revenue],BLANK())

Selected Value Measures

Step 3. Create a slicer using ‘MeasureChoice’ from the new table created.

Step 4. Create a visual use the ‘Selected Value’ as Value

Selection Result

Personally, I like the disconnected table method because I don’t need to spend time on creating images and link bookmarks. However, the bookmarks choice can help dynamically change the title of the visual and give you the ability to change the layout of your bookmark views. Now it is your choice to choose either one to fit your reporting needs.

Thanks for spending time reading this blog.

Database table partitioning – Build filegroup and partitions

Step 1

CREATE PARTITION FUNCTION PF_Fact_test (int)

AS RANGE

LEFT FOR VALUES

(20140101,20150101,20160101);

]

Step 2 

CREATE PARTITION SCHEME [PS_iGPT_FACT] AS PARTITION [PF_iGPT_FACT] TO ([iGPT2014], [iGPT2015], [iGPT2016],[iGPT2017])

Step 3

 Assign Partition Scheme and Partition Funtion to a table

USE [iGPT]

GO

BEGIN TRANSACTION

CREATE CLUSTERED INDEX [ClusteredIndex_on_PS_Fact_Test_636351103815297616] ON [dbo].[iGPTextraInfo]

([MaxDateAces])WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PS_Fact_Test]([MaxDateAces])

DROP INDEX [ClusteredIndex_on_PS_Fact_Test_636351103815297616] ON [dbo].[iGPTextraInfo]

COMMIT TRANSACTION

Step 4

Load data into the partitioned table

 –The following query returns one or more rows if the table PartitionTable is partitioned

SELECT *

FROM sys.tables AS t

JOIN sys.indexes AS i

ON t.[object_id] = i.[object_id]

AND i.[type] IN (0,1)

JOIN sys.partition_schemes ps

ON i.data_space_id = ps.data_space_id

WHERE t.name = ‘GPTClaimPendedFact’;

 

 

— The following query returns the boundary values for each partition in the PartitionTable table.

SELECT t.name AS TableName, i.name AS IndexName, p.partition_number,p.rows AS Rows, p.partition_id, i.data_space_id, f.function_id, f.type_desc, r.boundary_id, r.value AS BoundaryValue

FROM sys.tables AS t

JOIN sys.indexes AS i

ON t.object_id = i.object_id

JOIN sys.partitions AS p

ON i.object_id = p.object_id AND i.index_id = p.index_id

JOIN sys.partition_schemes AS s

ON i.data_space_id = s.data_space_id

JOIN sys.partition_functions AS f

ON s.function_id = f.function_id

LEFT JOIN sys.partition_range_values AS r

ON f.function_id = r.function_id and r.boundary_id = p.partition_number

WHERE t.name = ‘iGPTInquiryFact_WGS’ AND i.type <= 1

ORDER BY p.partition_number;

 

 

–The following query returns the name of the partitioning column for table. PartitionTable.

SELECT

t.[object_id] AS ObjectID

, t.name AS TableName

, ic.column_id AS PartitioningColumnID

, c.name AS PartitioningColumnName

FROM sys.tables AS t

JOIN sys.indexes AS i                           ON t.[object_id] = i.[object_id]  AND i.[type] <= 1 — clustered index or a heap

JOIN sys.partition_schemes AS ps  ON ps.data_space_id = i.data_space_id

JOIN sys.index_columns AS ic             ON ic.[object_id] = i.[object_id]  AND ic.index_id = i.index_id   AND ic.partition_ordinal >= 1 — because 0 = non-partitioning column

JOIN sys.columns AS c                           ON t.[object_id] = c.[object_id]   AND ic.column_id = c.column_id

WHERE t.name = ‘GPTClaimPendedFact’ ;

 

 

 

SELECT row_count              ,*

FROM sys.dm_db_partition_stats

WHERE object_id = OBJECT_ID(‘dbo.GPTClaimPendedFact’) –and row_count > 0

order by partition_number

SELECT * FROM sys.dm_db_partition_stats;

SELECT SUM(used_page_count) AS total_number_of_used_pages,

SUM (row_count) AS total_number_of_rows

FROM sys.dm_db_partition_stats

WHERE object_id=OBJECT_ID(‘dbo.GPTClaimPendedFact’)    AND (index_id=0 or index_id=1);

New DAX Functions only in SSAS 2016 and Above

SQL Server 2016 Analysis Services (SSAS)*, Power Pivot in Excel 2016, and Power BI Desktop include the following new Data Analysis Expressions (DAX) functions:

Date and Time Functions

CALENDAR Function
CALENDARAUTO Function
DATEDIFF Function

Information Functions

ISEMPTY Function
ISONORAFTER Function

Filter Functions

ADDMISSINGITEMS Function
SUBSTITUTEWITHINDEX Function

Math and Trig Functions

ACOS Function
ACOSH Function
ASIN Function
ASINH Function
ATAN Function
ATANH Function
COMBIN Function
COMBINA Function
COS Function
COSH Function
DEGREES Function
EVEN Function
EXP Function
GCD Function
ISO.CEILING Function
LCM Function
MROUND Function
ODD Function
PI Function
PRODUCT Function
PRODUCTX Function
QUOTIENT Function
RADIANS Function
SIN Function
SINH Function
SQRTPI Function
TAN Function
TANH Function

Statistical Functions

BETA.DIST Function
BETA.INV Function
CHISQ.INV Function
CHISQ.INV.RT Function
CONFIDENCE.NORM Function
CONFIDENCE.T Function
EXPON.DIST Function
GEOMEAN Function
GEOMEANX Function
MEDIAN Function
MEDIANX Function
PERCENTILE.EXC Function
PERCENTILE.INC Function
PERCENTILEX.EXC Function
PERCENTILEX.INC Function
SELECTCOLUMNS Function
XIRR Function
XNPV Function

Text Functions

CONCATENATEX Function

Other Functions

GROUPBY Function
INTERSECT Function
NATURALINNERJOIN Function
NATURALLEFTOUTERJOIN Function
SUMMARIZECOLUMNS Function
UNION Function
VAR

Blog at WordPress.com.

Up ↑