top of page

Automated Data Dictionary via Excel

MK

Two of my previous posts are related to creating an automated data dictionary - something I believe is essential to inform users about your model and to promote a more optimal use of your model. The first post showed how to do this using DMVs & SQL. The second post showed how to do this using C# and SQL. Both are completely viable solutions. However, what does one do if they do not have access to SQL or a proper data warehouse? In this post I will share a third option which can be used to solve this scenario.


The solution


Not everyone has access to SQL, but pretty much everyone has access to Excel. Therefore, Excel seems like a good option for solving this problem. Another idea would be to query the Analysis Services instance directly within the model itself however this is not really viable as directly joining DMVs is not feasible.


When I mention Excel as the solution, let me explain what I mean. We can run a C# script that will export the data dictionary metadata to an Excel file, create a Data Dictionary table within the model, and connect the table within the model to the Excel file. In fact, I have created a C# script which does exactly that.


Before we get to the script, let me mention that this solution is viable for all varieties of tabular modeling: SSAS, Azure AS and Power BI Premium (using the XMLA R/W endpoint).


Running the Script


Follow these steps to create your Data Dictionary in this method:

  1. Open your model in Tabular Editor.

  2. Download the C# script from GitHub.

  3. Paste the C# script into the Advanced Scripting window in Tabular Editor.

  4. Update the filePath parameter to be the location where you want to store the Data Dictionary Excel file (do NOT include the .xlsx suffix!).

  5. Update the dataSourceM parameter to 'true' if you want the data source for the Data Dictionary table to use the M language (otherwise it will create a 'legacy' partition.

  6. Click the 'Play' button or press 'F5'.


After running this script, you will see the Excel file containing your Data Dictionary within the filePath location. Additionally, you will see that a new data source has been created as well as a new table called 'Data Dictionary'. All of the columns and their properties have been added and the table in the model has been connected to the aforementioned data source.


Legacy vs M Partitions


As implied in the steps above, this script is able to generate SQL or M when creating the partition for the Data Dictionary table within the model. Both work equally well and it should be noted that using the 'legacy' partition option which generates SQL does not actually necessitate SQL Server. It's just the traditional way that a tabular data model reads from an Excel file. Generally speaking, one would specify to use an M-partition if the model already uses M-partitions for other tables.


Assembly References


If you look at the top of this script you will see references (denoted with #r) to .NET assemblies. This was made possible as of Tabular Editor 2.8.6 and is officially documented here. In this case, the ability to make assembly references allows Tabular Editor to interact with Excel and the File Class (System.IO). There are many other assembly references which can be added using this method and each such references gives new abilities to the Advanced Scripting functionality within Tabular Editor.


#r "System.IO"
#r "Microsoft.Office.Interop.Excel"

using System.IO;
using Excel = Microsoft.Office.Interop.Excel;

Additional Notes


Running this script takes just a few seconds - even for a large model with many tables, columns and measures. One reason it is so fast is because the metadata is collected into a single tab-delimited string, saved into a text file, and then copied into an Excel file (the text file is deleted afterwards as it is no longer necessary). The other option would be to copy the metadata directly into Excel but this process takes much longer. Speed is ultimately very important when it comes to programming so I'm happy to say this script is quite effective in that area.


Conclusion


If your team has a data warehouse, I recommend following the methods in my previous post. This method is strictly for folks without a data warehouse. Hopefully this post shows that there's always another option out there!


Lastly, I strongly encourage developers to not just execute these scripts but to read through them and use them as examples from which to learn. One year ago I knew literally nothing about C#. However, after defining problems for myself and solving them bit by bit (literally!) it is now simple to manipulate a model in any way - and even make various tools and applications. The amount of things you can do with C# is literally limitless - plus I think it's great fun!

3 comentários


Garrett Fox
Garrett Fox
29 de abr. de 2022

Hi Michael, my dataset is inside of Power BI Premium workspace, but my organization has not turned on XMLA read-write and is hesitating to turn it on for some reason. Is there a way to run a script to create an Automated Data Dictionary in an Excel file without needing XMLA read-write? This would make my life much easier.

Curtir
Garrett Fox
Garrett Fox
29 de abr. de 2022
Respondendo a

There also appear to be downsides to editing a dataset using XMLA read-write because then you are not allowed to download that dataset from the workspace as a .pbix file anymore. Just one more reason why I would like to export a Data Dictionary to Excel without needing to resort to XMLA read-write..

Curtir

Just Thorning Blindbæk
Just Thorning Blindbæk
22 de dez. de 2021

I had some trouble with special danish characters in the export, but figured out I could change line 264 where the Workbooks.OpenText has a long range of parameters. The second one is "Origin" that can either be set to a constant of 1, 2, 3 or to the specific codepage. I sat it to 65001 and it's working. Might be useful to others.

Curtir

©2021 by Elegant BI

bottom of page