This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.
Should we use Dual mode for ADX dimensions?
In one word
With some more details
I discussed in the past different ways to create Power BI models using Multiple ADX tables.
Assuming the fact table is always in Direct Query (DQ), the question is what about the dimension tables ?
In the past I compared the pros ad cons two options:
- Importing the dimensions
- No KQL queries are sent when slicers/filters are interacting between themselves
- Fast search
- No joins in the KQL statements
- Filters are implemented as in (Key1,key2…)
- Filters are implemented as in (Key1,key2…). Could be a problem if the list of keys is large
- Need to refresh the dimensions.
- Fast changing dimensions are not up to date.
- Using DQ also for the dimensions
- Dimensions are always up to date without a need to refresh.
- Joins are used to filter the fact table.
Joins are reliably fast (As long as you remember to declare as dimension)
- Search is by default only on full words and case sensitive (Can be changed by setting)
- Search on a slicer creates multiple queries.
- Interactions between slicers/filters create many queries to ADX.
I want to add another option which provides almost all pros and very few cons : Dual Mode
In Dual mode, the table is imported into the memory database and can be used in DQ mode.
Search is done as in import, very fast, using substring by default , not case sensitive and not sending any queries.
Interactions between slicers/filters are also using the imported copy of the dimension.
Only when interacting with the fact table which is in DQ (In general any table that is in DQ) , the dual mode dimension will behave as a DQ dimension and will create joins between the fact and the dimension(s).
It means that the setting of IsDimension=true is still needed for creating efficient joins
What about fast changing dimensions
When using dual mode, the table is imported and used in many cases as an imported table.
It means that values in the dimension table which were added since the last refresh, will not show in slicers/filters based on the dimension.
This is not necessarily a big problem.
When joining the fact and the dimension, DQ is used so an inner- join can be safely used as long as the dimension table is in sync with the fact on ADX.
You can control if inner-join or right outer-join will be used by the setting of assume referential integrity on the relationship.
The content of slicers/filters will show values from the last import.
This is also usually not a huge problem because slicers usually are based on categories and not on the most detailed keys.
You are analyzing data from wind turbines in wind farms.
You’ll miss a turbine in the slicer only if this turbine was added since the last refresh.
You have a customer’s dimension. If you want to filter an individual customers from a list, you miss customers who registered after the last refresh.
If you filter on customers in a city, you have a problem only if a city appeared for the first time after the last refresh which is a rare event.
You can schedule multiple refresh times per day and because dimension tables are usually not very big, the refresh will not take a long time.
If a dimension is very volatile and you must see items that were added minutes ago, you better use DQ for these dimensions while using Dual Mode for other dimensions.
There are two pbix files atachhed. They use the same data and same report.
The only difference is that in one all dimensions are in Dual Mode and in the other in DQ mode.
Using the performance analyzer, you can see the difference in the number of queries and the time to refresh slicers.