This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.
If you are developing applications with Azure Digital Twins (ADT) which will materialize large twin graphs, this article is for you. Read on to learn how we accelerated services that need to traverse these graphs and how you can do the same using the right caching strategy.
As part of the CSE global engineering organization at Microsoft, our team developed an ADT-based solution together with a customer.
An essential requirement was to have low-latency responses for materializing graphs of several thousands of nodes which are infrequently updated. We achieved this goal by improving the speed of a 3000 nodes graph traversal from ~10 seconds to under a second.
ADT offers a powerful SQL-like query language to retrieve data out of a twin graph.
Traversing large graph sections implies the execution of many ADT-queries. This blog post presents an in-memory caching solution that we utilized to enhance the performance of twin graph traversals.
Prerequisites
- .NET Core 3.1 on your development machine.
- Be familiar with C#, Azure Digital Twins and Azure Digital Twins Explorer.
Create your Azure Digital Twins graph
We want to represent the factories of a company named Contoso.
- Create an Azure Digital Twins instance and make sure you have the Azure Digital Twins Data Owner role.
- Open the Azure Digital Twins Explorer.
- Download the contoso-tree.zip provided in the attachments and import the contoso-tree.json to ADT and save it. You can select the import graph icon in the explorer and select the file to import it. Then save the graph.
You should see the following tree in the explorer.
The Azure Digital Twins Explorer shows that our Contoso company has two factories. Each factory is composed of rooms, and each room can contain machines.
Get the children of a twin
A common use case is the need to retrieve the children of a twin. For
example, we want to be able to list the rooms of a factory.
You can display the children of Factory1 by running the following query:
You should see the 2 following twins.
The Azure Digital Twins Explorer helps you visualizing the twins. If we develop an application, we need to be able to retrieve the twins programmatically. Let’s try to retrieve the children of a node with C#.
You can start by creating a new Console Application project and include the packages Azure.DigitalTwins.Core, System.Linq.Async, Azure.Identity.
Then we can create a simple AzureDigitalTwinsRepository class. It will use the DigitalTwinsClient to query ADT.
Add a method to get the children of a twin in the AzureDigitalTwinsRepository class.
We can use our AzureDigitalTwinsRepository to display the children of Factory1:
Get the subtree of a twin
Imagine that we need our application to retrieve the subtree of a twin. We want to get the twin, its descendants and the relationships between these twins. We cannot achieve that with a single ADT Query. We must make a tree traversal like a breadth-first search for example.
Add a method to get the subtree of a node in the AzureDigitalTwinsRepository class:
When traversing the tree, we make several consecutive queries to ADT which makes the entire operation longer. To make the operation faster, let's see how we can cache the tree in-memory.
Caching
A secondary datastore can serve as a data cache to accelerate application operations while avoiding the need to query ADT multiple times in complex operations.
We decided to implement a simple in-memory cache as the data we were interested in was small enough to load in-memory and is infrequently updated. This enabled us to avoid adding additional infrastructure complexity with a relatively simple caching approach.
The cache must contain a subset of the twin graph transformed into a data structure appropriate for the problem at hand. Depending on the use case, it might be necessary to store data as a subgraph of the twin graph. Still, there might be other situations where simpler data structures like lists or maps simplify the cache implementation. We used a simple in-memory adjacency list.
We want to store the Contoso tree in-memory as an adjacency-list.
Let's create a caching repository. The caching repository uses the AzureDigitalTwinsRepository that we implemented to reload the cache.
We can add a GetSubtree method that will traverse the in-memory graph instead of making several requests to ADT. The only difference with the previous implementation is that we get the digital twin and its children from the in-memory graph.
We can measure the duration of the 2 GetSubtree implementations.
Cache loading and invalidation
You can preload the cache when the service starts and invalidate it when the graph is updated.
Event notifications can be a great trigger for that. Azure Digital Twins provides different type of events.
We wanted to avoid additional dependencies, so we created an extra twin in ADT and used it as an indicator to keep track of the last graph updates. The twin indicator is updated by the service whenever it modifies the graph in ADT. Then, our service periodically checks if the indicator twin got updated and refreshes the cache if it is the case.
Conclusion
Azure Digital Twins is a powerful tool to create a digital representation of an environment, and we have seen how caching can be used to enhance the performance of twin graph traversals.
An additional advantage is ADT cost optimization. ADT pricing includes a cost per query unit. Using a cache may help you reduce the number of query units used by your system. However, reloading the cache also consumes query units. The amount you can save depends on how expensive all the operations you avoid to compute are, but the cost to refresh the cache also impacts the number of query units used. That's why you need to make your own analysis to understand what is best for your system and how this type of strategy would help your case.
Now you can try this strategy and share your experience and workarounds in the comments below!
Contributors
Marc Gomez
Alexandre Gattiker
Christopher Lomonico
Izabela Kulakowska
Max Zeier
Peeyush Chandel