Profiling Xamarin.Android Startup

This post has been republished via RSS; it originally appeared at: Microsoft Developer Blogs.

Poor performance can be a huge detriment to a mobile application's success. Users don't want to see a splash screen for 10 seconds when using any app. They may get frustrated, abandon their shopping cart, or uninstall the app completely if kept waiting for too long.

You can measure performance in Xamarin.Android applications in the same way as if developed with Java or Kotlin in Android Studio. Additionally, Xamarin.Android has specific tooling for measuring the performance of managed (C#) code. We'll look at a few examples of diagnosing performance problems.

Profiling Xamarin.Android Startup

Measuring Overall Startup Time

If your goal is to improve your mobile application's performance, the very first step should be to actually measure where things are. Blindly making performance changes may not make as much difference as you hoped, and complicated performance improvements can even hurt the maintainability of your codebase. The process should be: measure, make changes, measure again, repeat.

On Android, the ActivityManager system process displays a "time to initial display" log message that gives the best picture of overall startup time. You can use adb logcat to quickly view the Android device log at the command-line. Alternatively, you can use the Android Debug Log in Visual Studio.

For example, you could run the following on macOS:

$ adb logcat -d | grep Displayed
ActivityTaskManager: Displayed com.companyname.myapp/.MainActivity: +2s494ms

On Windows, you could instead run the following powershell:

> adb logcat -d | Select-String Displayed
ActivityTaskManager: Displayed com.companyname.myapp/.MainActivity: +2s494ms

Note for Windows users: you will either need adb.exe in your %PATH% or add a powershell alias.

The above log messages were captured while debugging an application from Visual Studio on an x86 Android emulator. There is some overhead in starting/connecting the debugger, as well as missing optimizations from a Debug build.

If we simply switch to a Release configuration and deploy and run the app again:

ActivityTaskManager: Displayed com.companyname.myapp/.MainActivity: +806ms

Now consider enabling AOT and Startup Tracing, and we can get an even better startup time:

ActivityTaskManager: Displayed com.companyname.myapp/.MainActivity: +445ms

Selecting Startup Tracing checkbox is probably the easiest win for your Xamarin.Android application, if you aren't using it already.

Finally, if we measure the application on a Pixel 3 XL device:

ActivityTaskManager: Displayed com.companyname.myapp/.MainActivity: +340ms

It is important to always measure what happens on physical hardware. The emulator is great for development, but doesn't match what your users see in the wild.

To recap:

Configuration Running on Startup Time (ms)
Debug x86 emulator 2,494ms
Release x86 emulator 806ms
Release + AOT x86 emulator 445ms
Release + AOT Pixel 3 XL 340ms

Diagnosing the Problem

OK my app is slow. Now what do I do?

For illustration purposes, let's create some slow methods in your MainActivity's OnCreate method:

protected override void OnCreate(Bundle savedInstanceState)
{
    base.OnCreate(savedInstanceState);
    //...

    SlowMethod1();
    for (int i = 0; i < 100; i++)
    {
        SlowMethod2();
    }
}

void SlowMethod1() => Thread.Sleep(1000);

void SlowMethod2() => Thread.Sleep(10);

Even when using Startup Tracing, your startup time will look something like:

ActivityTaskManager: Displayed com.companyname.myapp/.MainActivity: +2s691ms

We need to actually profile our C# code to understand which methods are taking a long time and need improvement.

Switch back to a Debug configuration, and enable the Mono profiler by running:

$ adb shell setprop debug.mono.profile log:calls

adb shell runs a single shell command on the Android device or emulator. setprop sets an Android system property, which is the similar to an environment variable on other platforms.

Then simply force-quit and restart the application. On the next startup, Mono will save a profile.mlpd file in the Android app's local directory.

Unfortunately, the file can only be accessed by the app itself, so we must use the run-as command to locate the file:

$ adb shell run-as com.companyname.myapp ls -l files/.__override__
-rw-rw-r-- 1 u0_a411 u0_a411 515303 2020-07-27 09:29 profile.mlpd

Where com.companyname.myapp is the package name set in your AndroidManifest.xml.

To get the file off the device, I used a known writable directory such as /sdcard/Download/:

$ adb shell run-as com.companyname.myapp cp files/.__override__/profile.mlpd /sdcard/Download/

Once the file is copied, you can use adb pull to get the profile.mlpd file to your desktop computer:

$ adb pull /sdcard/Download/profile.mlpd
/sdcard/Download/profile.mlpd: 1 file pulled, 0 skipped. 162.7 MB/s (515303 bytes in 0.003s)

profile.mlpd is a binary file that can be read on macOS by running the mprof-report command-line utility:

$ mprof-report profile.mlpd
Mono log profiler data
    Profiler version: 3.0
    Data version: 17
    Arguments: log:calls,output=/data/user/0/com.companyname.myapp/files/.__override__/profile.mlpd
...

Note that Windows users will need to have Mono installed in Windows Subsystem for Linux to run mprof-report. This works great; it's what I do!

The output is quite long, so you might consider opening the file in Visual Studio Code via the code command:

$ mprof-report profile.mlpd | code -
Reading from stdin via: code-stdin-kms.txt

Note that you will need to hit Ctrl+C to cancel the code - command.

With the above C# code in place, some interesting numbers pop out:

Method call summary
Total(ms) Self(ms)      Calls Method name
    2662        4          1 MyApp.MainActivity:OnCreate (Android.OS.Bundle)
    1299        2        100 MyApp.MainActivity:SlowMethod2 ()
    1017        2          1 MyApp.MainActivity:SlowMethod1 ()

Note that the Xamarin Profiler is also an option. I have found that using the Mono profiler directly does a better job at diagnosing startup performance on Android.

Making Improvements

As an experiment, let's assume the work done in SlowMethod2() can be done once and cached. We don't need it to do the work 100 times over!

Let's change it to:

bool initialized = false;

void SlowMethod2()
{
    if (!initialized)
    {
        initialized = true;
        Thread.Sleep(10);
    }
}

If we profile startup again, SlowMethod2 is dramatically better:

Method call summary
Total(ms) Self(ms)      Calls Method name
      53       17        100 MyApp.MainActivity:SlowMethod2 ()

Now let's assume there isn't much we can do to make SlowMethod1() faster. We might instead consider running it on a background thread:

// Show an empty UI or spinner
await Task.Factory.StartNew (SlowMethod1);
// Populate the UI

But notice after this change, that SlowMethod1 is roughly the same:

Method call summary
Total(ms) Self(ms)      Calls Method name
    1013        0          1 MyApp.MainActivity:SlowMethod1 ()

Moving this work to the background thread didn't actually speed anything up. It just improved the user experience so a nice UI can be displayed while the work happens in the background.

If we compare a Release build with these improvements:

-ActivityTaskManager: Displayed com.companyname.myapp/.MainActivity: +2s691ms
+ActivityTaskManager: Displayed com.companyname.myapp/.MainActivity: +554ms

Even though there is still ~1 second of delay, getting something in front of the user is better than no UI at all.

Before and After

Notes About the Mono Profiler

Setting the debug.mono.profile system property will slowdown your app's startup. The Mono runtime is performing additional work to instrument every method call and write this information to a file.

Take the Total(ms) timings displayed with a grain of salt. They are likely not the true time experienced by your users with a Release build of your app. This is why it is important to measure things again with a Release build and the profiler turned off.

Android system properties do not reset until you reboot the device, so make sure to clear debug.mono.profile when you are finished profiling:

$ adb shell setprop debug.mono.profile ""

Real-World Examples

Ok, Thread.Sleep() is definitely slow. But what do real problems in Xamarin.Android applications look like?

Example 1, Logging

Let's say your application has lots of logging throughout startup, and you have a method such as:

void Log (string format, params object[] args)
{
    Android.Util.Log.Debug("MyApp", format, args);
}

Let's say this is called 100 times on startup. At first glance, this doesn't seem like an issue, but the reality is that this method is not "free".

Several factors can contribute to poor performance here:

  1. An actual string.Format() call creates a string as large as the log message. This is extra memory usage and GC pressure.
  2. An object[] is created with size equal to number of args.
  3. C# to Java interop occurs. This is a p/invoke into JNI. This is pretty fast, but not as fast as calling a plain C# method.
  4. The Android OS actually writes the log message to the log. You might compare the performance of this operation to writing to a file. This also takes time.

These are multiplied by 100x if there are 100 calls. Note that if there were only one call, it probably wouldn't be worth doing anything here at all. A single Log ("Hello!") is not going to noticeably impact startup.

Let's add this to OnCreate() and profile again:

for (int i = 0; i < 100; i++)
{
    Log("This is a log message {0} {1}", i, i * 2);
}

This time, let's also enable memory allocation tracking in the Mono profiler including log:alloc:

$ adb shell setprop debug.mono.profile log:calls,alloc

As expected, we can see the method taking time on each call:

Method call summary
Total(ms) Self(ms)      Calls Method name
     782        5        100 MyApp.MainActivity:Log (string,object[])
     775        3        100 Android.Util.Log:Debug (string,string,object[])
     634       10        100 Android.Util.Log:Debug (string,string)

The winning combination to look for here, is a large number of calls combined with Total(ms) time taken. This indicates we could probably improve the individual calls and get 100x (for each call) for any improvements we make.

log:alloc also lets us see memory allocations; a few are directly related to the Log() calls:

Allocation summary
     Bytes      Count  Average Type name
     41784        839       49 System.String
      4280        144       29 System.Object[]

The best way to improve things here is to not call Log() at all. We really only need these log messages when developing the app, right?

Let's take advantage of the ConditionalAttribute such as:

[Conditional("DEBUG")]
void Log (string format, params object[] args)

Once this is applied, Release builds of the app will have all calls to Log() completely removed.

If we profiled the app again with the method removed, the method calls are gone as well as improved memory usage:

Allocation summary
     Bytes      Count  Average Type name
     34120        725       47 System.String
      1800         41       43 System.Object[]

Memory allocation in .NET is pretty fast, but no allocations are always faster!

Example 2, HTTP Requests

Let's say you need to make five requests to a web service when you app starts up. This is a reasonable thing to happen in a real-world application.

If we write some code to make an HTTP request and parse the results with Json.NET:

// At the top
using System.Net.Http;
using Newtonsoft.Json;
//...
readonly HttpClient httpClient = new HttpClient();

async Task <Response> GetSlides ()
{
    string json = await httpClient.GetStringAsync("https://httpbin.org/json");
    return JsonConvert.DeserializeObject<Response>(json);
}

Then call it five times in OnCreate():

for (int i = 0; i < 5; i++)
{
    var response = await GetSlides();
    // Do something with slides
}

Note that it is important to be using async/await here, and HttpClient makes this straightforward. Network requests should not be happening synchronously on the UI thread.

We can see these five calls can take quite some time:

Method call summary
Total(ms) Self(ms)      Calls Method name
   14692        4         10 MyApp.MainActivity/<GetSlides>d__6:MoveNext ()
    4073       11          5 MyApp.MainActivity:GetSlides ()
    3557        0          5 Newtonsoft.Json.JsonConvert:DeserializeObject<T_REF> (string)

And we can see memory allocations such as:

     Bytes      Count  Average Type name
   3762640      12935      290 System.String
   1397624        895     1561 System.Char[]

Note that you can pass --traces to mprof-report if you need to see which methods these allocations are coming from.

There are lots of options to improve things here:

  1. Can we cache the results of the web requests? Can we load past calls from a file on disk? Maybe they could be valid for 24 hours.
  2. Do the calls depend on each other? Can we make the five calls in parallel?
  3. On the server side, can we make a new API call that returns the data for all five calls in one request? This assumes you have control of what happens on the server.

These are all great ideas, but let's see what we can do to simply improve the C# code here.

Instead of creating a string with the contents of the response, let's read from a Stream directly. We can also use the new System.Text.Json library for more performant JSON parsing:

// At the top
using System.Text.Json;
//...
async Task<Response> GetSlides()
{
    var response = await httpClient.GetAsync("https://httpbin.org/json");
    response.EnsureSuccessStatusCode();
    using (var stream = await response.Content.ReadAsStreamAsync())
    {
        return await JsonSerializer.DeserializeAsync<Response>(stream);
    }
}

Reviewing a diff of the method calls, we can see a clear benefit:

Method call summary
Total(ms) Self(ms)      Calls Method name
-  14692        4         10 MyApp.MainActivity/<GetSlides>d__6:MoveNext ()
-   4073       11          5 MyApp.MainActivity:GetSlides ()
-   3557        0          5 Newtonsoft.Json.JsonConvert:DeserializeObject<T_REF> (string)
+  10381        6         10 MyApp.MainActivity/<GetSlides>d__6:MoveNext ()
+   4150       11          5 MyApp.MainActivity:GetSlides ()
+    617        1          5 System.Text.Json.JsonSerializer:DeserializeAsync<TValue_REF> ()

We can also see a ~42K difference in memory allocations for string:

Allocation summary
     Bytes      Count  Average Type name
-  3762640      12935      290 System.String
-  1397624        895     1561 System.Char[]
+  3730632      12149      307 System.String
+  1386704        879     1577 System.Char[]

Conclusion

Performance is hard. Our general idea of what parts of a codebase are slow might not actually be slow at all! The best way to make an impact with performance is to measure, measure, and then measure again. Then I'd even measure again after the phase of the moon has changed.

Don't take my word for anything in this article. Make sure to measure it yourself! For further reading, see the profiling guide in the Xamarin.Android source code on Github.

REMEMBER: these articles are REPUBLISHED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.