App size reduction at Microsoft SwiftKey

This post has been republished via RSS; it originally appeared at: Android@Microsoft - Medium.

This is based on a talk by Olli Jones, Praveen Kumar Pendyala, Abdelhak Attou and Beatriz Viñal Murciano given at Londroid (July 2019), Droidcon Lisbon (September 2019) and Droidcon Madrid (December 2019), which we’d now like to share in more detail.

The SwiftKey team is one of the many teams working on Android at Microsoft. We build the SwiftKey Keyboard app for Android and iOS. This is an overview of the Android app:

Last year we worked on making our Android app smaller, learnt a lot about how to do this and achieved great results. We’d like to take you through our journey, from when we started analysing our app to the changes we made and the improvements we got. We’re sharing this with you in the hope that some of the techniques we describe are applicable for you and your apps too.

For more general information about shrinking apps, the official documentation (e.g. here and here) is a good place to start.

What is app size?

One of the first things we noticed when we started thinking about app size is that there is no single number that represents “app size” and therefore we can’t describe it with a single number.

We talk about three different metrics when understanding our app size:

  • The APK size is the size of the APK file when you build it. This matters when OEMs pre-install the app on a device.
  • The download size is the amount of data users need to download the app from Google Play. It’s reported in the Google Play listing.
  • The install size is the space the app takes up on the device once it is installed. It is reported through the Android OS (in the settings) and it can vary over time (e.g. if an app stores a lot of data).

This is an example of how APK size, download size and install size are reported on Android. We took these screenshots part way through this project, so they don’t reflect the sizes of SwiftKey today.

Screenshots showing SwiftKey’s APK size, download size and install size
Screenshots showing SwiftKey’s APK size, download size and install size

You can track download and install size changes over time in the app size report in the Play Console.

Why download size matters

There is research from Google, which also matches our own findings, suggesting that by making an app smaller the install conversion rate will increase (the percentage of people who install the app of those who see the listing on Google Play).

In this map, taken from their article, we can see that people tend to download apps of different sizes in different parts of the world because, aside from cultural preferences, internet connectivity and data storage affordability vary a lot. In the areas in red, the median download size is around 5 MB, while in the areas in green it’s just under 25 MB. We can sadly see how SwiftKey was above the median even in the green areas when we started this project.

Map showing the app median download size per country
Map showing the app median download size per country, from the Google Play blog

We’re always trying to empower people around the world to achieve more, and this project was a nice reminder that in order to achieve this we need to make sure we’re reaching all users, not just optimising within a bubble of devices with almost permanent connectivity, large internal storage space and the latest APIs.

Why APK size and install size matter

Once people have our app installed on their devices, either because they downloaded it from Google Play or because it was pre-installed on their devices when they bought them, we need to keep the storage space the app takes up to a minimum.

This is the right thing to do: we should be respectful of our users’ devices storage and only use the resources we need (especially when we’re pre-installed and can’t be uninstalled).

We also do it for our own benefit because Android is very good at telling users to declutter their devices. We don’t want them to be prompted to uninstall our app because it’s too big. Android Go also prompts users to uninstall apps they don’t use.

Screenshot of Android prompting users to uninstall SwiftKey
Screenshot of Android prompting users to uninstall SwiftKey

Google Play tells you what percentage of your users have <1 GB storage and how many times more likely they are to uninstall (in Android vitals > App size).

Screenshot from Google Play showing that 13.4% of active devices have <1GB free and they’re 1.5x more likely to uninstall
Screenshot from Google Play with information about users free storage and uninstall rate

Google Play enforces maximum compressed download sizes of 100 MB for APKs and 150 MB for AABs. We are very far from these limits, but other apps may struggle to meet the requirements.

Reducing install size

We chose to focus first on reducing our install size. We’ve outlined all the things we’ve tried; the ones that worked and the ones that had us scratching our heads. Whilst some of the techniques outlined here also reduce the APK size, and therefore also reduce the download size, we’ve included another section where we focused on download size exclusively.

What makes up install size

Install size is important to us because we want our users to have plenty of storage space for their personal content without having to uninstall their keyboard. Before looking at how to reduce it, we should first understand exactly which things contribute to the install size.

You may have noticed that install size is the biggest of the 3 metrics we are using to measure app size. In our case the APK size accounted for roughly 75% of the install size and the remaining 25% came from the VDEX file and extracted native libraries.

We can start visualising this by taking the APK size as a reference:

Bar graph showing the APK size as 100%
Bar graph showing the APK size as 100%

When building the APK, all the Java and Kotlin byte code is compiled into DEX files which will be part of the APK file in compressed form. At installation, the Android system will extract and verify the DEX files to create a VDEX file. This will be counted as part of the install size.

Bar graph showing the APK size as 100% and the VDEX size as 25%
Bar graph showing the APK size as 100% and the VDEX size as 25%

If we are using the default behaviour, the Android system will also extract the native libraries from the APK, this includes native libraries included directly by us and also those that are indirectly included as part of a library we depend on. This is also counted against install size.

Bar graph showing the APK size as 100%, the VDEX size as 25% and the extracted native libraries size as 11%
Bar graph showing the APK size as 100%, the VDEX size as 25% and the extracted native libraries size as 11%

That is just at the beginning, right after users have just installed our app.

After using our app for a few days, the system uses an ahead-of-time compiler, optimizes the most common paths and generates the ODEX file (optimized DEX).

Bar graph showing the APK size as 100%, the VDEX size as 25%, the extracted native libraries size as 11% and ODEX size as 23%
Bar graph showing the APK size as 100%, the VDEX size as 25%, the extracted native libraries size as 11% and the ODEX size as 23%

From Android Pie, this ODEX file may be also generated at the time of installation thanks to Google’s optimized Cloud profiles.

The following commands were useful to estimate what each part was taking up:

The APK size is simply the APK size in bytes:

cat <app.apk> | wc –c

The APK download size is the gzip compressed APK with compression set to the highest value:

gzip — keep -c -9 <app.apk> | wc –c

The VDEX size is roughly the same size as the decompressed DEX file in the APK:

unzip -p <app.apk> “classes*.dex” | wc –c

The extracted library size is the decompressed size of all the so files in the APK:

unzip -p <app.apk> “**/*.so” | wc –c

This forces Android to optimise an app, and you can see the app size change in the system settings to include the optimised machine code generated ahead-of-time (AOT) (see other options here):

adb shell cmd package compile -m speed-profile -f <package>

When we started to think about how to make our app smaller, we started to think of the Java or Kotlin byte code being “triple taxed” — once as classes.dex in the APK, and a second time as the VDEX file and then once more in the ODEX file. We say “triple taxed” but the increase in actual size could well be higher than 3x.

Native libraries could also incur “double costs” — once in the APK and then again if they are extracted.

Resources can also take up unnecessary space.

For these reasons, we targeted DEX code first, native libraries next, and resources last.

Reduce, optimise and remove code with Proguard and R8

ProGuard is the go-to tool for reducing the install size. Our ProGuard configuration had been tweaked multiple times by different people over time and we found that we didn’t really know what it was doing. We suggest you review your configuration as there may be potential quick wins there, for example, enabling optimisations, disabled by default in proguard-android.txt but not in proguard-android-optimize.txt.

When we started, our ProGuard file looked like this:

Screenshot of our ProGuard configuration file, passing negated and non-negated optimisations names to the optimisations flag
Screenshot of our ProGuard configuration file

We thought we were optimising our ProGuard configuration by disabling some optimisations and enabling some others. However, the -optimizations flag works as a filter and only optimisations that match the filter are applied. When we disable something (with e.g. -optimizations!<A>), everything that is not A matches the filter and is applied. The problem is that when we think we’re enabling something (with e.g. -optimizations <B>), only B matches the filter and is applied, so we’re actually disabling all the other optimisations.

We had inadvertently disabled all optimisations but a few.

We reviewed the available options in the documentation and found a few flags that weren’t there by default and helped, repackageclasses, renamesourcefileattribute and allowaccessmodification. These helped a lot, but can make obfuscated stacktraces even more complicated to interpret — so make sure to keep a copy of the mappings file.

Some optimisations are disabled by default in proguard-android-optimize.txt (the config file we based our own on):

-optimizations !code/simplification/arithmetic,!code/simplification/cast,!field/*,!class/merging/*

The arithmetic optimisation can be used if you are only targeting Android 2.0 or later, so you can probably stop disabling it to get a microoptimisation.

With all these changes we reduced our DEX method count by 11% and reduced our DEX size in MB by 9%. Our overall install size decreased by 2% (roughly 1 MB), a lot higher than we initially expected, but ultimately not surprising given the “triple tax”.

R8

At first when we switched to using R8, our install size decreased further, but our performance testing tool flagged that our app open speed also got slower. We eventually worked out this performance regression occurred when we used newer versions of the Gradle build tools, and was unrelated to R8.

We run R8 with the same configuration we outlined above and enabling it reduced our install size by a further 1.3 MB.

In the talks we showed some screenshots of the bots we use to measure app size and performance in PRs and saw that people found them useful. The tool we use is now open sourced as a GitHub action.

To compress or not to compress native libraries

Android’s default behaviour is to include the compressed native libraries in the APK and then extract them after installation. It is possible to change this to have the uncompressed native libraries in the APK and not to extract them, using them from the APK directly instead. Set these flags to do this:

In build.gradle:

android { 
aaptOptions {
noCompress ‘so’
...
}
...
}

In AndroidManifest.xml:

<application android:extractNativeLibs=”false”

Keep in mind that third party libraries may be using native libraries even if you’re not using them directly.

Some libraries such as SoLoader will extract the libraries automatically, regardless of these settings. Libraries like Fresco and React Native use SoLoader under the hood, so check carefully (running ./gradlew dependencies should highlight it if it’s in your build).

Here’s what this looks like for us in terms of install size:

  • Not using SoLoader* and compressing the libraries: roughly 9.5 MB of native libraries
  • Not using SoLoader* and not compressing the libraries: roughly 6.7 MB of native libraries
  • Using SoLoader and compressing the libraries: roughly 9.5 MB of native libraries
  • Using SoLoader and not compressing the libraries: roughly 8.4 MB of native libraries

* We use SoLoader for Fresco, which at the time we investigated used roughly 1.7 MB of uncompressed native libraries (although the latest versions aren’t as big). It’s not possible to use Fresco without SoLoader, but we included the hypothetical numbers for comparison.

We’re in a situation where we must use SoLoader, so we’re only able to reduce our native library size by 1.1 MB.

By not compressing the libraries, in theory the over-the-air updates should be smaller, but given the progress made by Play Store in using bsdiff, file-by-file deltas, and packaging files in a deterministic order, we found the update size isn’t really affected by whether the libraries are compressed or not these days.

Resources

Finally, our last target for optimising install size was to investigate reducing the footprint of our Android resources.

The most obvious way to reduce how much space they take up is to remove unused resources. ProGuard and Gradle can do that for you. You can also run a Lint inspection to find resources you can delete, and Android Studio has an option to delete them all in one go (Refactor > Remove Unused Resources…). We enabled this Lint check in our gated-check-in (we’re big fans of continuous integration) so now we always catch unused resources when performing refactors.

There were some resources we did need, though, so we tried to make ours take up less space.

Images

Images can easily be a heavy part of the resources. We were using PNGs and so we converted them to WebPs, which are around 25% smaller with lossless compression and can be even smaller with some quality degradation. We could have saved more space by transforming them to SVGs (one asset for all densities) but vector drawables take longer to render and the conversion requires a designer. You can easily convert to WebP in Android Studio (right click on the “drawable” folder > Convert to WebP…).

SwiftKey isn’t a super graphical app, but the conversion to WebP still saved us around 1 MB.

Roboto fonts

We also realised that we were bundling Roboto fonts (mainly for Roboto Light), but these are now shipped on versions of Android 4.1 and upwards, because we no longer support Android versions older than 4.1, we realised we didn’t need to include these fonts in the system, and could instead use the system fonts. This shaved 0.8 MB off our install size.

Translations

The translations are another source of wasted space in the app. Just to clarify, these are the languages any app is translated into (which, currently, in our case is 72 languages). Just as a note: these aren’t the languages the SwiftKey Keyboard lets you type in (which are now over 650 and keep increasing).

Strings are compiled into resources.arsc as a matrix with a row per string ID and a column per language where each element is the translation of its row ID into its column language. You can see the content with aapt dump --values resources app-release.apk .

We translate our app into 72 languages, and we understand that we’ll have each of our strings repeated 72 times (one per language).

Here’s an example with just 2 languages:

Example with an app string in two languages (1 row, 2 columns)
Example with an app string in two languages

However, we realised that libraries can also have strings with translations.

A new row will be added per string ID they add, with at least as many columns as languages we support even if the library doesn’t provide all those translations. This left us with cells in those rows taking up space but not having any useful content.

Example with an app string in two languages and a library string in one language (2 rows, 2 columns)
Example with an app string in two languages and a library string in one language

Similarly, a new column was added per language they support that we don’t support, with as many rows as strings we have.

Example with an app string in 2 languages, a library string in 1 language and another one in 3 languages (3 rows, 3 columns)
Example with an app string in two languages, a library string in one language and another library string in three languages

We prevented the addition of new columns by specifying explicitly which languages we want to translate so that the rest can be removed. We did this with the Android Gradle plugin’s resConfigs option:

android { 
defaultConfig {
resConfigs “en”, “es”
}
}
Previous example removing the third language (3 rows, 2 columns)
Example with an app string in two languages, a library string in one language and another library string in three languages, where we’ve removed the third language

We reduced our install size by 2 MB when we limited our translations to our 72 languages.

SwiftKey specifics

So far, these have been some things that we found useful and that hopefully you can use too as they’re very generic.

We also made some specific adjustments to our app:

We used to include the English US language model (for autocorrect and next word predictions) with our app so that people had a default language if we couldn’t find one for them. However, now most of our users don’t speak English (there is no single language spoken by the majority of the world’s population) and having the wrong default was worse than not having any language, so we stopped including English US. This reduced our install size by around 3.1 MB.

We have so many different keyboard layouts (the QWERTY keys you see on screen) and they were defined in XML files. Whilst it’s not well advertised, we believe that XML files get turned into protobuf files within an APK. However, to allow this to support encoding any XML file, the protobuf definition is written in a general way, and therefore isn’t as optimal for our particular use case. We wrote a better schema/definition and turned our layouts into a protobuf-like data files and then further GZipped them.

We have around 3,000 of these layouts, and so shrinking these files results in a decreased install size of around 6 MB!

Reducing download size

Reducing download size is harder because we have less control over it. The install size reductions will have a small impact on download size too, but we depend on Google Play’s distribution improvements to make significant changes.

The biggest recent change in this area was the release of dynamic delivery, which reduces the usefulness of some of the changes we did to reduce install size (e.g. dynamic delivery automatically removes unnecessary translations and doesn’t compress native libraries). We’re big fans because it allows us to deliver apps to users much more efficiently.

There have been different app serving models over the years and they keep improving. This latest model ensures a device gets only what that specific device needs, to minimise both download and install size.

App serving models: single APK -> multiple APKs -> dynamic delivery

Before dynamic delivery, the simplest approach was to build and upload a single APK that contains everything any user may need to Google Play. All users would download and install the same APK although it contains many things they don’t need.

Diagram showing single APK distribution
Diagram showing single APK distribution

Multiple APK support appeared in 2011 and allowed us to upload different APKs for different configurations. Each device would get the APK that was more suitable for it.

This allowed us to have smaller APKs because each APK only has the code and resources for a specific configuration, but also increased the complexity of publishing because we needed to handle multiple APKs. The number of APKs needed to target very specific configurations grows very quickly, for example, we’d need 3 ABIs x 4 screen densities = 12 APKs to target devices per ABI and screen density.

We evaluated the download size reduction we got from splitting the APK by different dimensions and how many more APKs we’d have to maintain if we did that. We decided to maintain three different APKs for the three ABIs we support, which reduced the size of each APK as they only contained the native libraries for one ABI instead of for the three of them, but chose not to split on other configurations because this would make our app harder to release through Google Play.

Diagram showing multi-APK distribution
Diagram showing multi-APK distribution

This was a common problem and Google addressed it with dynamic delivery in 2018, which uses the AAB (Android App Bundle) format for uploading apps to Google Play.

Developers create one single AAB, containing everything any device may need, and upload it to Google Play. Google Play then creates several APKs from that AAB: a base APK with the core of the app, and multiple configuration APKs for different ABIs, screen densities and languages. All splits can be individually turned on and off. Then every device only gets the APKs it needs, and it can automatically get new configuration APKs when its configuration changes. To be able to sign the APKs, Google Play needs developers to upload the key they sign APKs with.

Diagram showing AAB distribution
Diagram showing AAB distribution

We were already using multi APKs dividing the app per ABI so this didn’t bring massive download size gains, but we saw a bigger improvement in install size. Using AABs reduced our download size by ~3 MB and our install size by ~18 MB.

Our transition from APKs to AABs was very smooth. We tested all sorts of updates (from APK to AAB, from AAB to APK, from pre-installed APK to AAB…) and locale changes and they all worked seamlessly.

We are seeing some errors that we believe come from side-loaded apps, though. We don’t support side-loaded apps and we’d recommend users to get their apps from a secure source like Google Play. We’d recommend using the Play Core library to disable your app when users incorrectly side-load, although when we tried this it introduced a small (roughly 20 ms) app open time regression.

Testing AABs also poses some challenges as users will get different combinations of APKs based on device configuration and our team usually tests a single APK. We rely on a beta app and staged rollouts of releases to detect any possible problems with the AAB to APKs conversion (and we’ve found none so far).

Our next step would be to use dynamic features, which allows users to download features only when they need them, but we haven’t found a good use case yet.

This GIF shows the benefits of dynamic delivery in a very nice way.

GIF showing dynamic delivery
GIF showing dynamic delivery, from the Google developers blog

Evolution

These changes didn’t all happen at once, so we could see our install size changing over time:

GIF showing SwiftKey’s install size evolution over time
GIF showing SwiftKey’s install size evolution over time

We start in 7.0.4, before we started reducing the app size.

Then, in 7.1.7, we did all the optimisations apart from moving to AABs. We see that:

  • Keyboard layouts have gone down because of our protobuf-like optimisation
  • Graphics have gone down by using WebP and we removed our bundled Roboto fonts
  • Native library disk utilisation has gone down
  • We removed the included English US language model
  • Translations have remained roughly the same — we’re always adding new features and therefore new strings, so this wasn’t as noticeable in the end
  • Our DEX code was roughly the same too — code additions over the 3 months this project took also counteracted some of the improvements from optimisations

When we started using AABs we saw further improvements. In the last version we can see how the graphics and the translations are smaller.

At the beginning, we mentioned that reducing the download size should increase the install conversion rate. Specifically, we expected the biggest increases in Indonesia, Brazil and India. We did see an increase in Indonesia (even against an expected slow-down in conversions in the winter), but sadly not so much in Brazil or India.

It’s interesting to see that the relationship between these two measurements is not linear and the conversion rate eventually flattens, so further reducing the download size wouldn’t mean a higher conversion rate. It’s good to confirm that if we deleted everything and made the download size 0 MB, we wouldn’t get a 100% conversion rate.

Graphs showing app download size (going down) and conversion rate (going up and flattening) over time
Graphs showing app download size (going down) and conversion rate (going up and flattening) over time

Final results

Each improvement was small, but they all added up and eventually we saw these improvements in the Google Play console*:

  • Download size reduced by ~50% (from 27.6 MB to 14.3 MB on Google Play’s reference device)
Screenshot from Google Play showing SwiftKey’s download size over time
Screenshot from Google Play showing SwiftKey’s download size over time
  • Install size reduced by ~40% (from 81.5 MB to 48 MB on Google Play’s reference device)
Screenshot from Google Play showing SwiftKey’s install size over time
Screenshot from Google Play showing SwiftKey’s install size over time

* There is some variation in these numbers from our own numbers because the Google Play console accounts for optimised DEX code. This varies from device to device and so we’ve generally excluded this from our analysis. Here the numbers are for an ARM 64bit device. The Google Play console also doesn’t account for SoLoader still extracting some native libraries.

Useful links


App size reduction at Microsoft SwiftKey was originally published in Android@Microsoft on Medium, where people are continuing the conversation by highlighting and responding to this story.

REMEMBER: these articles are REPUBLISHED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.