Android API calls typically involve significant latency and computation per invocation. Client-side caching is therefore an important consideration in designing APIs that are helpful, correct, and performant.
Motivation
APIs exposed to app developers in the Android SDK are often implemented as client code in the Android Framework that makes a Binder IPC call to a system service in a platform process, whose job it is to perform some computation and return a result to the client. The latency of this operation is typically dominated by three factors:
- IPC overhead: a basic IPC call is typically 10,000x the latency of a basic in-process method call.
- Server-side contention: the work done in the system service in response to the client's request may not start immediately, for example if a server thread is busy handling other requests that arrived earlier.
- Server-side computation: the work itself to handle the request in the server might require significant work.
You can eliminate all three of these latency factors by implementing a cache on the client side, provided that the cache is:
- Correct: the client-side cache never returns results that would be different than what the server would have returned.
- Effective: client requests are often served from the cache, for example the cache has a high hit rate.
- Efficient: the client-side cache makes efficient use of client-side resources, such as by representing cached data in a compact way and by not storing too many cached results or stale data in the client's memory.
Consider caching server results in the client
If clients often make the exact same request multiple times, and the value returned doesn't change over time, then you should implement a cache in the client library keyed by the request parameters.
Consider using IpcDataCache
in your implementation:
public class BirthdayManager {
private final IpcDataCache.QueryHandler, Birthday> mBirthdayQuery =
new IpcDataCache.QueryHandler, Birthday>() {
@Override
public Birthday apply(User user) {
return mService.getBirthday(user);
}
};
private static final int BDAY_CACHE_MAX = 8; // Maximum birthdays to cache
private static final String BDAY_API = "getUserBirthday";
private final IpcDataCache, Birthday> mCache
new IpcDataCache, Birthday>(
BDAY_CACHE_MAX, MODULE_SYSTEM, BDAY_API, BDAY_API, mBirthdayQuery);
/** @hide **/
@VisibleForTesting
public static void clearCache() {
IpcDataCache.invalidateCache(MODULE_SYSTEM, BDAY_API);
}
public Birthday getBirthday(User user) {
return mCache.query(user);
}
}
For a complete example, see android.app.admin.DevicePolicyManager
.
IpcDataCache
is available to all system code, including mainline modules.
There is also PropertyInvalidatedCache
which is nearly identical, but is only
visible to the framework. Prefer IpcDataCache
when possible.
Invalidate caches on server-side changes
If the value returned from the server can change over time, implement a callback for observing changes, and register a callback so that you may invalidate the client-side cache accordingly.
Invalidate caches between unit test cases
In a unit test suite, you might test the client code against a test double rather than the real server. If so, then be sure to clear any client-side caches between test cases. This is to keep test cases mutually hermetic, and prevent one test case from interfering with another.
@RunWith(AndroidJUnit4.class)
public class BirthdayManagerTest {
@Before
public void setUp() {
BirthdayManager.clearCache();
}
@After
public void tearDown() {
BirthdayManager.clearCache();
}
...
}
When writing CTS tests that exercise an API client that uses caching internally, the cache is an implementation detail that is not exposed to the API author, therefore CTS tests shouldn't require any special knowledge of caching used in client code.
Study cache hits and misses
IpcDataCache
and PropertyInvalidatedCache
can print live statistics:
adb shell dumpsys cacheinfo
...
Cache Name: cache_key.is_compat_change_enabled
Property: cache_key.is_compat_change_enabled
Hits: 1301458, Misses: 21387, Skips: 0, Clears: 39
Skip-corked: 0, Skip-unset: 0, Skip-bypass: 0, Skip-other: 0
Nonce: 0x856e911694198091, Invalidates: 72, CorkedInvalidates: 0
Current Size: 1254, Max Size: 2048, HW Mark: 2049, Overflows: 310
Enabled: true
...
Fields
Hits:
- Definition: The number of times a requested piece of data was successfully found within the cache.
- Significance: Indicates an efficient and fast retrieval of data, reducing unnecessary data retrieval.
- Higher counts are generally better.
Clears:
- Definition: The number of times the cache was cleared because of invalidation.
- Reasons for Clearing:
- Invalidation: Outdated data from the server.
- Space Management: Making room for new data when the cache is full.
- High counts could indicate frequently changing data and potential inefficiency.
Misses:
- Definition: The number of times the cache failed to provide the requested data.
- Causes:
- Inefficient caching: Cache too small or not storing the right data.
- Frequently changing data.
- First-time requests.
- High counts suggest potential caching issues.
Skips:
- Definition: Instances where the cache was not used at all, even though it could have been.
- Reasons for skipping:
- Corking: Specific to Android Package Manager updates, deliberately turning off caching because of a high volume of calls during boot.
- Unset: Cache exists but not initialized. The nonce was unset, which means the cache has never been invalidated.
- Bypass: Intentional decision to skip the cache.
- High counts indicate potential inefficiencies in cache usage.
Invalidates:
- Definition: The process of marking cached data as outdated or stale.
- Significance: Provides a signal that the system works with the most up-to-date data, preventing errors and inconsistencies.
- Typically triggered by the server that owns the data.
Current Size:
- Definition: The current amount of elements in cache.
- Significance: Indicates the cache's resource utilization and potential impact on system performance.
- Higher values generally mean more memory is used by the cache.
Max Size:
- Definition: The maximum amount of space allocated for the cache.
- Significance: Determines the cache's capacity and its ability to store data.
- Setting an appropriate max size helps balance cache efficacy with memory usage. Once the maximum size is reached, a new element is added by evicting the least-recently used element, which can indicate inefficiency.
High Water Mark:
- Definition: The maximum size reached by the cache since its creation.
- Significance: Provides insights into peak cache usage and potential memory pressure.
- Monitoring the high water mark can help identify potential bottlenecks or areas for optimization.
Overflows:
- Definition: The number of times the cache exceeded its max size and had to evict data to make room for new entries.
- Significance: Indicates cache pressure and potential performance degradation due to data eviction.
- High overflow counts suggest the cache size may need to be adjusted or the caching strategy reevaluated.
The same stats can also be found in a bug report.
Tune the size of the cache
Caches have a maximum size. When the maximum cache size is exceeded, entries are evicted in LRU order.
- Caching too few entries could negatively affect the cache hit rate.
- Caching too many entries increases the cache's memory usage.
Find the right balance for your use case.
Eliminate redundant client calls
Clients may make the same query to the server multiple times in a short span:
public void executeAll(List operations) throws SecurityException {
for (Operation op : operations) {
for (Permission permission : op.requiredPermissions()) {
if (!permissionChecker.checkPermission(permission, ...)) {
throw new SecurityException("Missing permission " + permission);
}
}
op.execute();
}
}
Consider reusing the results from previous calls:
public void executeAll(List operations) throws SecurityException {
Set permissionsChecked = new HashSet<>();
for (Operation op : operations) {
for (Permission permission : op.requiredPermissions()) {
if (!permissionsChecked.add(permission)) {
if (!permissionChecker.checkPermission(permission, ...)) {
throw new SecurityException(
"Missing permission " + permission);
}
}
}
op.execute();
}
}
Consider client-side memoization of recent server responses
Client apps may query the API at a faster rate than the API's server can produce meaningfully new responses. In this case, an effective approach is to memoize the last seen server response at the client side along with a timestamp, and to return the memoized result without querying the server if the memoized result is recent enough. The API client author can determine the memoization duration.
For example, an app may display network traffic statistics to the user by querying for the stats in every frame drawn:
@UiThread
private void setStats() {
mobileRxBytesTextView.setText(
Long.toString(TrafficStats.getMobileRxBytes()));
mobileRxPacketsTextView.setText(
Long.toString(TrafficStats.getMobileRxPackages()));
mobileTxBytesTextView.setText(
Long.toString(TrafficStats.getMobileTxBytes()));
mobileTxPacketsTextView.setText(
Long.toString(TrafficStats.getMobileTxPackages()));
}
The app may draw frames at 60 Hz. But hypothetically, the client code in
TrafficStats
may choose to query the server for stats at most once per second,
and if queried within a second of a previous query, return the last seen value.
This is allowed since the API documentation doesn't provide any contract
regarding the freshness of the results returned.
participant App code as app
participant Client library as clib
participant Server as server
app->clib: request @ T=100ms
clib->server: request
server->clib: response 1
clib->app: response 1
app->clib: request @ T=200ms
clib->app: response 1
app->clib: request @ T=300ms
clib->app: response 1
app->clib: request @ T=2000ms
clib->server: request
server->clib: response 2
clib->app: response 2
Consider client-side codegen instead of server queries
If the query results are knowable to the server at build time, then consider if they are knowable to the client at build time as well, and consider whether the API could be implemented entirely in the client side.
Consider the following app code that checks if the device is a watch (that is, the device is running Wear OS):
public boolean isWatch(Context ctx) {
PackageManager pm = ctx.getPackageManager();
return pm.hasSystemFeature(PackageManager.FEATURE_WATCH);
}
This property of the device is known at build time, specifically at the time
that the Framework was built for this device's boot image. The client-side code
for hasSystemFeature
could return a known result immediately, rather than
querying the remote PackageManager
system service.
Deduplicate server callbacks in the client
Lastly, the API client may register callbacks with the API server to be notified of events.
It's typical for apps to register multiple callbacks for the same underlying information. Rather than have the server notify the client once per registered callback using IPC, the client library should have one registered callback using IPC with the server, and then notify each registered callback in the app.
digraph d_front_back {
rankdir=RL;
node [style=filled, shape="rectangle", fontcolor="white" fontname="Roboto"]
server->clib
clib->c1;
clib->c2;
clib->c3;
subgraph cluster_client {
graph [style="dashed", label="Client app process"];
c1 [label="my.app.FirstCallback" color="#4285F4"];
c2 [label="my.app.SecondCallback" color="#4285F4"];
c3 [label="my.app.ThirdCallback" color="#4285F4"];
clib [label="android.app.FooManager" color="#F4B400"];
}
subgraph cluster_server {
graph [style="dashed", label="Server process"];
server [label="com.android.server.FooManagerService" color="#0F9D58"];
}
}