The versatility of logs allows them to be used across the development life cycle, and to solve various challenges within an organization. Let’s look at three logging use cases from leading organizations at various stages of the product life cycle, and see what we can learn from them.
Transferwise is an online payments platform for transferring money across countries easily, and their app runs across multiple platforms. One of the challenges they face with their mobile app is analyzing crashes. It’s particularly difficult to reproduce crashes with mobile as there are many more possible issues - device-specific features, carrier network, memory issues, battery drain, interference from other apps, and many more. A stack trace doesn’t have enough information to troubleshoot the issue. To deal with this, Transferwise uses logs to better understand crashes. They attach a few lines of logs to a crash report which gives them vital information on the crash. To implement this, they use the open source tool CocoaLumberjack. It transmits crash logs to external loggers where they can be analyzed further. It enables Transferwise to print a log message to the console. You can save the log messages to the cloud or include them in a user-generated bug report. As soon as the report is sent, the user is notified that Transferwise is already working on fixing the issue. This is much better than being unaware of the crash, or ignoring it because they can’t find the root cause. You should ensure to exclude sensitive data in the log messages. To have more control over how log messages are reported and classified, Transferwise uses a logging policy. They classify logs into 5 categories - error, warning, info, debug, and verbose - each has a different priority level, and are reported differently. While CocoaLumberjack works only on Mac and iOS, you can find a similar tool like Timber or Hugo for Android. But the key point of this case study is that logging can give you additional insight into crashes especially in challenging environments like mobile platforms. It takes a few unique tools and some processes and policies in place to ensure the solution is safe enough to handle sensitive data, but the value is in increased visibility into application performance, and how you can use it to improve user experience. [Read more here.]
Wealthfront is a wealth management solution that uses data analytics to help its users invest wisely and earn more over the long term. Though the Wealthfront web app is the primary interface for a user to make transactions, their mobile app is more actively engaged with and is an important part of the solution. Wealthfront is a big believer in A/B testing to improve the UI of their applications. While they have a mature A/B testing process setup for the web app, they didn’t have an equivalent for their mobile apps. As a result they just applied the same learnings across both web and mobile. This is not the best strategy, as mobile users are different from web users, and the same results won’t work across both platforms. They needed to setup an A/B testing process for their mobile apps too. For inspiration, they looked to Facebook who had setup something similar for their mobile apps with Airlock - a framework for A/B testing on mobile. Wealthfront focussed their efforts on four fronts - backend infrastructure, API design, the mobile client, and experiment analysis. They found logs essential for the fourth part - experiment analysis. This is because logs are a much more accurate representation of the performance and results of an experiment than relying on a backend database. With mobile, the backend infrastructure is very loosely coupled with the frontend client and reporting can be inaccurate if you rely on backend numbers. With logs, however, you can gain visibility into user actions, and each step of a process as it executes. One reason why logging is more accurate is that the logging is coded along with the experiment. Thus, logging brings you deeper visibility into A/B testing and enables you to provide a better user experience. This is what companies like Facebook and Wealthfront have realized, and it can work for you too.[Read more here.]
At Twitter where they run distributed systems to manage data at very large scale, they use high-performance replicated logs to solve various challenges brought on by distributed architectures. Leigh Stewart of Twitter comments that “Logs are a building block of distributed systems and once you understand the basic pattern you start to see applications for them everywhere.”To implement this replicated log service they use two tools. The first is the open source Apache BookKeeper which is a low-level log storage tool. They chose BookKeeper for its low latency and high durability even under peak traffic. Second, they built a tool called DistributedLog to provide higher level features on top of BookKeeper. These features include naming and metadata for log streams, data segmentation policies like log retention and segmentation. Using this combination, they were able to achieve write latencies of 10ms, and not exceeding 20ms even at the slowest write speed. This is very efficient, and is possible because of using the right open source, and proprietary tools in combination with each other. [Read more here.]As the above examples show, logs play a vital role in various situations across multiple teams and processes. They can be used to make apps more reliable by reducing crashes, improve the user interface using A/B tests, and enforce better safety policies on end users. As you look to improve your applications in these areas, the way these organizations have made use of logs is worth taking note of and implementing in a way that’s specific to your organization. You also need a capable log analysis platform like LogDNA to collect, process and present your log data in a way that’s usable and actionable. Working with log data is challenging, but with the right goals, the right approach, and the right tools, you can gain a lot of value from log data to improve various aspects of your application’s performance.