Resources
6 min read
Last updated:
As distributed environments become more complex, users often use distributed tracing tools to improve the visibility of issues evident within their traces.
Throughout this post, we will examine some of the best open-source and other generally popular distributed tracing tools available today.
Contents
1. Jaeger
Uber Technologies released Jaeger's distributed tracing system, as part of their open-source initiative, as a fully open sourced project in 2015. Using Jaeger, you can monitor and troubleshoot distributed systems. Like Kubernetes and Thanos, Jaeger is a Cloud Native Computing Foundation (CNCF) graduate project.
2. Zipkin
Zipkin is an open-source distributed tracing system that helps troubleshoot latency problems. In addition to collecting trace data, Zipkin can also be used to look up trace data. Based on the Google Dapper papers, Zipkin was originally developed at Twitter in 2010 and based upon the Java framework.
Zipkin is leveraged by many leading companies across a variety of industries including Postmates, Uber and TransferWise, among many others.
3. Logit.io
Using Logit.io's distributed tracing solution, you will be able to trace key events and see how efficiently resources are being utilised across any application, no matter the complexity. Logit.io also allows you to visualise metrics, logs, events, and traces from your applications.
Any metrics that you collect can be used to build dashboards, reports, and alerts, all within a single platform, once you have used one of our simple data forwarders to start sending your data. Aside from trace observability, the platform is also used for use cases such as log management, infrastructure monitoring & deep metrics analysis.
4. New Relic
With New Relic, you can send logs from AWS, Microsoft Azure, and other leading cloud providers. New Relic was founded in 2008, so they have extensive experience working in the log management market. They also provide distributed tracing, instant observability, and synthetics monitoring as well.
5. Datadog
For effective parsing, archiving and monitoring, Datadog's log management solution separates log ingestion from indexing. Besides metrics management and application analysis, the solution also offers synthetics monitoring and device monitoring.
When discussing the platform's APM features, users often appreciate Datadog's ability to collect and ingest a multitude of data sources as well as its large number of data points which then inform intuitive dashboards. Datadog's distributed tracing features help users debug application performance issues in real-time and better understand the impact services are having on users via error, latency, and high-value traces.
6. Sentry
The Sentry open-source application monitoring tool helps you identify errors and performance bottlenecks within your code. As well as monitoring separate services or applications, Sentry's distributed tracing service also enables the platform to stitch together related user instances from different sources. This provides a very convenient overview of the application state at each checkpoint a user passes through.
Using Sentry, you can also track performance issues, identify poor API calls, and pinpoint slow database queries.
7. Dynatrace
Dynatrace simplifies cloud complexity so that organisations can move toward digital transformation faster. Although, getting onboard with Dynatrace and their distributed tracing offering can be a challenging process because of the sheer amount of documentation required to learn can make it difficult to get the most out of the platform.
It may be worth comparing Dynatrace's onboarding process with other services to see how fast you can begin monitoring your applications after registration since many of its competitors offer simpler onboarding experiences.
8. Splunk
APM and SIEM are two of the main services offered by Splunk. Its platform is well known among engineers for its capabilities to handle large-scale projects (for instance, managing more than 200,000 devices). Their application performance monitoring solution offers distributed tracing capabilities as standard.
A high-performance observability platform addressing enterprise users, Splunk is the original proprietary "data to everything" platform. As well as monitoring microservices in production, Splunk can be used to monitor environments across production, test, and development environments.
9. AppDynamics
Full-stack observability offered by AppDynamics enhanced operational visibility, making it a suitable tool for eliminating visibility silos arising from microservice architecture.
In addition to the high cost associated with deployment and configuration, Appdymanics has previously received criticism for its lack of platform support options.
10. Honeycomb
Monitoring production servers and troubleshooting user experiences are just two of the features Honeycomb can handle. It is worth knowing that if you do go over your plan limits with Honeycomb, your data will not be automatically lost since they give users overusage to increase their plans (similar to Logit.io).
Honeycomb has a number of built-in distributed tracing capabilities that make it a very useful platform for developers. Honeycomb's distributed tracing features are designed specifically for the monitoring of microservices as well as making their performance bottlenecks much more transparent to anyone analysing their system.
11. Wavefront
The Wavefront monitoring and analytics platform offer 3D observability with metrics, histograms, and OpenTracing-compatible distributed tracing capabilities on a single platform. In 2017, Wavefront was acquired by VMware and now provides a high-performance observability platform that enables users to monitor, visualise, and analyse their distributed application environment.
One of the main benefits of Wavefront is its ability to handle data from any cloud infrastructure. Due to this, it is used to alert on, troubleshoot, and optimise the performance of both multi-cloud and hybrid-cloud environments.
12. Grafana Tempo
Grafana Tempo is an open-source, high-volume, minimal dependency distributed tracing backend that is built on top of the Grafana Framework. Using Tempo can be convenient, as it requires only object storage to run, along with being fully integrated with Prometheus, as well as Loki. Tempo is capable of ingesting data from a number of popular open source tracing protocols, such as Jaeger, Zipkin, and OpenTelemetry, which are all widely used today.
Another benefit of Tempo is that it can be deployed at a massive scale without requiring any Elasticsearch or Cassandra clusters (which can easily become quite tedious to maintain). Furthermore, Tempo can be used with Kafka and legacy tooling such as OpenCensus.
13. Uptrace
With Uptrace, teams can deploy code with confidence, understand complex systems, and save time and money with their maintenance-free, distributed tracing solution. As part of the Uptrace system, OpenTelemetry is used to collect the data and a ClickHouse database is used to store that data.
With Uptrace, users can pinpoint problems in complex distributed systems and find performance bottlenecks, as well as work with many petabytes worth of data. There are plans to make Uptrace an open-source tool in its own right, as mentioned on their official GitHub repository and according to the terms of their current licensing.
14. Kamon
Kamon consists of a set of tools used for instrumenting applications running on Java Virtual Machines (JVMs). The Kamon platform allows you to use metrics, tracing, and context propagation APIs all whilst being completely vendor agnostic.
The Kamon APIs are completely decoupled from the services that can receive the data, so you can instrument your application once and report going forward on the data no matter the connected service. Whether the service you are currently running is StatsD, Prometheus, Kamino, Datadog, Zipkin or Jaeger, Kamon integrates with all of these services.
15. Instana
With Instana, you get a fully automated solution for application performance management for cloud-native apps. It is important to note that Instana's AutoTrace solution is a distributed tracing and service discovery technology that supports multiple technologies simultaneously, including .NET, Clojure, Kotlin, PHP, Python, and Scala.
16. Summary
Tool | Key Features | Strengths | Limitations | Ideal For | Pricing / Free Trial |
---|---|---|---|---|---|
Jaeger | Monitor and troubleshoot distributed systems, CNCF graduate project | Open-source, cloud-native | Can be complex to configure | Large enterprises, DevOps | Free |
Zipkin | Troubleshoot latency issues, trace lookup, based on Google Dapper | Lightweight, widely adopted | Limited advanced features | Developers, IT professionals | Free |
Logit.io | Trace events, visualize metrics, logs, events, and traces | Easy setup, comprehensive platform | Subscription cost | DevOps, IT professionals | 14-day free trial |
New Relic | Log management, distributed tracing, instant observability | Extensive cloud provider integrations | High cost | Enterprises, large organizations | Paid, free trial available |
Datadog | Log ingestion, indexing, synthetics monitoring, device monitoring | Real-time debugging, intuitive dashboards | Can be expensive | Developers, IT professionals | Paid, free trial available |
Sentry | Error tracking, performance bottleneck identification | Open-source, convenient overview | Limited to monitoring applications | Developers, QA teams | Free, paid plans available |
Dynatrace | Cloud complexity simplification, digital transformation | High-performance, comprehensive observability | Steep learning curve | Large enterprises | Paid, free trial available |
Splunk | APM, SIEM, distributed tracing, enterprise-grade observability | Handles large-scale projects, high performance | Expensive, complex setup | Enterprises, large organizations | Paid, free trial available |
AppDynamics | Full-stack observability, operational visibility | Eliminates visibility silos | High cost, limited platform support | Large enterprises | Paid, free trial available |
Honeycomb | Production server monitoring, user experience troubleshooting | Built-in distributed tracing, handles overusage | Plan limits | Developers, support teams | Paid, free trial available |
Wavefront | Metrics, histograms, distributed tracing, cloud compatibility | High-performance, 3D observability | High cost | Enterprises, multi-cloud environments | Paid, free trial available |
Grafana Tempo | High-volume, minimal dependency tracing backend | Open-source, integrated with Grafana | Requires object storage | Developers, IT professionals | Free |
Uptrace | Maintenance-free tracing, OpenTelemetry, ClickHouse storage | Handles large data volumes, pinpoint problems | Not fully open-source yet | DevOps, IT professionals | Free |
Kamon | JVM instrumentation, vendor-agnostic APIs | Flexible, integrates with multiple services | Limited to JVM applications | Java developers | Free |
Instana | Automated APM, supports multiple technologies | Fully automated, comprehensive support | Can be expensive | Enterprises, cloud-native apps | Paid, free trial available |
If you learnt something new from this blog post then why not keep reading other similar articles such as what CMMC stands for or learn all about Prometheus metrics?