Amazon CloudWatch Logs gives teams that use AWS an easy way to collect and analyze their log data. We talk to many customers that have been successful with the tool because it is simple to set up and it’s well-integrated with the AWS ecosystem. However, many of these same customers run into challenges as their data volumes grow and their observability needs start to mature.
The Challenges of Amazon CloudWatch Logs
The first challenge is cost. Your Amazon CloudWatch bill depends on a few different levers, including ingest volumes, storage, queries run, and if applicable, egress. Anticipating your monthly bill becomes more complicated when you factor in metrics, alarms, dashboards, and other components.
The net issue here is that CloudWatch can quickly become expensive – you easily set it up, then are surprised by your bill at the end of the month. We’ve encountered customers who spend more on the service than they would for a more complete third-party observability platform.
In one example, a customer was paying over $0.90 per GB they ingested into CloudWatch when you factor in all fees – 40% more than the data ingestion list price. (Eventually, this customer was able to reduce TCO by two-thirds using an Edge Delta approach.)
The second challenge is that Amazon CloudWatch can lead to tool sprawl because it isn’t suitable for every use case. Specifically, if you use multiple clouds, you’ll need another tool to monitor non-AWS resources. Additionally, if you have teams with more advanced needs, they are likely better off using a standalone analytics tool. In these situations, one team (like your infrastructure team) relies on CloudWatch, while other teams (like developers and security) use their own tools.
If you experience this second challenge, you’re facing a complex management experience and potentially poor usability. There’s poor integration between the tools and each might have its own query language that users have to learn (for example, CloudWatch Log Insights query syntax). Additionally, you might have several log collection agents deployed to meet the needs of different teams and their platforms. In other words, you collect, prepare, and deliver logs in several different ways.
Edge Delta provides a couple of options to help you solve both of these challenges. Before explaining how, let’s discuss the underlying concept of the Edge Delta solution – pre-processing log data at the source.
Pre-Processing Log Data At the Source
It’s unlikely that you need all of the raw data that you collect and ingest into Amazon CloudWatch (or other observability platforms for that matter). In fact, we find that customers typically only query somewhere between 5-10% of the ingested log data. So, why pay to store the other 90-95%?
At Edge Delta, we think that it’s no longer tenable to ingest your complete raw datasets into expensive log stores – especially as log data continues to scale. This idea is the premise behind our log data optimization use case. So, we’ve built our platform to help companies (a) better control what they store in observability tools and (b) maintain visibility into the data they don’t ingest.
Here’s how it works. The Edge Delta agent sits at the data source (or as close to the source as possible). At the agent level, Edge Delta analyzes every log event as it’s created. In doing so, we can group together repetitive log events and help you control how frequently you pass those along to CloudWatch or another observability platform. When you take this approach, we also provide statistics to help you understand how frequently the log event is occurring, whether it’s problematic, and more. As a result, you don’t end up ingesting (and paying for) hundreds of thousands of the same log event. And, your team has the insights it needs to fully understand the behavior being communicated.
Additionally, our agent extracts KPIs from your log contents and presents them as time-series metrics in the dashboard of your choice. So, for example, you could extract metrics for response times or frequency of status codes, depending on what your team cares about. These KPIs provide visibility into your log data, whether the raw datasets are ingested or not.
Lastly, our agent captures the full-fidelity data your team cares about. So, when an anomaly occurs – like if there’s an abnormal spike in exceptions or your tracked metrics fall outside normal behavior – it’ll automatically capture all raw logs tied to the event. You also have the ability to capture raw logs during specific occurrences, such as deployments.
As data passes through the Edge Delta agent, we route all raw logs to Amazon S3. These can be accessed at any time.
All of these features add up to help you ingest and pay for your most valuable log data, while also preserving visibility and access to the rest. As a result, you can right-size resource allocation and reduce observability TCO.
Now let’s discuss your options for improving your CloudWatch architecture.
Option #1: Consolidating Log Collection Agents
Edge Delta’s agent is vendor-agnostic and can route data to multiple destinations. So, teams that adopt our platform typically consolidate their log collection agents to eliminate complexity and overhead.
Before you might have had one agent to support CloudWatch, another for your SIEM, and a third for your developers’ preferred observability tool. When you use Edge Delta, you can standardize on one data collection, preparation, and routing mechanism to simplify your logging architecture.
Option #2: Consolidating Analytics and Monitoring Tools
Customers that have consolidated log collection agents have sometimes gone one step further to reduce the number of tools their team was using.
In one example, an infrastructure team was using Amazon CloudWatch to monitor their Amazon EKS node and container logs. These resources alone generated over 70 TB of log data per month. The organization then used another observability platform to support their application developers and a SIEM tool for their security logs. This setup made sense at the time because CloudWatch was well integrated with Amazon EKS and their other observability tool wasn’t optimized for Kubernetes data.
However, Edge Delta removed this point of friction. When the team added Edge Delta to their observability stack, they were able to analyze their Amazon EKS logs at the data source, determine what they needed, and keep all raw data – all before anything hit CloudWatch.
With this functionality, they determined they no longer needed the service and could rely on their observability platform to support their infrastructure needs.
The total cost of using Edge Delta with Amazon S3 came in at about 65% less than their monthly CloudWatch bill. Additionally, they were able to reduce the complexity of their observability stack.
Amazon CloudWatch Logs provide an easy way to monitor your AWS logs. However, the platform can quickly become expensive as your data volumes grow. And, it’s unlikely to meet all of your teams’ needs – especially when it comes to monitoring resources that aren’t AWS services. So, you might end up with tool sprawl and management complexity.
By adopting Edge Delta, you can analyze your log data as it’s collected. By doing so, you’ll better understand your logs and gain control over what you index. This capability can help you optimize CloudWatch costs, consolidate your log collection agents, and/or reduce the number of monitoring tools your team uses.