🔒
There are new articles available, click to refresh the page.
Yesterday — 30 November 2021CrowdStrike

What Is a Hypervisor (VMM)?

30 November 2021 at 09:43

This blog was originally published on humio.com. Humio is a CrowdStrike Company.

What is a hypervisor?

hypervisor, or virtual machine monitor (VMM), is virtualization software that creates and manages multiple virtual machines (VMs) from a single physical host machine.

Acting as a VMM, the hypervisor monitors, pools and allocates resources — like CPU, memory and storage — across all guest VMs. By centralizing these assets, it’s possible to significantly reduce each VM’s energy consumption, space allocation and maintenance requirements while optimizing overall system performance.

Why should you use a hypervisor?

In addition to helping the IT team better monitor and utilize all available resources, a hypervisor unlocks a wide range of benefits. These include:

  • Speed and scalability: Hypervisors can create new VMs instantly, which allows organizations to quickly scale to meet changing business needs. In the event an application needs more processing power, the hypervisor can also access additional machines on a different server to address this demand.
  • Cost and energy efficiency: Using a hypervisor to create and run several VMs from a common host is far more cost- and energy-efficient than running several physical machines to complete the same tasks.
  • Flexibility: A hypervisor separates the OS from underlying physical hardware. As a result, the guest VM can run a variety of software and applications since the system does not rely on specific hardware.
  • Mobility and resiliency: Hypervisors logically isolate VMs from the host hardware. VMs can therefore be moved freely from one server to another without risk of disruption. Hypervisors can also isolate one guest virtual machine from another; this eliminates the risk of a “domino effect” if one virtual machine crashes.
  • Replication: Replicating a VM manually is a time-intensive and potentially complex process. Hypervisors automate the replication process for VMs, allowing staff to focus on more high-value tasks.
  • Restoration: A hypervisor has built-in stability and security features, including the ability to take a snapshot of a VM’s current state. Once this snapshot is taken, the VM can revert to this state if needed. This is particularly useful when carrying out system upgrades or maintenance as the VM can be restored to its previous functioning state if the IT team encounters an error.

Types of hypervisors

There are two main types of hypervisors:

  1. Type 1 hypervisor: Native or bare metal hypervisor
  2. Type 2 hypervisor: Hosted or embedded hypervisor

Type 1 hypervisor: native or bare metal hypervisor

type 1 hypervisor installs virtualization software directly on the hardware, hence the name bare metal hypervisor.

In this model, the hypervisor takes the place of the OS. As a result, these hypervisors are typically faster since all computing power can be dedicated to guest virtual machines, as well as more secure since adversaries cannot target vulnerabilities within the OS.

That said, a native hypervisor tends to be more complex to set up and operate. Further, a type 1 hypervisor has somewhat limited functionality since the hypervisor itself basically serves as an OS.

Type 2 hypervisor: hosted or embedded hypervisor

Unlike bare-metal hypervisors, a hosted hypervisor is deployed as an added software layer on top of the host operating system. Multiple operating systems can then be installed as a new layer on top of the host OS.

In this model, the OS acts as a weigh station between the hardware and hypervisor. As a result, a type 2 hypervisor tends to have higher latency and slower performance. The presence of the OS also makes this type more vulnerable to cyberattacks.

Embedded hypervisors are generally more convenient to build and launch than a Type 1 hypervisor since they do not require a management console or dedicated machine to set up and oversee the VMs. A hosted hypervisor may also be a good choice for use cases where latency is not a concern, such as software testing.

Cloud hypervisors

The shift to the cloud and cloud computing is prompting the need for cloud hypervisors. The cloud hypervisor focuses exclusively on running VMs in a cloud environment (rather than on physical devices).

Due to the cloud’s flexibility, speed and cost savings, businesses are increasingly migrating their VMs to the cloud. A cloud hypervisor can provide the tools to migrate them more efficiently, allowing companies to make a faster return on investment on their transformation efforts.

Differences between containers and hypervisors

Containers and hypervisors both ensure applications run more efficiently by logically isolating them within the system. However, there are significant differences between how the two are structured, how they scale and their respective use cases.

A container is a package of only software and its dependencies, such as code, system tools, settings and libraries. It can run reliably on any operating system and infrastructure. A container consists of an entire runtime environment, enabling applications to move between a variety of computing environments, such as from a physical machine to the cloud, or from a developer’s test environment to staging and then production.

Hypervisors vs containers

Hypervisors host one or more VMs that mimic a collection of physical machines. Each VM has its own independent OS and is effectively isolated from others.

While VMs are larger and generally slower compared to containers, they can run several applications and different operating systems simultaneously. This makes them a good solution for organizations that need to run multiple applications or legacy software that requires an outdated OS.

Containers, on the other hand, often share an OS kernel or base image. While each container can run individual applications or microservices, it is still linked to the underlying kernel or base image.

Containers are typically used to host a single app or microservice without any other overhead. This makes them more lightweight and flexible than VMs. As such, they are often used for tasks that require a high level of scalability, portability and speed, such as application development.

Understanding hypervisor security

On one hand, by isolating VMs from one another, a hypervisor effectively contains attacks on an individual VM. Also, in the case of type 1 or bare metal hypervisors, the absence of an operating system significantly reduces the risk of an attack since adversaries cannot exploit vulnerabilities within the OS.

At the same time, the hypervisor host itself can be subject to an attack. In that case, each guest machine and their associated data could be vulnerable to a breach.

Best practices for improving hypervisor security

Here are some best practices to consider when integrating a hypervisor within the organization’s IT architecture:

  • Minimize the attack surface by limiting a host’s role to only operating VMs
  • Conduct regular and timely patching for all software applications and the OS
  • Leverage other security measures, such as encryption, zero trust and multi-factor authentication (MFA) to ensure user credentials remain secure
  • Limit administrative privileges and the number of users in the system
  • Incorporate the hypervisor within the organization’s cybersecurity architecture for maximum protection

Hypervisors and modern log management

With the growth of microservices and migration to disparate cloud environments, maintaining observability has become increasingly difficult. Additionally, challenges such as application availability, bugs/vulnerabilities, resource use and changes to performance in virtual machines/containers that affect end-user experience continues to affect the community. Organizations operating with a continuous delivery model are further troubled with capturing and understanding the dependencies within the application environment.

Humio’s streaming log management solution can access and ingest real-time data streaming from diverse platforms and accurately log network issues, database connections and availability, and information about what’s happening in a container that the application relies on. In addition to providing visibility across the entire infrastructure, developers can benefit from comprehensive root cause investigation and analysis. Humio enables search across all relevant data with longer data-retention and long-term storage.

Humio Community Edition

Try Humio’s log management solution at no cost with ongoing access here!

Before yesterdayCrowdStrike

Mean Time to Repair (MTTR) Explained

23 November 2021 at 08:30

This blog was originally published oct. 28, 2021 on humio.com. Humio is a CrowdStrike Company.

Definition of MTTR

Mean time to repair (MTTR) is a key performance indicator (KPI) that represents the average time required to restore a system to functionality after an incident. MTTR is used along with other incident metrics to assess the performance of DevOps and ITOps, gauge the effectiveness of security processes, evaluate the effectiveness of security solutions, and measure the maintainability of systems.

Service level agreements with third-party providers typically set expectations for MTTR, although repair times are not guaranteed because some incidents are more complex than others. Along the same lines, comparing the MTTR of different organizations is not fruitful because MTTR is highly dependent on unique factors relating to the size and type of the infrastructure and the size and skills of the ITOps and DevOps team. Every business has to determine which metrics will best serve its purposes and how it will put them into action in their unique environment.

Difference Between Common Failure Metrics

Modern enterprise systems are complicated and they can fail in numerous ways. For these reasons, there is no one set of incident metrics every business should use — but there are many to choose from, and the differences can be nuanced.

Mean Time to Detect (MTTD)

Also called mean time to discover, MTTD is the average time between the beginning of a system failure and its detection. As a KPI, MTTD is used to measure the effectiveness of the tools and processes used by DevOps teams.

To calculate MTTP, select a period of time, such as a month, and track the times between the beginning of system outages and their discovery, and then add up the total time and divide it by the number of incidents to find the average. MTTD should be low. If it continues to take longer to detect or or discover system failures (an upward trend), an immediate review should be conducted of the existing incident response management tools and processes.

Mean Time to Identify (MTTI)

This measurement tracks the number of business hours between the moment an alert is triggered and the moment the cybersecurity team begins to investigate that alert. MTTI is helpful in understanding if alert systems are effective and if cybersecurity teams are staffed to the necessary capacity. A high MTTI or an MTTI that is trending in the wrong direction can be an indicator that the cybersecurity team is suffering from alert fatigue.

Mean Time to Recovery (MTTR)

Mean time to recovery is the average time it takes in business hours between the start of an incident and the complete recovery back to normal operations. This incident metric is used to understand the effectiveness of the DevOps and ITOps teams and identify opportunities to improve their processes and capabilities.

Mean Time to Resolve (MTTR)

Mean time to resolve is the average time between the first alert through the post-incident analysis, including the time spent ensuring the failure will not re-occur. It is measured in business hours.

Mean Time Between Failures (MTBF)

Mean time between failures is a key performance metric that measures system reliability and availability. ITOps teams use MTBF to understand which systems or components are performing well and which need to be evaluated for repair or replacement. Knowing MTBF enables preventative maintenance, minimizes reactive maintenance, reduces total downtime and enables teams to prioritize their workload effectively. Historical MTBF data can be used to make better decisions about scheduling maintenance downtime and resource allocation.

MTBF is calculated by tracking the number of hours that elapse between system failures in the ordinary course of operations over a period of time and then finding the average.

Mean Time to Failure (MTTF)

Mean time to failure is a way of looking at uptime vs. downtime. Unlike MTBF, an incident metric that focuses on repairability, MTTF focuses on failures that cannot be repaired. It is used to predict the lifespan of systems. MTTF is not a good fit for every system. For example, systems with long lifespans, such as core banking systems or many industrial control systems, are not good subjects for MTTF metrics because they have such a long lifespan that when they are finally replaced, the replacement will be an entirely different type of system due to technological advances. In cases like that, MTTF is moot.

Conversely, tracking the MTTF of systems with more typical lifespans is a good way to gain insight into which brands perform best or which environmental factors most strongly influence a product’s durability.

MTTR is intended to reduce unplanned downtime and shorten breakout time. But its use also supports a better culture within ITOps teams.When incidents are repaired before users are impacted, DevOps and ITOps are seen as efficient and effective. Resilient system design is encouraged because when DevOps knows its performance will be measured by MTTR, the team will build apps that can be repaired faster, such as by developing apps that are populated by discrete web services so one service failure will not crash the entire app. MTTR, when done properly, includes post-incident analysis, which should be used to inform a feedback loop that leads to better software builds in the future and encourages the fixing of bugs early in the SDLC process.

How to Calculate Mean Time to Repair

The MTTR formula is straightforward: Simply add up the total unplanned repair time spent on a system within a certain time frame and divide the results by the total number of relevant incidents.

For example, if you have a system that fails four times in one workday and you spend an hour repairing each of those instances of failure, your MTTR would be 15 minutes (60 minutes / 4 = 15 minutes).

However, not all outages are equal. The time spent repairing a failed component or a customer-facing system that goes down during peak hours is more expensive in terms of lost sales, productivity or brand damage than time spent repairing a non-critical outage in the middle of the night. Organizations can establish an “error budget” that specifies that each minute spent repairing the most impactful systems is worth an hour of minutes spent repairing less impactful ones. This level of granularity will help expose the true costs of downtime and provide a better understanding of what MTTR means to the particular organization.

How to Reduce MTTR

There are three elements to reducing MTTR:

  1. Manage resolution process. The first is a defined strategy for managing the resolution process, which should include a post-incident analysis to capture lessons learned.
  2. Build defenses. Technology plays a crucial role, of course, and the best solution will provide visibility, monitoring and corrective maintenance to help root out problems and build defenses against future attacks.
  3. Mitigate the incident. Lastly, the skills necessary to mitigate the incident have to be available.

MTTR can be reduced by increasing budget or headcount, but that isn’t always realistic. Instead, deploy artificial intelligence (AI) and machine learning (ML) to automate as much of the repair process as possible. Those steps include rapid detection, minimization of false positives, smart escalation, and automated remediation that includes workflows that reduce MTTR.

MTTR can be a helpful metric to reduce downtime and streamline your DevOps and ITOps teams, but improving it shouldn’t be the end goal. After all, the point of using metrics is not simply improving numbers but, in this instance, the practical matter of keeping systems running and protecting the business and its customers. Use MTTR in a way that helps your teams protect customers and optimize system uptime.

Improve MTTR With a Modern Log Management Solution

Logs are invaluable for any kind of incident response. Humio’s platform enables complete observability for all streaming logs and event data to help IT organizations better prepare for the unknown and quickly find the root cause of any incident.

Humio leverages modern technologies, including data streaming, index-free architecture and hybrid deployments, to optimize compute resources and minimize storage costs. Because of this, Humio can collect structured and unstructured data in memory to make exploring and investigating data of any size blazing fast.

Humio Community Edition

With a modern log management platform, you can monitor and improve your MTTR. Try it out at no cost!

Introduction to the Humio Marketplace

18 November 2021 at 08:56

This blog was originally published Oct. 11, 2021 on humio.com. Humio is a CrowdStrike Company.

Humio is a powerful and super flexible platform that allows customers to log everything and answer anything. Users can choose how to ingest their data and choose how to create and manage their data with Humio. The goal of Humio’s marketplace is to provide a variety of packages that power our customers with faster and more convenient ways to get more from their data across a variety of use cases.

What is the Humio Marketplace?

The Humio Marketplace is a collection of prebuilt packages created by Humio, partners and customers that Humio customers can access within the Humio product interface.

These packages are relevant to popular log sources and typically contain a parser and some dashboards and/or saved queries. The package documentation includes advice and guidance on how to best ingest the data into Humio to start getting immediate value from logs.

What is a package?

The Marketplace contains prebuilt packages that are essentially YAML files that describe the Humio assets included in the package. A package can include any or all of: a parser, saved searches, alerts, dashboards, lookup files and labels. The package also includes YAML files for the metadata of the package (such as descriptions and tags, support status and author), and a README file which contains a full description and explanation of any prerequisites, etc.

Packages can be configured as either a Library type package — which means, once installed, the assets are available as templates to build from — or an Application package, which means, once installed, the assets are instantiated and are live immediately.

By creating prebuilt content that is quick and simple to install, we want to make it easier for customers to onboard new log sources to Humio to quickly get value from that data. With this prebuilt content, customers won’t have to work out the best way of ingesting the logs and won’t have to create parsers and dashboards from scratch.

How do I make a package?

Packages are a great way to mitigate manual work, whether that’s taking advantage of prebuilt packages or making your own packages so you don’t have to begin new processes all over.

Anyone can create a Humio package straight from Humio’s interface. We actively encourage customers and partners to create packages and submit those packages for inclusion in the Marketplace if they think they could benefit other customers. Humio will work with package creators to make sure the package meets our standards for inclusion in the Marketplace. By sharing your package with all Humio customers through the Marketplace, you are strengthening the community and allowing others to benefit from your expertise while you, likewise, benefit from others’ expertise.

For some customers, the package will be exactly what they want, but for others, it will be a useful starting point for further customization. All Humio packages are provided under an Apache 2.0 license, so customers are free to adapt and reuse the package as needed.

If I install a package, will it get updated?

Package creators can develop updates in response to changes in log formats or to introduce new functionality and improvements. Updates will be advertised as available in the Marketplace and users can choose to accept the update. The update process will check to see if any local changes have been made to assets installed from the package and, if so, will prompt the user to either overwrite the changes with the standard version from the updated package or to keep the local changes.

Are packages free?

Yes, all Humio packages in the Marketplace are free to use!

Can I use packages to manage my own private Humio content?

Absolutely! Packages are a convenient way for customers to manage their own private Humio content. Packages can be created in the Humio product interface and can be downloaded as a ZIP file and uploaded into a different Humio repository or a different instance of Humio (cloud or hybrid). Customers can also store their Humio packages in a code repository and use their CI/CD tools and the Humio API to deploy and manage Humio assets as they would their own code. This streamlines Humio support and operations and delivers a truly agile approach to log management.

Get started today

To get started with packages is simple. All you need is access to a Humio Cloud service, or if running Humio self-hosted, you need to be on V1.21 or later. To create and install packages, you need the “Change Packages” permission assigned to your Humio user role.

Access the Marketplace from within the Humio product UI (Go to Settings, Packages, then Marketplace to browse the available packages or to create your own package). Try creating a package and uploading it to a different repository. If you create a nice complex dashboard and want to recreate it in a different repository, you know what to do: Create a package; export/import it, and then you don’t need to spend time recreating it!

Let us know what else you want to see in the Marketplace by connecting with us at The Nest or emailing [email protected].

Additional Resources

Everything You Need To Know About Log Analysis

16 November 2021 at 09:51

This blog was originally published Sept. 30, 2021 on humio.com. Humio is a CrowdStrike Company.

What Is Log Analysis?

Log analysis is the process of reviewing computer-generated event logs to proactively identify bugs, security threats, factors affecting system or application performance, or other risks. Log analysis can also be used more broadly to ensure compliance with regulations or review user behavior.

A log is a comprehensive file that captures activity within the operating system, software applications or devices. Logs automatically document any information designated by the system administrators, including: messages, error reports, file requests, file transfers and sign-in/out requests. The activity is also time-stamped, which helps IT professionals and developers establish an audit trail in the event of a system failure, breach or other outlying event.

Why Is Log Analysis Important?

In some cases, log analysis is critical for compliance since organizations must adhere to specific regulations that dictate how data is archived and analyzed. It can also help predict the useful lifespan of hardware and software. In addition, log analysis can help IT teams amplify four key factors that help deliver greater business value and customer-centric solutions: agility, efficiency, resilience and customer value.

Log analysis can unlock many additional benefits for the business. These include:

  • Improved troubleshooting. Organizations that regularly review and analyze logs are typically able to identify errors more quickly. With an advanced log analysis tool, the business may even be able to pinpoint problems before they occur, which greatly reduces the time and cost of remediation. Logs also help the log analyzer review the events leading up to the error, which may make the issue easier to troubleshoot and prevent in the future.
  • Enhanced cybersecurity. Effective log analysis dramatically strengthens the organization’s cybersecurity capabilities. Regular review and analysis of logs helps organizations more quickly detect anomalies, contain threats and prioritize responses.
  • Improved customer experience. Log analysis helps businesses ensure that all customer-facing applications and tools are fully operational and secure. The consistent and proactive review of log events helps the organization quickly identify disruptions or even prevent such issues — improving satisfaction and reducing turnover.
  • Agility. Organizations can predict the useful life span of hardware and software and help businesses prepare for scale and agility, thus providing a competitive edge in the marketplace.

How Is Log Analysis Performed?

Log analysis is typically done within a log management system, a software solution that gathers, sorts and stores log data and event logs from a variety of sources.

Log management platforms allow the IT team and security professionals to establish a single point from which to access all relevant endpoint, network and application data. Typically, logs are searchable, which means the log analyzer can easily access the data they need to make decisions about network health, resource allocation or security. Traditional log management uses indexing, which can slow down search and analysis. Modern log management uses index-free search; it’s less expensive, faster and can create gains of 50-100x in required disk space.

Log analysis typically includes:

Ingestion: Installing a log collector to gather data from a variety of sources, including the OS, applications, servers, hosts and each endpoint, across the network infrastructure.

Centralization: Aggregating all log data in a single location as well as a standardized format regardless of the log source. This helps simplify the analysis process and increase the speed at which data can be applied throughout the business.

Search and analysis: Leveraging a combination of AI/ML-enabled log analytics and human resources to review and analyze known errors, suspicious activity or other anomalies within the system. Given the vast amount of data available within the log, it is important to automate as much of the log analysis process as possible. It is also recommended to create a graphical representation of data, through knowledge graphing or other techniques, to help the IT team visualize each log entry, its timing and interrelations.

Monitoring and alerts: The log management system should leverage advanced log analytics to continuously monitor the log for any log event that requires attention or human intervention. The system can be programmed to automatically issue alerts when certain events take place or certain conditions are or are not met.

Reporting: Finally, the LMS should provide a streamlined report of all events as well as an intuitive interface that the log analyzer can leverage to get additional information from the log.

The Limitations of Indexing

Many log management software solutions rely on indexing to organize the log. While this was considered an effective solution in the past, indexing can be a very computationally-expensive activity, causing latency between data entering a system and then being included in search results and visualizations. As the speed at which data is produced and consumed increases, this is a limitation that could have devastating consequences for organizations that need real-time insight into system performance and events.

Further, with index-based solutions, search patterns are also defined based on what was indexed. This is another critical limitation, particularly when an investigation is needed and the available data can’t be searched because it wasn’t properly indexed.

Leading solutions offering free-text search, which allows the IT team to search any field in any log. This capability helps to improve the speed at which the team can work without compromising performance. Learn more.

Log Analysis Methods

Given the massive amount of data being created in today’s digital world, it has become impossible for IT professionals to manually manage and analyze logs across a sprawling tech environment. As such, they require an advanced log management system and techniques that automate key aspects of the data collection, formatting and analysis processes.

These techniques include:

  • Normalization. Normalization is a data management technique that ensures all data and attributes, such as IP addresses and timestamps, within the transaction log are formatted in a consistent way.
  • Pattern recognition. Pattern recognition refers to filtering events based on a pattern book in order to separate routine events from anomalies.
  • Classification and tagging. Classification and tagging is the process of tagging events with key words and classifying them by group so that similar or related events can be reviewed together.
  • Correlation analysis. Correlation analysis is a technique that gathers log data from several different sources and reviews the information as a whole using log analytics.
  • Artificial ignorance. Artificial ignorance refers to the active disregard for entries that are not material to system health or performance.

Log Analysis Use Case Examples

Effective log analysis has use cases across the enterprise. Some of the most useful applications include:

  • Development and DevOps. Log analysis tools and log analysis software are invaluable to DevOps teams, as they require comprehensive observability to see and address problems across the infrastructure. Further, because developers are creating code for increasingly-complex environments, they need to understand how code impacts the production environment after deployment. An advanced log analysis tool will help developers and DevOps organizations easily aggregate data from any source to gain instant visibility into their entire system. This allows the team to identify and address concerns, as well as seek deeper information.
  • Security, SecOps and Compliance. Log analysis increases visibility, which grants cybersecurity, SecOps and compliance teams continuous insights needed for immediate actions and data-driven responses. This in turn helps strengthen the performance across systems, prevent infrastructure breakdowns, protect against attacks and ensure compliance with complex regulations. Advanced technology also allows the cybersecurity team to automate much of the log file analysis process and set up detailed alerts based on suspicious activity, thresholds or logging rules. This allows the organization to allocate limited resources more effectively and enable human threat hunters to remain hyper-focused on critical activity.
  • Information Technology and ITOps. Visibility is also important to IT and ITOps teams as they require a comprehensive view across the enterprise in order to identify and address concerns or vulnerabilities. For example, one of the most common use cases for log analysis is in troubleshooting application errors or system failures. An effective log analysis tool allows the IT team to access large amounts of data to proactively identify performance issues and prevent interruptions.

Log Analysis Solutions From Humio

Humio is purpose-built to help any organization achieve the benefits of large-scale logging and analysis. The Humio difference:

  • Virtually no latency regardless of ingestion, even in the case of data bursts
  • Index-free logging that enables full search of any log, including metrics, traces and any other kind of data
  • Real-time data streaming and streaming analytics with an in-memory state machine
  • Ability to join datasets and create a joint query that searches multiple data sets for enriched insights
  • Easily configured, sharable dashboards and alerts power live system visibility across the organization
  • High data compression to reduce hardware costs and create more storage capacity, enabling both more detailed analysis and traceability over longer time periods

Additional Resources

How Humio Outpaces Traditional Logging Solutions and Leaves Competitors in the Dust

10 November 2021 at 17:04

This blog was originally published Sept. 24, 2021 on humio.com. Humio is a CrowdStrike Company.

From time to time, people ask us exactly what we mean when we say things like Humio lets you “stream live data” or Humio provides “real-time observability.” In this blog, we provide a high-level overview of traditional log management and explain some of the terms we use when explaining what makes Humio so powerful and unique compared to other solutions.

Legacy log management

Most businesses today rely on a diverse collection of compute, networking, security and software solutions supplied by different vendors and service providers. Security, IT and development professionals all rely on log data to ensure the performance, availability and security of this infrastructure. But examining discrete event logs individually is a manually-intensive, time-consuming and error-prone process. It’s nearly impossible to detect and resolve sophisticated security incidents or complex architectural issues with a siloed approach.

Most organizations simply can’t afford to gather and retain log data from all their networking gear, security products and other IT systems using SIEM solutions or conventional log management products. As a result, organizations have to limit the types of log records they collect or periodically age out log data, leaving security, IT and development staff in the dark.

Blind spots start to multiply, making it easy for malicious attackers to penetrate IT systems, traverse networks and avoid detectionLikewise, data gaps make it incredibly difficult for IT operations teams and developers to troubleshoot system performance problems and pinpoint application design issues. Because organizations can’t log everything, launching investigations of any kind becomes like looking for a needle in a haystack without knowing if the needle even exists.

Humio lets you log everything and answer anything in real time

Unlike conventional log management systems, Humio cost-effectively collects and analyzes unlimited data at any throughput, providing the full visibility needed to identify, isolate and resolve the most complex security, performance and reliability issues.

Most traditional log management vendors treat logging much like a general-purpose database, organizing and searching datasets using inefficient indexing techniques. Indexing introduces ingest and search latency, which impairs discoveries, observability and investigations. It also consumes excessive CPU and memory resources, adding hardware expense. Humio is based on an innovative index-free design that delivers extremely fast performance.

With Humio, businesses are no longer forced to make difficult decisions about which data to log and how long to retain it. By logging everything, Humio customers gain the complete visibility needed to detect and respond to any incident in real time.

Streaming observability explained

With that in mind, below are some of the phrases we use to describe Humio’s ability to log everything and answer anything.

When we say Humio lets you stream live data, we mean Humio ingests log data as quickly as it arrives, regardless of volume or throughput. We never drop or discard log data.

When we say Humio provides streaming observability, live observability or real-time observability with sub-second latency, we mean Humio lets you aggregate and visualize streaming log data in real time, so no matter what volume of data you send to Humio or how fast you send it, Humio processes it almost instantaneously. Humio updates alerts, scripts and dashboards in real time, giving you live visibility into the health and operations of your IT infrastructure.

Finally, when we say Humio provides blazing-fast free-text search, we mean Humio’s index-free design lets you search anything, in any field, with near-instantaneous results. Again, this is because of Humio’s index-free architecture, where data is compressed, creating gains of 50-100x in required disk space. With Humio, you can search 1PB of data in less than a second. This opens the door to incredible efficiency gains, including highly-effective incident response and prevention.

Wondering if Humio is right for you?

To learn more about how Humio’s live observability capabilities can reduce costs and improve performance, download our Total Cost of Ownership report. See how Humio can ingest over 1PB of streaming log data per day and help you massively reduce your operational expenses while providing real-time observability for security, development and IT operations professionals.

Additional Resources

Top 6 financial services log management use cases

4 November 2021 at 10:12

This blog was originally published July 8, 2020 on humio.com. Humio is a CrowdStrike Company.

Organizations that provide financial services and fintech companies experience constant pressure from customers, regulators, and competitors to increase the speed and quality of their services. For those organizations making the move to the cloud, there are additional layers of complexity arising from microservices in containers. Financial services organizations find themselves in the place where they need technology that will address the competing concerns of performance, speed, security, and cost.

Log management provides a tool that addresses the rigors faced by modern financial services infrastructure. By providing network-wide visibility into all processes in real time, log management makes it easier and faster to resolve the pressures of financial organizations. Through these top six use cases, the flexibility of log management reveals how it is more than a single solution tool, but rather a competitive advantage that financial intuitions can structure their entire strategy around.

The top six use cases of log management

1. Accelerate development of applications

Worldwide, mobile banking is expected to grow to a $26.341 trillion industry by 2026.

In this climate of explosive growth, pressure is on developers to put out multiple releases a day for their mobile banking apps. Any tools that speed up the feedback loop offer vital ways of getting ahead of competition. Log management does this by detecting errors and performance issues in application development environments, allowing developers to get alerts about problems in real-time so they can correct them before they go live. Log management further boosts application performance monitoring (APM) tools by including all log data, providing a detailed view that otherwise would be missed.

2. Monitor for security threats and get alerts

With robust options for alerting based on real-time streaming data, log management provides security personnel with the earliest possible indication of danger and can be configured via webhooks to automate a response such as closing certain ports or running a script on another host.

Logging everything enables security personnel to go back after an incident has occurred and investigate incidents or security threats. Log management tools can be used to search for indicators of compromise (IOC), while also supplying a powerful interface to drill down to the root causes and stop attacks at the source.

3. Assist support teams with a real-time view of transactions

It’s not just about providing raw data to developers and security teams. Log management can be used to develop easy-to-ready graphic and chart-rich dashboards to facilitate the needs of multiple teams.

For support teams, it can provide a view of possible errors on customer accounts, enabling them to see what exactly went wrong so they can correct it as quickly as possible, improving the customer experience.

4. Monitor performance of system

Log management provides system-wide indicators of performance like latency and garbage collection that provide early signals of problems in a system before they lead to a wider crash. Logs can indicate which processes are consuming the most resources and can draw attention to infrastructure that is being overactive and wasteful of resources. If the system passes customizable performance points, log management can alert an automated response to provide more storage or processing resources to prevent a crash.

5. Support compliance

Log management collects and centralizes data, preparing organizations for audits with easily accessible records of all logs. Modern log management compresses data further, allowing financial institutions to meet the retention requirements of regulations while using less hardware. Advanced solutions maintain compression while searching through the data, keeping data in active memory longer, further reducing costs by decreasing transaction time and CPU usage.

6. Monitor user behavior

Since it can record all transactions and activity in a system, logging can be used as a means to viewing user behavior to discover if an account may have been compromised or involved in breaches of sensitive data. Log management can alert security teams if a user’s privileges are elevated – a classic sign of compromise. Once a threat is detected, security teams can use log management to track the behavior of any individual IP address across the system and determine the extent of possible damage.

Humio provides powerful log management for financial services

Humio was designed from the beginning to be a highly performant, adaptable log management solution that is ideal to address the needs of financial service providers.

By removing indexes from the data collection process, Humio gives users instant access to their live streaming data. By compressing historical data by 10-20x, it gives them sub-second access to historical searches of their data. This speed translates into a boost in response time for all teams — Development, Support, Security, and Operations.

Advanced compression also enables longer storage periods, making it less expensive to meet compliance standards. Humio is available as a SaaS, and as a self-hosted solution for storage systems that may be required by financial regulations.

Humio accepts all forms of structured and unstructured data, fitting any number of use cases. Its flexible interface lets users build custom alerts, searches, and dashboards that represent and track the most essential data points for each user. Sharable dashboards democratize the information and can make a curated selection of secure data available to any organization.

In addition to being a central repository for audit trail information, Humio provides audit logs of all user behavior within Humio, providing an extra layer of accountability and security.

Originally inspired by the efficiency of high-volume stock trading, Humio brings high-speed data access to log management, creating stronger security, faster insights, and a better experience for customers.

Hear how the financial institutions like Deutsche Bank and Stash use Humio at our Financial Services Roundtable co-hosted with IBM.

Read more about financial services providers that use Humio: M1 FinanceSpareBank 1, and Lunar.

Learn how switching to unlimited log management can save institutions hundreds of thousands of dollars a year by reading Redefining Log Management TCO.

What is Application Monitoring?

2 November 2021 at 12:57

This blog was originally published Sept. 30, 2021 on humio.com. Humio is a CrowdStrike Company.

Introduction to application monitoring

Application monitoring is the process of collecting log data in order to help developers track availability, bugs, resource use, and changes to performance in applications that affect the end-user experience (UX). Application monitoring tools provide alerts to live anomaly events, and through distributed tracing provide a means of seeing which events form a causal chain that led to them across multiple services.

Also known as application performance management (APM), application monitoring tools provide a visual means of seeing how events are connected through dependency and flow mapping. Application monitoring can be accomplished by dedicated tools to monitor apps, or by collecting and analyzing logs using log management tools. With application monitoring, the end goal is to maximize availability and give customers the best experience.

The main functions of application monitoring tools are:

  • To observe app components – Components may include servers, databases, and message queues or catches.
  • To provide app dashboards and alerts – Dashboards give an overview, alerts drive attention to specific problems.
  • Anomaly detection – Can vary from simple threshold detection to advanced machine learning pattern recognition.
  • Distributed tracing – Tracking how one event connects across multiple nodes to detect the origins of errors.
  • Dependency & flow mapping – A visual representation of how requests travel between services.

Challenges

As applications expand in number with the growth of microservices and the migration to disparate cloud environments, maintaining observability has become more difficult over time. Without centralized monitoring, other monitoring tools such as network performance monitoring, server monitoring, and user monitoring may be collecting a limited set of metrics instead of a dedicated application monitoring tool like APM, resulting in an incomplete picture. Organizations operating with a continuous delivery model have a more difficult time capturing and understanding the dependencies within an application environment. Where APM tools have adapted to meet the needs of a dynamic environment, they may sacrifice the ability to respond to incidents in real-time.

The persistent sources of difficulty for APM tools:

  • Continuous change – Continuous delivery model delivers higher performance overall, but for monitoring, it makes determining context difficult.
  • Complexity – Millions of data points are spread over an increasingly complex network of operations, relationships, and dependencies.
  • Limited data – APM-only tools may miss configuration and operational data found in non-application logs.
  • Unsynced timestamps – Not including the right configuration or platform dependencies within timeframe analysis leads to incomplete understanding.
  • Siloed monitoring solutions – Data separated across multiple solutions slows the detection of root causes.

Answering APM challenges with log management

Log management expands on the roles of APM tools by providing observability across the entire infrastructure. Whereas APM typically captures a subset of all log data, log management includes all data, allowing detailed root cause investigation and analysis. Logging management solutions can access more data from specific platforms than APM monitoring agents can get, including network issues, database connections or availability, or information about what’s happening in a container that the app relies on.

Built to compress and store data, log management also facilitates historical analysis of data, enabling users to identify sources of performance problems on a much larger scale. Because log management is optimized for response time, it provides additional benefits:

  • Observability of the entire infrastructure
  • Comprehensive root cause investigation and analysis
  • Search across all relevant data, not just application data
  • Longer data retention and long-term storage

(Click to enlarge)

Choosing modern log management

Not all log management tools meet the needs of complex, microservices-heavy APM. Look for log management with these features that address the core needs of APM in a modern distributed environment:

  • Unlimited data ingestion
  • Non-indexed queries
  • Real-time data and streaming

Unlimited data ingestion

With microservices, there is exponentially more data than monolithic or service-oriented architecture (SOA) applications. On top of the individual stack data, there is also application data, and each request can have a unique path through the infrastructure. Trying to guess what pieces of data to include for analysis is practically impossible. Include all data and be able to answer unexpected questions that may come up later by using a log management tool that supports unlimited data ingestion.

Non-indexed queries

The need to index data as it’s collected, and searching indexes for analysis slows everything down and gets in the way of advanced data analysis. Just one troubleshooting session could incorporate dozens of queries. If streaming data can be collected without being restricted to defining the schema upfront, there is much more freedom to explore relationships later. Non-indexed queries enable instant search results, encouraging users to ask more questions and explore further.

Real-time data and streaming

As organizations move from a few software releases a year to dozens a day, the need for immediate feedback is greater than ever. The only way to effectively assist the ops team to keep their service levels up and decrease their mean time to resolution (MTTR) is to provide data in near real time. The best way to do that is to stream data from the source and make it available without delays for indexing.

Humio is modern log management

Built index-free to collect live streaming data, and with a cost-efficient unlimited licensing plan, Humio is a modern log management that addresses the current needs of application monitoring and performance management. It fills in the gaps in observability while providing real-time alerts that can boost a team’s performance and address the demands of a modern customer.

Gem State University Saves a Small Fortune on TCO With Humio

16 September 2021 at 12:49

This blog was originally published on humio.com. Humio is a CrowdStrike Company.

Overview

The University of Idaho uses Humio to ingest and analyze network security log data at scale. Humio provides incredible cost-savings compared to their previous logging solution, helping the university increase security insights, streamline incident detection and response efforts, and reduce TCO.

“With Humio, it’s easier and faster to search than it was with previous solutions. We can get to the root of malicious activity like phishing attacks more quickly and efficiently.” — Mitch Parks, Chief Information Security Officer, University of Idaho

Challenge: Reducing Log Management Cost and Complexity

Like many budget-conscious organizations, the IT services department at the University of Idaho is always looking for creative ways to do more with less. The university was using their previous solution to capture and analyze network security log data, but the solution was costly and complicated to scale.

“Because of budget constraints, we could only afford to license 100 gigabytes of data per day. A security incident like a denial-of-service attack can easily drive up our log volumes, trigger licensing caps, and impair forensics.” — Mitch Parks, Chief Information Security Officer, University of Idaho

Solution: Humio Logs Everything at Scale in Real Time

After investigating a number of log management alternatives, including open-source solutions, the university selected Humio as its next-generation security log management platform.

“The open-source approach would have required as many as 12 servers, and we would have needed a dedicated IT person to deploy and maintain it,” recalls Parks. “That just didn’t make sense from an investment perspective. I had read about how other universities had successfully switched to Humio and decided to take a look at it.”

“We evaluated Humio for about 30 days and were quite impressed,” explains Carl Pearson, IT security analyst for the university. “The product is easy to set up and use, and doesn’t require a dedicated IT admin or a SIEM expert, or take a lot of my time to manage.”

Results: Faster and Deeper Insights, Lower TCO

Humio’s state-of-the-art log management platform helped the university improve visibility, slash operations expenses and complexity, and reduce risk and exposure.

“With Humio we save at least $10K a year in licensing fees alone,” says Parks. The university can now retain at least a year’s worth of full log data, which is paramount when sophisticated threat actors can penetrate networks and evade detection for weeks or even months on end.

“With other solutions, we spent a lot of time and effort cleaning up our logs to save space. In the process, we removed Active Directory events and other information that we actually needed later for forensics. We don’t have to worry about any of that anymore with Humio.” — Mitch Parks, Chief Information Security Officer, University of Idaho

Once they started using Humio, Parks and Pearson quickly found additional use cases for the platform beyond security. The IT Services team now uses Humio to identify potential system performance and availability issues, flag possible software licensing violations, and gather other IT operations and application insights.

The post Gem State University Saves a Small Fortune on TCO With Humio appeared first on crowdstrike.com.

How Fast Can You Grep?

14 September 2021 at 12:54

This blog was originally published Sept. 28, 2017 on humio.com. Humio is a CrowdStrike Company.

Assume that you have a 1GB text you want to search.

A typical SSD lets you read on the order of 1GB/s, which means that you can copy the file contents from disk into memory at that speed.

Next, you will then need to scan through that 1GB of memory using some string search algorithm.

If you try to run a plain string search (memmem) on 1GB, you realize that it also comes at a cost. A decent implementation of memmem will do ~10GB/s, so it adds another 1/10th of a second to your result to search through 1GB of data. Total time: 1.1 second (or 0.9GB/s).

Now, what if we compress the input first?

Imagine for simplicity that the input compresses 10x using lz4 to 0.1GB (on most workloads we see 5–10x compression). It takes just 0.1 second to read in 0.1GB at 1GB/s from disk into main memory. lz4 decompresses at ~2GB/s on a stock Intel i7, or 0.5 second for 1GB. Add search time of 0.1 second to a total of 0.6s for reading from disk and decompressing, and we can now search through 1GB in just 0.7s (or 1.4GB/s). And all of the above is on a single machine. Who needs clusters?

Compressing the input has the obvious additional advantage that the data takes up less disk space, so you can keep more data around and/or keep it for a longer period of time. If, on the other hand, you use a search system that builds an index, then you’re likely to bloat your storage requirements by 5–10x. This is why Humio lets you store 25–100x the data of systems that use indexing.

Assuming we’re on a 4-core i7 machine, we can split the compressed data it into four units of work that are individually decompressed and searched on each core for an easy 4x speed up; 1/4th of 0.6 seconds on each core is 0.125s. This gives us a total search time of 0.225 seconds, or 4.4GB/s on a single 4-core machine.

But we can do better.

All of the above assumes that we work in main memory, which is limited by a theoretical ~50GB/s bandwith on a modern CPU, in practice we see ~25GB/s.

Once data is on the CPU’s caches it can be accessed even faster. The downside is that the caches are rather small. The level-2 cache for instance is 256kbytes. In the previous example, by the time the decompression of 1/4 of 1GB is done, the beginning of those 256MB have long been evicted from the cache.

So what if we move the data onto the level-2 cache in little compressed chunks, so that their decompression also fits in the same cache, and then search in an incremental way? Memory-accesses on the level-2 cache are ~10x faster than main memory, so this would let us speed up the decompress-and-search phase by an order of magnitude.

To achieve this, we preprocess the input by splitting the 1GB into up to 128k chunks that are individually compressed.

Adding all this up for a search of 1GB to 0.1s for read-from-disk, 0.004s main-to-core 0.1GB @ 25GB/s, and blazing 10x at 0.0125s to decompress-and-search, for a total of 0.1265 seconds reaching 7.9GB/s.

But what if the 1GB file contents is already in the operating system’s file system cache? If it was recently written, or if this is the second time around doing a similar search.The loading the file contents would be instantaneous, and the entire processing would be just 0.0265 seconds, or 37GB/s.

Loading data from disk can be done concurrently with processing data, so the loading and processing can overlap in time. Notice that we’re now again dominated by I/O (the blue bar above is wider than the other ones combined), which is why Humio searches faster the better the input compresses. If you search more than a few GBs, then processing is essentially limited by the speed at which we can load the compressed data from disk.

To enable even faster searches you simply employ multiple machines. The problem is trivially parallelizable, so to be searching at 100GB/s would just need 3 machines the likes of a desktop i7.

The beauty is that this generalizes not just to search, but many other data processing problems which can be expressed in Humio’s query language. Whatever processing is presented the entire input; which makes it easy to extract data and do aggregations such as averages, percentiles, count distinct, etc.

But in the Real World…

Many interesting aggregate computations require non-trivial state (probabilistic percentiles need a sample pool, the hyper-log-log we use for count distinct needs some fancy bitmaps), and these ruin the on-CPU caching somewhat, thereby reducing the performance. Even something as simple as keeping the most recent 200 entries around slows down things.

In all honesty, most of the above is more or less wishful thinking. It’s the theoretical limits of an optimal program. For several reasons, we really only get around 6GB/s or 1/6th of the theoretical speed, not ~37GB/s per node that I tallied up above. Trouble is that our system does many other things that end up influencing the outcome, and it is really hard to measure exactly where the problem is at the appropriate level of detail without influencing the outcome. But performance is still decent — and (unfortunately) our customers are asking for more features, not more performance, at present.

The system really lends itself to a data processing problem where lots of data is ingested but queries are relatively rare. So it’s a good match for a logging tool: logs arrive continually, they are relatively fast to compress, and few people such as sysops and developers initiate queries. Humio easily sustains a large volume of ingest, we have seen successful single-node deployments taking in +1TB/day; when someone comes around to ask a question, it will use all available processing power (for a short while) for just that single query.

In a later post, I’ll get back to how we improve these tradeoffs using stream processing to maintain ‘views’ that are readily available for retrieval.

The post How Fast Can You Grep? appeared first on crowdstrike.com.

Everything You Think You Know About (Storing and Searching) Logs Is Wrong

9 September 2021 at 13:20

This blog was originally published Aug. 25, 2020 on humio.com. Humio is a CrowdStrike Company.

Humio’s technology was built out of a need to rethink how log data was collected, stored, and searched. As the requirements for data ingest and management are increasing, traditional logging technologies and the assumptions on which they were built no longer match the reality of what organisations have to manage today.

This article explores some of those assumptions, the changes in technology that impact them, and why Humio’s purpose-built approach is a better option for customers to get value with real-time search and lower costs.

3 assumptions about log data

There are three main assumptions that just don’t hold true today (and we like things that come in threes because it makes for neat sections in a blog).

1. Indexes are for search, therefore searches need indexes – False

Traditional thinking about how to do search at scale comes down to one concept: indexing the data. Indexing traditionally involves scanning the documents in question, extracting and ranking the terms, etc., etc. For many years, the ubiquitous technology for this has been Apache Lucene. This is the underlying technology in the search engines of many tools, and in more recent years has been “industrialized” into a really flexible technology thanks to the work of Elastic with the Elasticsearch tools.

But it’s not the best choice for logs (or more specifically streaming human-readable machine data). The assumption that indexes are best for all search scenarios is wrong.

This is no reflection on the technology itself; it’s designed for randomised search and it does that very well. Elastic gets a pass, they didn’t set out to build a log aggregation and search tool.

The other vendors that did set out to build such a tool and took an index-based approach may also get a pass, because indexing was the prevailing technology at the time.

2. Compression, and the obverse, are slow – Not anymore

Data can be compressed to make storage more efficient, but the perception remains that compressing and decompressing data will slow things down significantly. But compressing data can actually make search faster. There are two pieces to that discussion.

Firstly, if you design and optimise your system around compression, it makes reading, writing, storing, and moving data faster. Humio does exactly that, and you can read about some of this thinking in a Humio blog post: How fast can you grep?. Compression is assumed to be slow because so many users have experienced it in systems where it was introduced as an afterthought, a kludge to help solve the storage requirements of indexed data.

Secondly, compression algorithms are still making progress and being optimised. There are arguments that the latest techniques are reaching theoretical limits of performance, but let’s not declare that everything that will be invented has been.

Humio makes use of the Zstandard family of compression algorithms, and they are FAST. More about that in a bit.

3. Datasets become less manageable with size/age, or are put in the freezer – Datasets are not vegetables!

We often talk to prospective customers that have a requirement for Hot/Warm/Cold storage; and in the context of uncompressed, indexed data, this can make sense. People are used to the concept that storage is expensive, and that the storage “tier” is something the application needs to be aware of (e.g., hot data on local disk, warm data on SAN, etc).

Two things have changed significantly here; storage is no longer as expensive as people are used to it being, and a whole new class of storage has become available to application developers and users alike, Object Storage.

The merits of Object Storage are covered in a bit more detail in a recent post: (The Indestructible Blob, and described in the Humio How-To Guide: Optimize the stack with cloud storage.

How does Humio break these conventions?

We’re not going to give you all the details for what Humio does in these areas, but we can certainly discuss the general ways in which Humio reexamined these assumptions, and some of the results of doing so.

Indexes are not the solution

Indexing streaming data for the purposes of search is expensive, slow, and doesn’t result in a faster system for the kinds of use cases customers have for Humio. The interesting thing is that even the leading vendors of other data analytics platforms know this. They have had to work around this very problem to achieve acceptable solutions with things like “live tail” and “live searches”, etc. These index-based tools have to work around their own indexing latency to get the performance needed to claim “live” data … that should have been a big hint that maybe indexing wasn’t needed at all!

By moving away from the use of indexes (Ed: Humio still does actually index event timestamps, but we get the point), Humio does not have to do any of the processing and index maintenance that goes along with it. This means that:

  • When data arrives at Humio it is ready for search almost immediately. We’re taking 100-300 ms between event arrival and that same event being returned in a search result (manual search or a live search that is already running, or an alert, or a dashboard update).
  • Humio does not have to maintain indexes, merge them with new indexes, track which indexes exist, fix corruption in indexes, none of that. For those technologies that do rely on indexes, the indexes themselves become very large. Assuming the index is used to make the entire event searchable, indexing can make the data up to 300% larger than it was in its raw form.
  • With Humio, all queries are against the same datastore; there’s no split processing between historical and live data. Now consider where indexing is used for “search” and some sort of live streaming query is used to power “live” views of the data: tools that take this approach will often show users a spike in a live dashboard, but the user cannot search those events in detail or even view them in the live view.

Find out more about how Humio’s index-free architecture from a blog post: How Humio’s index-free log management searches 1 PB in under a second.

Compression everywhere

Humio uses optimal compression algorithms to ensure minimal storage space is required (did I mention we don’t build indexes?); often achieving 15:1 compression against the original raw data, and in some cases exceeding 30:1 compression.

These compression algorithms allow for extremely fast decompression of the data. Humio analyses and organises incoming data so it can make use of techniques like compression dictionaries, meaning we can do this for the optimally-sized segment files in storage (i.e., we don’t have to build and access monolithic blocks of data to achieve high compression ratios).

This is a good original article to read to get some more background on the kinds of techniques Humio uses from Facebook Engineering: Smaller and faster data compression with Zstandard.

Find out more about Humio compression: Humio product page: Humio: Keep 5-15x more data, for longer.

Accessing data

The final piece of the puzzle here is getting access to the right data when a user issues a query. Humio can’t go scanning all the raw event content no matter how fast it might be. This is where the storage pattern that Humio utilises comes into the picture, and the heuristics for a node in the cluster to get access to the data and scan it.

Firstly, segment files are built around optimally-sized groups of data (some secret sauce is added here to make that happen effectively and transparently to the user). These segment files also have accompanying bloom filters built, which means Humio can quickly and effectively identify only the relevant segments for any given query.

The segments work really well on local or network-attached storage, and their size and nature make them an excellent fit for Object Storage.

What does a query pipeline typically look like?

  1. A query is issued against a Humio cluster. Humio identifies which segment files are relevant, based on the time range and scope of the query.
  2. The nodes that handle the query then fetch the relevant segment files for their part of the query job:
    1. First, check on the local storage/cache for the segment.
    2. Secondly, check the other nodes in the cluster for the segment.
    3. Finally, fetch the segment from the object storage.
  3. Complete the scan and return the results to the query coordinator.

Fun fact: Because the object storage can be so efficient, you can tell Humio to always fetch missing segments from the object storage rather than the other nodes in the cluster as that’s sometimes the fastest way to do things.

For more information on the Humio architecture, see this blog post that summarizes a presentation given by Humio CTO Kresten Krab Thorup: How Humio leverages Kafka and brute-force search to get blazing-fast search results.

Conclusion

Humio has reconsidered the problem of ingesting and searching log data. Through a new approach and new technologies that are available, it has built a solution that scales efficiently and performs better than the systems that have come before it, often by more than an order of magnitude in terms of speed, storage, and total cost of ownership.

Want to find out more? Set up some time with us for a live demo, or see how it performs for yourself with a 30-day trial.

The post Everything You Think You Know About (Storing and Searching) Logs Is Wrong appeared first on crowdstrike.com.

  • There are no more articles
❌