New Topics

  • The QuasarDB daemon reports "out of free sessions"

    Leon Mergen · 0 · Posted

    Summary

    The QuasarDB daemon reports the following error:

    Clients might experience reliability issues, and sessions time out.

    Cause

    All operating systems put limits on the maximum number of file descriptors or TCP connections a process can use. In QuasarDB, each client uses a dedicated connection to each partition it wants to write to. When this error occurs, it means QuasarDB is unable to reserve a new connections for clients.

    Resolution

    Option 1: Ensure sessions are properly cleaned up

    The most effective solution to this is to ensure you are using proper resource management client-side. Ensure that you are tearing down your QuasarDB

  • What are the hardware recommendations for my cluster?

    Leon Mergen · 0 · Posted

    Summary

    You want to allocate resources for a QuasarDB cluster and need to have an estimate on the hardware requirements in terms of storage, cache and speed.

    Getting started

    When determining the minimal cluster size, the dataset and how you interact with it is the driving force behind the decision making. We will be looking at the following variables:

    • Ingestion speed
    • Data retention period
    • Querying patterns

    Throughout this article we use a trading firm as an example, but the same strategy can be applied to any use case.

    Row size

    The average size (in bytes) of a row is the

  • The QuasarDB API reports " An entry matching the provided alias cannot be found."

    Leon Mergen · 0 · Posted

    Summary

    You are performing operations on a multi-node QuasarDB cluster that are performed in quick succession and depend upon each other. Your code might look like this:

     series.attach_tag('a')  tag = cluster.tag('b')  tag.attach_tag('b')

    While performing the last operation, the QuasarDB client API reports the following error:

    Cause

    The different entries are stored on different nodes within your cluster and you have a clock skew between these nodes. The clock skew causes a lag before the first entry is visible to all nodes, causing a failure when these operations are performed in quick succession. 

    Resolution

    Deploy time synchronization from the same time

  • The QuasarDB daemon reports "chord algorithm stopped as it encountered a logic error"

    Leon Mergen · 0 · Posted

    Summary

    You are setting up a multi-node cluster, and the you see an error with the following message appears in your qdbd log:

    At least one node in the cluster fails to come online.

    Cause

    This error is thrown when the chord algorithm fails to stabilize because it requires the nodes to be time-synchronized within reasonable limits.

    Resolution

    Deploy time synchronization from the same time source across all nodes in the cluster, and make sure this is synchronized on a frequent basis (we recommend once an hour). We recommend using a time source that is as predictable and available as

  • Setting up a Docker evaluation

    Leon Mergen · 0 · Posted

    Summary

    This article describes how to set up a multi-node cluster in a Docker environment for simple evaluation.

    Prerequisites

    • Docker engine

    Getting started

    The Docker strategy is as follows:

    • We will be using the bureau14/qdb docker container
    • This container uses qdbd as entrypoint, so we can easily add additional command-line arguments to it
    • We will mount volumes for perisstent storage, license files, and others.

    Generally we recommend to run RHEL (or derivatives) on the Docker host because of better support of dtrace / systemtap, but any Linux distribution is supported.

    To get started, first do a sanity check whether your docker environment works

  • Measuring performance

    Leon Mergen · 0 · Posted

    Summary

    Measuring performance can be useful for a number of reasons:

    • You want accurate numbers on how long it takes for the QuasarDB daemon to insert a certain amount of data;
    • You want to run a live analysis on where exactly the QuasarDB daemon is spending its time;
    • You want on-line programmable response to certain actions inside QuasarDB.

    For this purpose, the Linux build of the QuasarDB is instrumented with SystemTap probes. This document provides an example systemtap script you can use to monitor various latencies.

  • How many CPU sockets should I use in a server ?

    Leon Mergen · 0 · Posted

    Summary

    You need to decide whether to invest in a server with multiple CPU sockets or more powerful CPUs inside a single socket.

    Because of Non-Uniform Memory Access (NUMA), we recommend to choose CPUs with more cores before you expand into more CPU sockets. NUMA causes memory access latency to depend upon the physical location of the memory, which makes performance unpredictable. All other things being equal, you will experience better performance with a single 16-core CPU than two 8-core CPUs or four 4-core CPUs.

  • 0

    Group by arbitrary column

    Leon Mergen · 0 · Posted

    Currently, QuasarDB only supports grouping by timespans (e.g. group by 5min or 1h). This should be extended into grouping by arbitrary columns, so that queries like this are possible:

    The current workaround would be to create different timeseries per exchange, which is less flexible when you want to query the same dataset in multiple ways.

  • Look up all available timeseries

    Leon Mergen · 0 · Posted

    Summary

    You want to look up all available timeseries in your QuasarDB cluster. QuasarDB allows for this by attaching tags to your timeseries upon creation and using the same tag to look them up at a later time.

    Solution

    To look up all available timeseries in your QuasarDB cluster, you are recommended to assign all of them the same tag. For example, when you attach the tag "store" to all your timeseries, you effectively create an index of all your timeseries. As an added benefit, it allows you to run queries and aggregates across all your timeseries, as described in

  • Aggregation of multiple timeseries

    Leon Mergen · 0 · Posted

    Summary

    QuasarDB allows for aggregations of data stored in multiple timeseries in a single query by making use of Tags. This allows for concise and high-performance execution of aggregations. This article explains how to set up your data model in such a way to support this, and how to write queries that make use of this.

    Tags and Timeseries

    QuasarDB uses a concept called "Tags" for efficient indexing and querying of stored entities. Because timeseries in QuasarDB are just another entry type, we can leverage this for timeseries by providing tags upon construction.

    Note - tags apply to entire timeseries,

  • The performance of my inserts degrades over time

    Leon Mergen · 0 · Posted

    Symptoms

    While loading data into QuasarDB, the performance of similar insert operations slowly degrades over time. This corelates with the QuasarDB daemon showing a relatively high CPU usage.

    Cause

    QuasarDB organizes timeseries data in shards, which are dynamically created or updated when appropriate. When adding data to a shard, QuasarDB has to reindex all data in that shard, which is a complex operation that require a lot of CPU resources.

    Resolution

    Option 1: tune your shard sizes

    The most frequent scenario is that the shard size of a timeseries has been misconfigured. For example, if you have a shard size

  • My timeseries is slow when I use many columns

    Leon Mergen · 0 · Posted

    Symptoms

    You are using a timeseries that has many different columns (more than 1,000). The insert performance of this timeseries is considerably worse than a timeseries with few columns.

    Cause

    QuasarDB implements MVCC transactions to ensure data consistency. References of these transactions are maintained in a map with O(log n) complexity. By using a large amount of columns, the upkeeping of th data structure that maintains the transaction references becomes a bottleneck.

    Resolution

    Unlike other databases, QuasarDB isn't limited to a certain amount of timeseries and we encourage the use of many different timeseries; millions of timeseries is appropriate for

  • Copying data between two QuasarDB clusters

    Leon Mergen · 0 · Posted

    Summary

    You want to copy data between QuasarDB clusters. This is a common use case that can be required when:

    • Upgrading a QuasarDB cluster version;
    • Safeguarding against data loss or data corruption by creating backups on a secondary cluster;
    • Creating a snapshot copy for use in a staging or development environment.

    While QuasarDB does not provide native cluster-to-cluster data migration, we do provide you with the tools to do this yourself.

    Choose a strategy

    There are two different strategies you can take when copying data:

    Clone snapshot

    This is the simplest approach, where the entire contents of the primary cluster

  • QuasarDB reports that no entry is in memory, but memory usage is high

    Leon Mergen · 0 · Posted

    Symptoms

    The QuasarDB daemon log file shows the following message:

    Additionally, QuasarDB shows very high CPU usage.

    Cause

    QuasarDB reports its actual memory usage, which includes internal data structures as well as entries in cache. The daemon is configured in such a way that causes its internal data structures to go beyond the configured limit, leaving no available memory for actual data.

    This causes the QuasarDB daemon to frequently start a cache eviction process, possibly many times per second. This is an expensive operation that causes a very high CPU load, which degrades system performance.

    Resolution

    Option 1: Use the

  • Copying a timeseries

    Leon Mergen · 0 · Posted

    Summary

    You want to copy all data from one QuasarDB timeseries to another. While QuasarDB does provide this functionality natively, we do provide you with the tools to do this yourself.

    Solution

    To copy a timeseries, the best solution is to create a custom application that reads all data from one timeseries and inserts it into another. This is similar to the approach taken for Copying data between two QuasarDB clusters

    A clone snapshot is created using the following strategy:

    An example of what the code could look like is provided with our Python API. You can find it at