infrablockchain-docs
en
en
  • InfraBlockchain
    • Learn
      • Architecture
        • Architecture
        • Network Participants
        • Parachain
          • System Parachains
      • Protocol
        • System Token
        • Transaction Fee
        • Proof of Transaction
      • Substrate
        • Learn
          • Basic
            • Cryptography
            • Blockchain Basics
            • Consensus
            • Networks and Nodes
            • Blockchain Transaction
            • Transaction Life Cycle
            • Offchain Operations
            • Light Client
            • Rust for Substrate
            • Introduction to Library
            • Architecture and Rust Libraries
            • File Architecture
            • Accounts, Addresses, and Keys
            • Transaction Format
            • Blockchain Randomness
          • FRAME
            • FRAME Pallets
            • FRAME Macros
            • Custom Pallets
            • Pallet Coupling
            • Origin
            • Events and Erros
            • Runtime Storage
            • State Transitions and Storage
            • SCALE Encoding
            • Weight and Fee
            • Runtime API
            • Runtime Development
          • Account
          • Address Format
          • Glossary
          • CLI
            • Archive
            • Memory Profiler
            • Node Template
            • sidecar
            • srtool
            • Subkey
            • subxt
            • try-runtime
            • tx-wrapper
          • Runtime Development
            • Basics
              • Configure Genesis State
              • Configure Runtime Constants
              • Customize a Chain Spec
              • Import a Pallet
              • Use Helper Function
            • Consensus Model
              • PoW
              • Create a Hybrid Node
            • Offchain Worker
              • Request Offchain HTTP
              • Offchain Indexing
              • Offchain Local Storage
            • Pallet Design
              • Create a Storage Structure
              • Implement Lockable Currency
              • Incorporate Randomness
              • Loose Coupling
              • Tight Coupling
            • Parachain Development
              • Add HRMP Channel
              • Add Paranodes
              • Connect to a Local Relay Chain
              • Convert a Solo Chain
              • Prepare to Launch
              • Select Collator
              • Upgrade a Parachain
            • Storage Migration
              • Basics
              • Trigger Migration
            • Test
              • Basics
              • Test a Transfer Transaction
            • Tools
              • Create a TxWrapper
              • Use Sidecar
              • try-runtime
              • Verify WASM
            • Weigths
              • Benchmark
              • Calculate Fees
              • Use Conditional Weights
              • Use Custom Weights
        • Build
          • Decide What to Build
          • Build Process
          • Determinisitc Runtime
          • Chain Spec
          • Genesis Configuration
          • Application Development
          • RPC
          • Troubleshoot Your Code
        • Tutorials
          • Install
            • Developer Tools
            • Linux
            • macOS
            • Rust Toolchain
            • Issues
            • Windows
          • Quick Start
            • Explore the Code
            • Modify Runtime
            • Start a Node
            • Substrate Basics
          • Build a Blockchain
            • Add Trusted Nodes
            • Authorize Specific Nodes
            • Build a Local Blockchain
            • Simulate Network
            • Upgrade a Running Network
          • Build Application Logic
            • Add a Pallet
            • Add Offchasin Workers
            • Publish Custom Pallets
            • Specify Origin for a Call
            • Use Macros in a Custom Pallet
          • Integrate with Tools
            • Access EVM Accounts
            • EVM Integration
            • Explore Sidecar Endpoints
            • Integrate a Light Client Node
          • Smart Contracts
            • Strategy
            • Build a Token Contract
            • Develop a Smart Contract
            • Prepare Your First Contract
            • Troubleshoot Smart Contracts
            • Use Maps for Storing Values
      • XCM
        • XCM
        • XCM Format
    • Service Chains
      • InfraDID
      • InfraEVM
      • URAuth(Universal Resource Auth)
    • DevOps
      • Build
      • Deploy
      • Monitoring
      • Runtime Upgrade
    • Tutorials
      • Basic
        • How to Interact with System Token
        • How To Pay Transaction Fee
        • How To Vote with TaaV
        • Hot to Get Validator Reward
      • Build
        • Build InfraRelayChain
        • Build Parachain
        • Open Message Passing Channels
        • Transfer Assets with XCM
      • Test
        • Benchmark
        • Check Runtime
        • Debug
        • Simulate Parachains
        • Unit Testing
      • Service Chains
        • Play with InfraDID
          • Build
          • Add Keys
          • Add Service Endpoint
          • Create InfraDID
        • Play with InfraEVM
          • Build
          • Deposit and Withdraw Token
          • Deploy ERC20 Contract
          • Deploy ERC721 Contract
          • Deploy ERC1155 Contract
  • Newnal Data Market
Powered by GitBook
On this page
  • View telemetry data
  • Customize the monitoring stack
  • Disable telemetry
  1. InfraBlockchain
  2. DevOps

Monitoring

Provides an overview of the default telemetry for Substrate nodes and the Polkadot ecosystem.

PreviousDeployNextRuntime Upgrade

Last updated 1 year ago

Monitoring network operations and performance is an important part of maintaining a healthy, stable, and secure ecosystem. By default, Substrate nodes provide low-level telemetry components to automatically collect and transmit detailed information about the operation of each node in the network. The backend telemetry information is streamed in real time—as it occurs—to an endpoint so that you can observe, monitor, and alert on system behavior.

As a developer or node operator, there's rarely any need to know the implementation details of how these low-level telemetry components provide information about computer and network operations. At a high level, the information collected from each public node is sent to a default telemetry server and aggregated for display in a front-end dashboard.

The following diagram provides a simplified overview of the workflow.

View telemetry data

To see the list of columns and modify which columns are displayed, click Settings.

The columns you can toggle on and off include:

  • Validator to display an indicator to differentiate validator nodes from other nodes.

  • Location to display the location of a node.

  • Implementation to display the version of the software running on the node.

  • Network ID to display the public network identifier for the node.

  • Peer Count to display the number of peers connected to the node.

  • Transactions in Queue to display the number of transactions in the Ready queue waiting for a block author.

  • Upload Bandwidth to display a graph of recent upload activity in MB per second.

  • Download Bandwidth to display a graph of recent download activity in MB per second.

  • State Cache Size to display a graph of the state cache size in MB.

  • Block to display the current best block number for the node to ensure it's staying synchronized with its peers.

  • Block Hash to display the block hash for the current best block number for the node.

  • Finalized Block to display the most recently finalized block number for the node to ensure it's staying synchronized with its peers.

  • Finalized Block Hash to display the block hash for the most recently finalized block number for the node.

  • Block Time to display the block execution time between blocks.

  • Block Propagation Time to display the time it took for the node to import the most recent block.

  • Last Block Time to display the time it took for the node to author the most recent block.

  • Node Uptime to display the number of days the node has been online without being restarted.

Pin node information

You can keep the information for one or more nodes in view in the dashboard by clicking the line displayed for the node in the dashboard list. For example, if you want to keep the information for two nodes visible while the information for other nodes scrolls by, you can select the nodes in the list and pin the information in place.

Display nodes on a map

As an alternative to the list of nodes, you can display nodes on a global map. The map provides a visual representation of where nodes are located, where blocks are being produced and imported in real time, and where there are higher concentrations of nodes geographically. Note that only nodes that have a known location are displayed in the map.

To see represented on a map, click Map.

In this view, you can hover over any node location on the map to display a subset of the information available for the node, including the node name, location, best block number, best block hash, and block time statistics.

Display node statistics

In addition to the information available for individual nodes, you can view statistics that describe the number and percentage of nodes that share specific attributes. For example, you can see the percentage of nodes that are using each release of Polkadot, the percentage of nodes using the Linux operating system and the most common distributions currently deployed. You can also view statistics about the CPU and memory configuration across nodes in the network.

To see the full set of statistics available, click Statistics.

The statistics available include details about the hardware and software running on the nodes in the network, including:

  • Software version.

  • Operating system.

  • CPU architecture and model.

  • Number of physical CPU cores.

  • Total memory.

  • Whether the node is a virtual machine.

  • Linux distribution and kernel version.

  • CPU and memory speed.

  • Disk speed.

Filter by chain

By default, the telemetry dashboard displays information about Polkadot and provides quick links to a subset of other chains. To display information for other chains, click More and select a different chain.

Customize the monitoring stack

The default telemetry dashboard provides visibility into node and chain operations without requiring you to host or configure any backend monitoring or front-end services. However, many projects choose to augment or replace the default telemetry server with their own backend server and front-end dashboards.

In general, setting up your own telemetry server involves establishing monitoring and alerting policies for both on-chain events and individual node operations.

On-chain activity

Node operations

You should monitor each node that you run on the network for basic information about its operation such as the current block height, the number of peer-to-peer connections, CPU usage, and the available free memory. By default, Substrate exposes many useful metrics on the host:9615/metrics endpoint. For example, if Substrate is running locally, you can see the metrics on the http://localhost:9615/metrics endpoint. This endpoint outputs metrics using a simple key-value format.

For example:

polkadot_database_cache_bytes 0

However, keys can also include descriptive tags. For example:

susbtrate_block_height{status="best"} 136
susbtrate_block_height{status="finalized"} 133

However, you can expose it on all interfaces by using the --prometheus-external command-line option to start a node.

Configure monitoring tools

As this diagram illustrates, there are different tools available for each layer of the stack. In this example, the following tools are configured for monitoring on-chain activity and node operations:

  • Prometheus is a monitoring engine that collects metrics from specified targets at specified intervals and evaluates the data collected using rules you define. Its time series database can hold large amounts of data that can be accessed very quickly.

  • Grafana is an observability platform that allows you to query, visualize, and analyze the data you collect through dashboards and graphs.

  • Node exporter is process that listens on a port and reports application-specific metrics to Prometheus.

  • Alertmanager is a tool that enables you to create and route alerts based on the rules you specify. Alertmanager allows you to configure how and where to send alert if something goes wrong. For example, you can send instant messages for warning alerts, but page an on-call technician for critical alerts

  • Loki is a scalable log aggregation system that allows you to view and search logs from all components in your infrastructure in one place.

Change the telemetry server

After you have configured the backend monitoring rules for your nodes, you can use the --telemetry-url command-line option when you start a node to specify the telemetry server that you want to send telemetry data to. You can pass this option multiple times to specify multiple telemetry endpoints. If you specify the --telemetry-url command-line option, you must also specify how verbose metrics should be, with level 0 denoting the least verbose through level 9 denoting the most verbose.

For example, to specify your own telemetry server URL with a verbosity level of 5, you would run a command similar to the following:

./target/release/node-template --dev \
  --telemetry-url "wss://192.168.48.1:9616 5" \
  --prometheus-port 9616 \
  --prometheus-external

Disable telemetry

Telemetry is enabled for all global chain nodes by default. You can use the --no-telemetry command-line option to prevent a node from connecting to the Substrate telemetry server. For example, to prevent telemetry data from being send to the default telemetry server, you would run a command similar to the following:

./target/release/node-template --chain myCustomChain \
  --no-telemetry

The dashboard provides a real-time view of how currently online nodes are performing. You can choose the information you want to display by turning visible columns on and off from the list of columns available.

You can monitor on-chain activity for specific events, such as transactions submitted from a certain address, a change to the current validator set. On-chain monitoring typically involves connecting to RPC nodes to check for specific values, identify processing delays, or track the timing of events. In most cases, you only need two RPC instances to handle requests for all of your hosts. However, it's recommended that you run your own RPC servers to service these requests in case there are issues with the public RPC nodes. Some examples of applications that query on-chain information are and .

By default, the endpoint is only be exposed on the local network interface.

To set up monitoring and alerting policies, you typically configure a set of tools to create your own monitoring stack. For example, the default metrics endpoint doesn't include host metrics—such as CPU, memory, bandwidth usage—so you can complement it by installing the Prometheus on each host. The following diagram illustrates an open source set of tools that are often used as a monitoring stack.

For a simple example of setting up node monitoring using Prometheus, Grafana, and node exporter, see . For a simplified example of using Loki, see .

For more information about the backend components for telemetry or configuring your own server, see or the for Kubernetes deployments.

Polkadot telemetry
polkabot
polkadot-basic-notification
metrics
node_exporter
Monitor node metrics
Remote logging
substrate-telemetry
telemetry helm chart
Default telemetry for Substrate nodes
Click Settings to modify the visible columns
Keep information for selected nodes displayed
Display nodes on a global map
View statistics across nodes
Select other chains
Monitoring stack layers