In previous articles, we have outlined the process of constructing a report that depicts real-time IoT sensors data and the visualization of corresponding historical sensors data.
In this study, our primary focus was on finding a scalable approach for archiving time-series data efficiently. Our goal was to enable smooth aggregation across various temporal scales, allowing us to present these aggregated insights in our reports. We identified several tools that seemed well-suited to meet our requirements, and the outcome of our efforts is indeed noteworthy.
Furthermore, aside from visualizing historical sensor data, we have also maintained the capability to display real-time data using an architecture that was described in an our previous article. The architecture we chosen utilizes a Microsoft Azure IoT Hub as endpoint for the IoT devices, and data are planned to be stored in an InfluxDB database.
The communication between IoT Hub and InfluxDB occurs through the Telegraf component of InfluxData.
Real-time visualization in Power BI is achieved by leveraging Microsoft Fabric’s data streaming infrastructure.
Let’s now delve into more detail of this architectures, but first let’s introduce two of the main components used, InfluxDB and Telegraf.
InfluxDB is an open-source tool designed for the collection, monitoring, and querying of data related to measurements with timestamps, such as those received from various sensing devices like thermometers, hygrometers, and more.
The data metrics received are gathered into a Bucket, which corresponds to a Database, and are stored in time series. Each Bucket contains a table with fields that indicate the timestamp, field name, measured value, and sensor name.
InfluxDB distinguishes between two fundamental data components: Tags and Fields. Tags serve as metadata used for grouping and filtering data, while Fields hold the numeric values from the sensors. One of the powerful features of InfluxDB is its query language called Flux. Flux enables users to perform various operations on the data, such as grouping, filtering, and aggregating, providing a flexible and efficient way to extract meaningful insights from the collected measurements.
In summary, InfluxDB is a versatile open-source tool that excels in handling time-series data, making it a valuable asset for applications requiring real-time monitoring, analytics, and data-driven decision-making based on sensor data.
Executed from the command line, Telegraf is a server agent that allows to receive, transform and send IoT metrics. It uses a configuration file to specify the Input, Transformation, and Output parameters.
Input: In our case, the input plugin is inputs.eventhub_consumer.json_v2, designed to receive metrics from an Azure IoT Hub. It is configured by specifying the connection string and the specific metrics to retrieve.
Transformation Processor: The plugin used here is processors.starlark, which allows for the modification of the structure of collected metrics (adding or renaming fields) and altering their values (e.g., unit conversions, etc.). In our setup, it’s used to transform the metrics collected by the input plugin into a format that InfluxDB can readily interpret.
Output: The output plugin specified in the configuration file is outputs.influxdb_v, enabling data transmission to an InfluxDB server. It is configured with the server’s address, an access authorization token, and the name of the Bucket, which corresponds to the database that will receive the data.
In the configuration file it is also possible to manage the routing between the input and output data so that for example the data received from input1 will be redirected to output1 and those received from input2 will be redirected to output 2. In this way it is possible to manage two Azure IoT connections each sending data to its respective Bucket.
In this architecture, the IoT devices are connected to a Microsoft Azure IoT Hub. The IoT Hub serves as the central hub for managing devices connections and data ingestion. Through a Microsoft Fabric EventStream job, these data is ingested into a Kusto database and can be visualized in real-time on a Power BI report through the VCAD custom visual.
Once data is in the Azure environment, they can be ingested using Telegraf into InfluxDB for storage. Being InfluxDB an highly efficient time-series database from, is well-suited for handling sensor data with timestamps. Therefore, Telegraf allows us to gather the data sent to the IoT Hub, using an “Event hub compatible” connection string as endpoint, and archive them in our InfluxDB database.
Once massively loaded into our database, the data can be sampled on different time scales and stored in different tables in this format. This approach enables us to keep the retention period low, ensuring that we do not store excessively large volumes of row data, especially when dealing with a high number of sensors generating frequent readings. Representation and visualization of the historically collected data is then achieved by connecting PowerBI to InfluxDB.
In conclusion, this architecture solves many issues encountered when trying to read, store, and visualize IoT data. By leveraging the capabilities of InfluxDB and Microsoft Fabric we are capable of gaining respectively historical and real-time insights from our IoT data.