The world of Industrial IoT is brimming with promise and improvements. Everywhere you turn, industries are showcasing the latest and greatest technologies in order to eke out another fractional percentage of improvement. And the basis of these improvements comes from collecting vast amounts of data in order to produce novel insights.
When a machine builder decides to embark on their own digitalization mission, their mantras include “data is the new gold” and “collect all the data, we’ll figure out what to do with it later”. By this, a key component is collecting OT (Operational Technology) data. OT Data usually refers to data from the Automation side of the plant floor.
Unfortunately, without a clear vision for what to do with this OT data, these digitalization initiatives can soon fall apart. According to a 2019 Microsoft Signal report1, nearly 1/3 of all IoT proof of concepts fail. One of the key reasons is because of the high cost of scaling. In addition to the raw cost, it becomes difficult to justify a business case without a real short-term impact. Finally, the pilots demonstrate an unclear business value or return on investment. Comparing the high cost of data collection and scaling with little final value causes these projects to stall or get cancelled.
But before we dive deeper, we need to take a step back.
What is machine / OT data used for?
Data gathered from the Automation layers of a plant hierarchy are rich with detail. From the machine’s PLC to the I/O to the various sensors spread about, a lot can be realized from this data. From our own research, there are two main applications for machine / OT data:
Dashboards
Dashboard applications are simple views consisting of pie charts, bar graphs, and alarms in order to monitor, in near real-time, the status and performance of the automation equipment.
Because a dashboard depends on near real-time data, determinism nor historical accuracy is necessary. If the data is missing from one or two seconds ago, it’s not a loss because the user only really cares about the status right now. Therefore, a reasonable amount of data loss is acceptable. Lastly, because data is needed for a dashboard immediately, any processing that is needed takes place within the dashboard application itself. Therefore, no pre-processing is performed (nor needed) to the data prior to being sent to the dashboard application.
Analytics
Analytics applications are more complex views consisting of trend graphs and historical analyses. For these analyses to provide meaningful insights, only relevant data should be provided to the application. This means that a good amount of data processing needs to take place prior to being sent to the final analytics application. Filtering the data and providing quality datasets are crucial for the application to identify cause-effect relationships and ultimately improve the performance of the machine. Moreover, these datasets need to be complete with no losses nor incoherency. Duplicates and improperly ordered values reduce the quality of a dataset, both of which need to be caught before delivery.
In the first steps of a digitalization strategy, a clear distinction needs to be made between the type of application. As can be seen above, these two applications have very different requirements and we posit that this is a key reason why digitalization initiatives ultimately fail. In fact, collecting all your data as fast as possible will lead to grave inefficiencies. One study found that 30% of Business Intelligence professionals spend between 50%-90% of their time making their data “analytics-ready"
While most machine builders start their projects with a simple dashboard in mind, their goal is to gain insights into their machine fleet as quickly as possible. As a result, they should be aiming for an analytics application. A good milestone is usually a preventive maintenance application. We will continue this blogpost with the assumption that a preventive or predictive maintenance application will be the first big IIoT application deployment.
The most common reason companies fail – the high cost of scaling
If a machine builder is trying to design their strategy to address both dashboard and analytics applications at once, there might be too much data (or not enough structured data) in order to provide a real return on investment. It’s no wonder that less than 1% of unstructured data is analyzed or used at all3.Trying to collect as much data as possible, as fast as you can, is a surefire way to reach unsustainable scaling costs. And as mentioned before, this is the #1 reason that 33% of all digitalization projects fail.
Scaling costs increase quickly due to 2 main reasons:
Expanding a Proof of Concept is complicated
If you have ever set up a simple point to point data transfer, you might be deceived into thinking that scaling your project will be just as easy. But the problem with the pervasiveness of the IoT proof of concept (PoC) is that the PoC is easy to get up of the ground. You’re only gathering data from a handful of sites. But when it comes time to expand the data collection to factories around the world, or multiple machines in the same factory, these costs multiply quickly.
A trusted software vendor shared with us that each site takes approximately 2 hours of engineering time to get PLC data collection working properly with their remote server. For small projects with only 5 or 10 machines, this is not a big deal. But expand this to the reach of a global machine builder, and all potential cost benefits have already been eaten up.
In addition to this, project scopes change. The tags collected, the way they are processed, and sometimes even the end application shift. These changes might seem trivial, but they result in real engineering hours. Again, multiply this by the number of sites, and it becomes a burden to manage.
Finally, proof of concepts are sometimes at the mercy of the factory owner. IT maintenance and various other routine procedures could prevent key data from being sent to your IoT application. If you don’t have a way to buffer and historize the data, you might miss critical events for analysis.
Cloud data costs add up
Both Azure and AWS offer reasonable rates and services to get you started on their platforms. They might even offer rebates, deals, and specials in order to get you to consume as much as you want. While data storage costs are very reasonable, the data ingestion costs are a different story. These are the costs for inputting the data into the cloud platforms.
When a machine builder pursues an analytics application, filtering of the data prior to data ingestion is the most ideal way to manage their costs. Otherwise a decent amount of post-processing is required once the data is ingested, and the computing costs drastically increase the total cost of ownership.
But what if these scaling costs could be mitigated?
What if expanding your PoC to a full-blown pilot took a few minutes?
What if you could optimize your data in order to feed your analytics application properly?
Introducing DataMailbox, Talk2M’s first data collection service
Ewon has been a part of the data collection conversation for 20+ years. As we watched customer needs and expectations evolve, we realized that a new method for collecting and managing OT data was necessary.
In 2015, we launched the first Talk2M service for data collection, called DataMailbox. The basis of this service was that properly ordered, sorted, and “clean” historical datasets would be critical for the first wave of IoT applications such as analytics and preventive maintenance. Furthermore, the developers of these applications would not be experts in the OT world. Therefore, gathering data from multiple sites should be easy, intuitive, and secure.
Following from this hypothesis, many key improvements turned into the hallmark of the DataMailbox service, such as:
Data buffered on an intermediary server (ISO27001-certified Talk2M)
No data loss, no duplicates, and a properly ordered dataset
Historical datasets would be retrieved every 10 minutes
Simple and easy configuration
More recently, DataMailbox has been helping both our Solution Partners and IoT developers also manage their scaling costs. Because DataMailbox has built in services to filter for just relevant data, we avoid duplicates and unnecessary data ingestion costs. On the Ewon device, you can avoid general polling and instead write a new tag value when the value changes. Furthermore, you can also set a predetermined margin for this change, called a dead-band.
DataMailbox provides a public REST API so that modern developers can implement and integrate an easy-to-use connector into their application. This open API greatly eases the burden for developers by avoiding lock-in. In a matter of hours, developers can simplify their data collection at scale. Need to collect data from 5 sites? Only one API call is needed. Has your project scope increased and now you need to collect data from 50 sites? Still, only one API call is needed. While the benefits are incredible for the IoT Application provider, customers ultimately win.
By implementing a connector, DataMailbox provides a customer-centric approach into the modern Industrial IoT application. With just their Talk2M credentials, a customer can quickly evaluate and visualize their data on any DataMailbox-supported IoT application.
But most importantly, they can stop worrying about how to collect data and instead focus on their IoT application.
DataMailbox is the premier data collection method for analytics applications because it solves multiple issues that prevent proof of concepts from moving forward:
Scaling costs are managed due to filtering, sorting, and duplication-removal services
Customer experience is simplified, and it takes a matter of minutes to get to a “wow”
experience More time can be spent on the actual application and proving the ROI
Total cost of ownership is reduced because additional data processing is already handled (instead of requiring additional complex cloud services)
Combined with the Ewon Flexy IoT Gateway, DataMailbox will give you a jumpstart on your path to digitalization. You can leverage our trusted pull-only REST API and guarantee a secure and professional beginning on your journey.