Your processes create a constant stream of data: for example, a single petrochemical site can generate literally millions of data points that all need to get processed and analyzed. Sometimes this can feel like an avalanche of data that is resource-intensive and impossible to keep up with.
Most Oil & Gas businesses are using some sort of software program to manage all that data – that could be a series of spreadsheets, a more intelligent database, or a full-fledged Environmental Management System (EMS).
The driving force behind the trend toward EMS software is a desire to be in control of data. It’s one of your most precious resources, and environmental professionals need it to be successful.
So how does your current data management system prevent or catch bad data before it taints your reporting, KPIs, and processes? Do you ever worry that bad data is corrupting your efforts? Have you ever lost time correcting mistakes caused by incorrect values that were not filtered out?
Any business that is putting its trust in any type software system should take the time to check up on how “bad data” is detected and prevented by their systems. In fact, a recent Aberdeen Group research paper named trust as one of the 3 main pillars of big data performance: “data-driven managers these days want to make decisions rooted in confidence. If they don’t trust the data belying their most critical decisions, confidence goes down and most people revert back to their gut instinct […] this scenario defeats the purpose of data-driven decisions […] building trust entails ensuring that the data is of high quality, that it is up to date, and that it is highly relevant”.
Two Types of Bad Data to Watch Out for
There are two types of bad data to watch out for, as both can cause major problems for reporting and tracking:
The first is data incorrectly entered by human hands (AKA manual data entry). At any time a human is involved with the entry of data into any type of system there is the potential for incorrect entries, misplaced decimal points, or wrong boxes checked off.
The difficulty is that badly-entered data can come from humans using outside systems, so even if your own system has no manual data entry, data incoming from vendors or suppliers can still be incorrect. Since you can’t always control the systems used by your supply chain, you need to have a system in place that protects you from those errors as well. (Ideally, you use a system that makes it easy for all your suppliers to use a standardized web-based template or other digital import portal).
Using templates or a web-based solution for data import allows greater control of data validation. It prevents suppliers or your own employees from accidentally entering unreadable or incompatible data.
The key is to have a validation protocol for all incoming data – at the most basic level this should include a check to make sure fields are filled out so that there are no blank fields and that there are no formatting errors (i.e. two decimal points in a number or letter characters on number fields). Any good system that accepts imports from excel sheets should have basic formatting checks to make sure data is entered in a readable format.
You should also have the ability to verify and approve incoming data before it gets logged in your primary database; this feature often means keeping your data in a temporary “holding cell” until you approve it. Another important feature is the ability to create smart ban lists that automatically flag chemicals or materials you don’t want in your facility – even if the data about these unwanted materials is entered perfectly, you still want to catch and filter them out.
The second type of bad data to watch out for is generated by integrated systems or devices. A primary example of these systems is Continuous Monitoring Systems (CMS), on which so many refineries rely for compliance monitoring. CMS generates data at a break-neck pace (as much as a new reading every second) and this can be a lot for a simple spreadsheet to handle.
Good EMS software will have CMS integration built in, but this doesn’t guarantee that it’s bad-data proof. Physical problems with the device, connection interruptions, and other technical issues with integrated systems can also generate bad data. Your EMS should be able to detect these issues and alert you automatically when they happen.
Like with manual data entry errors, CMS and other system errors can be filtered out before they become a problem. Intelligent EMS software can create automated procedures for vetting incoming system data and giving your environmental engineers an opportunity to correct it.
What to Do with Bad Data
It’s not enough to just prevent bad data from entering your data banks. It’s essential that you use a tool that is capable of identifying bad data sources, logging when bad data was detected, and automatically alerting the right people.
If you use CMS, your system should be able to provide real-time alerts when there’s a connection issue or even when a reading approaches a customized threshold. You should be able to know within a few seconds if there’s been a problem somewhere in your processes.
A sample of how a CMS error or error from another integrated system should be logged. This information can be automatically forwarded to a manager, EH&S professional, or executive.
These functions are important because it allows your business to problem solve errors: once you identify the source of errors you can address the core issues. There will always be the risk of bad data, but you can protect yourself against suppliers that continuously provide unreliable values or find out which employees need retraining on how to enter data properly.
In other words, a good EMS software puts you in a proactive position when it comes to handling bad data instead of a reactive one. That’s the key to a continuously improving business.
How does your current system stack up?
The big question to ask yourself is: does your current system leave you feeling worried or fearful about the impact of bad data? If so, that’s a sign that your tools are lacking in essential protective and troubleshooting capabilities.
The typical spreadsheet software has rudimentary error-proofing at best (in fact, it is often the source of bad data that gets imported into other systems), and many basic EMS software solutions don’t have automated filtering or customized vetting lists. You may even be expected to write your own validation scripts.
If you’re considering an upgrade in your EH&S system, be sure to investigate both how it manages your perfect data and data that isn’t always 100% reliable – that’s the true test of any EMS software.
Why is ERA passionate about this issue? It’s because we work with companies that have to deal with millions of data points each day and we know how much wasted time and effort can be caused by a single, small error. That’s why our EMS Software provides all of the error-proofing and verification functions mentioned in this article, and others too numerous to detail. We know that bad data is a part of working in fast-paced, booming industries, and we’ve built a software that won’t let that reality slow you down.
April 23, 2015