Of all the silver bullets promised for IT management, the discovery tool is often one of the most over-sold. “Just point our tool at your IT infrastructure,” the sales rep promises, “and we’ll discover hardware, software, servers, databases, applications, dependencies and everything else.” The discovery tool will “solve your CMDB problem” and “overcome any troubles with software licensing.” “It will make your systems more stable, eliminate manual record keeping, fix compliance issues, provide visibility and more.” It wouldn’t be surprising to hear that it also helps you lose weight with no effort.
The premise of Discovery has a long history, since at least the early 2000s. Discovery tools have some basic problems, however, that have never been completely resolved, including:
• Too much data that is too complex Data is very low level
• False positives False negatives
• Impact on environment Not cost-effective Unclear value
• Insufficient reach and effect
Too much data that is too complex
Have you ever actually seen the data dump from a Unix or Linux package manager? It’s not pretty. You don’t obtain nice high-level data, such as “Java version X is on this server.” Instead, you see thousands of lines of content that might or might not indicate which Java is running where. Windows isn’t much better. The discovery tool must massage all this data using complex business rules to give you the information you need. This is sometimes called fingerprinting, and this only helps you discover OS software. What about the services (such as applications) you are running? The systems, processes, technologies, etc. you built, what you actually use day to day?
One of the ways discovery tools are oversold is the promise of discovering dependencies. Modern systems communicate with each other in so many ways that it becomes very complex to discover and represent items and their relationships effectively. It takes human understanding to determine what dependencies are of concern and which are not. An extreme example of this: if you look just at network traffic, then you’ll see it all communicating with your DNS server. Yes, it’s a big dependency, but you already knew that.
One of discovery tool vendors’ most common demos is pointing their tool at a small, simple system, and then showing how it can visualize all the dependencies – software installations, network traffic and so on. These capabilities always demo very well, but, when you point the tool at an actual production environment, the dependencies and complexity are overwhelming.
Graphical visualizations are useless spaghetti, with hundreds or thousands of lines, which are impossible to trace. Engineers quickly abandon the graphical displays for less impressive, but more practical, text-based analysis of dependency data.
Data is very low level
The next problem is that the discovery data is too low level. Knowing that a server has Oracle on it is all very well, but who owns that Oracle license and why is it deployed there? Is it the accounts payable system, or the employee satisfaction survey? Who’s accountable? It is difficult if not impossible to obtain that from automated discovery, which is why you never obtain it from manual maintenance.
“You can create your own fingerprints,” the vendor says. “It’s easy, and then you don’t have to maintain the data manually!” OK, let’s look at that claim.
First, defining business rules for hundreds or thousands of software packages and internal services is a big investment in time, resources, and cost. The actual problem is that even when that investment is made, the discovery tool risks both false positives and false negatives.
False positives
A guy named Ralph writes some software, to be used by a customer-facing application, call it “App X.” Mary, the SVP, owns App X. Ralph works with the discovery tool vendor to build a fingerprint based on that software, so if it is discovered on a server, then the server is assumed to be running App X and therefore Mary should pay for it.
Bill finds that Ralph’s software is useful and borrows some or all of it to use in his application, “App Y,” which Sarah, another SVP, owns. Ralph didn’t think about the discovery tool implications when he let Bill have the software. Now, Bill is installing the software on some of his servers, and the discovery tool starts billing Mary for them – not Sarah! This is a problem (particularly for Mary), assuming that a given fingerprint means business ownership. Errors like this happen almost continuously, and when the mistake surfaces, your discovery tool will lose the confidence of key stakeholders. While there is a cost associated with using quality assurance to make sure the fingerprints are accurate and specific, the cost of not verifying accuracy will cost you significantly more.
False negatives
Meanwhile, Ralph writes another module for App X and installs it on some new servers. Mary still owns it, but Ralph does not update the fingerprint, and so now these servers have no identified owner. Other servers, and even entire network zones, due to security concerns, are never opened to the discovery tool, so some manual effort is always required. This is another problem.
Next week in this two part article, we discuss latency, impact on the environment, unclear value and more!
At Blazent, our core focus is ensuring the integrity and quality of IT and Operational data. Our Data Quality Management platform, driven by a powerful big data engine, is the only solution built from the ground up to provide complete, accurate, and auditable IT and operational data. We transform, align, and unify your data across the broadest range of sources in the industry, enabling you to move IT and OT strategies and recommendation from the back room to the board room, while driving downstream value across your entire organization.
Contact us today for a Free Demo!
