Case Study:

Root Cause Rescue

CHALLENGE

During what should have been a routine server upgrade, a local municipal district found themselves in a perplexing and desperate scenario that no one—not even the software provider—was able to resolve. The server in question was responsible for hosting Web Map Services (WMS) critical to the district’s enterprise GIS and utility infrastructure. A replacement WMS server had been installed, but the enterprise GIS was incorrectly displaying maps from the old WMS server (which was still active for testing purposes) rather than from the new WMS server as was expected. The district urgently needed to identify the root cause and resolve the issue as the legacy server was at end of life and in the process of being decommissioned. Moreover, critical infrastructure processes and daily workflows were dependent on the hosted data. Further, complications and misdirection were encountered while trying to identify the root cause and resolve the issue.

SOLUTION

The district engaged our staff to investigate the cause and propose a solution. After confirming proper configuration was implemented, the team performed process of elimination and root cause analysis techniques to troubleshoot the issue. Various logs were explored and test cases were designed to verify the GIS application was still connecting to and retrieving data from the old WMS server as opposed to some unknown client-side cache. Multiple configuration files were reviewed and eliminated as possible culprits. Through a tedious and systematic process of elimination, it was determined the GIS software provider’s proprietary template files were inadvertently storing the WMS URLs.

Unfortunately, the GIS software provider denied the proposed cause and was unwilling to resolve (or attempt to resolve) the issue. Given the time sensitivity and potential impact of the situation, something had to be done. Fortunately, a team member developed an approach whereby a hex editor was used to explore the template files in order to prove the diagnosis and give more insight as to a possible solution. After analyzing the files, it was substantiated the URLs were indeed retained in two different locations within the provider’s template files. Presented with the new evidence, the provider conceded the templates were flawed but stated the needed expertise and tools to fix the issue were no longer available. Perilously, the district was left with no other options and was directed by the provider to rebuild every template file from scratch. This recommendation would have demanded costly resourcing and substantial time for unnecessary rework.

As experienced developers, the team understood that applications that utilize mix-encoded files, such as these templates, need delimiters or indicators of some kind to determine when one encoding ends and another begins. For a URL in UTF-16 (2 bytes per character) surrounded by binary data, the file would likely contain some bytes that indicate the length of the URL. Pursuing this theory, the team found those bytes preceding the URL that represented the number of bytes of the URL plus an additional 6-byte null buffer at the tail end. The team accurately suggested the hex editor could be used to change the URL in the same UTF-16 encoding and change the bytes that represented the size of the URL plus the buffer. This methodology allowed the GIS application to read the new URL without issue, thus resolving the issue for the municipal district.

RESULT

Our team’s expertise, attention to detail, and willingness to go beyond that of others resulted in accurately identifying the root cause when no other resources could. Furthermore, the staff was able to develop an innovative solution that no other subject matter expert, including the GIS software provider, was able to identify, devise or produce. The GIS software provider further exacerbated the problem by recommending a costly and time intensive solution that was ultimately not needed. Alternatively, once the root cause was identified, the team’s solution required merely a few developer hours and a simple hex editor. The team’s deliverable is a repeatable solution and will remain viable for future upgrades until the GIS software provider permanently addresses the underlying problem with their code and proprietary format. In the end, our ability to correct the template files saved the district substantial time and money, allowed infrastructure data to remain fully functioning and accessible, and kept IT support maintenance cycles on track and in compliance.