The Pattern in the Noise: What 1,602 Exposed Modbus Systems Reveal About Industrial Security's Systemic Failures
Analyzing 1,602 internet-visible Modbus systems revealed not scattered misconfigurations but systematic patterns—95% shared TLS fingerprints, identical certificates, same CVEs across clusters. This isn't about individual negligence; it's how the entire ICS ecosystem deploys critical infrastructure in predictable, exploitable ways.
By Chawkr Reports
06/01/2026
The Pattern in the Noise: What 1,602 Exposed Modbus Systems Reveal About Industrial Security's Systemic Failures
Estimated read time: 25-30 minutes
Somewhere right now, a water treatment facility's pump controller is answering requests from anyone on the internet. A manufacturing plant's PLC is accepting unauthenticated commands from any IP address willing to connect. A power substation's monitoring system is broadcasting its status to the world—and listening for instructions in return.
These aren't hypothetical scenarios. They're the reality revealed by analyzing 1,602 internet-visible Modbus systems discovered through Shodan and clustered using ClusterHawk. What we found wasn't a story of sophisticated attackers or targeted intrusions—it was something more troubling: systemic architectural failures that have left industrial control systems exposed for reasons that have nothing to do with malicious actors and everything to do with how we've connected critical infrastructure to modern networks.
We expected scattered exposure: random misconfigured PLCs, forgotten HMIs, the usual noise of industrial systems accidentally bridged to the internet. Instead, we found patterns.
Systems sharing the same TLS certificates. The same SSH configurations. The same unpatched vulnerabilities. The same service combinations deployed identically across different organizations, countries, and providers.
This isn't about a single negligent operator. It's about how the entire industrial control ecosystem deploys and fails to secure critical infrastructure in predictable, exploitable ways.
Why Two Analyses? The original analysis (Part 1) examined 1,602 Modbus-tagged IPs after filtering Shodan-detected honeypots. Readers noted the extreme homogeneity—95% shared TLS fingerprints, identical service stacks—looked suspicious. They were probably right. Part 2 applies additional honeypot filtering (~351 IPs remaining). We present both because definitive honeypot classification from metadata alone isn't possible, and ClusterHawk's clustering methodology works consistently on either dataset—both perspectives reveal meaningful patterns about what Modbus exposure looks like in scanner data.
Why Modbus Exposure Matters
Modbus is a protocol from 1979. It predates the public internet by over a decade. It was designed for a world where industrial systems lived on isolated networks, where the concept of an "external attacker" meant someone physically breaking into a facility. The protocol reflects these assumptions: no authentication, no encryption, no access control whatsoever.
When you connect to a Modbus service, you don't log in. You just start sending commands. Read this register. Write that value. The device complies because that's what it was designed to do.
This made perfect sense in 1979. It makes no sense in 2026.
| What Modbus Controls | Real-World Impact |
|---|---|
| Coils (binary outputs) | Valve positions, motor starts, relay states |
| Holding Registers | Temperature setpoints, pressure thresholds, operational parameters |
| Input Registers | Sensor readings, flow rates, voltage measurements |
| Discrete Inputs | Switch states, alarm conditions, safety interlocks |
The protocol offers no authentication barrier, so the question isn't whether these systems could be targeted. It's what the patterns of exposure reveal about who's at risk and why.
Part 1: Full Dataset Analysis [1,602 IPs]
This section analyzes the complete Shodan dataset after filtering only Shodan-tagged honeypots (73 systems, 4.6%).
The Uncomfortable Discovery
Clustering 1,602 exposed ICS systems, you'd expect them to scatter; each organization's infrastructure should reflect its unique environment. That's not what happened.
80 distinct clusters emerged with 93% confidence, and the similarities within clusters were striking:
| What We Expected | What We Found |
|---|---|
| Random exposure patterns | 95% of hosts share identical TLS fingerprints |
| Diverse configurations | Same CVEs preserved across 14 different clusters |
| Individual failures | Certificates issued at the exact same timestamp |
| Isolated incidents | Service stacks replicated verbatim across providers |
The infrastructure isn't just exposed; it's templated. Someone deployed these systems from shared playbooks, default configurations, and vendor images that were never secured.
Building the Evidence: How Weak Signals Become Strong Patterns
No single data point tells the story. But when multiple weak signals align, they reveal something important.
Signal 1: The Certificate Timestamp
Nineteen IPs across different networks share certificates issued at the exact same moment:
Issued: 2025-12-03 19:55:45 UTC
Serial: 3962222222037844928038379877949654151
Nineteen systems, same second, different providers. That's a deployment event. Someone provisioned infrastructure in batch and never individualized the certificates.
Signal 2: The TLS Fingerprint
95.99% of the analyzed infrastructure (958 hosts) shares an identical JA3S hash:
JA3S: 6c2811f7ba8e88604ea41a2bf9fa5ad7
JA3S fingerprints capture how a server negotiates TLS. When 958 hosts negotiate TLS identically, they're running the same software with the same configuration. Across 14 different hosting providers. Across multiple countries.
Signal 3: The Vulnerability Profile
The same CVEs appear across clusters with statistical significance:
| CVE | EPSS Score | Appears In | Interpretation |
|---|---|---|---|
| CVE-2017-3599 | 0.9942 | Clusters 3, 10, 20, 29, 35 | MySQL DoS via integer overflow—high exploitation probability |
| CVE-2025-26466 | 0.4666 | Clusters 6, 10, 21, 26, 37, 51 | SSH pre-auth DoS—moderate exploitation probability |
| CVE-2008-3844 | 0.0209 | 110+ hosts per cluster | Red Hat OpenSSH 2008 supply chain compromise—probable CPE artifact |
High EPSS scores suggest these vulnerabilities are commonly exploited at internet scale, but EPSS doesn't confirm exploitation of any specific host. EPSS scores reflect values at time of analysis (January 2026) and update daily.
The CVE-2008-3844 entry is almost certainly a Shodan CPE-matching artifact — the signature pattern-matches OpenSSH version strings from the Red Hat 2008 trojaned packages, but real exposure on modern systems is unlikely. Its very low EPSS (0.0209) supports this reading. Warrants local provenance verification only if a host is genuinely running one of the affected Red Hat OpenSSH builds from that incident.
Signal 4: The Service Stack
The same service combination appears repeatedly:
Modbus/502 + MySQL/3306 + Redis/6379 + SSH/22 + SSH/2222
This exact stack—including the unusual dual SSH ports—appears in clusters 6, 10, 21, 26, 37, and 51. Different organizations, same architecture. Same vulnerabilities. Same attack surface.
What This Reveals
None of these signals alone proves much:
- Shared certificates? Maybe the same CA.
- Same TLS fingerprint? Same product.
- Same CVEs? Common vulnerabilities.
- Same services? Standard stack.
But all four together reveal how ICS infrastructure actually gets deployed: vendors ship default configurations that customers never change, integrators use templates replicated across clients, MSPs manage multiple networks with identical practices, and nobody patches because "it's working."
The clustering doesn't reveal a single operator. It reveals an industry-wide pattern of copy-paste deployment with copy-paste vulnerabilities.
Why This Pattern Is Dangerous: The Multiplication Effect
Traditional vulnerability management assumes each system is unique. You patch server A, then server B, then server C. Each has its own attack surface. Clustered infrastructure breaks this assumption.
When 110 hosts share the same configuration, vulnerabilities, and access paths, an attacker doesn't need 110 exploits. They need one playbook that works 110 times.
A Common Attack Pathway Enabled by This Stack
This attack chain exploits deployment patterns rather than sophisticated vulnerabilities. Default Redis configurations enable initial access; templated host keys suggest credential reuse; shared configurations work across entire clusters; and the absence of patching means the same techniques work across hundreds of hosts.
Based on the actual service data from clusters 6, 10, 21, 26, 37, and 51—which share 95.99% overlap on TLS fingerprints and SSH host keys—here's an attack path consistent with the observed exposure:
PHASE 1: Initial Access via Unauthenticated Redis
┌─────────────────────────────────────────────────────────────┐
│ Target: Redis 7.2.10 on port 6379 │
│ Finding: 932 Redis instances across infrastructure │
│ Method: Direct connection—Redis is frequently deployed │
│ without authentication unless explicitly configured. │
│ Cluster 6 alone has 220 instances (dual Redis per host │
│ for redundancy). │
│ │
│ Risk: Misconfigured Redis may expose administrative commands│
│ that enable persistence or pivoting, depending on │
│ server configuration and filesystem permissions │
└─────────────────────────────────────────────────────────────┘
│
▼
PHASE 2: Credential Harvesting from Database Layer
┌─────────────────────────────────────────────────────────────┐
│ Target: MySQL 5.7.16 on port 3306 │
│ Finding: 997 MySQL instances exposed │
│ Method: With shell access, dump MySQL credentials │
│ and connection strings from application configs │
│ │
│ Key data: 333 CVEs affect this MySQL version (EPSS 0.586) │
│ Same version across clusters suggests templated │
│ builds where credential reuse is common │
│ │
│ Alternative entry: Telnet/23 (494 hosts, 32.3%) │
│ Default credentials: admin/admin, │
│ root/root, or blank password │
└─────────────────────────────────────────────────────────────┘
│
▼
PHASE 3: Lateral Movement via Shared Host-Key Fingerprints
┌─────────────────────────────────────────────────────────────┐
│ Target: SSH/22 + SSH/2222 (dual ports for redundancy) │
│ Finding: 958 hosts share identical SSH host-key fingerprint │
│ 35:f1:36:a6:8b:ee:13:13:78:bc:56:03:ea:9d:ee:4e │
│ │
│ Implication: Identical host keys strongly suggest templated │
│ images deployed without regeneration. This │
│ enables host spoofing and fleet correlation. │
│ Templated deployments often reuse credentials. │
│ │
│ Clusters 2, 6, 10, 21, 26, 37, 51, 61 all expose SSH │
└─────────────────────────────────────────────────────────────┘
│
▼
PHASE 4: Industrial Control System Access
┌─────────────────────────────────────────────────────────────┐
│ Target: Modbus/502 │
│ Finding: All 1,602 hosts expose Modbus │
│ Method: Direct protocol commands—no authentication │
│ │
│ Capabilities: Read sensor values (Input Registers) │
│ Modify setpoints (Holding Registers) │
│ Toggle outputs (Coils) │
│ Query device status (Discrete Inputs) │
│ │
│ Impact: Physical process manipulation │
└─────────────────────────────────────────────────────────────┘
│
▼
PHASE 5: Data Exfiltration via FTP Staging
┌─────────────────────────────────────────────────────────────┐
│ Target: FTP/21 │
│ Finding: 1,134 hosts expose FTP; Clusters 13, 34 are │
│ FTP-only infrastructure (116 hosts) │
│ │
│ Purpose: Data staging before exfiltration │
│ Unencrypted protocol = credential interception │
│ │
│ MITRE: T1048.003 (Exfiltration Over Alternative Protocol) │
└─────────────────────────────────────────────────────────────┘
The CVEs identified (CVE-2017-3599, CVE-2025-26466, CVE-2008-3844) aren't the attack vectors; they're indicators of systemic neglect. When 333 high-risk CVEs persist across infrastructure running identical software versions, it signals that patching doesn't happen here. The real attack surface is simpler: unauthenticated Redis, default Telnet credentials, and shared host-key fingerprints across 958 hosts.
FrostyGoop: A Real-World Example
On January 22-23, 2024, FrostyGoop malware attacked a municipal energy company in Lviv, Ukraine. Dragos researchers describe it as the first publicly documented ICS malware designed to interact directly with industrial equipment via Modbus TCP. The attack disrupted heating to over 600 apartment buildings, affecting approximately 100,000 residents during sub-zero temperatures.
The incident illustrates the same class of failure these exposures enable. Attackers gained initial access via a MikroTik router vulnerability in April 2023, deployed a web shell to harvest credentials, moved laterally through poor network segmentation to reach heating controllers, and sent direct Modbus commands to ENCO controllers on port 502. Over 100,000 residents lost heating for nearly two days.
The attackers downgraded controller firmware to disable monitoring, then caused controllers to report false measurements—showing hot water when it was cold. Dragos assesses that FrostyGoop's generic Modbus implementation makes it adaptable to any ICS environment using the protocol.
The configurations we found match this attack profile. The vulnerabilities have ready-made exploit modules. The Modbus endpoints accept unauthenticated commands.
INCONTROLLER/PIPEDREAM, discovered in 2022, was designed to target exactly these service combinations—Modbus, OPC UA, and CODESYS devices with common vulnerability patterns.
Evidence of Active Targeting
The analysis detected one host running Metasploit (Cluster 11, IP 120.157.27.81), a penetration testing framework used for both legitimate assessments and malicious exploitation. Whether this represents authorized testing, red team operations, or adversarial reconnaissance, it confirms that professional exploitation tools are already being used against this infrastructure.
The Telnet Finding: Default Credentials at Scale
494 hosts (32.3%) expose Telnet on port 23 to the internet.
Telnet itself isn't the problem; many ICS engineers use it daily for local troubleshooting. The problem is what internet-exposed Telnet reveals about these deployments. If vendors ship systems with default TLS certificates and nobody changes them, if they ship with default SSH host keys and nobody regenerates them, why would anyone change the default Telnet credentials?
They don't:
Phase 1: Connect to Telnet/23
Phase 2: Try admin/admin, root/root, or blank password
Phase 3: You're in
No CVE required. No exploit development. Just default credentials
on a protocol that was never meant to face the internet.
The clustering confirms this pattern. Telnet exposure appears in specific clusters (1, 3, 8, 14-16, 24, 27, 30) alongside other services with the same unchanged-defaults signature. These aren't systems where someone accidentally enabled Telnet; they're systems where nothing was changed from vendor defaults, and Telnet happened to be one of those defaults.
That's 494 systems where the path from internet access to authenticated shell requires guessing passwords printed in the manual.
The Anomalies: Infrastructure That Defies Classification
Not everything clusters neatly. 77 IPs (5.04%) exhibited unstable cluster membership, assigned to different clusters across analysis runs.
Four IPs stood out with extreme instability:
| IP | Behavior | Possible Interpretation |
|---|---|---|
| 149.12.67.161 | Assigned to 16 different clusters | Multi-purpose infrastructure |
| 149.12.67.163 | 70% label change rate | Configuration changes between scans |
| 149.12.67.252 | 81% label change rate | Deliberately ambiguous |
| 149.12.67.118 | Same /24 as above | Coordinated anomalous behavior |
All four share the same /24 subnet, exhibit the same instability pattern, and share certificates and SSH configurations with the broader infrastructure. Possible explanations include research infrastructure, honeypots, compromised systems behaving unusually post-breach, or operational testing environments. The clustering can't tell us which, but it can tell us these systems warrant investigation.
The Systemic Risk Picture
Provider Concentration: Shared Blast Radius
The infrastructure clusters across a small number of hosting providers:
| Provider | Pattern |
|---|---|
| Cogent Communications | Appears across multiple ICS clusters |
| PSINet, Inc. | 42% of high-vulnerability cluster hosts |
| Hurricane Electric LLC | Shared presence in critical clusters |
| Aliyun Computing | Cloud-hosted ICS (unusual and concerning) |
This creates correlated risk. A provider-level incident affects multiple organizations simultaneously, and most ISPs don't filter Modbus traffic because they don't expect industrial protocols on the public internet.
The Shared Blast Radius
When a vulnerability is discovered in this infrastructure, it doesn't affect one system; it affects every system deployed from the same template. Clusters 6, 10, 21, 26, 37, and 51 share:
- The same MySQL version (5.7.16)
- The same Redis version (7.2.10)
- The same OpenSSH version (9.9)
- The same CVEs
- The same attack paths
Patch one, and you've patched one. An attacker who compromises one has a working exploit for all of them.
Cluster Reference Guide [Full Dataset: 1,602 IPs]
Note: Every IP in this dataset has Modbus exposure—the dataset was filtered from Shodan's Modbus-tagged systems. The cluster characteristics below describe additional services that differentiate clusters, but Modbus capability is the common thread across all 1,602 hosts.
Throughout this analysis, cluster numbers refer to groups identified by the clustering algorithm. This table summarizes the key characteristics of clusters mentioned in the post:
| Cluster(s) | Host Count | Distinguishing Services | Key Characteristics |
|---|---|---|---|
| 6, 10, 21, 26, 37, 51 | ~660 | MySQL, Redis, SSH | Core infrastructure group—95.99% shared TLS fingerprint, identical service stack |
| 3, 10, 20, 29, 35 | ~550 | MySQL, FTP | CVE-2017-3599 vulnerable (EPSS 0.9942) |
| 1, 3, 8, 14-16, 24, 27, 30 | 494 | Telnet | Default credential exposure risk |
| 13, 34 | 116 | FTP | Data staging infrastructure |
| 2, 6, 10, 21, 26, 37, 51, 61 | 958 | SSH | Shared host-key fingerprint across all hosts |
| 11 | 1 | Metasploit | Active exploitation tools detected |
| 19, 59 | ~20 | Mixed | Anomalous/research infrastructure |
Cluster numbers are arbitrary identifiers from the analysis pipeline. What matters is the shared characteristics within each group—these patterns reveal the systemic nature of the exposure.
Part 2: Filtered Dataset Analysis [~351 IPs]
This section analyzes the dataset after applying additional honeypot detection heuristics beyond Shodan's tagging. Approximately 78% of the original IPs were filtered as probable honeypots, leaving ~351 IPs.
Methodology Note
We filtered systems showing patterns inconsistent with real ICS deployments—service counts too high, fingerprints too uniform. It's imperfect:
- Some honeypots likely survived
- Some legitimate systems probably got filtered
- Without direct access, we can't know for certain
This filtered view complements Part 1, it doesn't replace it.
Cluster Portraits
351 IPs passed filtering, forming 23 clusters with strong ensemble stability.
Important: Every IP in this dataset has Modbus exposure—that's the common thread. The dataset was filtered from Shodan's Modbus-tagged systems. The cluster portraits below describe additional distinguishing services that differentiate clusters from each other, but Modbus capability is present across all 351 hosts.
Cluster Portrait: FTP Infrastructure (Cluster 5 — 66 assets, 18.8%)
| Attribute | Finding |
|---|---|
| Core Service | FTP on port 21 (100% prevalence) |
| Domain | e-wro.net.pl (100% coverage - all 66 hosts) |
| Organization | Metro Ethernet Access Services |
| TLS/Certificates | None (plaintext FTP only) |
| Shodan Query | hash:-499029242 org:"Metro Ethernet Access Services" port:21 |
| Shodan Matches | 72 hosts worldwide |
Configuration breakdown:
- All 66 hosts resolve to e-wro.net.pl
- Same FTP server fingerprint across entire cluster
- All traffic routes through Metro Ethernet Access Services
- No TLS, no SSH—plaintext FTP only
Highly homogeneous, suggesting centralized management. Within the 82.143.x.x subnet, 66 IPs form this FTP-only cluster (C5), while 34 IPs with additional services landed in Cluster 9—clustering operates on service fingerprints, not network topology. Likely legitimate Polish hosting infrastructure.
Cluster Portrait: ICS/Modbus Devices (Clusters 1, 6, 10, 12, 18, 20 — 62 assets, 17.7%)
| Attribute | Finding |
|---|---|
| Core Service | Modbus TCP on port 502 (100% prevalence) |
| Vendors Identified | RESOL, Teltonika, Sierra Wireless (ACEmanager), CoDeSys |
| Modbus Unit IDs | Unit 0, Unit 1, Unit 2 (varying by sub-cluster) |
| Shodan Query | hash:334888494 port:502 |
| Shodan Matches | 22 hosts worldwide |
Sub-Cluster Breakdown:
| Sub-Cluster | Hosts | Device Profile | Key Signatures |
|---|---|---|---|
| C1 | 27 | Pure Modbus devices | Device Identification response, Unit 1+2, no web UI |
| C6 | 7 | ICS with web interfaces | CoDeSys WebVisualization, WHEDCo BMS addresses, Boa HTTPd |
| C10 | 6 | Minimal Modbus | Port 502 only, Device Identification, isolated devices |
| C12 | 9 | Tagged ICS | Explicit "ics" tag, pure Modbus exposure |
| C18 | 5 | Cellular gateways | ACEmanager (Sierra Wireless), lighttpd, MQTT |
| C20 | 8 | Gateway/broadcast ICS | "ics" tag, Unit 0 responses, 50% have FTP |
What we found:
- CoDeSys WebVisualization - PLC programming environment interface
- WHEDCo building addresses (e.g., "50 E. 168th St, Bronx, NY") - Building Management Systems
- ACEmanager - Sierra Wireless industrial cellular gateway management
- Modbus Unit 0 in C20 - broadcast-capable or gateway devices
These signatures (CoDeSys, ACEmanager, WHEDCo BMS) are hard to fake convincingly. Low Shodan match count (22 hosts) points to specific device models. If legitimate, direct internet exposure enables FrostyGoop-style attacks.
Cluster Portrait: MQTT/IoT Brokers (Clusters 2, 4, 14, 15 — 43 assets, 12.3%)
| Attribute | Finding |
|---|---|
| Core Service | MQTT on port 1883 (plaintext) |
| Brokers Detected | Mosquitto 2.0.18, EMQX (Kubernetes-deployed) |
| TLS Configuration | TLSv1.2, ECDHE-RSA-AES128-GCM-SHA256 |
| Shodan Query | hash:-1351362334 port:1883 product:MQTT |
| Shodan Matches | 288,411 hosts worldwide |
MQTT Topic Intelligence:
| Topic Pattern | Cluster | Interpretation |
|---|---|---|
rubix/points/value/cov/all/module-core-modbus/.../Stage 1 Compressor | C4 | Real ICS telemetry - Modbus-over-LoRa compressor control |
$SYS/broker/messages/received | C2 | Broker metrics |
$SYS/broker/load/messages/received/5min | C4, C15 | Load monitoring |
$SYS/brokers/[email protected]/version | C14 | Kubernetes EMQX cluster |
00775627CDAC4C9283AD471AED4C9B21/.../DEVICE2688/CB_TRIP/INFO | C14 | Circuit breaker trip info |
Sub-Cluster Breakdown:
| Sub-Cluster | Hosts | Broker | Notable Features |
|---|---|---|---|
| C2 | 13 | Mosquitto | RESOL-DL2Plus, Teltonika signatures |
| C4 | 14 | Mosquitto | Modbus-over-MQTT with real sensor data (2234.48, 179.31...) |
| C14 | 6 | EMQX | Kubernetes-deployed, circuit breaker telemetry |
| C15 | 10 | Mosquitto + Cloud | Aliyun, Chinese/Taiwanese IoT vendors (smartpole.com.tw) |
The MQTT topic paths contain real device identifiers and sensor readings (Stage 1 Compressor, circuit breaker trip
data)—hard to fake this specificity. But Mosquitto is also used in Conpot and other honeypots, so we can't be certain.
The Kubernetes-deployed EMQX (C14) suggests enterprise IoT infrastructure.
Cluster Portrait: Anomalous Outliers (Cluster -1 — 2 assets, 0.6%)
| Attribute | Finding |
|---|---|
| Hosting | Google Cloud Platform |
| TLS Version | TLS 1.3 with 256-bit encryption |
| HTTP Server | "Webs" (lightweight/embedded) |
| Shodan Query | http.status:200 http.dom_hash:1684361402 http.headers_hash:-1045840734 org:"Google LLC" port:443 |
| Shodan Matches | 2 hosts worldwide (unique pattern) |
Why these don't fit:
- TLS 1.3 + ECDSA certificates—real ICS/SCADA devices typically run older TLS
- No FTP, no MQTT, no Modbus on port 502—just HTTPS
- "Webs" is a lightweight embedded server, not industrial interfaces like Robustel or Moxa
- Google Cloud is unusual for industrial equipment
- Only 2 hosts worldwide match this pattern
Cloud hosting + modern TLS + lightweight HTTP suggests non-industrial web services, cloud proxies, or honeypots that happened to respond on Modbus ports.
Cluster Reference Guide [Filtered Dataset: ~351 IPs]
| Cluster | Host Count | Core Service | Quality Score | Key Characteristics |
|---|---|---|---|---|
| 5 | 66 (18.8%) | FTP/21 | Stable | e-wro.net.pl Polish hosting, 100% domain coverage |
| 3 | 37 (10.5%) | Mixed | Good | Web services, diverse configurations |
| 9 | 34 (9.7%) | FTP/21 | Good | Secondary FTP infrastructure cluster |
| 1 | 27 (7.7%) | Modbus/502 | Very Good | Real ICS devices, 22 global fingerprint matches |
| 19 | 15 (4.3%) | HTTPS/443 | Good | Contains Metasploit host (120.157.27.81) |
| 14 | 6 (1.7%) | MQTT/1883 | Best (0.896) | ICS-labeled IoT infrastructure |
| 12 | 9 (2.6%) | Modbus/502 | Very Good | ICS device cluster, excellent stability |
| -1 | 2 (0.6%) | HTTPS/443 | Worst (-0.950) | Google Cloud anomalies, requires investigation |
Part 3: Comparing Both Analyses
Same methodology, different datasets. The differences below come from what we filtered out, not how we clustered.
Side-by-Side Comparison
| Aspect | Full Dataset (1,602 IPs) | Filtered Dataset (~351 IPs) |
|---|---|---|
| Clusters | 80 at 93% confidence | 23 at 89% confidence |
| JA3S fingerprint sharing | 95.99% identical | <2% share any fingerprint |
| SSH host keys | 958 identical | Diverse configurations |
| Certificate timestamps | 19 at same second | No shared timestamps |
| MySQL instances | 997 | 0 |
| Redis instances | 932 | 0 |
| Telnet hosts | 494 (32.3%) | 0 |
| Modbus exposure | 1,602 (100%) | 351 (100%) |
| FTP hosts | 1,134 | 100 |
| MQTT brokers | Not prominent | 43 hosts |
| Provider distribution | Cogent-heavy concentration | Diverse |
| Metasploit detection | 120.157.27.81 | 120.157.27.81 |
What Holds Up In Both
These findings appear regardless of filtering:
- Real ICS exposure exists: 62 devices show characteristics of genuine industrial systems (RESOL, Teltonika, Sierra Wireless, CoDeSys)
- Metasploit presence: IP 120.157.27.81 running exploitation framework—confirmed in both
- Clustering works: Both datasets form meaningful clusters regardless of honeypot contamination
- The core risk is real: Unauthenticated Modbus remains a genuine attack vector (see FrostyGoop)
What Diverges
These patterns appear in the full dataset but vanish after filtering:
| Pattern | Full Dataset | Filtered Dataset | Interpretation |
|---|---|---|---|
| Extreme homogeneity | 95.99% TLS fingerprint sharing | <2% sharing | Likely honeypot artifact |
| Rich service stacks | MySQL + Redis + SSH + Telnet | Simpler profiles | Likely honeypot artifact |
| Shared credentials | 958 identical SSH host keys | Diverse keys | Likely honeypot artifact |
| Provider concentration | Cogent-heavy pattern | Diverse distribution | Likely honeypot artifact |
The divergence suggests significant honeypot presence in the full dataset. We can't rule out that some legitimate templated deployments got filtered too.
Conclusion
Shodan's Modbus data is a mix of real industrial devices, honeypots, and research infrastructure. Clustering shows what that mix looks like.
Full dataset (1,602 IPs): Extreme homogeneity—95.99% shared TLS fingerprints, identical service stacks, batch-deployed certificates. Consistent with large-scale honeypot deployments.
Filtered dataset (~351 IPs): Heterogeneous configurations, identifiable vendors (RESOL, Teltonika, Sierra Wireless, CoDeSys), simpler service profiles matching actual industrial equipment.
Both confirm:
- Real industrial systems are exposed on the internet
- Modbus remains unauthenticated and vulnerable to FrostyGoop-style attacks
- Clustering works regardless of honeypot contamination
Practical takeaway: When analyzing internet-exposed ICS data, expect significant honeypot presence. Service count and fingerprint diversity tell you more than platform tagging.
Disclosure: We used Shodan's indexed scan results—no active scanning, exploitation, or direct interaction with target systems. The goal is understanding aggregate exposure patterns, not identifying specific organizations.
Full dataset analysis performed on 1,602 Modbus-identified hosts from Shodan (January 2026). Filtered dataset analysis performed on ~351 IPs after additional honeypot detection heuristics. Both analyses clustered via ClusterHawk.
References
ICS Malware & Incidents
- FrostyGoop ICS Malware Impact on Connected OT Systems — Dragos Intelligence Brief (July 2024)
- FrostyGoop Malware Analysis: Artifacts, Behaviors and Network Communications — Palo Alto Unit 42
- Russia-Linked Hackers Used Frostygoop Malware to Shut Off Heat to 600 Ukrainian Buildings — WIRED
- New critical infrastructure malware used in attack on Ukrainian city — Axios
- FrostyGoop malware left 600 Ukrainian households without heat — The Record
- CISA Advisory AA22-103A: APT Cyber Tools Targeting ICS/SCADA Devices — Joint advisory on INCONTROLLER/PIPEDREAM (April 2022)
- INCONTROLLER: State-Sponsored Cyber Attack Tools Target Multiple ICS — Mandiant/Google Cloud
- CHERNOVITE Threat Profile — Dragos threat actor analysis
Vulnerability References
- CVE-2017-3599 — MySQL Server integer overflow (DoS)
- CVE-2025-26466 — OpenSSH pre-authentication DoS
- CVE-2008-3844 — Red Hat OpenSSH supply chain compromise (trojaned packages)
- CVE-2008-3844 Details — CVE Details analysis
- Red Hat Security Advisory CVE-2008-3844 — Red Hat Customer Portal
Data Sources
- Shodan — Internet-connected device search engine
- ClusterHawk — Infrastructure clustering and analysis platform
