Now we get into the part of the project that I have little to no experience in. While I have used the tools, I don’t have years of working knowledge and I have certainly never combined them in this way in an attempt to build a narrative around an attack. Thus enter my good buddy Claude to fill in the gaps.
I knew what I wanted, but not how to get there. I have been in school with SANS for some time now and while I have learned an absolute ton, I don’t claim myself to be an expert in anything. Since this is my first experience setting up a honeypot like this, I started reading about different ways to parse the data… and man there are a lot of them. I decided that while I could follow a write up of someone that had come before me to the letter and have a working solution in no time, I would probably learn more by explaining what I wanted to AI and having it coach me through the process.
I have a Security Onion deployment here at work, and to be clear up front, I know stream correlation isn’t anything new. Security Onion already does it, plenty of SIEMs do it, and there are write-ups all over the place on wiring Cowrie, Suricata and Zeek together. That isn’t the part I’m trying to prove. The problem I actually have is the one sitting in front of me every day: a perfectly good Security Onion deployment generating more alerts than I will ever have time to triage by hand. So the thing I’m really testing here isn’t whether I can correlate the streams, it’s whether I can hand the correlated data to AI and get back a daily summary that does my tier 1 parsing for me and only shows me what I need to take action on. The idea came out of an industry conference; to use an offline AI model to collect data from “all the things,” sanitize it (since its internal), and then ship it off to the Claude API. Then I can have Claude deliver a daily report on the most important alerts. This gives me the actionable summary I want and saves me hours. The ICS internship with SANS is providing me the opportunity to refine the bones of that idea before putting it into practice in my own network. So the plan…
First off, the HoneyPi is essentially the DShield honeypot package that the ISC deploys as part of the ongoing sensor network and data-collection project. The DShield has an iptables/firewall log reporter that submits packet level reject/drop data, alongside an ISC-specific output plugin that ships Cowrie events to the collector for the aggregate threat feeds. Cowrie is the upstream honeypot engine that DShield uses as its SSH/Telnet emulation layer. So, you get all that just by deploying the DShield; I wanted a bit more and to try and set up a sort of Security Onion “lite” deployment.
I decided on a combination of tools, with the goal of building three complementary streams I could correlate against a single attacker IP. While we aren’t fighting the Stay Puft marshmallow man, this is yet another time where we want to cross the streams… With Cowrie telling me what the attacker did, Suricata giving me the alert context, and Zeek providing the protocol narrative, the final tooling is laid out below.
- Suricata installed locally on HoneyPi, running the ET Open ruleset to “grade the data” and provide signature-based alerting that lets me sort by severity using rule classtype and priority.
- Tcpdump also on the HoneyPi, on an hourly rotation to grab a full packet capture to ship off to Zeek.
- Grafana Alloy, as the transport layer, shipping the logs (Cowrie and Suricata’s
eve.json) to my Mac here in the office. - Zeek on the Mac, batch processing pcaps that are rsynced over the 12222 port we have been using for SSH. Also a second Alloy instance on the Mac is where we essentially “cross the streams”. Alloy normalizes the Zeek data, provides labeling, and makes sure all three streams share the same key so they line up against one attacker.
- Loki + Grafana for storage and presentation. Loki holds the logs, and Grafana on top gives me the ability to build dashboards, research, and analyze it all in an intuitive way.
The last piece of the puzzle is the AI part, and what I really want refined for the enterprise network I am in charge of securing. Since I am nowhere near a python wizard and it would take me a year to create a script that would do what I want, who better to help me interact with AI than AI itself. Claude generated a great little guide and script for me to bundle up all the information I am getting from this whole setup and ship it off to Claude’s API. The script does this daily and provides a report on the last 24hrs. That report is the main goal of all this. I want that report to cut through all the data that’s just noise and tell me what’s important. Now, this script is designed for honeypot analysis, so it will be very different than the one I use internally to parse potential IOCs on my network. However, the concept is sound and it’s already generating some good analysis on the data it’s reviewing.
In the next few sections I will lay out the guide for my setup, from installing/configuring Alloy, Suricata, and tcpdump on HoneyPi to setting up the final AI reporting mechanism. I will also be adding a short bio post in the coming days about me, to lay a bit of a background for what I hope is a new hobby of mine. I have been meaning to document more of my work in both IT and Cyber and this internship has provided the perfect motivation.