- Fifteen Months in the Life of a Honeyfarm: Exploring the First Research Project of the AIDE Cohorts.
Approximately one year ago, the Global Cyber Alliance’s Internet Integrity Program took the important decision to open the AIDE repository to research projects that would help advance its objectives, either on IoT security or on the network operations space.
AIDE is a repository of billions of IoT-related incidents collected through a network of +200 geographically distributed sensors. After five years of operation, and with current records of 800,000 new incidents every day, the repository had become large and valuable enough to support the creation of a community around it, one of the goals of the Program.
Research was to become the glue binding that community together. We needed better clarity into our data and a wider sense of its applicability to our objectives, but also new perspectives that could eventually interact with each other in the exploration of synergies.
This is how the AIDE Cohort Model was created.
In this model, research institutions with actual capabilities to explore the project’s massive volumes of data are offered full access to our platform under two very simple conditions— the submission and approval of a time-bound research proposal, aligned with the strategic goals of AIDE, and the preparation of a joint final communication action that could help raise visibility over the project and trigger a constructive debate within its community. In its current setting, the model has space for up to six proposals every year, the so-called ‘AIDE Cohorts.’
Once the model was designed, we needed to test it with a trustworthy organization. That organization was Germany’s Max Planck Institute for Informatics, one of the largest and most prestigious research centers for computer science in the world.
We could not have been any luckier.
Max Planck Institute’s proposal was an academic paper that was eventually —and very successfully— presented on October 25 at the ACM Internet Measurement Conference 2023. Its title was as exciting as its findings— ‘Fifteen Months in the Life of a Honeyfarm.’
Today, I am meeting Cristian Munteanu, main author of the paper, to chat about his research and the experience of being the absolute pioneer of our AIDE Cohort Model.
(Question): Christian, first of all, congratulations on your paper and on your presentation at the ACM Internet Measurement Conference 2023. We are sure this was a great opportunity for you as a researcher. How was the paper received?
Answer: Hello, and thank you very much. It was a great opportunity indeed. I think the presentation went pretty well. We got a few questions after it, people did wonder about IPv6 and if we have plans to investigate it? Also, some results opened discussions – for example people got intrigued by the fact that each honeypot provided a significant amount of unique information.
Your paper works as a map to navigate the immense contents of our repository. Did you find any surprises there? How useful do you think it will be for future researchers?
Given the amount of data gathered, it is expected to see outliers or some unexpected behavior. As a researcher, you should always be prepared for this.
However, the GCA dataset was really helpful. Neat, easy to understand, clean and consistent. Some explanation was required at the start of the work, but afterwards everything went smoothly.
I find the dataset extremely interesting. So far, our research just scratched the surface of the barrel. In my opinion, you can find many treasures there, if you look deep enough.
Your work has been a real stress test for our platform, and an amazing learning opportunity for us at GCA. How was your personal experience as a researcher… And a pioneer?
I found the platform very useful and extremely intuitive. It did not take me very long to understand all the functionalities that the platform provided.
Of course, running complicated queries on 400 million entries is not trivial, and to be honest, I was surprised that the interface handled such jobs and more importantly answered fast.
For my research, that was mostly exploratory, the platform was an extremely useful tool.
As you know, the end goal of the AIDE Cohort Model is offering solid data on unwanted traffic that will help us mobilize network operators to address this massive issue that affects the integrity of the Internet. How do you think you have contributed to that goal?
First of all, I would like to mention the importance of the AIDE Cohort Model end goal. With the exponential growth of the Internet, the amount of unwanted traffic and the number of security breaches increased as well. Notifying network operators and providing solid data is a great way to improve security for all Internet users.
I do hope that our work shed light on the current intrusive behavior observed nowadays on the Internet. I think of this work as a starting point, “a stepping-stone into the muddy waters” of unwanted traffic. Our findings carry valuable insights, and I think that some of them can immediately contribute to the end goal of the AIDE project.
Looking into the future, what new lines of research are you exploring? Are you counting on using the AIDE platform again?
At this moment we have a good understanding of the intrusive behavior on the Internet. As in any good research, after answering one question, two new questions will appear.
Since we have a broad view of the situation, we are focusing now on specifics. For example, determining the mechanism used by intruders to perform their attacks. This may sound simple, but it is most certainly not a trivial task.
Yet, we find this investigation extremely important, as our result may further improve the understanding of the unwanted traffic.
And, as you may have guessed, once we understand it, we can start the work on preventing it.
It is a unique collaboration between researchers and security professionals.
It would be my privilege to use the AIDE platform for future work.
Thanks a lot for your amazing work, Cristian. I am looking forward to reactivating our working sessions again!