Australia’s national science agency, CSIRO, has analysed a 10‑month workplace trial showing that large language models can help frontline cybersecurity teams manage threats while easing the strain of round‑the‑clock monitoring.
The study examined anonymised data from global cybersecurity firm eSentire’s Security Operations Centres in Ireland and Canada, where analysts sift through alerts, investigate incidents and respond to attacks. Over the trial, 45 analysts posed more than 3,000 prompts to ChatGPT‑4, chiefly for routine, low‑risk tasks such as interpreting technical telemetry, editing reports and scrutinising snippets of malware code.
Dr Mohan Baruwal Chhetri, Principal Research Scientist at CSIRO’s Data61, said the findings point to AI being embedded in real workflows to augment, not supplant, human judgment. “ChatGPT-4 supported analysts with tasks like interpreting alerts, polishing reports, or analysing code, while leaving judgement calls to the human expert,” Dr Baruwal Chhetri said. “This collaborative approach adapts to the user’s needs, builds trust, and frees up time for higher-value tasks.”
The work sits within CSIRO’s Collaborative Intelligence (CINTEL) programme, which explores human‑AI teaming in high‑stakes settings, including cybersecurity where analyst fatigue is a persistent risk. SOC teams face rising volumes of alerts, many of them false positives, which can sap productivity, obscure real threats and contribute to burnout. Dr Baruwal Chhetri said similar human‑AI approaches could translate to other pressure‑cooker environments such as emergency response and healthcare.
Dr Martin Lochner, a data scientist and research coordinator on the project, said the trial is the first long‑running industrial study to test how LLMs operate inside live cybersecurity operations and to inform tools designed specifically for SOC workflows. “This collaboration uniquely combined academic rigor with industry reality, producing insights that neither pure laboratory studies nor industry-only analysis could achieve,” Mr Locher said.
Beyond headline productivity gains, the data also shed light on how analysts want to use generative AI. The team found only four per cent of prompts sought a direct answer, such as ‘is this malicious?’. Most analysts asked for evidence and context to underpin their own decisions rather than definitive calls from the model. “This highlights the value of LLMs as decision-support tools that enhance analyst autonomy rather than replace it.”
Building on the initial 10‑month dataset, CSIRO and eSentire will now undertake a longer study spanning two years to track how analyst behaviour evolves as familiarity with the tools grows. That phase will add qualitative interviews to compare user experiences with system logs, with the aim of refining AI assistance for broader adoption across SOC environments and better quantifying impacts on performance and wellbeing.
The full analysis is published as LLMs in the SOC: An Empirical Study of Human‑AI Collaboration in Security Operations Centres.