AquaChat: Revolutionizing marine cage inspection with conversational AI

AquaChat Project Overview: Using LLM to interpret user commands and guide adaptive ROV navigation and inspection of aquaculture net pens. Source: Akram et al. (2025); Aquacultural Engineering, 111, 102607.

Regularly inspecting the nets of aquaculture cages is a crucial yet demanding task. Ensuring structural integrity to prevent escapes, manage biofouling, and safeguard fish health requires constant monitoring—a process that has traditionally relied on divers or the manual control of Remotely Operated Vehicles (ROVs). These methods are not only expensive and labor-intensive but are also prone to error and offer limited adaptability to dynamic underwater conditions.

To address these challenges, researchers from Khalifa University have developed AquaChat, a novel system integrating Large Language Models (LLMs)—the same technology behind tools like ChatGPT—to enable a more intelligent, flexible, and efficient method of inspecting marine cages. This advancement, which integrates artificial intelligence into aquaculture operations, allows operators to “converse” with the ROV, prompting it to perform complex inspection tasks autonomously and adaptively.

Key takeaways

Contenidos ocultar

1 Key takeaways
2 How does AquaChat work? A three-layer architecture
3 Validation in simulated and real environments: Is it truly superior?
1. 3.1 Flexibility and language comprehension
2. 3.2 Navigation and control accuracy
4 Implications for the future of aquaculture
5 Entradas relacionadas:

AquaChat allows operators to instruct ROVs using natural language (e.g., “inspect the cage for holes”), eliminating the need for complex programming.
The system uses a Large Language Model (LLM), such as GPT-4, to dynamically create and adjust inspection plans in real time, responding to unforeseen events like obstacles or low visibility.
Tests show that the AquaChat planner is far more flexible and understands a wider range of commands (both structured and unstructured) than traditional fixed-rule-based systems.
The system demonstrated high accuracy in tracking inspection trajectories in both simulations and physical tests, ensuring complete and efficient coverage of the net structure.

How does AquaChat work? A three-layer architecture

AquaChat’s effectiveness lies in its multi-layered architecture, designed to translate simple natural language commands into precise ROV actions in the water.

The high-level planner: The conversational brain

At the top layer is the LLM-based planner, which utilizes OpenAI’s GPT-4 model. When an operator inputs a command like, “Inspect the entire cage using a spiral method at a 3-meter distance,” the LLM not only understands the request but also breaks it down into a symbolic action plan. This plan considers the environmental context, such as the cage’s dimensions and the ROV’s capabilities, to generate a logical sequence of tasks: move_to, inspect, and capture_image.

The mid-level task manager: The logical translator

The symbolic plan generated by the LLM is then processed by a mid-level task manager. This layer acts as a translator and supervisor, converting symbolic actions into a sequence of verifiable tasks. For example, before executing the inspect(area) action, this module validates that logical preconditions are met, such as the ROV having reached the correct position (navigated(rov)). This step is crucial for preventing execution errors and ensuring the mission proceeds in an orderly and robust manner.

The low-level motion control: Precise execution

Finally, the low-level motion control layer handles the physical execution of the tasks. Using data from the ROV’s sensors (camera, IMU, etc.), this module calculates the exact trajectories and adjusts the thrusters to move the vehicle precisely, follow the planned inspection route (e.g., a helical path), and maintain stability, even in dynamic underwater environments. A real-time feedback loop allows for error correction and on-the-fly mission adjustments if conditions change.

Validation in simulated and real environments: Is it truly superior?

The researchers validated AquaChat’s efficacy through extensive testing, comparing it against a traditional fixed-rule-based planner.

Stay Always Informed

Join our communities to instantly receive the most important news, reports, and analysis from the aquaculture industry.

Join on Telegram Follow on WhatsApp

Flexibility and language comprehension

The results were conclusive. The rule-based planner could only process structured, predefined commands. Any variation in wording or a more ambiguous command like “Can you look for holes in the net?” failed. In contrast, AquaChat’s LLM planner demonstrated far superior flexibility and comprehension, correctly interpreting both structured and unstructured commands and generating complete, coherent mission plans. Although the LLM took longer to generate the plan (between 2.7 and 47.7 seconds), its ability to handle the diversity of human language represents an immense operational advantage.

Navigation and control accuracy

In simulation tests, the AquaChat-guided ROV followed inspection trajectories (both direct-motion and complex spirals) with minimal error, demonstrating the low-level control system’s effectiveness in executing the AI-generated plans. Furthermore, experiments were conducted in a controlled pool environment with a commercial ROV (Blueye Pro ROV X). When assigned a zigzag inspection path, the ROV successfully tracked the depth waypoints with high precision, validating the system’s applicability on a real physical platform.

Implications for the future of aquaculture

AquaChat is not just a research project; it’s a glimpse into the future of aquaculture management. By making interaction with marine robotics as simple as giving a verbal command, this technology has the potential to:

Increase operational efficiency: It reduces the time and complexity involved in planning inspection missions.
Improve data accuracy and quality: It ensures systematic and complete coverage of the nets, leading to better problem detection.
Democratize the use of technology: It lowers the learning curve for ROV operators, enabling more personnel to perform high-level inspections without programming expertise.

The development of systems like AquaChat marks a critical step toward more autonomous, sustainable, and intelligent aquaculture operations, laying the groundwork for the next generation of management tools in the industry.

Contact
Irfan Hussain
Khalifa University Center for Autonomous Robotic Systems (KUCARS), Khalifa University
United Arab Emirates
Email: irfan.hussain@ku.ac.ae

Reference
Akram, W., Din, M. U., Saad, A., & Hussain, I. (2025). AquaChat: An LLM-guided ROV framework for adaptive inspection of aquaculture net pens. Aquacultural Engineering, 111, 102607. https://doi.org/10.1016/j.aquaeng.2025.102607

Milthon Lujan

Editor at the digital magazine AquaHoy. He holds a degree in Aquaculture Biology from the National University of Santa (UNS) and a Master’s degree in Science and Innovation Management from the Polytechnic University of Valencia, with postgraduate diplomas in Business Innovation and Innovation Management. He possesses extensive experience in the aquaculture and fisheries sector, having led the Fisheries Innovation Unit of the National Program for Innovation in Fisheries and Aquaculture (PNIPA). He has served as a senior consultant in technology watch, an innovation project formulator and advisor, and a lecturer at UNS. He is a member of the Peruvian College of Biologists and was recognized by the World Aquaculture Society (WAS) in 2016 for his contribution to aquaculture.

Key takeaways

How does AquaChat work? A three-layer architecture

Mantente siempre informado

The high-level planner: The conversational brain

The mid-level task manager: The logical translator

The low-level motion control: Precise execution