Scene Understanding Challenges

The Robotic Vision Challenges

link to the competition.
In this challenge, participants are tasked with object detection on a video stream, where each detection must provide accurate estimates of spatial and semantic uncertainty using probabilistic bounding boxes (PBoxes). Participants are evaluated using a new probability-based detection quality (PDQ) measure which will reward accurate uncertainty estimates.

Semi-automated power pole inspection

link to project page
Power line failure is a critical issue that causes electricity to be drawn from the remaining lines, and in the worst case, cause large blackouts. Maintenance on these is imperitive, but because the electricity networks are scattered, visiting each power pole itself becomes inefficient. So companies have sought to the sky to help this by using drones. However, in doing this they have created another risk in the drone crashing into the line. To fix this issue we are developing an autonomy package to ensure that human-piloted inspection drones do not collide with poles, cross arms and wires.

The aim of this project is to develop an unmanned aerial vehicle (UAV) asset inspection technology that:

  • increases the safety of the pilot.
  • reduces the required pilot skill level for close proximity data capture.
  • reduces the cognitive load required by the pilot.
  • improves the quality of collected data for the purpose of visual inspection.


link to project page
Because of the huge outbreak of the crown of thorns starfish, located in the great barrier reef, there has been a push to develop a simple monitoring and treating robot that is capable of administrating a fatal dose of bile salts directly into the starfish. 2016 saw QUT’s robotics team responsible for COTSbot, in partnership with the Great Barrier Reef Foundation, successfully pursue Google funding to develop RangerBot. This smaller version of COTSbot will be less expensive and more nimble in the water. The Great Barrier Reef Foundation’s plans for a low-cost ‘robo reef protector’ received the thumbs up by the public and by Google. The Foundation’s RangerBot project with QUT roboticists Drs Matthew Dunbabin and Feras Dayoub won a $750,000 people’s choice vote in the Google Impact Challenge Australia, a competition which helps not-for-profit organisations develop technologies that can help to tackle the world’s biggest social challenges.

Robotic Detection and Tracking of Crown-of-Thorns Starfish

link to project page

This work presents a novel vision-based underwater robotic system for the identification and control of Crown-Of-Thorns starfish (COTS) in coral reef environments. COTS have been identified as one of the most significant threats to Australia’s Great Barrier Reef. These starfish literally eat coral, impacting large areas of reef and the marine ecosystem that depends on it. Evidence has suggested that land-based nutrient runoff has accelerated recent outbreaks of COTS requiring extensive use of divers to manually inject biological agents into the starfish in an attempt to control population numbers. Facilitating this control program using robotics is the goal of our research.

This paper explains the proposed framework used for tracking and detection. Since this work was completed we have improved the precision rate greatly.

[pdf] Robotic detection and tracking of Crown-of-thorns starfish

Agricultural robotics

Towards Unsupervised Weed Scouting for Agricultural Robotics

Weed scouting is an important part of modern integrated weed management but can be time consuming and sparse when performed manually. Automated weed scouting and weed destruction has typically been performed using classification systems able to classify a set group of species known a priori. This greatly limits deployability as classification systems must be retrained for any field with a different set of weed species present within them. In order to overcome this limitation, this project works towards developing a clustering approach to weed scouting which can be utilized in any field without the need for prior species knowledge. We demonstrate our system using challenging data collected in the field from an agricultural robotics platform. We show that considerable improvements can be made by (i) learning low-dimensional (bottleneck) features using a deep convolutional neural network to represent plants in general and (ii) tying views of the same area (plant) together. Deploying this algorithm on in-field data collected by AgBotII, we are able to uccessfully cluster cotton plants from grasses without prior knowledge or training for the specific plants in the field.

Peduncle Detection of Sweet Pepper for Autonomous Crop Harvesting – Combined Colour and 3D Information

This work presents a 3D visual detection method for the challenging task of detecting peduncles of sweet peppers (Capsicum annuum) in the field. Cutting the peduncle cleanly is one of the most difficult stages of the harvesting process, where the peduncle is the part of the crop that attaches it to the main stem of the plant. Accurate peduncle detection in 3D space is therefore a vital step in reliable autonomous harvesting of sweet peppers, as this can lead to precise cutting while avoiding damage to the surrounding plant. Our method makes use of both colour and geometry information acquired from an RGB-D sensor and utilises a supervised-learning approach for the peduncle detection task. The performance of the proposed method is demonstrated and evaluated using qualitative and quantitative. Dataset utilized in this paper to train/test are available from

Visual Detection of Occluded Crop: for automated harvesting

This work presents a novel crop detection system applied to the challenging task of field sweet pepper (capsicum) detection. The field-grown sweet pepper crop presents several challenges for robotic systems such as the high degree of occlusion and the fact that the crop can have a similar colour to the background (green on green). To overcome these issues, we propose a two-stage system that performs per-pixel segmentation followed by region detection. The output of the segmentation is used to search for highly probable regions and declares these to be sweet pepper. We propose the novel use of the local binary pattern (LBP) to perform crop segmentation. This feature improves the accuracy of crop segmentation from an AUC of 0.10, for previously proposed features, to 0.56. Using the LBP feature as the basis for our two-stage algorithm, we are able to detect 69.2% of field grown sweet peppers in three sites. This is an impressive result given that the average detection accuracy of people viewing the same colour imagery is 66.8%. [PDF]

Evaluation of Features for Leaf Classification in Challenging Conditions

Fine-grained leaf classification has concentrated on the use of traditional shape and statistical features to classify ideal images. In this paper we evaluate the effectiveness of traditional hand-crafted features and propose the use of deep convolutional neural network (ConvNet) features. We introduce a range of condition variations to explore the robustness of these features, including: translation, scaling, rotation, shading and occlusion. Evaluations on the Flavia dataset demonstrate that in ideal imaging conditions, combining traditional and ConvNet features yields state-of-the-art performance with an average accuracy of 97.3% ± 0.6% compared to traditional features rhich obtain an average accuracy of 91.2% ± 1.6%. Further experiments show that this combined classification approach consistently outperforms the best set of traditional features by an average of 5.7% for all of the evaluated condition variations.


Semantic Mapping on a Mobile Robot using Convolutional Networks

In this work, we focus on the challenging problem of place categorization and semantic mapping on a robot with-out environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can learn to recognize new semantic classes online. Prior domain knowledge is incorporated by embedding the classification system into a Bayesian filter framework that also ensures temporal coherence. We evaluate the classification accuracy of the system on a robot that maps a variety of places on our campus in real-time. We show how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot’s behaviour during navigation tasks. The system is made available to the community as a ROS module.
link to code
link to project page

Vision-Only Autonomous Navigation Using Topometric Maps

The aims is to enable a robot to navigate autonomously using purely visual sensors for extended periods of time.
Active research includes:
Robustness – investigating multi-hypothesis graph-based approaches to dealing with change and uncertainty in both geometry and appearance on a global and local level.  Compact representations to aid localization. Memory based approaches to persistent navigation. Robust recovery behavior to handle the periods of vision blackout during navigation.

Robot Navigation Using Human Cues

This work shows that by using only symbolic language phrases, a mobile robot can purposefully navigate to specified rooms in previously unexplored environments. The robot intelligently organises a symbolic language description of the unseen environment and
“imagines” a representative map, called the abstract map. The abstract map is an internal representation of the topological structure and spatial layout of symbolically defined locations. To perform goal-directed exploration, the abstract map creates a high-level semantic plan to reason about spaces beyond the robot’s known world. While completing the plan, the robot uses the metric guidance provided by a spatial layout, and grounded observations of door labels, to efficiently guide its navigation. The system is shown to complete exploration in unexplored spaces by travelling only 13.3% further than the optimal path.
link to project page

Persistent navigation in densely crowded environments

Current state-of-the-art methods to tackle the problems of mapping, localizing, navigating and planning by mobile robots have been shown to produce desirable and promising results when dealing with each of these problems individually. However, there are few mobile robots which are able to demonstrate long-term autonomy using all these methods together while operating unsupervised in a real-life environment. One example of such an environment is a public event. The main characteristic of these events is that they are densely crowded most of the time which introduces a new set of challenges for mobile robots.

Long-term operation in everyday environment (Bookshop as an example)

The main challenge in mapping non-stationary environments for mobile robots comes from the fact that the configuration of the environment can change in unpredictable ways. Therefore, the internal representation which the robot holds about the state of the surrounding environment can easily become invalid and out-of-date. The consequences of this fact can have catastrophic effects on the performance and the efficiency of the planning and navigation of the robot. This work presents a method to enable a mobile robot working in non-stationary environments to plan its path and localize within multiple map hypotheses simultaneously. The maps are generated using a long-term and short-term memory mechanism that ensures only persistent configurations in the environment are selected to create the maps. In order to evaluate the proposed method, experimentation is conducted in an office and a bookshop environment. Compared to navigation systems that use only one map, our system produces superior path planning and navigation in a non-stationary environment where paths can be blocked periodically, a common scenario which poses significant challenges for typical planners.

Proudly powered by WordPress
Theme: Esquire by Matthew Buchanan.