To train our CNN we need labelled data of forest paths. Most data online however is meant for autnomous driving, usually on solid (asphalt/compacted gravel) roads/paths.


One (near) exception to this is the Freiburg deepscene forest dataset. While this dataset still mostly contains compacted gravel park roads, it also contains a lot of smaller soft paths.

As such we adapted and used it for the training of our network.

See also Dataprep > Datasets > Freiburg.


Another excellent dataset of forest paths is the data set from Giusti et al.’s A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots.

This data however is unfortunately only labelled as “left”, “straight” or “center”.

See also Dataprep > Datasets > Giusti.

Own data

We also collected our own data in a forest in flanders. See Dataprep > Datasets > Steentjes for more info.

Crowd Sourced Labelling

To label the Giusti and our own data we employed the help of our fellow students in a crowd sourced labelling effort.

We set up a custom version of Django Labeller on a Google Cloud VPS and used a collaborative spreadsheet and a simple instructional document to coordinate the labelling.

Our Django Labeller instance Our Django Labeller instance

Instructional document Instructional document to coordinate labelling