Results of our pilot to predict public procurement risks based on contract notices

The main aim of the pilot was to predict the riskiness of a procurement call based on the text of the notice. We also aim to give hints to those who are interested in replicating this attempt on similar datasets from other countries using different languages.

Our investigations carried out on two different pathways. This first one aimed to detect linguistic clues associated with doctored calls (i.e. call for tenders which had one bidder). The second approach tried to use modern machine learning methods to predict various indicators associated with the calls. Based on our findings, we built a small Proof of Concept (POC) tool, which will help our further investigations.

Despite our best efforts, we couldn’t find any linguistic clues to identify problematic procurements solely based on their texts. However, modern machine learning methods seem to be promising if there is sufficient amount of high quality data available. Our POC laid the ground for further development and testing by providing an interface to check the quality and usability of the manually set indicators and their configurations.