Estimated workload

175 hours (Medium size project)




The goal of this project is to allow pages to be automatically tagged using a machine learning framework of your choice, based on the content of the page and its metadata.

There are two similar project proposals in the area of automatic tagging that could give you inspirations how this could be implemented with your chosen framework: content-based tag suggestions (using TF-IDF to suggest tags) and Organizing Knowledge Using Topic Models (using LDA for a global analysis).

Technical requirements

  • An offline solution is preferable. Sending wiki content to an external service should be avoided. Self-hosted third parties can be considered, but it's better if they can be avoided. This can lead to additional complexity in deployment and maintenance.
  • It is preferable to propose a solution that does not require excessive resources to deploy. (e.g. solutions that require several gigabytes of memory and/or disk should preferably be avoided).



