La Trobe

How to apply zero-shot learning to text data in substance use research: An overview and tutorial with media data

Download (232.8 kB)
A vast amount of media-related text data is generated daily in the form of social media posts, news stories or academic articles. These text data provide opportunities for researchers to analyse and understand how substance-related issues are being discussed. The main methods to analyse large text data (content analyses or specifically trained deep-learning models) require substantial manual annotation and resources. A machine-learning approach called ‘zero-shot learning’ may be quicker, more flexible and require fewer resources. Zero-shot learning uses models trained on large, unlabelled (or weakly labelled) data sets to classify previously unseen data into categories on which the model has not been specifically trained. This means that a pre-existing zero-shot learning model can be used to analyse media-related text data without the need for task-specific annotation or model training. This approach may be particularly important for analysing data that is time critical. This article describes the relatively new concept of zero-shot learning and how it can be applied to text data in substance use research, including a brief practical tutorial.

History

Publication Date

2024-05-01

Journal

Addiction

Volume

119

Issue

5

Pagination

9p. (p. 951-959)

Publisher

Wiley

ISSN

0965-2140

Rights Statement

© 2024 The Authors. Addiction published by John Wiley & Sons Ltd on behalf of Society for the Study of Addiction. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.

Usage metrics

    Journal Articles

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC