publish date
Oct 12, 2022
duration
33
min
Difficulty
Case details
Data is a bottleneck for most of the current AI and Machine Learning projects. Deep learning solutions are data-hungry and require often hundreds of thousands of examples in order to learn the patterns within data. So how can you scale your data labeling efforts to keep up with this demand? Crowdsourcing might be the solution to this scalability problem. In this session, you will learn how to treat data labeling tasks as engineering tasks and how to manage the crowd as yet another computing cluster. Magda will show you how to achieve this using python open-sourced libraries from the comfort of your terminal. [b]Toloka GitHub page: [/b] [url=https://github.com/Toloka]https://github.com/Toloka[/url] [b]Projects[/b]: [b]EASY[/b]: [url=https://github.com/Toloka/toloka-kit/blob/main/examples/0.getting_started/0.learn_the_basics/learn_the_basics.ipynb]https://github.com/Toloka/toloka-kit/blob/main/examples/0.getting_started/0.learn_the_basics/learn_the_basics.ipynb[/url] [b]MEDIUM[/b]: [url=https://github.com/Toloka/toloka-kit/blob/main/examples/5.nlp/text_classification/text_classification.ipynb]https://github.com/Toloka/toloka-kit/blob/main/examples/5.nlp/text_classification/text_classification.ipynb[/url] [b]COMPLEX[/b]: [url=https://github.com/Toloka/toloka-kit/blob/main/examples/1.computer_vision/object_detection/object_detection.ipynb]https://github.com/Toloka/toloka-kit/blob/main/examples/1.computer_vision/object_detection/object_detection.ipynb[/url]
Share case: