publish date
Dec 7, 2022
duration
57
min
Difficulty
Case details
# Abstract Tons of data are collected every day. As more information becomes available, it’s difficult to obtain the relevant and desired information. So, we need tools and techniques to organize, search and understand massive amounts of information. This is where the “Topic modelling” change the game. # Description In machine learning and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden semantic structures in a text body. Some common use-cases for topic modeling are: Summarizing large text data by classifying documents into topics (the idea is pretty like clustering). Exploratory Data Analysis to gain understanding of data such as customer feedback forms, amazon reviews, survey results etc. Feature Engineering creating features for supervised machine learning experiments such as classification or regression Now, there are several algorithms used for topic modeling, out of which “Latent Dirichlet allocation (LDA)” is a particularly popular method for fitting a topic model. It treats each document as a mixture of topics, and each topic as a mixture of words. This allows documents to “overlap” each other in terms of content, rather than being separated into discrete groups, in a way that mirrors typical use of natural language. In this talk we are going to cover the following points: • Quick intro to NLP • What are Topic Models? • How do we know that we have a good topic model? • How does Topic Models work? • How to go from raw data to topics? • Latent Dirichlet allocation in Action By the end of the talk, I will make sure that you will understand the significance and building topic modeling from scratch. # Prerequisites • Familiarity with Python • Basic understanding of NLP • Curiosity for learning new things.
Share case:
About Author