End to End Topic Modeling

Python

Copy Link

Join to Unlock

Unlock This Lesson

min

End to End Topic Modeling

publish date

Dec 7, 2022

duration

min

Difficulty

Intermediate

Beginner

Case details

# Abstract Tons of data are collected every day. As more information becomes available, it’s difficult to obtain the relevant and desired information. So, we need tools and techniques to organize, search and understand massive amounts of information. This is where the “Topic modelling” change the game. # Description In machine learning and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden semantic structures in a text body. Some common use-cases for topic modeling are: Summarizing large text data by classifying documents into topics (the idea is pretty like clustering). Exploratory Data Analysis to gain understanding of data such as customer feedback forms, amazon reviews, survey results etc. Feature Engineering creating features for supervised machine learning experiments such as classification or regression Now, there are several algorithms used for topic modeling, out of which “Latent Dirichlet allocation (LDA)” is a particularly popular method for fitting a topic model. It treats each document as a mixture of topics, and each topic as a mixture of words. This allows documents to “overlap” each other in terms of content, rather than being separated into discrete groups, in a way that mirrors typical use of natural language. In this talk we are going to cover the following points: • Quick intro to NLP • What are Topic Models? • How do we know that we have a good topic model? • How does Topic Models work? • How to go from raw data to topics? • Latent Dirichlet allocation in Action By the end of the talk, I will make sure that you will understand the significance and building topic modeling from scratch. # Prerequisites • Familiarity with Python • Basic understanding of NLP • Curiosity for learning new things.

Share case:

About Author

Kalyan Prasad

Data & AI Scientist (Freelance)

A self-taught data scientist/analytics manager, speaker, mentor & community first-person, Kalyan has contributed to various tech communities. He enjoys being involved with these communities and helping them grow. Currently he is associated with the following organisations below:

PyCon India – Review Panel Work Group Lead

PyConf Hyderabad – Organising Committee Member

PyData Global Impact Mentoring Program - Mentor

Hyderabad Python Users Group – Core Member/ Meetups Organiser

Humans for AI – Program Manager for AI learning Community

Here are some of my previous & upcoming conference talks links –

https://geekle.us/schedule/datascience

https://pydata.org/global2021/schedule/

https://hopin.com/events/pyconindia2020#schedule

https://cfp.jupytercon.com/2020/schedule/general-sessions/

https://www.pycon.se/

https://belpy.in/schedule.html

https://www.linkedin.com/feed/update/urn:li:activity:6838155804479242240/ (Invited talk)

https://pycon.hk/2021/2021-schedule/

https://2021.pycon.jp/time-table

https://th.pycon.org/

Show More

Kalyan Prasad

Data & AI Scientist (Freelance)

PyCon India – Review Panel Work Group Lead

PyConf Hyderabad – Organising Committee Member

PyData Global Impact Mentoring Program - Mentor

Hyderabad Python Users Group – Core Member/ Meetups Organiser

Humans for AI – Program Manager for AI learning Community

Here are some of my previous & upcoming conference talks links –

https://geekle.us/schedule/datascience

https://pydata.org/global2021/schedule/

https://hopin.com/events/pyconindia2020#schedule

https://cfp.jupytercon.com/2020/schedule/general-sessions/

https://www.pycon.se/

https://belpy.in/schedule.html

https://www.linkedin.com/feed/update/urn:li:activity:6838155804479242240/ (Invited talk)

https://pycon.hk/2021/2021-schedule/

https://2021.pycon.jp/time-table

https://th.pycon.org/

Show More

Kalyan Prasad

Data & AI Scientist (Freelance)

PyCon India – Review Panel Work Group Lead

PyConf Hyderabad – Organising Committee Member

PyData Global Impact Mentoring Program - Mentor

Hyderabad Python Users Group – Core Member/ Meetups Organiser

Humans for AI – Program Manager for AI learning Community

Here are some of my previous & upcoming conference talks links –

https://geekle.us/schedule/datascience

https://pydata.org/global2021/schedule/

https://hopin.com/events/pyconindia2020#schedule

https://cfp.jupytercon.com/2020/schedule/general-sessions/

https://www.pycon.se/

https://belpy.in/schedule.html

https://www.linkedin.com/feed/update/urn:li:activity:6838155804479242240/ (Invited talk)

https://pycon.hk/2021/2021-schedule/

https://2021.pycon.jp/time-table

https://th.pycon.org/

Show More

Contacts

info@geekle.us

+447 418 357 819

+141 563 852 98

Account

Activate License

Reset Password

Questions?

Chat

Chat with Us!

Code of Conduct

Legal Docs

910 Foulk Road, Suite 201

Wilmington, DE 19803, USA

Contacts

info@geekle.us

+447 418 357 819

+141 563 852 98

Account

Activate License

Reset Password

Questions?

Chat

Chat with Us!

Code of Conduct

Legal Docs

910 Foulk Road, Suite 201

Wilmington, DE 19803, USA

Contacts

info@geekle.us

+447 418 357 819

+141 563 852 98

Account

Activate License

Reset Password

Questions?

Chat

Chat with Us!

Code of Conduct

Legal Docs

910 Foulk Road, Suite 201

Wilmington, DE 19803, USA

Contacts

info@geekle.us

+447 418 357 819

+141 563 852 98

Account

Activate License

Reset Password

Questions?

Chat

Chat with Us!

Code of Conduct

Legal Docs

910 Foulk Road, Suite 201

Wilmington, DE 19803, USA