WhatsApp Chat Analyzer

I would like to share one of my project names as WhatsApp Chat Analyzer. And also, thanks to @nityesh and @kurasaiteja for sharing their article with us.

The main story behind building this project, I saw their article, and the next day one of my friend asked me “Can you tell which word I frequently used when chatting?” then I remember their article and think that it’s easy for us or any tech person to analyze their chat data but what about non-tech guys.

Then this idea came in my mind to build an application that can easily be used by anyone to analyze their chat. I have experience with web applications that’s why I choose to build WebApp.

About the app, major tools and technologies used are Streamlit, Plotly and Heroku (for hosting). You can download your chat text file, either group chat or personal chat, and upload it on the WebApp and see below in the gif how it works.

Don’t worry none of your data is stored

There is no database behind it. When a text file is uploaded using python, it converted into a dataframe, and then you can see the visualizations. It contains plots from which you can get insights such as

  • Which member of the group is most active?
  • Most frequently used word by the particular person.
  • When this particular member active for a chat?
  • Most frequent emoji used by you.

And some other insights. I ran two text files, one from group chat and other from a personal chat, and both times my top three most used emoji are :rofl:, :joy: and :laughing:. It’s enjoyable to see friends, most used emoji and words :laughing: .

For more info, check out the project on GitHub here. One improvement that can be done is the story part. Show that the user can easily get insights without putting efforts to see plots.

Feedbacks are always welcome

If anyone wants to contribute either a new feature or in the documentation feel free to fork and start contribution.

Thanks to everyone who share their projects in the community.

Keep sharing, keep Learning :slight_smile:

work

10 Likes

I think the gif is not clear you can checkout the working WebApp here.

1 Like

Wow, cool… Great work!!! @Prem… I recall the article you mentioned, I haven’t used the webApp just yet, but be sure I’ll check it out :+1:

2 Likes

Great idea @Prem!

I tried to use and got an error, however. It says the file is empty. I do not know what’s going on, maybe it’s because I use Whatsapp in Portuguese. Anyway, I though it was important to let you know.

This is the error:

ValueError: We need at least 1 word to plot a word cloud, got 0.
Traceback:
File "/app/.heroku/python/lib/python3.6/site-packages/streamlit/ScriptRunner.py", line 322, in _run_script
    exec(code, module.__dict__)
File "/app/app.py", line 102, in <module>
    analysis.word_cloud(data)
File "/app/custom_modules/func_analysis.py", line 69, in word_cloud
    wordcloud = WordCloud(stopwords=STOPWORDS, background_color='white', height=640, width=800).generate(processed_words)
File "/app/.heroku/python/lib/python3.6/site-packages/wordcloud/wordcloud.py", line 631, in generate
    return self.generate_from_text(text)
File "/app/.heroku/python/lib/python3.6/site-packages/wordcloud/wordcloud.py", line 613, in generate_from_text
    self.generate_from_frequencies(words)
File "/app/.heroku/python/lib/python3.6/site-packages/wordcloud/wordcloud.py", line 404, in generate_from_frequencies
    "got %d." % len(frequencies))
2 Likes

Hi @otavios.s

I don’t think word cloud varies with language. There must be some other reason due to which it shows.

If you think this is an issue, code is publicly available on GitHub. You are welcome to contribute :slightly_smiling_face: .

Thanks for sharing your feedback.

1 Like

Thanks @paul.aromolaran.1710 :slightly_smiling_face:.

Hey!

The problem is in the date format. In English Whatsapp, there’s a comma between date and time. Like this:

Screen-Shot-2017-11-03-at-11.42.08

There’s no such comma in Portuguese Whatsapp:

Capturar

Without the comma, the date does not match the regex pattern. Look at the outputs:


If you’re interested in fixing it, I’ve created a pull request.

Ther’s still a problem, though. The “Portuguese data” is displayed wrong in the app because it is in the “dd/mm/yyyy” format. If you like my contribution, I can help with that too. I think you made an amazing work and it would be my pleasure to help making it accessible to more people.

3 Likes

Awesome, you are interested to contribute in the project. You can check your contribution status here.

1 Like

With the contribution of @otavios.s, WhatsApp Chat Analyzer is able to handle both English and Portuguese WhatsApp Chat text file.

Feel free to use and analyze your chat :slightly_smiling_face:.

If you found an error or want any feature, feel free to open an issue here.

Wants to make a contribution click here read the documentation, solve open issues. If this project interests you give a star here.

2 Likes

This looks so amazing, Prem!!! :heart_eyes: Awesome work, here! :heavy_heart_exclamation:

1 Like

Thanks to you for sharing your article :smile:.

1 Like

Awesome work. I will go through this and provide some feedbavk

1 Like

Awesome work. I’m wondering if this type of data analysis is computationally intensive? Is there are any GPU backend to serve the requests or were GPUs used to train/process data? If yes, then on which cloud did you get GPUs from?

1 Like

No, there’s no machine learning going on.

1 Like

Hello Gaurav,

No, there is no GPU involved in the backend. In the case of very high computation, it may create a problem, and I haven’t tested it on a vast dataset (chat file) or not found anywhere. It structures the uploaded chat text file and creates some insightful plots. That doesn’t involve much computation.

1 Like

thank you for sharing @Prem! I enjoyed reading this post, very interesting. Hope you can keep sharing!

1 Like

Got it. Thanks @Prem for making it clear

1 Like

@Prem,

Nice work, I have done the same with word cloud which I can see is part of your process.
when I was working on that project my problem was the language. as word-cloud can be very helpful for Latin character however I have been facing many obstacles on the way to optimize it for Arabic and Persian(Farsi) languages. fortunately, there are some libraries which can help you to achieve your goal. I will share with you the word cloud that I generate.

3 Likes

Hey @ai1

That’s really cool, I love it :heart_eyes: . I don’t know that WordCloud can be generated in your own shapes.

Thanks for sharing :smile:.