Links
Idea forthcoming
In 2021 here in Kosovo we had two elections, in the heat of it all I noticed that news sites are overflown with biased content, as any normal human my first response was: "I want to visualize that".
I had no experience with Web Scraping so it was a perfect opportunity to learn too, after some research I found that Python is a good option for web scrapping and I had no experience with Python either so I can say for sure I learned a lot during this project.
Excecution
The scrapper
My first goal was scrapping the data and finally, after having 100 StackOverflow tabs open I managed to make a useful web scrapper with Python after more hours of testing the scrapper accurate and fine-tunning the data structure this is what I came up with.
If you want a more detailed look into the backend, here is the documentation I wrote: https://arditxhaferi2.gitbook.io/python-web-scrapper/
Visualistaion
I have all the data I need now I wanted to find the perfect way to visualize it, I found a good HTML, CSS layout online because wanted to save some time on the front end and modified it to my liking.
For the data visualization I decided to use a useful JS library chart.js, and added a donut bar for every news site to show how many times they mentioned every political party in Kosovo and show the one that dominated the most:
I didnt stop at that I wanted to show which political figures are mostly mentioned on all of these news sites:
API & Documentation
I wanted to put the data that I am scrapping to good use so I made the data available to the public the link to the API will be down below here is an example of the data structure that the API will return
Structure Example:
{
"parties": {
"sites": {
"keywoard": {
"link": 'number of mentions',
},
}
},
"people": {
"sites": {
"keywoard": {
"link": 'number of mentions',
},
}
}
}
And even wrote the documentation from Scratch with 0 knowledge in Python how you can make ur own scrapper: