Last time I wrote about gather insights into nodes and their languages and hashtags. This can give better understanding of the communities on different nodes, but how about getting a live insight into what people are talking about?
A couple of years back, I had a project called TWIMPACT that looked at tweets on Twitter and used retweet counts to figure out what people are talking about, and also to compute the impact of a user based on how many retweets they got. I always liked the abundance of interesting data in the stream of posts. Look at this blog post from 2011 for what we did back then (wow, is Twitter so old already?) Twitter always had a difficult relationship with the ecosystem using their API, even way before Elon Musk, so we eventually pivoted to doing real-time data analysis for event data.
I think Mastodon and the fediverse is a great opportunity to revisit ideas to gather what is happening in the world. It is a big community (I'm seeing 80k to 100k posts per day), and in general it is more friendly and there is less abuse, so that's a good starting point.
The data is different, of course. For example, on Twitter, each retweet generates another post, but on mastodon you only see the original post in the stream. It has a reshare count in its metadata, but to get the true number you would need to poll it again, which does not scale (and also would put undue load on servers).
So instead of looking at number of boosts, you can count the number of posts or the number of distinct users (which is what mastodon is using in its own explore page it seems). I'm not sure whether one is always better than the other measure. Number of boosts or likes could be seen as a measure of approval, but maybe also just reflects the number of followers for a user. Looking at distinct number of users posting means more people need to find it interesting, but it does not take account the "standing" of those users.
In any case, I think the question is whether you can see interesting trends or not. So let's explore some of the trends on the fediverse.
So let's look at hashtag trends and the links. In terms of daily posts, #press is the biggest trend right now, mostly driven by posts from the LA times and reuters it seems. It's tempting to dive deeper into the data, but for privacy reasons, I'm not extracting data on the account level, so we have to take this at face value.
This trend is periodic: it follows more or less a daily rhythm. You might wonder what these spikes are. I've done a bit of digging, and it seems that sometimes there is a flood of new posts coming in, probably when a new node connects, but I'm not sure to be honest. Still so much to discover about this data!
These are great hashtags to follow to get daily new information on what is happening.
Sometimes there are big new events that give rise to a hashtag that hasn't been there before. Last week, Donald Trump was indicted which led to new activity in the #trump hashtag, but also associated hashtags like #trumpindictment.
I've seen similar such events like #f1, last weekend when there was a new Formula 1 race, or #openingday2023 (now not much happening) when the baseball season kicked off. Other examples were #aurora when there was northern lights activities, #volksentscheid when there was a referendum in Berlin, #ramadan when it started, and so on. Currently, fedistats only looks back for the last week, so now you cannot see a lot on these hashtags.
fedistats actually uses a different sorting to bring up hashtags that are new, weighting trends more where the number of daily posts and weekly posts is the same.
Which brings us to another fun topic: Hashtag games. There are some hashtags that become active every week on a different day. #caturday apparently is people sharing pictures of cats every Saturday.
These used to be much more popular on Twitter, but I like the playfulness of people participating in games like this.
There are hashtag games that create their own genre, like hashtags starting with #songsormoviesabout... . As I'm writing this, there is #songsormoviesaboutnothing trending with the top link being "Mas que nada" from Sergio Mendes.
Right now, we have #songsormoviesaboutgettingindicted, of course... .
I think there is a lot of interesting information to follow. The data is surprisingly rich which speaks for the community on the fediverse. Still I admit that fedistats is a bit technical and not very easy to use right now. What are your ideas to make it more useful? I'm looking forward to your thoughts! Follow the links down there to the posts and leave your comments!