Each message represents the number of various emojis that appeared on Twitter since the previous message. After a few transformations, we got a stream of hexadecimal Unicode values for each emoji. E.g. for {"1F607":1,"2764":2} we produce three events: "1F607", "2764", "2764". This is how we achieved it:

Then it exploded for real with: NumberFormatException: For input string: "1F1F5-1F1F1". Turns out some emojis are bigger than the other. For example, two individual characters: 🇵 and 🇱 when placed next to each other (🇵🇱) may be rendered as a flag. Polish flag in this case. A single emoji formed from two emojis. We need to enhance our parsing logic by parsing each hexadecimal number separated by dash (-) individually and concatenating characters. To be honest I started with something quite complex:

Maybe not as impressive, but I like it more. Few more test cases and we are free to go:

'1F1F5-1F1F1' || '🇵🇱'
'1F1FA-1F1E6' || '🇺🇦'
'1F1FA-1F1F8' || '🇺🇸'

OK, we are finally ready to aggregate individual events. We must somehow aggregate individual emojis into some sort of histogram (occurrence map). Basically, we want a Map<String, Long> of all emojis since the very beginning. The worst way to do this is global, mutable state:

Within 5 seconds 😂 emoji was sent 75 times to Twitter! So why is this solution bad? Modifying global mutable state from within your reactive stream inevitably leads to race conditions and problems with synchronization. A much better solution is to aggregate events within the stream itself. It's a bit mind-bending. Basically, we turn a stream of individual events into a stream of gradually built aggregation. Every event is applied to our histogram and passed further downstream. Look:

Notice how each individual emoji is either added to the map or increments existing entry. Theoretically, the occurrence map (histogram) can grow quite large. However, the number of different emojis is fixed and not that large (2666 as of this writing). Now we'd like to find the top 50 emojis - 50 map entries with highest occurrence count. This can easily be done with JDK 8 Stream API:

You might think this and the previous article aren't very practical. On the surface, yes. But we learned a few techniques that can be really valuable when dealing with real streams of data. Also producing and consuming SSE stream is the easiest way to enable streaming architecture in your ecosystem.