Chances are that you listened to some kind of internet radio station today, that could have been through your browser at work, or maybe even on your interactive TV at home.
There is one thing all these internet radio's have in common no matter what you listen on: they are probably powered by either Shoutcast or Icecast.
Back in 2016 I operated an online radio station called ParalakeFM, a radio station made for an internet gaming community called PERPHeads. This radio station was most often played through the in-game radio, and I always look back to the fond times of 70+ players tuning in to enjoy some music with me or my fellow broadcasters.
One major annoyance when running this station was always that the Shoutcast sofware was closed source, and it lacked some features such as a good way to get up-to-date track information to your user without building something yourself completely isolated from the server software, while the server software also had the track information ready to go.
So I set out on a programming endeavour to roll my own server software that I could customize the ever living crap out of, and open sourced it. This project still sits on my Github profile to this day: https://github.com/coderiekelt/opencast
The Shoutcast protocol
One challenge was reversing the Shoutcast protocol, and relaying the streamed data to people that tuned into the radio station. So I set out with Wireshark and made a bit of a startling discovery if I am to be quite honest.
Shoutcast has a field in which it grows it's fucks, and behold it was barren. The legacy Shoutcast protocol could care less about what you send it. It could be a properly encoded song, or perhaps you wanted to write a bytestream with the entire Bee movie script in it. Shoutcast does not care, Shoutcast forwards and proceeds to batter the listener's ear if the client does not have some validation built in.
So how does the client know it's listening to a Shoutcast server then? Well that's easy, through headers! An example of you tuning into a Shoutcast server would look a bit like this:
Icy-MetaData: 0\r\n icy-br: 0\r\n icy-genre: Pop\r\n icy-name: ParalakeFM\r\n icy-url: http://paralakefm.com\r\n icy-pub: 0\r\n icy-notice1: Notice one!\r\n icy-notice2: Notice two!\r\n
Finally, the server will send the client a final line ending to indicate that is is going to be sending data now (this is only required on some clients, Chrome will crash if you don't, VLC will handle it just fine.)
After this, the server will send you whatever the DSP (broadcaster) sends you. This is usually mpeg data in chunks of 2048 bytes. If the Icy-MetaData header is set to 1, metadata about the current stream is expected followed by an line ending to indicate that a new set of audio data is inbound. It should be noted that this is fine for native clients, or clients where you have access to the stream to read metadata. Not so much for web applications.
Broadcasting to Shoutcast is equally easy: if a password is configured, you simply send it followed by an line ending and wait for Shoutcast to respond with OK2 (anything else is a failure.)
A sample response from the server would be:
OK2\r\n icy-caps:11\r\n \r\n
This time however, after receiving the final line ending you are expected to take full control of the stream by sending audio and meta data in properly sized chunks.
That is basically all there's to it.
If you'd like to take a look at my implementation, feel free to look around in the Github repository!