Where should I begin?
I am trying to build a real-time stem-split music visualizer for VJing and the like. 
 What sets this apart is that I would like to split the input audio stream into its stems (either algorithmically or using something like Spleeter) and then use each stem data to control different aspects of the visualization.
For example:
- The isolated drums to play a BPM-synced video.
- I'm hoping to achieve this by making a short looping video at a fixed BPM (say, 60) and then by detecting the BPM of the stream, adjust the playback speed of the video so that the video is in sync.
 
- The isolated synth stream could control DMX lights.
- I want to try and encode this data in, say, the last row of pixels in the above video. By reading the colour, intensity, and movement data from the pixels, moves and timings could be read and sent to the lights in real-time. I'm doing this so that the user can encode all the data needed for a scene into one video file.
 
- The isolated vocals could be synced and displayed on screen using MusixMatch.
- The isolated bassline could be parsed into MIDI data and visualized on screen.
- All of the above can be controlled live.
Now the problem is that I am relatively inexperienced with programming. I am not sure where to start. Which language to use, which IDE, how to display visuals, how to interact with audio input streams, how to use DMX and how to visualize MIDI data. I know this is currently quite a bit out of my depth but I'll manage with the right resources. Please give me some advice on where to begin for a project like this.