Music-aware AI macromodel that can generate lyrics, chords, beats, melodies

Lamucal is a hybrid multimodal model based on Transformer, which can automatically generate chords, beats, lyrics, etc. of music, supports multiple languages, and provides score editing and listening functions. This model has made significant breakthroughs in the music field, demonstrating the revolutionary impact of AI on the industry.

AI can indeed change the world. What can AI help for a music enthusiast? In fact, currently AI can already do a lot of things, and this is the big model lamucal that we will introduce today. Lamucal is a Transformer based hybrid multimodal model created by professionals, which uses various Transformer models to solve various professional problems in the field of music information retrieval, relying on the model to generate corresponding interdependent information relationships. This is a multi-modal project driven by artificial intelligence that focuses on music. It can automatically generate chords, beats, lyrics, melodies, and labels for any song.

Technical architecture

The underlying technical architecture is as follows:

U-Net is used to separate network models from the original audio and video sources for various tasks, including Pitch-Net, Beat-Net, Chord-Net and Segment-Net, all of which are professional large-scale models based on Transformer. In addition to establishing the correlation between frequency and time, the most important thing is to establish the interaction between different networks.

Functionality

  • Chords: music chord detection including major, minor, 7, maj7, min7, 6, m6, sus2, sus4, 5, inverted chords. Determine the key of the song.

  • Beats: Musical beats, beat detection and tempo (BPM) tracking.

  • Pitch: tracks the pitch of the melody in the track.

  • Musical structure: musical fragment boundaries and labels, including intro, lead, chorus, bridge, etc.

  • Lyrics: music lyrics recognition and automatic lyrics-to-audio alignment using ASR (whispering) to recognize lyrics for vocal tracks. Lyrics to audio alignment is achieved by fine-tuning wav2vec2 pre-trained models. Currently supports dozens of languages such as English, Chinese, Spanish, Portuguese, Russian, Japanese, Korean, Arabic and so on.

  • AI Tabs: Generate playable sheet music using chords, beats, music structure information, lyrics, rhythms, etc., including chord charts and six-line pentatonic scores, with support for editing chords, rhythms, and lyrics.

  • Others: tone separation, tempo adjustment, pitch change, etc.

Models and Trials

Simply search for the name of the song through its search box, for example, search for Perfect Song of Ed Sheeran.

Then selecting the corresponding song in its results (youtube sources) will automatically launch the AI model to generate various musical elements, with the option to generate lyrics.

Finally, it will give a comprehensive interface, you can choose the melody and Tabs tabs to display the relevant content, and there is a V selection in its upper right corner, you can choose the type of instrument such as guitar, piano, etc. to generate the sheet music of the corresponding instrument to represent the situation. On the far right are play buttons, tempo buttons and other function buttons (some need to download its app to use).

Have tried other songs:

Lyrics, sheet music are synchronized with the playback, for those who do not know the sheet music and musical instruments students, that he came as a karaoke to read the lyrics is also great.

Finally tried an old song "Kill Bill", play it, the lyrics are also to keep up with the tune!

Summarize

AI changes the world, mainly for the industry and professional field of revolutionary innovation, not just to a conversation, generate a picture, generate a video. This article mentions that this model is a breakthrough in the field of music, which is really quite shocking, and I hope that similar models in various professional fields can blossom, and we can welcome the spring of AI together!