Hey Google, talk to Music Scales

2018-01-20

I’ve learned recently that creating your own actions for Google Home (or Google Assistant in general) is really simple. If you know some JavaScript, you can easily create a powerful app.

So I had an idea to create a simple action that would help musicians finding intervals, scales and chords.
Being a musician myself, I thought it might be a great opportunity for me to:

  • Learn some new technologies (DialogFlow and Cloud Functions on Firebase)
  • Transfer music theory knowledge into code

Designing the app

Being a front-end developer for years I’ve always emphasized the importance of great User Experience. It required a close relationship with the UX Designers, or sometimes I wore a designer’s hat myself in order to meet the customer’s needs.

But in this exercise we’re shifting from GUI-driven interaction towards voice interactions. That’s right, we’re replacing “GUI” with “VUI” - Voice User Interface. And for someone who designed only for screens it’s quite a challenge.

The best thing you can do is to get a piece of paper and design a simple dialog between the user and your app.
It’s much easier to think of how to pivot the conversation when you have a draft of example dialog in front of you.

Example conversations

1
2
User: Hey Google, ask Music Scales for F mixolydian scale
Action: F mixolydian consists the following notes: F – G – A – B♭ – C – D – E♭
1
2
User: Ask Music Scales what scale is a b c sharp d e f sharp g
Action: I believe it's a C♯ locrian, D major, E dorian, F♯ phrygian, G lydian, A mixolydian or B aeolian scale.
1
2
User: Tell Music Scales to find an interval between G and E flat
Action: Interval between G and E♭ is called minor sixth. The distance between them is 8 semitones.
1
2
User: Tell Music Scales to find me the frequency of G sharp in third octave
Action: G♯ in 3 octave has a frequency of 207.65 Hz and its MIDI number is 56
1
2
User: ask Music Scales for a chord G sharp C D sharp F
Action: This is Fm7 or G♯M6

Development

I’ve looked up the code from this repository to get started. As you can see, there’s a /functions/ folder that contains all the logic of your Agent.
It will be deployed to Firebase.
I recommend you to read through this great Codelab.
I had a chance to learn and experience it during Google Developer Days in Krakow in 2017.

The flow briefly explained:

  • User input was provided to Google Assistant (no matter if it’s voice or text)
  • Request is passed to DialogFlow (in text format)
  • DialogFlow interprets the input, detects the intent
  • It parses the intent, getting all the entities (data we’ll need in our back-end code)
  • The JSON response is sent to my WebHook (Cloud Function for Firebase)
  • My function receives the data and prepares the response
  • Response is sent to DialogFlow (you can use SSML!)
  • DialogFlow responds to Google Assistant

If you’re a JS dev, I really recommend you to try this out! It’s super fun to code and you can see (and hear) the results immediatelly.

Music Scales is published

After I’ve submitted my Action for review, I got some feedback from AoG support team. I was able to track erros and fix bugs before the app was finally published.

Right now I’m looking forward to translate it to more languages, as well as introduce some enhancements (like dynamically drawn music scores).

If you would like to test my app, simply ask one of example questions from above.
The app is constantly learning, so some phrases may not be caught properly :)

Also check out Music Scales on Google Assistant’s Explore page.

Happy hacking!


Comments: