A lot of kids who are born with hearing inabilities give up their basic human right to speak even though they have fully developed vocal cords. Can we use design to change this?
I spent the last 1.5 years of my time at IIT Guwahati working for the hard of hearing kids(kids who cannot hear properly) and it was the most rewarding time of my life.
In the process, I got the opportunity to assemble and work with a talented team, collaborate with the best speech & hearing institute in the country and even pitch to investors to bring my idea to the masses.
I even ended up publishing two research papers in the field!
The most important takeaway for me in this whole journey, however, was that I felt the happiest when I was able to optimise my life for impact creation and not any other metric.
Here is an account of my journey, hope you get inspired to help the community too 🙂
To explain the problem, I would like you to imagine this scenario:
You have your headphones on, your favourite rock song is playing at the maximum volume, your mom is sitting on the couch and she suddenly calls to you. You read her lips and figure out that she is asking you to turn off the TV and you immediately reply with an affirmative to her.
She comes to you, pulls your headphones off and asks you to stop screaming.
You are confused, you didn’t scream, what is she talking about?
Well, what happened was, your headphones blocked your voice feedback, and hence you did not get to adjust your voice to the normal loudness intensity that you speak at.
Now imagine facing this issue all the time. Welcome to the world of the hard of hearing.
A voice is a human gift; it should be cherished and used, to utter fully human speech as possible. Powerlessness and silence go together.
Hard of hearing kids have improper speech due to lack of voice feedback. As a result, a lot of them choose to stay silent even when they have properly developed vocal chords.
This hit me hard, these kids are deprived of their basic human capability to speak?
I immediately got digging to understand exactly how big a problem this is.
To tackle the problem of improper speech, the child is enrolled in speech therapy. Once the hearing implant is fitted into the child’s ear, she is put through multiple visits to a licensed speech therapist, spanning across 2 to 6 years and costing about 500 rupees on average for a session.
Here is the ground reality of the state of hard of hearing kids and speech therapy in India:
Number of hard of hearing kids in India
Number of speech therapists: Number of hard of hearing kids
Cost of speech therapy over the lifetime of the child
The situation of hard of hearing kids in India is in need of dire upliftment with a major focus on the chances of helping them get employment or even completing school.
Seeing this, I couldn’t sit back and take it anymore so I immediately got to work.
I wanted to build a product that would bring speech therapy to the child’s home, free of cost.
Here is the more elaborate problem statement:
Designing a product that enables hard of hearing kids aged 10–15 years with cochlear implants to improve their voice through carefully crafted training exercises and realtime feedback on speech, with minimal remote monitoring by the doctor.
Vocle is a free to use app that helps children train themselves remotely at the comfort of their homes
I got to interact with the leading speech therapists in the country, multiple subject matter experts, therapists, and most importantly, the hard of hearing kids themselves.
The more I spoke to the stakeholders, the more I got to refine the target audience, the problem statement and consequently the approach to my solution.
I personally feel that there is no single design process that fits all. Every problem is unique and hence every approach/process too is unique.
An important takeaway for me, was that a lot of design processes and case studies omit is that when you are working on projects in real life, the problem statement is itself a constant work in progress.
I decided to document the unique interdisciplinary process that came into being and verbalised it into a simple framework that would help designers working in an interdisciplinary context to structure their approach better.
This framework has been accepted to be published in the esteemed Springer Journal in Jan, 2021.
To provide clarity to my approach, I wanted to get my fundamental design objectives for the product right.
To enable the child to train voice without any external guidance required.
To engage the child with playful gameplays and incentives.
To incorporate medically-certified training exercises for the child.
To provide real-time feedback to the child’s current intonation levels.
Inspired by the Lean methodology, I always had the ethos of Build • Validate • Learn at the cornerstone of my approach.
Here are a few snapshots that will help you understand the product.
The basic foundational scripts were coded in Python using a library called Praat-Parselmouth.
The functions from this library were called to extract the frequency and intensity curves of the child’s speech, hereby referred to as test curves. Prerecorded reference files of the sentences used for training were also run through the script and their corresponding frequency and intensity curves were recorded, hereby called the control curves.
The first step was to extract an array of values corresponding to regular intervals on the frequency and intensity curves. This was done on both the control curves and the test curves. The comparator measures the difference between the two audio files using these values.
The next step was to normalise the time taken to speak the same sentence. In reality, different people speak at different speeds and if the speech was not normalised to the same speed, comparing the two curves would lead to incorrect and inaccurate results. Normalisation is done using a function called Dynamic Time Warping. This function reads through the entire array of values from the two curves, finds similarities and subsequently shifts the values with respect to time such that the two paths coincide with respect to time. This function is called for both the frequency and amplitude curves.
The final step in the process of comparison is for the script to run through every value for the predetermined intervals of time and measuring the difference in values. Once the values are extracted, a cumulative score for each curve is generated. This score corresponds to the amount of deviation between the two curves and is directly proportional to how similar or dissimilar the two audio files are.
Testing and Validation
I approached the experts at All India Institute of Speech and Hearing, the leading speech and hearing institute in the country to help us test the application with patients falling within the target age group.
They were kind enough to give me access to 7 therapists and speech experts who further began to test the application and analyse the various flows involved. The therapists further got in touch with their patients to administer the lessons with them to see how they perform and whether the feedback is appropriate.
I’ve saved the most important bit for the last. None of this would have been possible without my super-talented eclectic team who worked tirelessly to help achieve the vision of helping these kids lead better lives.
My prof. Sharmishtha Bannerjee was constantly there to guide me throughout the process both as a mentor and as a friend.
The good folks at AIISH and everyone else who has been a part of this but hasn’t been mentioned.
And last and most importantly my partner Sree Mahit, without whom I would not have even dared attempt this journey.
Designing to give hard of hearing kids their voice back was originally published in UX Planet on Medium, where people are continuing the conversation by highlighting and responding to this story.