How to build a Google Meet transcription bot with Python, React and Gladia API
Published on Jul 25, 2023
In today's fast-paced world, effective communication and collaboration are essential. Tools like Google Meet have revolutionized how we connect and conduct meetings remotely. However, it can be very challenging to keep track of all action items and key insights shared during long meetings.
Anyone who's used Google Meet native transcription, knows that relying on Google alone is not an option - the quality is poor, and processing time takes around 30 minutes on average.
One possible solution is building a custom Google Meet transcription bot that will transcribe and summarize the meetings for you automatically!
In this tutorial, we explain how to build a smart summary bot for Google Meets using Python, React, and Gladia speech-to-text API, able to record your virtual meetings and transcribe them using top-quality speech-to-text AI for easy future summarization using a tool like ChatGPT.
Here's what you'll do
Build the backend with Python
Create the frontend with React
Integrate Python, React, and Gladia Speech-to-Text API
Prerequisites
Before we dive into the implementation, let's ensure we have the necessary tools and knowledge.
Install Python on your system to create the bot's backend.
Familiarise yourself with React, needed to create the bot's user interface.
from selenium import webdriver
import time
import pyaudio
import wave
from google.cloud import storage
from google.cloud import speech
# Set up Selenium WebDriver
driver = webdriver.Chrome()
driver.get("https://meet.google.com/meeting-url")
# Start recording the meeting audio
def record_audio():
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
RECORD_SECONDS = 600 # Adjust as per your meeting duration
WAVE_OUTPUT_FILENAME = "meeting-recording.wav"
audio = pyaudio.PyAudio()
stream = audio.open(format=FORMAT, channels=CHANNELS,
rate=RATE, input=True,
frames_per_buffer=CHUNK)
frames = []
print("Recording started...")
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("Recording finished.")
stream.stop_stream()
stream.close()
audio.terminate()
wave_file = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wave_file.setnchannels(CHANNELS)
wave_file.setsampwidth(audio.get_sample_size(FORMAT))
wave_file.setframerate(RATE)
wave_file.writeframes(b''.join(frames))
wave_file.close()
record_audio()
5. Transcribe the audio
Create a free 10h/month Gladia account on app.gladia.io, get the API key and paste it in the code below
bucket = storage_client.bucket(bucket_name)
file_path = "meeting-recording.wav"
with open(file_path, 'rb') as f: # Open the file
files = {
# Sending a local audio file
'audio': (file_name, f, 'audio/'+file_extension[1:]), # Send it. Here it represents: (filename: string, file: BufferReader, fileMimeType: string)
# You can also send an URL for your audio file. Make sure it's the direct link and publicly accessible.
# 'audio_url': (None, 'http://files.gladia.io/example/audio-transcription/split_infinity.wav'),
# Then you can pass any parameters you wants. Please see: https://docs.gladia.io/reference/pre-recorded
'toggle_diarization': (None, True),
}
print('- Sending request to Gladia API...');
response = requests.post('https://api.gladia.io/audio/text/audio-transcription/', headers=headers, files=files)
if response.status_code == 200:
print('- Request successful');
result = response.json()
print(result)
else:
print('- Request failed');
print(response.json())
print('- End of work');
# open AI summary code here:
Step 2: Create the frontend with React
1. Set up a React project
npx create-react-app bot-ui
cd bot-ui
2. Design the user interface
Create your desired UI components and layout using React.
Step 3: Integrate Python, React, and Gladia Speech-to-Text API
1. Set up communication between the backend and frontend
In your Python Flask backend, create the following API endpoints:
from flask import Flask, jsonify
app = Flask(__name__)
@app.route('/api/meeting-summary', methods=['GET'])
def get_meeting_summary():
# Retrieve the summary from the database or file
summary = retrieve_summary_from_database()
return jsonify(summary=summary)
@app.route('/api/start-recording', methods=['POST'])
def start_recording():
# Implement the code to start recording the meeting video
return jsonify(message='Recording started')
@app.route('/api/stop-recording', methods=['POST'])
def stop_recording():
# Implement the code to stop recording the meeting video
return jsonify(message='Recording stopped')
if __name__ == '__main__':
app.run()
2. Trigger the recording functionality
In your React frontend, use the MeetingRecorder component to initiate and stop the meeting recording.
Building a custom Google Meet bot can significantly streamline virtual meetings analysis and improve productivity.
By automating the meeting recording and speech-to-text transcription, this bot allows participants to focus on the meeting content without worrying about extensive note-taking, and access the most relevant takeaways and action points faster by summarising the transcript using ChatGPT or other similar tools.
With Gladia's blazing fast and accurate transcription capabilities, combined with flexibility of Python and React, you can create a highly efficient and intelligent bot that will save you time on recording, transcribing, and summarizing virtual meetings.
Contact us
Your request has been registered
A problem occurred while submitting the form.
Read more
Speech-To-Text
Real-time agent assist: Unlocking better call center services with speech-to-text
Customer service is evolving fast to meet new challenges. Today's clients expect immediate, accurate answers to increasingly specific queries and complaints. Meanwhile, contact centers need to reduce costs, improve efficiency, and maintain compliance…all while delivering exceptional experiences.
Even the most advanced speech-to-text (STT) systems can make mistakes, especially when they encounter unfamiliar words like brand names, technical acronyms, or non-standard pronunciations. For call centers and customer service platforms, these missteps aren’t just minor glitches. They can lead to broken workflows, misinterpreted customer needs, and frustrating experiences on both ends of the call.
Call center quality assurance: How AI is transforming quality at scale
CCaaS and BPO providers live and die by the quality of the customer experience they deliver. Clients rely on them not just to answer calls, but to do so with consistency, professionalism, empathy, and accuracy every time.