Play AMR File on the Browser

How to Play AMR Files On the Browser

I’m building a web app where:

  • users upload an audio file to blob storage
  • on upload completion, file url is sent to a speech-to-text model
  • the audio transcription (text) is then sent to an LLM for content analysis.
  • the LLM returns the analyzed content as a structured data object.

The structured data is then presented to the user on a page that also includes an audio component where they can listen to the original audio.

The challenge I’m running into is that many voice memos are saved in amr format. (This is the standard format for iMessage and WeChat voice memos.) The speech-to-text model is able to work with the amr format just fine, but when I get a URL link to a particular amr file in storage I am not able to play this file inside of the browser because the standard HTML audio element does not support playing amr files from the client side.

In my particular application, I get a signed audio url from my storage provider and then pass this url to the audio element:

const audioUrl = await getSignedAudioUrl(note.user_id, note.audio_storage_url)

...

<>
    <div>Audio</div>
    <audio controls>
        <source type="audio" src={audioUrl} />
    </audio>
</>

If the audio is saved in storage in mp3 format, or any other format supported by the HTML audio element, the audio element loads with the source pointing to the signed audio url and the audio plays with no problem at all.

But if the audio is saved in amr format, the audio element is unable to play the file and the audio element just displays an unclickable play button. To the user, this just looks like there is no audio file at all.

The likelihood of the user actually playing this audio back is relatively low for this particular application. The web app already transcribes the audio and then fills out a form with the data extracted from the audio transcript. The only scenario where the user would actually listen to the audio would be if there was an issue with the transcription and the user wanted to double-check the audio. Since the transcription works exceptionally well, this is relatively rare. Still, there is a possibility that the user would want to review the audio. And in the current implementation it is not clear to the user what the problem is and why they are unable to play back the audio –– they just see an unfunctional audio element.

There are a few options for addressing this issue:

  1. Convert File During Upload
  • Convert amr files to mp3 format on the server side when the user does the initial file upload.
  • Drawbacks: added processing time during upload; if the user never actually listens to the audio file, this whole process is unnecessary and just adds more overhead and potential maintenance issues.
  1. Convert File Using Database Function
  • Keep the current upload process in place so that the amr file url is sent to the speech-to-text transcription service as soon as possible, but set up a database function that eventually converts the amr file to an mp3 file. When the user views the page for this particular item, they are then served the link to the mp3 file.
  • Drawbacks: Database functions should typically be used to address issues relating to maintaining data integrity. Conversion of file types should typically be done before inserting the files into storage. Doing the file type conversion using a database function trigger could be challenging to debug. Additionally, it is preferable to keep as much of the file handling logic in the primary application code. In this case, since I am using Supabase for my file storage, the code for this file conversion process would live separately from the rest of my project.
  1. Convert the File on the Client Side
  • Keep the file in amr format in storage and use a library that converts the amr file into an mp3 file when the user plays the file.
  • Drawbacks: Increases load on the client and introduces potential maintenance concerns. To reduce unnecessary load on the client, we should build out a wrapper around the audio element so that the audio conversion does not always take place immediately upon loading the page. Remember, in this particular app the user is unlikely to actually listen to the audio each time they view the page. The user would first click a button indicating that they want to listen to the audio. This click event would then trigger the amr to mp3 conversion.

Each of these options have considerable downsides. This is also an instance where we are solving for an edge case where it may not make much sense to deploy an extended amount of time finding and implementing a solution. Especially if it is a brittle solution or is likely to require maintenance.

I decided to first implement a short-term solution before committing to any of the three options above. This quick fix simply checks if the audio file is in amr format. If it is, the user will see a “download file” button instead of an audio player. This doesn’t actually solve our issue in a way that allows the user to listen to the file seamlessly inside of the browser, but this approach does at least restore user access to audio files stored in amr format and removes any ambiguity for the user.

Implementing a Short-term Bridge Solution

I added the following code to check if the audio is in amr format:

const audioUrl = await getSignedAudioUrl(note.user_id, note.audio_storage_url)

let isAMR = false;

if (audioUrl) {
  const url = new URL(audioUrl);
  if (url.pathname.endsWith('.amr')) {
    isAMR = true;
  }
}

And then I updated the client side jsx to conditionally render the audio component:

<>
<div>Audio</div>
    {isAMR ? (
        <div>
          <LinkButton href={audioUrl}>Download audio</LinkButton>
        </div>
    ) : (
        <audio controls>
          <source type="audio" src={audioUrl} />
        </audio>
    )}
</>

In the case that the audio is in amr format, the user will see a “Download audio” button instead of an audio player element. The assumption here is that if the user was able to record amr formatted audio, they also have a player installed on their machine that will be able to read this format.

This is not an excellent long-term solution for the user experience, but it at least stops rendering an audio element if we know the file type is unplayable in an audio element. This is also the kind of solution that we can ship immediately while we explore other options.

Actually Playing AMR Files on the Browser in a Next.js App

Ultimately, I decided against converting the files to mp3 format during the upload process and chose to figure out a way to play the amr files inside of the browser.

To do this, I used this web-amr library on npm: https://www.npmjs.com/package/web-amr

I created an AudioPlayer.tsx component:

'use client';

import React, { useEffect, useState } from 'react';
import { AMRPlayer, Player } from 'web-amr';

interface AudioPlayerProps {
  audioUrl: string;
}

const AudioPlayer: React.FC<AudioPlayerProps> = ({ audioUrl }) => {
  const [isAMR, setIsAMR] = useState<boolean>(false);
  const [player, setPlayer] = useState<Player | null>(null);
  const [currentTime, setCurrentTime] = useState<number>(0);
  const [duration, setDuration] = useState<number>(0);
  const [paused, setPaused] = useState<boolean>(true);

  useEffect(() => {
    const checkAndLoadAudio = async () => {
      const url = new URL(audioUrl);
      if (url.pathname.endsWith('.amr')) {
        setIsAMR(true);
        const res = await fetch(audioUrl);
        if (res.ok) {
          const buffer = await res.arrayBuffer();
          const playerInstance = AMRPlayer(buffer);
          setPlayer(playerInstance);
          setDuration(playerInstance.duration);
          playerInstance.addEventListener('timeupdate', () => {
            setCurrentTime(playerInstance.currentTime);
          });
        }
      }
    };

    if (audioUrl) {
      checkAndLoadAudio();
    }
  }, [audioUrl]);

  const handlePlayPause = async () => {
    try {
      if (paused && player) {
        await player.play();
        setPaused(false);
      } else if (!paused && player) {
        await player.pause();
        setPaused(true);
      }
    } catch (error) {
      console.error('Playback error:', error);
    }
  };

  const handleSeek = (event: React.ChangeEvent<HTMLInputElement>) => {
    const newTime = parseFloat(event.target.value);
    if (player) {
      player.fastSeek(newTime);
      setCurrentTime(newTime);
    }
  };

  const formatTime = (seconds: number): string => {
    const minutes = Math.floor(seconds / 60);
    const remainingSeconds = Math.floor(seconds % 60);
    return `${minutes}:${remainingSeconds.toString().padStart(2, '0')}`;
  };

  return (
    <>
      {isAMR ? (
        <div className="flex h-16 flex-row items-center rounded-full bg-gray-100 px-6">
          <div
            className="flex w-12 cursor-pointer items-center justify-center align-middle"
            onClick={handlePlayPause}
          >
            {paused ? (
              '▶️'
            ) : (
              <svg
                xmlns="http://www.w3.org/2000/svg"
                height="24px"
                viewBox="0 0 24 24"
                width="24px"
                fill="#000000"
              >
                <path d="M0 0h24v24H0z" fill="none" />
                <path d="M6 19h4V5H6v14zm8-14v14h4V5h-4z" />
              </svg>
            )}
        'use client';

import React, { useEffect, useState } from 'react';
import { AMRPlayer, Player } from 'web-amr';

interface AudioPlayerProps {
  audioUrl: string;
}

const AudioPlayer: React.FC<AudioPlayerProps> = ({ audioUrl }) => {
  const [isAMR, setIsAMR] = useState<boolean>(false);
  const [player, setPlayer] = useState<Player | null>(null);
  const [currentTime, setCurrentTime] = useState<number>(0);
  const [duration, setDuration] = useState<number>(0);
  const [paused, setPaused] = useState<boolean>(true);

  useEffect(() => {
    let playerInstance: Player | null = null;

    const updateTimeHandler = () => {
      if (playerInstance !== null) {
        setCurrentTime(playerInstance.currentTime);
      }
    };

    const checkAndLoadAudio = async () => {
      const url = new URL(audioUrl);
      if (url.pathname.endsWith('.amr')) {
        setIsAMR(true);
        const res = await fetch(audioUrl);
        if (res.ok) {
          const buffer = await res.arrayBuffer();
          playerInstance = AMRPlayer(buffer);
          setPlayer(playerInstance);
          setDuration(playerInstance.duration);

          playerInstance.addEventListener('timeupdate', updateTimeHandler);
        }
      }
    };

    if (audioUrl) {
      checkAndLoadAudio();
    }

    return () => {
      if (playerInstance) {
        playerInstance.pause(); // Pause the audio playback
        playerInstance.removeEventListener('timeupdate', updateTimeHandler);
      }
    };
  }, [audioUrl]);

  const handlePlayPause = async () => {
    try {
      if (paused && player) {
        await player.play();
        setPaused(false);
      } else if (!paused && player) {
        await player.pause();
        setPaused(true);
      }
    } catch (error) {
      console.error('Playback error:', error);
    }
  };

  const handleSeek = (event: React.ChangeEvent<HTMLInputElement>) => {
    const newTime = parseFloat(event.target.value);
    if (player) {
      player.fastSeek(newTime);
      setCurrentTime(newTime);
    }
  };

  const formatTime = (seconds: number): string => {
    const minutes = Math.floor(seconds / 60);
    const remainingSeconds = Math.floor(seconds % 60);
    return `${minutes}:${remainingSeconds.toString().padStart(2, '0')}`;
  };

  return (
    <>
      {isAMR ? (
        <div className="flex h-16 flex-row items-center rounded-full bg-gray-100 px-6">
          <div
            className="flex w-12 cursor-pointer items-center justify-center align-middle"
            onClick={handlePlayPause}
          >
            {paused ? (
              '▶️'
            ) : (
              <svg
                xmlns="http://www.w3.org/2000/svg"
                height="24px"
                viewBox="0 0 24 24"
                width="24px"
                fill="#000000"
              >
                <path d="M0 0h24v24H0z" fill="none" />
                <path d="M6 19h4V5H6v14zm8-14v14h4V5h-4z" />
              </svg>
            )}
          </div>

          <div className="mx-2 flex flex-row items-center font-mono text-sm">
            <span className="w-[30px]">{formatTime(currentTime)}</span>
            <span className="pl-2 pr-1">/</span>
            <span className="w-[30px]">{formatTime(duration)}</span>
          </div>

          <input
            type="range"
            min="0"
            max={duration.toString()}
            value={currentTime.toString()}
            onChange={handleSeek}
            className="mx-2 w-full cursor-pointer accent-gray-900"
          />
        </div>
      ) : (
        <audio className="w-full" controls preload="metadata" src={audioUrl}>
          Your browser does not support the audio element.
        </audio>
      )}
    </>
  );
};

export default AudioPlayer;

Now I can import this AudioPlayer component into any of my pages that need to offer audio playback functionality:

import AudioPlayer from '@/app/components/AudioPlayer';
...


<AudioPlayer audioUrl={audioUrl} />

All of the logic that checks for the file type lives within the component itself. I just need to pass in the audioUrl. If the audio file is in AMD format, the component will automatically render as an AMR player. Otherwise the component will render a standard HTML audio element.

Thank you to Sean Wu (S1’24) and Seamus Edson (S1’24) for pairing with me at the Recurse Center on this issue!

https://www.npmjs.com/package/web-amr

https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API

https://stackoverflow.com/questions/64926174/module-not-found-cant-resolve-fs-in-next-js-application