AudioMemo

AudioMemo is a native Android application that records audio, transcribes it using the OpenAI Whisper API, and provides AI-powered summaries. Built with modern Android development practices, it features a sleek intuitive UI and robust background processing capabilities.

Features

Audio Recording: High-quality voice recording with a real-time audio wavelength visualization UI.
Smart Chunking: Automatically splits long audio recordings into manageable 30-second chunks and saves them to local storage.
AI Transcription: Integrates with OpenAI's whisper-1 model to provide accurate transcriptions of recorded audio.
Intelligent Summarization: Uses OpenAI's gpt-4o-mini model to generate concise summaries from the transcriptions.
Local Storage: Securely stores audio chunks and transcriptions locally using Room Database.
Background Processing: Ensures seamless transcription and summarization tasks run efficiently in the background using WorkManager.

Tech Stack

The app is built using the latest Android development technologies and architecture patterns:

Language: Kotlin
UI Framework: Jetpack Compose (Material 3)
Architecture: MVVM (Model-View-ViewModel)
Concurrency: Coroutines & Flow
Dependency Injection: Dagger Hilt
Networking: Retrofit2 & OkHttp
JSON Serialization: Kotlinx Serialization
Local Database: Room
Background Work: WorkManager

Prerequisites

Android Studio: Latest version recommended (Giraffe or newer).
Minimum SDK: API 24 (Android 7.0)
Target SDK: API 34 (Android 14)
OpenAI API Key: Required for transcription and summarization features.

Getting Started

Clone the repository:

Permission	Purpose
`RECORD_AUDIO`	Capture voice recordings (runtime — requested before recording starts)
`INTERNET`	Communicate with the OpenAI API
`FOREGROUND_SERVICE`	Keep the recording service alive in the background
`FOREGROUND_SERVICE_MICROPHONE`	Android 14 foreground service type for microphone access
`POST_NOTIFICATIONS`	Show the persistent recording notification (API 33+)
`READ_PHONE_STATE`	Detect incoming / active calls to pause recording
`MODIFY_AUDIO_SETTINGS`	Start Bluetooth SCO for headset microphone support
`BLUETOOTH`	Bluetooth headset detection (API ≤ 30)

Scenario	Status	Mechanism	Behaviour
Process death	✅ Handled	`ChunkFinalizationWorker` (WorkManager)	A 15-second delayed WorkManager job is enqueued when recording starts. If the process dies before the user stops cleanly, the worker fires, marks all in-flight chunks `FAILED`, and re-queues `TranscriptRetryWorker` to retry uploads.
Network failure during upload	✅ Handled	`WhisperUploadWorker` + `TranscriptRetryWorker`	`WhisperUploadWorker` uses WorkManager's built-in retry with exponential back-off. `TranscriptRetryWorker` can also re-enqueue failed chunks on demand.
Long recordings	✅ Handled	30-sec

Loading Project Details...

AudioMemo

Overview

AudioMemo

Features

Tech Stack

Prerequisites

Getting Started

Permissions

Interruption Handling

Tech Stack

MCP server