Building an Audiobook Processor: Splitting and Converting M4B Files for Car Audio

Create a Python script to split audiobooks into manageable chunks and convert them to MP3 format for better compatibility with car audio systems

Learn how to build a Python script that processes M4B audiobook files into smaller, more manageable MP3 chunks while preserving audio quality and providing flexible customization options.

The Challenge: Car Audio and Audiobooks

Modern audiobooks often come in M4B format, which while great for dedicated audiobook players, can present challenges when used with simpler audio systems like car stereos. I recently faced this exact problem: I had a lengthy audiobook I wanted to listen to during my commute, but my car’s audio system had two major limitations:

  1. No support for M4B format files
  2. No ability to fast-forward or rewind within tracks - only track skipping was supported

The solution? Create a script that could split the audiobook into smaller, more manageable chunks while converting them to a widely-supported format.

Technical Requirements

Before diving into the implementation, let’s outline our key requirements:

  1. Format Conversion: Convert M4B to MP3 format
  2. Chunking: Split the audiobook into fixed-duration segments
  3. Quality Preservation: Maintain audio quality during conversion
  4. Flexibility: Allow customization of chunk size, audio quality, and playback speed
  5. Robustness: Handle errors gracefully and provide clear feedback

The Solution: FFmpeg-Powered Python Script

The solution leverages FFmpeg, a powerful multimedia framework, wrapped in a Python script for ease of use and flexibility. Here’s how we implemented each component:

1. Duration Detection

First, we need to determine the total duration of the audiobook:

def get_duration(input_file):
    """Get the duration of the audio file in seconds using ffprobe"""
    cmd = [
        'ffprobe',
        '-v', 'quiet',
        '-show_entries', 'format=duration',
        '-of', 'default=noprint_wrappers=1:nokey=1',
        input_file
    ]
    result = subprocess.run(cmd, capture_output=True, text=True)
    
    if not result.stdout.rstrip() == "":
        return float(result.stdout.rstrip())
    return 0

This function uses ffprobe to extract the duration metadata from the input file, providing the foundation for our chunking calculations.

2. Core Processing Logic

The main processing function handles the chunking and conversion process:

def process_audiobook(
        input_file: str,
        output_dir: str,
        chunk_duration: int = 10,
        speed_factor: float = 1.05,
        start_chunk: int = 1,
        max_chunks: int = None,
        audio_quality: int = "192",
        sample_rate: int = "44100",
) -> None:

Key features include:

  • Customizable chunk duration (default: 10 minutes)
  • Playback speed adjustment
  • Selective chunk processing
  • Configurable audio quality and sample rate

3. Speed Adjustment Implementation

One interesting challenge was implementing variable playback speed. FFmpeg’s atempo filter has a limitation: it only works within the range of 0.5x to 2.0x. To overcome this, we chain multiple atempo filters:

if speed_factor > 2.0:
    tempo_chain = ','.join(['atempo=2.0'] * (math.floor(speed_factor / 2)) +
                           [f'atempo={speed_factor % 2}'])
elif speed_factor < 0.5:
    tempo_chain = ','.join(['atempo=0.5'] * (math.floor(2 / speed_factor)) +
                           [f'atempo={1 / (1 / speed_factor % 2)}'])
else:
    tempo_chain = f'atempo={speed_factor}'

This allows us to achieve any playback speed while maintaining audio quality.

Using the Script

The script provides a simple command-line interface:

python audiobook_processor.py input.m4b output_directory \
    --chunk-duration 10 \
    --speed 1.05 \
    --quality 192 \
    --sample-rate 44100

Command Line Arguments

Technical Considerations

Performance Optimization

The script is designed to be memory-efficient by:

  1. Processing chunks sequentially
  2. Using FFmpeg’s built-in seeking capabilities
  3. Avoiding loading the entire file into memory

Error Handling

Robust error handling ensures the script:

  • Validates input parameters
  • Catches and reports FFmpeg processing errors

Future Improvements

Potential enhancements could include:

  1. Parallel processing for faster conversion
  2. Chapter-aware splitting
  3. Progress bar and ETA estimation
  4. Audio normalization options

Technical Requirements

To use this script, you’ll need:

  • Python 3.6 or higher
  • FFmpeg installed and available in your system PATH
  • Sufficient disk space for the output files

Installation Guide

Before using the script, you’ll need to install Python and FFmpeg on your system. Here are multiple ways to get started:

Installing Python

Option 1: Direct Download

  • Visit python.org
  • Download and run the installer for your operating system
  • Make sure to check “Add Python to PATH” during installation

Option 2: Windows Package Managers

Using winget:

winget install Python.Python.3.11

Using Chocolatey:

choco install python

Installing FFmpeg

Option 1: Direct Download

  • Visit ffmpeg.org
  • Download the appropriate version for your system
  • Extract the archive and add the bin folder to your system’s PATH

Option 2: Windows Package Managers

Using winget:

winget install "FFmpeg (Essentials Build)"

Using Chocolatey:

choco install ffmpeg

Verifying Installation

Open a new terminal/command prompt and verify both installations:

python --version
ffmpeg -version

Setting Up the Script

Download Options

  1. Direct Download: audiobook-processor.py
  2. Copy from Below: Copy the complete script from this code block:
import argparse
import numbers
import subprocess
import math
import os
from pathlib import Path


def get_duration(input_file):
    """Get the duration of the audio file in seconds using ffprobe"""
    cmd = [
        'ffprobe',
        '-v', 'quiet',
        '-show_entries', 'format=duration',
        '-of', 'default=noprint_wrappers=1:nokey=1',
        input_file
    ]
    result = subprocess.run(cmd, capture_output=True, text=True)

    if not result.stdout.rstrip() == "":
        print(f"Total duration: {result.stdout.rstrip()} seconds")
        return float(result.stdout.rstrip())
    if not result.stderr == "":
        print(f"Error: {result.stderr}")
    return 0


def process_audiobook(
        input_file: str,
        output_dir: str,
        chunk_duration: int = 10,
        speed_factor: float = 1.05,
        start_chunk: int = 1,
        max_chunks: int = None,
        audio_quality: int = "192",
        sample_rate: int = "44100",
) -> None:
    """
    Process an M4B audiobook file (or any other media supported by FFMpeg)
    
    Args:
        input_file: Path to input M4B file
        output_dir: Directory to save output files
        chunk_duration: Duration of each chunk in minutes
        speed_factor: Speed adjustment factor (1.0 = normal speed)
        start_chunk: First chunk to process (1-based indexing)
        max_chunks: Maximum number of chunks to process (None = process all)
        audio_quality: Output audio bitrate
        sample_rate: Output audio sample rate
    """

    # Create output directory if it doesn't exist
    Path(output_dir).mkdir(parents=True, exist_ok=True)

    # Get total duration
    total_duration = get_duration(input_file)
    if total_duration == 0:
        print(f"Cannot determine total audiobook duration, make sure the file is valid.\n\rFile: {input_file}")
        return

    chunk_seconds = chunk_duration * 60
    total_chunks = math.ceil(total_duration / chunk_seconds)

    if max_chunks is not None:
        total_chunks = min(total_chunks, start_chunk + max_chunks - 1)

    print(f"Total duration: {total_duration / 60:.1f} minutes")
    print(f"Processing chunks {start_chunk} to {total_chunks}")

    # Process each chunk
    for i in range(start_chunk - 1, total_chunks):
        start_time = i * chunk_seconds
        duration = min(chunk_seconds, total_duration - start_time)

        output_file = os.path.join(
            output_dir,
            f"{Path(input_file).stem}_{i + 1:03}.mp3"
        )

        # Build FFMpeg command
        cmd = [
            'ffmpeg',
            '-y',  # Overwrite output file if exists
            '-ss', str(start_time),  # Start time
            '-i', input_file,  # Input file
            '-t', str(duration),  # Duration to extract
        ]

        # Add speed adjustment filter if needed
        if (speed_factor <= 0.0) or ((1 / speed_factor % 2) <= 0.0):
            print(f"Speed factor is too small, operation aborted.")
            return

        if speed_factor != 1.0:
            # atempo filter is limited to 0.5 to 2.0 range
            # for larger changes, we need to chain multiple atempo filters
            if speed_factor > 2.0:
                tempo_chain = ','.join(['atempo=2.0'] * (math.floor(speed_factor / 2)) +
                                       [f'atempo={speed_factor % 2}'])
            elif speed_factor < 0.5:
                tempo_chain = ','.join(['atempo=0.5'] * (math.floor(2 / speed_factor)) +
                                       [f'atempo={1 / (1 / speed_factor % 2)}'])
            else:
                tempo_chain = f'atempo={speed_factor}'

            cmd.extend(['-filter:a', tempo_chain])

        # Add output options
        cmd.extend([
            '-b:a', f'{audio_quality}k',  # Audio bitrate
            '-map_metadata', '-1',  # Remove metadata
            '-map', 'a',            # Remove video
            '-ar', f'{sample_rate}',         # Sample Rate
            output_file
        ])

        # Execute FFMpeg command
        print(f"Processing chunk {i + 1}/{total_chunks}: {output_file}")
        try:
            print(f"\n\rCommand:\n\r{' '.join(cmd)}\n\r")
            subprocess.run(cmd, check=True)
        except subprocess.CalledProcessError as e:
            print(f"Error processing chunk {i + 1}: {e}")
            continue


def main():
    parser = argparse.ArgumentParser(description="Split and speed up M4B audiobooks using FFMpeg")
    parser.add_argument("input_file", help="Input M4B file path")
    parser.add_argument("output_dir", help="Output directory path")
    parser.add_argument("--chunk-duration", type=int, default=10,
                        help="Duration of each chunk in minutes (default: 10)")
    parser.add_argument("--speed", type=float, default=1.00,
                        help="Playback speed factor (default: 1.05)")
    parser.add_argument("--start-chunk", type=int, default=1,
                        help="First chunk to process (default: 1)")
    parser.add_argument("--max-chunks", type=int,
                        help="Maximum number of chunks to process (default: all)")
    parser.add_argument("--quality", default="192",
                        help="Output audio bitrate (default: 192k)")
    parser.add_argument("--sample-rate", default="44100",
                        help="Output audio bitrate (default: 44100k)")

    args = parser.parse_args()

    process_audiobook(
        args.input_file,
        args.output_dir,
        args.chunk_duration,
        args.speed,
        args.start_chunk,
        args.max_chunks,
        args.quality,
        args.sample_rate
    )


if __name__ == "__main__":
    main()

First-Time Setup

Create a new directory for your audio processing:

mkdir audiobook-processor
cd audiobook-processor

Step-by-Step Usage Guide

  1. Prepare the files
  • Copy your M4B file in the audiobook-processor directory.
  • Save the script as audiobook-processor.py in the same directory.
  1. Basic Usage Example
    Open a command prompt and navigate to the audiobook-processor directory.
# Process with default settings
python audiobook-processor.py audiobook.m4b output_folder
  1. Advanced Usage Examples
# Process with higher speed (1.05%) and shorter chunks (5 minutes):
python audiobook-processor.py audiobook.m4b output_folder --chunk-duration 5 --speed 1.05
# Process with custom audio quality (256kb):
python audiobook-processor.py audiobook.m4b output_folder --quality 256 --sample-rate 48000

Common Issues and Solutions

Conclusion

This script solves a specific but common problem: making audiobooks more accessible for systems with limited playback capabilities. By leveraging FFmpeg’s powerful features through a Python wrapper, we’ve created a flexible tool that can handle various audio processing needs while maintaining quality and usability.