The ACX Audio Submission Requirements is a set of technical requirements to achieve the quality standards for recording and rendering the audio files that make an audiobook. They include suitable audio levels (-25 dB to -18dB), the maximum level of noise floor (-60 dB), the type of file to use (MONO 192 kbps mp3), the length of the sample audio (1 – 5 mins), and the maximum length of each audio file (120 mins) among others.
You don’t need to be a professional audio engineer to record your works or be a professional narrator.
Just make sure you know the basics of audiobook recording, editing, and rendering the final files as well as the ACX audio submission requirements, to achieve a professional quality standard not only for Amazon’s ACX but also for Findawayvoices or any other platform that promotes the work of talented authors and narrators.
What Are the ACX Audio Submission Requirements
To make sure your recordings have a professional standard and are going to be well received by the audience, it’s necessary to comply with the technical aspects shown in the tables below. Some of them are much easier to understand, so you will see a description rather than a table. In each case, I show the way I work to achieve each requirement.
- Overall Sound and Formatting
|Requirement||Description||How to Achieve It|
|Audio level||The volume of your voice in the recording must be consistent, at the same level: between -18dB and -24 dB.||Apply Normalization and Compression in your DAW* software, after the recording stage (if you’re a beginner) or compression and gating during the recording stage (if you feel confident working in a DAW).|
|Tone of voice||The narration must sound balanced and natural.||Practice reading each chapter in advance, so you can identify what sections need a faster or slower pace, as well as a louder or lower tone.|
If you work with a limited amount of time, try reading a page and then record it.
Use a highlighter tool in your document or pdf reader to emphasize those details.
To avoid ruining the track with sudden too loud or too quiet sounds, set a compressor plugin in your DAW in the recording state
|Noise level||It means to keep away any unwanted noises from your recording. They can affect your reputation and make you be seen as a careless person.|
Mouth clicks are very common and they can easily be removed with a plugin. I love to use Era Mouth De-clicker.
|The recording area must be away from any noises (appliances, barking, street noise, etc).|
Select the regions in the track that contain unwanted noises and cut them off. Glue the remaining parts, so that they will sound like one single track.
Apply a plugin called Gate to keep those unwanted noises from being heard. You can use this when there are too many unwanted noises that sound at the same level.
If some of the unwanted noises are louder than others, you will have to set an independent gate for each case (the region of the track where they sound).
|Spacing and pronunciation||This means applying a suitable amount of time for commas, periods, and new paragraphs. |
Make sure to check the pronunciation of any unfamiliar words in advance.
|Remember that narrating an audiobook requires a slower pace than normal speech. If the listeners want a faster speed speech, they can select it in the app they use.|
If you are narrating someone else’s book, ask the author to provide an audio track with those words, so you can pronounce them like in that audio track.
*DAW: It stands for Digital Audio Workstation. It’s the audio software used on a computer or tablet to record, edit, and render the audio track. It contains all the necessary plugins to normalize the level, compress it, and equalize it, as well as the main tools to cut the unwanted parts, glue selected regions of a track, apply effects like reverberation, and many others.
Your DAW will allow you to render the final track that complies with all the ACX submission requirements.
- Include the opening and closing credits: As they are included by the author in the audiobook document you read, there is no way you can miss recording the credits.
Nevertheless, sometimes new authors who narrate their works may omit to record the credits as they focus on every chapter of the book.
The minimum title requirements, according to ACX are the title of the book, subtitle (if applicable), written by, and narrated by.
- Record and render all of the audio files as mono, rather than stereo: audiobook narration requires mono files (MP3 192 kbps at 44.1 kHz or WAV 44.1 kHz 16-bit). Audio drama productions may require stereo files for specific moments.
Other types of audio products like audio drama (The Sandman) or podcasts do benefit from recording stereo files, but that decision is made by the producer.
Even in audio drama, most of the narration and voice acting are recorded in mono tracks. The sound designer and audio engineer are the people in charge of creating the stereo picture, achieved by automating the panning of the tracks in the moments it’s required according to the production script.
- Include a retail audio sample that is between one and five minutes long: In most of the largest audiobook retailers, the audio sample of an audiobook always takes 5 minutes or a bit less. I don’t recommend recording just a couple of minutes.
What I do is ask the author what part of the book has already been considered to be the sample. If you are let to make that decision, go for the first chapter.
The audio sample is your chance to engage (with your talent) thousands of listeners that preview each title before buying it, and if they like it, they will provide a good Performance rating, and that’s what you’re looking for! Growing your positive reviews will bring other successful audiobooks to your portfolio. Try recording your first audiobook using free software:
- Requirements for each audio file
|Requirement||Description||How to Achieve It|
|Each file must contain one chapter or section with the section header read aloud.||Some chapters take between 1 to 59 minutes.|
No matter how short the opening and closing credits are, they must be on separate files.
|In 95% of the cases, you won’t have any issues with it, as each chapter on the provided script on PDF or document file will have a suitable length.|
|Each file must be under 120 minutes.||If you find a chapter that takes more than 120 minutes, split the file into 2 sections.||Find a suitable moment in the track (after the end of a paragraph) and cut the file into 2 sections. They don’t need to be the same length.|
|Room tone at the beginning (0.5 to 1 sec) and end (5 seconds), free of unwanted noises.||Room tone is the sound of your recording space when you are in silence. It must be clear of any kind of sound.|
These silent periods are important to allow the listener to notice when the narration begins and ends.
|Your room tone must be -60dB or below.|
Set an equalizer and cut 70 Hz and below, 14kHz, and above.
Set a gate: pre-open 5m; attack 3 ms; hold 50 ms; release 250-300 ms; dry signal -25 dB; wet signal +0.0 dB.
Set an adequate gain level: you can notice it immediately when you record directly through an audio interface to your DAW (like Reaper, Audacity, Logic, etc.). Look at the main meter and make sure your noise floor is below -60dB. If it is not, reduce the gain just a little bit and check again.
|Each file must contain the section header if contained within the manuscript (e.g., “Prologue”, “Chapter 1”, “Chapter 2”).||You must begin the recording by reading the header.||After reading the header, leave 3 – 5 seconds and then continue reading the chapter.|
|The maximum peak value can not surpass – 3dB.||When you have to record a moment where a character screams or produces a loud sound, be careful and set a limiter.||The limiter plugin will ensure all the audio will never surpass the mandatory -3dB requirement.|
|The only audio formats allowed are MP3 at 192 kbps or WAV at CD quality|
(44.1 kHz and 16-bit).
|These two types of file and depth rate are optimum to avoid noises||MP3 at 192 kbps is excellent and generates small file sizes |
(60 mins = 86.4 MB)
WAV at CD quality is also great but makes large file sizes
(60 mins= 317.52 MB)