What are the industry standards for recording/editing podcasts and/or audio books? Do people use PT and edit it the same way they would a song? For a podcast, this kind of makes sense, but for something like an audio book that is hundreds of pages long, it seems like it would be an awful way to work.
Are there any tools that make it less sucky, or are people still using brute force to edit audio books?
I haven’t noticed a great deal of editing in podcasts, they seem to fly by the seat of their pants and leave them as is, though the better sounding (or slicker produced) ones may well have been edited a bit. They probably also have their routines down and do some pre-production for episodes in order to minimize post-production necessities.
Editing would mostly be removing bad spots or gaffes, and merging sections together after editing. Any DAW can do that. They may do some compression and gating (or expander) after the fact to improve the dialog clarity and projection. Those with backing or theme music may either cue it up during performance (if they have a radio DJ type setup) or add it in post-production.
Audio books are a beat-down, there’s no way around it. For each hour of “finished audio” you’ll probably spend 3-5 hours editing. So if you’re getting paid by the finished hour (PFH, per-finished-hour) be sure to account for this. There may be variations to what’s involved depending on genre and quality standards. Fiction vs Non-Fiction, etc. Also, if you’re working with a big publishing house the standards would usually be much more rigid than with an individual author. If someone else is going to do the “final master”, like a big publishing house, you want your levels really under control - max Peaks -3dB (if that, maybe -6dB) and RMS in the -18 to -23dB range.
The skills of the narrator are a huge factor (plosives, sibilance, mouth noise, errors). Also managing “retakes” or “pickups” without stopping recording; have a system of hand-claps or clicker noise that cause spikes in the waveform to clue you in to those places during editing.
You can also do something called “Punch-and-Roll” which is similar to how studio music recording used to be done. You do stop recording briefly to regroup, set the cursor to your “pickup” location, and then proceed recording over the error that was committed. You can use Pre-Roll, if desired. There are upsides to that in terms of time spent editing later, and loss of focus, plus proofing (critical listening to final product) time. However, like with music performance, it can disrupt the creative flow and vibe if not handled correctly. And you’ll still likely have to go back and crossfade those areas and/or clean them up (until you get really good at it).
Pro Tools can do that pretty well, but Reaper can be set up for it too. Adobe Audition is probably used extensively in some arenas.
This is "slightly related…"
When wanting to see a couple of more videos on “leaky gut” I came across several 50-minute ones that were so slow you want to bang your head against the wall…
Then I came across this one, which has excellent pace…
So unless your video book is intentionally putting people to sleep, watch the pacing. And consider editing out those seconds of silence…
Just because it can do it, doesn’t mean it’s a good tool for doing it. I can edit audio by cutting and taping reels of tape. Doesn’t mean I wouldn’t rather work as a janitor cleaning toilets. The only reason we had to do that in the past is because there weren’t better options available.
It seems kind of ridiculous to me that we would be using the music editing style of editing for something like audio books that are 10 hours long. If that is the actual standard, all it says to me is that nobody has put any thought into it.
Well, it’s kind of like recording musicians and artists … get a great performance and the work won’t be as challenging or drudgery. Except in long form spoken-word I’d say “consistency” is the key. Consistency and holding the listener’s intention. So there’s is not much dazzle or ‘hook’ (except the writer’s intended literary hooks), just keep a flow that’s listenable and interesting. So I’d say it’s mostly “work flow” in terms of the narration and the recording.
So if your performer (the narrator, or yourself as narrator) can work with the material until the performance is near flawless, the editing is reduced and so are the headaches. Eliminate goofs and dead space before they happen, essentially. That involves preparation, experience, practice, and diligence. A narrator should almost always read through the material at least once, perhaps several times, so that notes can be made for researching pronunciations (eliminates retakes later), character voices can be fleshed out, and the storyline is well understood so that you don’t find out in Chapter 15 that “Charlene said in her typical Cockney accent” when the narration to date has been in a Welsh accent. And then, the performance still needs to sound “fresh” after all that research which is where ‘acting’ comes in.
Punch-and-Roll takes some practice, but can save time. Not everyone likes it, and I mentioned the pro’s and cons. The only short cut I know is those computer voice-to-text programs (i.e. Siri), and those still sound pretty horrible for those with discriminating tastes.
From what I’ve been able to gather, amateur podcasts are just recorded “as is”. People sitting in their bedrooms / living rooms recording around a single microphone, with minimum editing and EQ’ing done. Obviously you’re listening to the podcast for the content rather than the quality of the production. But the better produced podcasts (Serial) are obviously recorded in state of the art studios and have a ton of editing done on them.
For my own YouTube channel, I record the voice in my studio (obviously) and have a standard EQ I apply to my voice. But I’m still shocked at how different my voice can sound from one video to the next!