Audio

Post has an audio file

Smol bash script for finding oversize media files

Friday, September 2, 2022 

Sometimes you want to know if you have media files that are taking up more than their fair share of space.  You compressed the file some time ago in an old, inefficient format, or you just need to archive the oversize stuff, this can help you find em.  It’s different from file size detection in that it uses mediainfo to determine the media file length and wc -c to get the size, and from that computes the total effective data rate. All math is done with bc, which is usually installed. Files are found recursively from the starting point (passed as first argument) using find.

basic usage would be:

./find-high-rate-media.sh /search/path/tostart/ [min rate] [min size]

The script will then report media with a rate higher than minimum and size larger than minimum as a tab delimited list of filenames, calculated rate, and calculated size. Piping the output to a file, output.csv, makes it easy to sort and otherwise manipulate in LibreOffice Calc.

Save the file as a name you like (such as find-high-rate-media.sh) and # chmod  +x find-high-rate-media.sh and off you go.

The code (also available here):

#!/usr/bin/bash

# check arguments passed and set defaults if needed
# No argument given?
if [ -z "$1" ]; then
  printf "\nUsage:\n\n  pass a starting point and min data rate in kbps and min size like /media/gessel/datas/Downloads/ 100 10 \n\n" 
  exit 1
fi

if [ -z "$2" ]; then
  printf "\nUsage:\n\n  returning files with data rate greater than default max of 100 kbps  \n\n" 
  maxr=100
  else
        maxr=$2
        echo -e "\n\n  returning files with dara rate greater than " $maxr " kbps  \n\n" 
fi

if [ -z "$3" ]; then
  printf "\nUsage:\n\n  returning files with file size greater than default max of 100 MB  \n\n" 
  maxs=10
  else
        maxs=$3
        echo -e "\n\n  returning files with dara rate greater than " $maxs " MB  \n\n" 
fi

# multipliers to get to human readable values
msec="1000"
kilo="1024"

echo -e "file path \t rate kbps \t size MB"

# search for files with the extensions enumerated below
# edit this list to your needs (e.g. -iname \*.mp3 or whatever
# the -o means "or", -iname (vs -name) means case independent so
# it will find .MKV and .mkv.
# then pass each file found to check if the data rate is 
# above the min rate of concern and then if the files size is 
# above the min size of concern, and if so, print the result
 
find "$1" -type f \( -iname \*.avi -o -iname \*.mkv -o -iname \*.mp4 -o -iname \*.wmv \) -print0 | while read -rd $'\0' file
do
    size="$(wc -c  "$file" |  awk '{print $1}')"
    duration="$(mediainfo --Inform="Video;%Duration%" "$file")"
    seconds=$(bc -l <<<"${duration}/${msec}")
    sizek=$(bc -l <<<"scale=1; ${size}/${kilo}")
    sizem=$(bc -l <<<"scale=1; ${sizek}/${kilo}")
    rate=$(bc -l <<<"scale=1; ${sizek}/${seconds}")
    if (( $(bc  <<<"$rate > $maxr") )); then
        if (( $(bc  <<<"$sizem > $maxs") )); then
            echo -e $file "\t" $rate "\t" $sizem
        fi
    fi
done

Results might look like

file path 	 rate kbps 	 size MB
/media/my kitties playing.mkv 	 1166.0 	 5802.6
/media/cats jumping.mkv 	 460.1 	 2858.9
/media/fuzzy kitties.AVI 	 1092.7 	 7422.0

Another common task is renaming video files with some key stats on the contents so they’re easier to find and compare. Linux has limited integration with media information (dolphin is somewhat capable, but thunar not so much). This little script also leans on mediainfo command line to append the following to the file name of media files recursively found below a starting directory path:

  • WidthxHeight in pixels (1920×1080)
  • Runtime in HH-MM-SS.msec (02-38-15.111) (colons aren’t a good thing in filenames, yah, it is confusingly like a date)
  • CODEC name (AVC)
  • Datarate (1323kbps)

For example

kittyplay.mp4 -> kittyplay_1280x682_02-38-15.111_AVC_154.3kbps.mp4

The code is also available here.

#!/usr/bin/bash
PATH="/home/gessel/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

############################# USE #######################################################
# find_media.sh /starting/path/ (quote path names with spaces)
########################################################################################

# No argument given?
if [ -z "$1" ]; then
  printf "\nUsage:\n  pass a starting point like \"/Downloads/Media files/\" \n" 
  exit 1
fi

msec="1000"
kilo="1024"
s="_"
x="x"
kbps="kbps"
dot="."

find "$1" -type f \( -iname \*.avi -o -iname \*.mkv -o -iname \*.mp4 -o -iname \*.wmv \) -print0 | while read -rd $'\0' file
do
  if [[ -f "$file" ]]; then
    size="$(wc -c  "$file" |  awk '{print $1}')"
    duration="$(mediainfo --Inform="Video;%Duration%" "$file")"
    seconds=$(bc -l <<<"${duration}/${msec}")
    sizek=$(bc -l <<<"scale=1; ${size}/${kilo}")
    sizem=$(bc -l <<<"scale=1; ${sizek}/${kilo}")
    rate=$(bc -l <<<"scale=1; ${sizek}/${seconds}")
    codec="$(mediainfo --Inform="Video;%Format%" "$file")"
    framerate="$(mediainfo --Inform="General;%FrameRate%" "$file")"
    rtime="$(mediainfo --Inform="General;%Duration/String3%" "$file")"
    runtime="${rtime//:/-}"
    width="$(mediainfo --Inform="Video;%Width%" "$file")"
    height="$(mediainfo --Inform="Video;%Height%" "$file")"
    fname="${file%.*}"
    ext="${file##*.}"
    $(mv "$file" "$fname$s$width$x$height$s$runtime$s$codec$s$rate$kbps$dot$ext")
  fi
done

If you don’t have mediainfo installed,

sudo apt update
sudo apt install mediainfo
Posted at 10:18:58 GMT-0700

Category: AudioHowToLinuxvideo

Audio Compression for Speech

Tuesday, June 28, 2022 

Speech is generally a special class of audio files where compression quality is rated more on intelligibility than on fidelity, though the two related the former may be optimized at the expense of the latter to achieve very low data rates.  A few codecs have emerged as particularly adept at this specific class: Speex, Opus, and the latest, Google’s Lyra, a deep learning enhanced codec.

Lyra is focused on Android and requires a bunch of Java cruft to build and needs debugging.  It didn’t seem worth the effort, but I appreciate the Deep Learning based compression, it is clearly the most efficient compression possible.

I couldn’t find a quick whatcha-need-to-know is kind of summary of the codecs, so maybe this is useful:

Opus

On Ubuntu (and most Linux distros) you can install the Opus codec and supporting tools with a simple

# sudo apt install opus-tools

If you have ffmpeg installed, it provides a framework for dealing with IO and driving libopus from the command line like:

# ffmpeg -i infile.mp3 -codec:a libopus -b:a 8k -cutoff 8000 outfile.opus

Aside from infile.(format) and outfile.opus, there are two command line options that make sense to mess with to get good results: the bitrate -b:a (bit rate) and the -cutoff (frequency), which must be 4000 (narrowband), 6000 (mediumband), 8000 (wideband), 12000 (super wideband), or 20000 (fullband).  The two parameters work together and for speech limiting bandwidth saves bits for speech.

There are various research papers on the significance of frequency components in speech intelligibility that range from about 4kHz to about 8kHz (and “sometimes higher”).  I’d argue useful cutoffs are 6000 and 8000 for most applications.    The fewer frequency components fed into the encoder, the more bps remain to encode the residual.  There will be an optimum value which will maximize the subjective measure of intelligibility times the objective metric of average bit rate that has to be determined empirically for recording quality, speaker’s voice, and transmission requirements.

In my tests, my sample, the voice I had to work with an 8kHz bandwidth made little perceptible difference to the quality of speech.  6kbps VBR (-b:a 6k) compromised intelligibility, 8k did not, and 24k was not perceptibly compromised from the source.

one last option to consider might be the -application, which yields subtle differences in encoding results.  The choices are voip which optimizes for speech, audio (default) which optimizes for fidelity, and lowdelay which minimizes latency for interactive applications.

# ffmpeg -i infile.mp3 -codec:a libopus -b:a 8k -application voip -cutoff 8000 outfile.opus

VLC player can play .opus files.

Speex

AFAIK, Speex isn’t callable by ffmpeg yet, but the speex installer has a tool speexenc that does the job.

# sudo apt install speex

Speexenc only eats raw and .wav files, the latter somewhat more easily managed.  To convert an arbitrary input to wav, ffmpeg is your friend:

# ffmpeg -i infile.mp3 -f wav -bitexact -acodec pcm_s16le -ar 8000 -ac 1 wavfile.wav

Note the -ar 8000 option.  This sets the sample rate to 8000 – Speexenc will yield unexpected output data rates unless sample rates are 8000, 16000, or 32000, and these should correlate to the speexenc bandwidth options that will be used in the compression step (speexenc doesn’t transcode to match): -n “narroband,” -w “wideband,” and -u “ultrawideband”

# speexenc -n --quality 3 --vbr --comp 10 wavfile.wav outfile.spx

This sets the bandwidth to “narrow” (matching the 8k input sample rate), the quality to 3 (see table for data rates), enables VBR (not enabled by default with speex, but it  is with Opus), and the “complexity” to 10 (speex defaults to 3 for faster encode, Opus defaults to 10), thus giving a pretty head-to-head comparison with the default Opus settings.

VLC can also play speex .spx files. yay VLC.

Results

The result is an 8kbps stream which is to my ear more intelligible than Opus at 8kbps – not 😮 better, but 😐 better.  This is atypical, I expected Opus to be obviously better and it wasn’t for this sample.  I didn’t carefully evaluate the -application voip option, which would likely tip the tables results.  Clearly YMMV so experiment.

Posted at 10:23:52 GMT-0700

Category: AudioHowToLinuxtechnology

Audio Processing Workflow

Monday, April 18, 2022 

I prefer local control of media data, the rent-to-listen approach of various streaming services is certainly convenient, but pay-forever, you get what we think you should models don’t appeal to me. Over the decades, I’ve converted various eras of my physical media to digital formats using different standards that were in vogue at the time and with different emphasis on various metadata tags yielding a rather heterogeneous collection with some annoying incompatibilities that sometimes show up, for example using the Music plugin with NextCloud streaming via Subsonic to Sublime Music or Ultrasonic on Android.  I spent some time poking around to find a set of tools that satisfied my preferences for organization and structure and filled in a missing gap or two; this is what I’m doing these days and what with.

The steps outlined here are tuned to my particular use case:

  • Linux-based process.
  • I prefer mp3 to aac or flac because the format is widely compatible.  mp3 is pretty clearly inferior to aac for coding efficiency (aac produces better sound with less bits) and aac has some cool features that mp3 doesn’t but for my use compatibility wins.
  • My ears ain’t what they used to be.  I’m not sure I could ever reliably have heard the difference between 320 CBR and 190 VBR, but I definitely can’t now and less data is less data.
  • I like metadata and the flexibility in organization it provides, and like it standardized.

So to scratch that itch, I use the following steps:

  • Convert FLAC/high-data rate mp3s to VBR (about 190 kbps) with ffmpeg
  • Fix MP3 meta info wierdsies with MP3 Diags
  • Add Replay Gain tags with loudness-scanner
  • Add BPM tags with bpm-tag from bpm-tools
  • Use Puddletag to:
    • Clean any stray tags
    • Assign Genre, Artist, Year, Album, Disk Number, Track, Title, & Cover
    • Apply a standard replace function to clean text of weird characters
    • Refile and re-name in a most-os-friendly way
  • Clean up any stray data in the file system.

Links to the tools at the bottom.

Convert FLAC to MP3 with ffmpeg

The standard tool for media processing is ffmpeg.  This works for me:

find . -depth -type f -name "*.flac" -exec ffmpeg -i {} -q:a 2  -c:v copy -map_metadata 0 -id3v2_version 3 -write_id3v1 1  {}.mp3 \;

A summary:

find                  unix find command to return each found file one-by-one
.                     search from the current directory down
-depth                start at the bottom and work up
-type f               find only files (not directories)
-name "*.flac"        files that end with .flac
-exec ffmpeg          pass each found file to ffmpeg
-i {}                 ffmpeg takes the found file name as input
-q:a 2                VBR MP3 170-210 kbps
-c:v copy             copy the video stream (usually the cover image)
-map_metadata 0       copy the metadata from input to global metadata of output
-id3v2_version 3      write ID3v2.3 tag format (more compatible than ID3v2.4)
-write_id3v1 1        also write old style ID3v1 tags (maybe useless)
{}.mp3 \;             write output file (which yields "/original/filename.flac.mp3")

For album encodes with a .cue or in other formats where the above would yield one giant file, Flacon is your friend.  I would use two steps: single flac -> exploded flac, then the ffmpeg encoder myself just for comfort with the encoding parameters.

Convert high data rate CBR MP3 to VBR

Converting high data rate CBR files requires a bit more code to detect that a given file is high data rate and CBR, for which I wrote a small bash script that leverages mediainfo to extract tags from the source file and validate.

#!/bin/bash

# first make sure at least some parameter was passed, if not echo some instructions
if [ $# -eq 0 ]; then
    echo "pass a file name or try: # find . -type f -name "*.mp3" -exec recomp.sh {} \;"
    exit 1
fi

# assign input 1 to “file” to make it a little easier to follow
file=$1

# get the media type, the bitrate, and the encoding mode and assign to variables
type=$(mediainfo --Inform='General;%Format/String%' "$file")
brate=$(mediainfo --Inform='General;%OverallBitRate/String%' "$file" |& grep -Eo [0-9]+)
mode=$(mediainfo --Inform='Audio;%BitRate_Mode/String%' "$file")

# first check: is the file an mpeg audio file, if not quit
if [[ "$type" != "MPEG Audio" ]]; then
    echo $file skipped, not valid audio
    exit 0
    fi

# second check: if the file is already VBR, move on.  
if [[ "$mode" = "Variable" ]]; then
    echo $file skipped, already variable
    exit 0
    fi

# third check: the output will be 170-210, no reason to expand low bit rate files
if [[ "$brate" -gt 221 ]]
    then
        ffmpeg -hide_banner -loglevel error -i "$file"  -q:a 2  -c:v copy -map_metadata 0 -id3v2_version 3 -write_id3v1 1  "${file}.mp3"
        rm "${file}"
        mv "${file}.mp3" "${file}"
        echo $file recompressed to variable
    fi
exit

I named this script “~/projects/recomp/recomp.sh” and call it with

find . -depth -type f -name "*.mp3" -exec ~/projects/recomp/recomp.sh {} \;

which will scan down through all sub-directories and find files with .mp3 extensions, and if suitable, re-compress them to VBR as above. Yes, this is double lossy and not very audiophile, definitely prioritizing smaller files over acoustic fidelity which I cannot really hear anyway.

Fix bad data with MP3 Diags

MP3 Diags is a GUI tool for cleaning up bad tags.  It is pretty solid and hasn’t mangled any of my files yet.  It has two basic functions: passively highlight missing useful tags (replaygain, cover image, etc) and actively fix messed up tags which is a file-changing operation so make backups if needed.  I generally just click the tools buttons “1”–”4″ and it seems to do the right thing. Thanks Ciobi!

Install was easy on Ubuntu:

sudo apt install mp3diags

Add ReplayGain Tags

To bulk add (or update) ReplayGain tags, I find loudness-scanner very easy.  I just use the droplet version and drop folders on it. The defaults do the right thing, computing track and album gain by folder. The droplet pops up a confirmation dialog which can be lost on a busy desktop, remember it.  Click to apply the tags then wait for it to finish before closing that tag list window or it will seg fault.  The only indication is in the command prompt window used to launch it, which shows “….” as it progresses and when the dots stop, you can close the tags window.

I built it from source – these steps did the needful for me:

git clone https://github.com/jiixyj/loudness-scanner.git
cd loudness-scanner
git submodule init
git submodule update
mkdir build
cd build
cmake ..
make
sudo make install

Then launch the droplet with

~/projects/loudness-scanner/build/loudness-drop-gtk

Add Beats Per Minute Tags

Beats per minute calcs are mostly useful for DJ types, but I use them to easily sort music for different moods or for exercise.  The calculation seems a bit arbitrary for things like speech or classical, but for those genres where BPM is relevant, bpm-tools seems to yield results that make sense.

Install with

sudo apt-get install libsox-fmt-mp3 bpm-tag

Then write tags with (the -f option overwrites existing tags).

find . -name "*.mp3" -exec bpm-tag -f {} \;

Puddletag

Back in my Windows days, I really liked MP3Tag.  I  was really happy to find puddletag, an mp3tag inspired linux variant.  It’s great, does everything it should.  I wish I had something like this for image metadata editing: the spreadsheet format is very easy to parse.  One problem I had was the deunicode tool wasn’t decoding for me, so I wrote my own wee function to extend the functions.py by calling the unidecode function.  only puddlestuff/functions.py needs to be patched to add this useful decode feature.  UTF8 characters are well supported in tags, but not in all file structures and since the goal is compatibility, mapping them to fairly intelligible  ASCII characters is useful.

This works with the 2.1.1 version.
Below is a patch file to show the very few changes needed.

--- functions.py.bak    2022-04-14 13:58:47.937873000 +0300
+++ functions.py        2022-04-14 16:49:23.705786696 +0300
@@ -43,6 +43,7 @@
 from mutagen.mp3 import HeaderNotFoundError
 from collections import defaultdict
 from functools import partial
+from unidecode import unidecode
 
 import pyparsing
 
@@ -769,6 +770,10 @@
     cleaned_fn = unicodedata.normalize('NFKD', t_fn).encode('ASCII', 'ignore')
     return ''.join(chr(c) for c in cleaned_fn if chr(c) in VALID_FILENAME_CHARS)
 
+# hack by David Gessel
+def deunicode(text):
+    dutext = unidecode(text)
+    return (dutext)
 
 def remove_dupes(m_text, matchcase=False):
     """Remove duplicate values, "Remove Dupes: $0, Match Case $1"
@@ -1126,7 +1131,8 @@
     'update_from_tag': update_from_tag,
     "validate": validate,
     'to_ascii': to_ascii,
-    'to_num': to_num
+    'to_num': to_num,
+    'deunicode': deunicode
 }
 
 no_fields = [filenametotag, load_images, move, remove_except,

I use the “standard” action to clean up file names with a few changes:

  • In “title” and “album” I replace ‘ – ‘ with ‘–‘
  • in all, I RegExp replace ‘(\s)’ with ‘ ‘  – all blank space with a regular space.
  • I replace all %13 characters with a space
  • I RegExp ‘(\s)+’ with ‘ ‘ – all blank runs with a single space
  • Trim all to remove leading and ending spaces.

My tag->filename function looks like this craziness which reduces the risk of filename misbehavior on most platforms:

~/$validate(%genre%,_,/\*?;”|: +<>=[])/$validate($deunicode(%artist%),_,/\*?;”|: +<>=[])/%year%--$left($validate($deunicode(%album%),_,/\*?;”|: +<>=[]),136)$if(%discnumber%, --D$num(%discnumber%,2),"")/$left($num(%track%,2)--$validate($deunicode(%title%),_,/\*?;”|: +<>=[]),132)

Puddletag is probably in your repository. To mod the code, I first installed from source per the puddletag instructions, but had to also add unidecode to my system with

pip install unidecode

Last File System Cleanups

The above steps should yield a clean file structure without leading or trailing spaces, indeed without any spaces at all, but in case it doesn’t the rename function can help.  I installed it with

sudo apt install rename

This is useful to, for example, normalize errant spelling of mp3 – for example Mp3 or MP3 or, I suppose, mP3.

find . -depth -exec rename 's/\.mp3$/.mp3/i' {} +
aside from parameters explained previously
's/A/B/'            substitute B for each instance of A
\.                  escaped "." because "." has special meaning
$                   match end of string - so .mp3files won't match, but files.mp3 does
i                   case insensitive match (.mp3 .MP3 .mP3 .Mp3 all match)

More details on rename.

The following commands clean up errant spaces before after and repeated:

find . -depth -exec rename 's/^ *//' {} +
find . -depth -exec rename 's/ *$//' {} +
find . -depth -exec rename 's/\s+/_/g' {} +

If moving files around results in empty directories (or empty files, which shouldn’t happen) then they can be cleaned with

find . -depth -type d -empty -print -delete
find . -depth -type f -empty -print -delete

Players

If workflow preferences are highly personal, player prefs seem even more so.  Mine are as follows:

For Local Playback on a PC: Quod Libet

I like to sort by genre, artist, year, and album and Quod Libet makes that as easy as in foobar2000 did back in the olde days when Windows was still an acceptable desktop OS.  Those days are long, long over and while I am still fond of the foobar2000 approach, Quod Libet doesn’t need Wine.

Alas, one shortcoming still is that Quod Libet does not support subsonic or ampache.  That’s too bad because I really like the UI/UX.

For Subsonic Streaming on a PC: Sublime Music

Not the text editor, the music app.  It is pretty good, more pretty than Quod Libet and in a way that doesn’t entirely appeal to me, but it seems to work fairly well with NextCloud and is the best solution I’ve found so far.  It tends to flow quite a few errors and I see an odd bug where album tile selection jumps around, but it seems to work and a local program linking back to a server is generally more performant than in browser, but that’s also an option (see below) or run foobar2000 in Wine, perhaps even as an (ugh!) snap.

In Browser: NextCloud Music

Nextcloud’s Music app is one of those that imposes a sorting model that doesn’t work for me – not at all foobar2000ish – and so I don’t really use it much, but there are times, working on site for example, that a browser window is easiest.  I find I often have to rebuild the music database after changes.  Foam or Ample might be more satisfying choices functionally and aesthetically and can connect to the backend provided by Music.

Mobile: Ultrasonic

Ultrasonic works pretty well for me and seems to connect fairly reliably to my NextCloud server even in low bandwidth situations (though, obviously, not fast enough to actually listen to anything, but it doesn’t barf.)  Power Ampache might be another choice still currently developed (but I haven’t tried it myself).  Subsonic also worked with NextCloud, but I like Ultrasonic better and it is still actively developed.

If you’re on iOS instead of Android (congratulations on the envy your overpriced corporate icon inspires in the less fortunate) you almost certainly stick exclusively with your tribal allegiance and have no need for media outside of iTunes/Apple TV approved content.

Tools:

Players:

Posted at 17:59:27 GMT-0700

Category: AudioHowToLinuxtechnology

Tagging MP3 Files with Puddletag on Linux Mint

Tuesday, March 23, 2021 

A “fun” part of organizing an MP3 collection is harmonizing the tags so the datas work consistently with whatever management schema you prefer.  My preference is management by the file system—genre/artist/year/album/tracks works for me—but consistent metainformation is required and often disharmonious.  Finding metaharmony is a chore I find less taxing with a well structured tag editor and to my mind the ur-meta-tag manager is  MP3TAG.

The problem is that only works with that dead-end spyware riddled failing legacyware called “Windows.” Fortunately, in Linux-land we have puddletag, a very solid clone of MP3TAG.  The issues is that the version in repositories is (as of this writing) 1.20 and I couldn’t find a PPA for the latest, 2.0.1.  But compiling from source is super easy and works in both Linux Mint 19 and Ubuntu 20.04 (yay open source!):

  1. Install pre-reqs to build (don’t worry, if they’re installed, they won’t be double installed)
  2. get the tarball of the source code
  3. expand it (into a reasonable directory, like ~/projects)
  4. switch into that directory
  5. run the python executable “puddletag” directly to verify it is working
  6. install it
  7. tell the desktop manager it’s there – and it should be in your window manager along with the rest of your applications.

The latest version as of this post was 2.0.1 from https://github.com/puddletag/puddletag

sudo apt install python3-pyqt5 python3-pyqt5.qtsvg python3-pyparsing python3-mutagen python3-acoustid libchromaprint-dev libchromaprint-tools libchromaprint1 
wget href="https://github.com/puddletag/puddletag/releases/download/2.0.1/puddletag-2.0.1.tar.gz
tar -xvf puddletag-2.0.1.tar.gz cd puddletag-2.0.1/
cd puddletag 
./puddletag 
sudo python3 setup.py install 
sudo desktop-file-install puddletag.desktop

A nice feature is the configuration directory is portable and takes your complete customization with you – it is an extremely customizable program so you can generally configure it as fits your mental model.  Just copy the entire puddletag directory located at ~/.configure/puddletag.

Posted at 15:19:01 GMT-0700

Category: AudioHowToLinuxPositivereviewsuncategorized

The Daily, from the NYT, weirdly slow

Saturday, December 30, 2017 

From my distant location overseas, listening to the news via podcasts is a great way to keep up: something I’m quite grateful to have access to on demand and via the internets.   Until the end of Net Neutrality means that only “Verizon Insights” and “Life at AT&T” are still accessible, I enjoy a range of news sources on a daily basis using a podcatcher called Beyond Pod.  One of the essential features it has is the ability to speed up the tempo of podcasts, some of which are a bit slow as recorded.  But one…. one is like dripping molasses on a winter day: The Daily from the NYT by Michael Barbaro.  I’m pretty sure silences are inserted in editing to draw out the drama to infuriating lengths and the tempo of the audio is selectively slowed to about half normal speed.  Nobody can actually talk that slowly.  I mean listen and try – like actually try to draw out a word that might take 1 second to pronounce to two full seconds.   It is a pretty good news summary and has some useful information, but there’s no way I’d suffer through it without setting the tempo to 2x.

Every time I accidentally stream the podcast, rather than downloading and playing, the tempo control is disabled and while I scramble to skip to the next podcast before my I start questioning reality I often wonder for a moment just how bad the pauses are.  Here’s my analysis:

I consider the BBC Global News to be a very professional, truly “broadcast quality” podcast.  The announcers are clear, comprehensible, and speak at a pace that is appropriate for a news broadcast.  I still speed it up because daily life is like that now, but if I listen at normal pace, it isn’t even slightly annoying.

The Economist Radio is fairly typical of a print publication’s efforts at presenting print journalists as audio stars.  it doesn’t always work out perfectly and the pacing varies a lot by who is speaking and the rather eclectic line-up.  In general it is annoyingly slow, but not interminably so.  It comes across as a bit amateur by broadcast standards, but well done and very informative.

But then there’s The Daily from the NYT.  This podcast was the reason I took the time to figure out how to speed up playback.  There was no other choice: either unsubscribe or speed it up to something not aneurysm inducing.  Looking at the waveforms, I suspect they might actually be inserting silences of around 500msec between words, perhaps for dramatic effect (there’s way too much dramatic effect in a lot of the stories, which speeding up only hastens rather than fully alleviating—never have you heard so many interviewees break into uncomfortable tears as they’re overwhelmed by whatever the day’s tragedy is, an artifice only slightly less annoying than broadcasting the sound of someone eatingOMG, that’s real.  Rule 34.)

Read more…

Posted at 07:42:26 GMT-0700

Category: Audiotechnology

Family Feud Basra Style

Saturday, December 13, 2014 

Gunfire is pretty common here, perhaps even more common than in Oakland though usually for the same reasons: celebrating holidays, sports victories, weddings, that sort of stuff.  It is kind of fun to listen to and watch tracers and stuff, but usually the villa is also celebrating in an obvious way; when you hear gunfire you also hear cheers, at least at night.

This evening the house was quiet, but the gunfire sure wasn’t.  The guys tell me it was a tribal feud in the neighborhood, quite close from the sound of it.  This is a low-fi recording from my phone.

Posted at 17:13:13 GMT-0700

Category: Audioplacestravel

Speaker Build

Friday, November 28, 2014 

In December of 2002 (really, 2002, 12 years ago), I decided that the crappy former Sony self-amplified speakers with blown amplifiers that I had wired into my stereo as surround speakers really didn’t sound very good as they were, by then, 7 years old and the holes in the plastic housing where the adjustment knobs once protruded were covered by aging gaffers tape.

At least it was stylish black tape.

I saw on ebay a set of “Boston Acoustics” woofers and tweeters back in the time when ebay prices could be surprisingly good.  Boston Acoustics was a well-respected company at the time making fairly decent speakers.  36 woofers and 24 tweeters for $131 including shipping.  About 100 lbs of drivers.  And thus began the execution of a fun little project.


 

Design Phase: 2003-2011

I didn’t have enough data to design speaker enclosures around them, but about a year later (in 2003), I found this site, which had a process for calculating standard speaker properties with instruments I have (frequency generator, oscilloscope, etc.)  I used the weighted diaphragm method.

WOOFER: PN 304-1150001-00 22 JUL 2000
80MM CONE DIA = 8CM
FS  = 58HZ
RE  = 3.04 OHMS
QMS = 1.629
QES = 0.26
QTS = 0.224
CMS = 0.001222
VAS = 4.322 (LITERS) 264 CUBIC INCHES
EBP = 177.8

NOMINAL COIL RESISTANCE @ 385HZ (MID LINEAR BAND) 3.19 OHMS
NOMINAL COIL INDUCTANCE (@ 1KHZ) 0.448 MHENRY
TWEETER: PN 304-050001-00 16 OCT 2000
35MM CONE DIA
FS  = 269HZ
RE  = 3.29 OHMS
QMS = 5.66
QES = 1.838
QTS = 1.387
CMS = 0.0006
VAS = 0.0778 (LITERS)
EBP = 86.7

NOMINAL COIL RESISTANCE @ 930HZ (MID LINEAR BAND) 3.471 OHMS
NOMINAL COIL INDUCTANCE (@ 1KHZ) 0.153 MHENRY

Awesome.  I could specify a cross over and begin designing a cabinet.  A few years went by…

In January of 2009 I found a good crossover at AllElectronics.  It was a half decent match and since it was designed for 8 ohm woofers, I could put two of my 4 ohm drivers in series and get to about the right impedance for better power handling (less risk of clipping at higher volumes and lower distortion as the driver travel is cut in half, split between the two).

HTTP://WWW.ALLELECTRONICS.COM/MAKE-A-STORE/ITEM/XVR-21/2-WAY-CROSSOVER-INFINITY/1.HTML
CROSS OVER FREQUENCY 
3800HZ CROSSOVER
LOW-PASS: 18DB, 8 OHM
HIGH-PASS: 18DB, 4 OHM

Eventually I got around to calculating the enclosure parameters.  I’m not sure when I did that, but sometime between 2009 and 2011.  I found a site with a nice script for calculating a vented enclosure with dual woofers, just like I wanted and got the following parameters:

TARGET VOLUME 1.78 LITERS = 108 CUBIC INCHES

DRIVER VOLUME (80MM) = 26.25 CUBIC INCHES = 0.43 LITERS
CROSS OVER VOLUME = 2.93 CUBIC INCHES = 0.05 LITERS
SUM = 0.91 LITERS
1" PVC PORT TUBE: OD = 2.68CM, ID = 2.1CM = 3.46 CM^2
PORT LENGTH = 10.48CM = 4.126"


WIDTH = 12.613 = 4.829"
HEIGHT = 20.408 = 7.82"
DEPTH = 7.795 = 3"

In 2011 I got around to designing the enclosure in CAD:

There was no way to fit the crossover inside the enclosure as the drivers have massive, magnetically shielded drivers, so they got mounted on the outside.  The speakers were designed for inside mounting (as opposed to flange mounting) so I opted to radius the opening to provide some horn-loading.

I also, over the course of the project, bought some necessary tools to be prepared for eventually doing the work: a nice Hitachi plunge router and a set of cheap router bits to form the radii and hole saws of the right size for the drivers and PVC port tubes.

Build Phase (2014)

This fall, Oct 9 2014, everything was ready and the time was right.  The drivers had aged just the appropriate 14 years since manufacture and were in the peak of their flavor.

I started by cutting down some PVC tubes to make the speaker ports and converting some PVC caps into the tweeter enclosure.  My first experiment with recycled shelf wood for the tweeter mounting plate failed: the walls got a bit thin and it was clear that decent plywood would make life easier.  I used the shelf wood for the rest of the speaker: it was salvaged from my building, which was built in the 1930s and is probably almost 100 years old.  The plywood came with the building as well, but was from the woodworker who owned it before me.

I got to use my router after so many years of contemplation to shape the faceplates, fabricated from some fairly nice A-grade plywood I had lying around.

Once I got the boxes glued up, I installed the wiring and soldered the drivers in.  The wood parts were glued together with waterproof glue while the tweeters and plastic parts were installed with two component clear epoxy.  The low frequency drivers had screw mounting holes, so I used those in case I have to replace them, you know, from cranking the tunage.

I lightly sanded the wood to preserve the salvage wood character (actually no power sander and after 12 years, I wasn’t going to sand my way to clean wood by hand) then treated them with some polyurethane I found left behind by the woodworker that owned the building before I did.  So that was at least 18 years old.  At least.

I supported the speakers over the edge of the table to align the drivers in the holes from below.

The finished assembly looked more or less like I predicted:

Testing

The speakers sound objectively quite nice, but I was curious about the frequency response.  To test them I used the pink noise generator in Audacity to generate 5.1 6 channel pink noise files which I copied over to the HTPC to play back through my amp.  This introduces the amp’s frequency response, which is unlikely to be particularly good, and room characteristics, which are certainly not anechoic.

Then I recorded the results per speaker on a 24/96 Tascam DR-2d recorder, which also introduces some frequency response issues, and imported the audio files back into Audacity (and the original pink noise file), plotted the spectrum with 65536 poles, and exported the text files into excel for analysis.

Audacity’s pink noise looks like this:

Pink_Noise_Spectrum

It’s pretty good – a bit off plan below 10 Hz and the random noise gets a bit wider as the frequency increases, but it is pretty much what it should be.

First, I tested one of my vintage ADS L980 studio monitors.  I bought my L980s in high school in about 1984 and have used them ever since.  In college I blew a few drivers (you know, cranking tunage) but they were all replaced with OEM drivers at the Tweeter store (New England memories).  They haven’t been used very hard since, but the testing process uncovered damage to one of my tweeters, which I fixed before proceeding.

ADS L980 Spectrum

The ADS L980 has very solid response in the low frequency end with a nicely manufactured 12″ woofer and good high end with their fancy woven tweeter.  A 3 way speaker, there are inevitably some complexities to the frequency response.

I also tested my Klipsch KSC-C1 Center Channel speaker (purchased in 2002 on ebay for $44.10) to see what that looked like:

It isn’t too bad, but clearly weaker in the low frequency, despite moderate sized dual woofers and with a bit of a spike in the high frequency that maybe is designed in for TV and is perhaps just an artifact of the horn loaded tweeter. It is a two way design and so has a fairly smooth frequency response in the mid-range, which is good for the voice program that a center speaker mostly carries.

And how about those new ones?

New Speaker Spectrum

Well… not great, a little more variability than one would hope, and (of course) weak below about 100Hz.  I’m a little surprised the tweeters aren’t a little stronger over about 15kHz, though while that might have stood out to me in 1984, it doesn’t now.  Overall the response is quite good for relatively inexpensive drivers, the low frequency response, in particular, is far better than I expected given the small drivers.  The high frequency is a bit spiky, but quite acceptable sounding.

And they sound far, far better than the poor hacked apart Sony speakers they replaced.

Raw Data

The drawings I fabricated from and the raw data from my tests are in the files linked below:

Speaker Design Files (pdf)

Pink Noise Tests (xlsx)

Posted at 21:05:03 GMT-0700

Category: AudioFabricationHowTophototechnology