bash

Smol bash script for finding oversize media files

Friday, September 2, 2022 

Sometimes you want to know if you have media files that are taking up more than their fair share of space.  You compressed the file some time ago in an old, inefficient format, or you just need to archive the oversize stuff, this can help you find em.  It’s different from file size detection in that it uses mediainfo to determine the media file length and a variety of other useful data bits and wc -c to get the size (so data rate includes any file overhead), and from that computes the total effective data rate. All math is done with bc, which is usually installed. Files are found recursively (descending into sub-directories) from the starting point (passed as first argument) using find.

basic usage would be:

./find-high-rate-media.sh /search/path/tostart/ [min bpp] [min data rate] [min size] > oversize.csv 2>&1

The script will then report media with a rate higher than minimum and size larger than minimum as a tab delimited list of filenames, calculated rate, and calculated size. Piping the output to a file, output.csv, makes it easy to sort and otherwise manipulate in LibreOffice Calc as a tab delimited file.  The values are interpreted as the minimum for suppression of output, so any file that exceeds all three minimum triggers will be output to the screen (or .csv file if so redirected).

The script takes four command line variables:

  • The starting directory, which defaults to . [defaults to the directory the script is executed in]
  • The minimum bits per pixel (including audio, sorry) for exclusions (i.e. more bpp and the filename will be output)  [defaults to 0.25 bpp]
  • The minimum data rate in kbps [defaults to 1 kbps so files would by default only be excluded by bits per pixel rate]
  • The minimum file size in megabytes [defaults to 1mb so files would by default only be excluded by bits per pixel rate]

Save the file as a name you like (such as find-high-rate-media.sh) and # chmod  +x find-high-rate-media.sh and run it to find your oversized media.

!/usr/bin/bash
############################# USE #######################################################
# This creates a tab-delimeted CSV file of recursive directories of media files enumerating
# key compression parameters.  Note bits per pixel includes audio, somewhat necessarily given
# the simplicity of the analysis. This can throw off the calculation.
# find_media.sh /starting/path/ [min bits per pixel] [min data rate] [min file size mb]
# /find-high-rate-media.sh /Media 0.2 400 0 > /recomp.csv 2>&1
# The "find" command will traverse the file system from the starting path down.
# if output isn't directed to a CSV file, it will be written to screen. If directed to CSV
# this will generate a tab delimted csv file with key information about all found media files
# the extensions supported can be extended if it isn't complete, but verify that the 
# format is parsable by the tools called for extracting media information - mostly mediainfo
# Typical bits per pixel range from 0.015 for a HVEC highly compressed file at the edge of obvious
# degradation to quite a bit higher.  Raw would be 24 or even 30 bits per pixel for 10bit raw.
# Uncompressed YUV video is about 12 bpp. 
# this can be useful for finding under and/or overcompressed video files
# the program will suppress output if the files bits per pixel is below the supplied threshold
# to reverse this invert the rate test to " if (( $(bc  <<<"$rate < $maxr") )); then..."
# if a min data rate is supplied, output will be suppressed for files with a lower data rate
# if a min file size is supplied, output will be suppressed for files smaller than this size
########################################################################################

# No argument given?
if [ -z "$1" ]; then
  printf "\nUsage:\n  starting by default in the current directory and searchign recusrively \n"
  dir="$(pwd)"
  else
        dir="$1"
        echo -e "starting in " $dir ""
fi

if [ -z "$2" ]; then
  printf "\nUsage:\n  returning files with bits per pixel greater than default max of .25 bpp \n" 
  maxr=0.25
  else
        maxr=$2
        echo -e "returning files with bits per pixel greater than " $maxr " bpp" 
fi

if [ -z "$3" ]; then
  printf "\nUsage:\n  returning files with data rate greater than default max of 1 kbps \n" 
  maxdr=1
  else
        maxdr=$3
        echo -e "returning files with data rate greater than " $maxdr " kbps" 
fi


if [ -z "$4" ]; then
  printf "\nUsage:\n  no min file size provided returning files larger than 1MB \n" 
  maxs=1
  else
        maxs=$4
        echo -e "returning files with file size greater than " $maxs " MB  \n\n" 
fi


msec="1000"
kilo="1024"
reint='^[0-9]+$'
refp='^[0-9]+([.][0-9]+)?$'

echo -e "file path \t rate bpp \t rate kbps \t V CODEC \t A CODEC \t Frame Size \t FPS \t Runtime \t size MB"

find "$dir" -type f \( -iname \*.avi -o -iname \*.mkv -o -iname \*.mp4 -o -iname \*.wmv -iname \*.m4v \) -print0 | while read -rd $'\0' file
do
  if [[ -f "$file" ]]; then
    bps="0.1"
    size="$(wc -c  "$file" |  awk '{print $1}')"
    duration="$(mediainfo --Inform="Video;%Duration%" "$file")"
    if ! [[ $duration =~ $refp ]] ; then
       duration=$msec
    fi
    seconds=$(bc -l <<<"${duration}/${msec}")
    sizek=$(bc -l <<<"scale=1; ${size}/${kilo}")
    sizem=$(bc -l <<<"scale=1; ${sizek}/${kilo}")
    rate=$(bc -l <<<"scale=1; ${sizek}/${seconds}")
    codec="$(mediainfo --Inform="Video;%Format%" "$file")"
    audio="$(mediainfo --Inform="Audio;%Format%" "$file")"
    framerate="$(mediainfo --Inform="General;%FrameRate%" "$file")"
    if ! [[ $framerate =~ $refp ]] ; then
       framerate=100
    fi
    rtime="$(mediainfo --Inform="General;%Duration/String3%" "$file")"
    width="$(mediainfo --Inform="Video;%Width%" "$file")"
    if ! [[ $width =~ $reint ]] ; then
       width=1
    fi
    height="$(mediainfo --Inform="Video;%Height%" "$file")"
    if ! [[ $height =~ $reint ]] ; then
       height=1
    fi
    pixels=$(bc -l <<<"scale=1; ${width}*${height}*${seconds}*${framerate}")
    bps=$(bc -l <<<"scale=4; ${size}*8/${pixels}")
    if (( $(bc -l <<<"$bps > $maxr") )); then
        if (( $(bc -l <<<"$sizem > $maxs") )); then
            if (( $(bc -l <<<"$rate > $maxdr") )); then
                echo -e "$file" "\t" $bps "\t" $rate "\t" $codec "\t" $audio "\t" $width"x"$height "\t" $framerate "\t" $rtime "\t" $sizem
            fi
        fi
    fi
  fi
done

Results might look like:

Another common task is renaming video files with some key stats on the contents so they’re easier to find and compare. Linux has limited integration with media information (dolphin is somewhat capable, but thunar not so much). This little script also leans on mediainfo command line to append the following to the file name of media files recursively found below a starting directory path:

  • WidthxHeight in pixels (e.g. 1920×1080)
  • Runtime in HH-MM-SS.msec (e.g. 02-38-15.111) (colons aren’t a good thing in filenames, yah, it is confusingly like a date)
  • CODEC name (e.g. AVC)
  • Datarate (e.g. 1323kbps)

For example

kittyplay.mp4 -> kittyplay_1280x682_02-38-15.111_AVC_154.3kbps.mp4

The code is also available here.

#!/usr/bin/bash
PATH="/home/gessel/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

############################# USE #######################################################
# find_media.sh /starting/path/ (quote path names with spaces)
########################################################################################

# No argument given?
if [ -z "$1" ]; then
  printf "\nUsage:\n  pass a starting point like \"/Downloads/Media files/\" \n" 
  exit 1
fi

msec="1000"
kilo="1024"
s="_"
x="x"
kbps="kbps"
dot="."

find "$1" -type f \( -iname \*.avi -o -iname \*.mkv -o -iname \*.mp4 -o -iname \*.wmv \) -print0 | while read -rd $'\0' file
do
  if [[ -f "$file" ]]; then
    size="$(wc -c  "$file" |  awk '{print $1}')"
    duration="$(mediainfo --Inform="Video;%Duration%" "$file")"
    seconds=$(bc -l <<<"${duration}/${msec}")
    sizek=$(bc -l <<<"scale=1; ${size}/${kilo}")
    sizem=$(bc -l <<<"scale=1; ${sizek}/${kilo}")
    rate=$(bc -l <<<"scale=1; ${sizek}/${seconds}")
    codec="$(mediainfo --Inform="Video;%Format%" "$file")"
    framerate="$(mediainfo --Inform="General;%FrameRate%" "$file")"
    rtime="$(mediainfo --Inform="General;%Duration/String3%" "$file")"
    runtime="${rtime//:/-}"
    width="$(mediainfo --Inform="Video;%Width%" "$file")"
    height="$(mediainfo --Inform="Video;%Height%" "$file")"
    fname="${file%.*}"
    ext="${file##*.}"
    $(mv "$file" "$fname$s$width$x$height$s$runtime$s$codec$s$rate$kbps$dot$ext")
  fi
done

If you don’t have mediainfo installed,

sudo apt update
sudo apt install mediainfo
Posted at 10:18:58 GMT-0700

Category: AudioHowToLinuxvideo

Favicon generation script

Monday, December 21, 2020 

Favicons are a useful (and fun) part of the browsing experience.  They once were simple – just an .ico file of the right size in the root directory.  Then things got weird and computing stopped assuming an approximate standard ppi for displays, starting with mobile and “retina” displays.  The obvious answer would be .svg favicons, but, wouldn’t’ya know, Apple doesn’t support them (neither does Firefox mobile) so for a few more iterations, it still makes sense to generate an array of sizes with code to select the right one.  This little tool pretty much automates that from a starting .svg file.

There are plenty of good favicon scripts and tools on the interwebs. I was playing around with .svg sources for favicons and found it a bit tedious to generate the sizes considered important for current (2020-ish) browsing happiness. I found a good start at stackexchnage by @gary, though the sizes weren’t current recommended (per this github project). Your needs may vary, but it is easy enough to edit.

The script relies on the following wonderful FOSS tools:

These are available in most distros (software manager had them in Mint 19).

Note that my version leaves the format as .png – the optimized png will be many times smaller than the .ico format and png works for everything except IE<11, which nobody should be using anyway.  The favicon.ico generated is 16, 32, and 48 pixels in 3 different layers from the 512×512 pixel version.

The command line options for inkscape changed a bit, the bash script below has been updated to reflect current.

Note: @Chewie9999 commented on https://github.com/mastodon/mastodon/issues/7396 that for Mastodon, the sizes needed would be generated with the following:

size=(16 32 36 48 57 60 72 76 96 114 120 144 152 167 180 192 256 310 384 512 1024)

The code below can be saved as a bash file, set execution bit, and call as ./favicon file.svg and off you go:

#!/bin/bash

# this makes the output verbose
set -ex

# collect the file name you entered on the command line (file.svg)
svg=$1

# set the sizes to be generated (plus 310x150 for msft)
size=(16 32 48 70 76 120 128 150 152 167 180 192 310 512) 

# set the write director as a favicon directory below current
out="$(pwd)"
out+="/favicon"
mkdir -p $out

echo Making bitmaps from your svg...

for i in ${size[@]}; do
inkscape -o "$out/favicon-$i.png" -w $i -h $i $svg
done

# Microsoft wide icon (annoying, probably going away)
inkscape -o "$out/favicon-310x150.png" -w 310 -h 150 $svg

echo Compressing...

for f in $out/*.png; do pngquant -f --ext .png "$f" --posterize 4 --speed 1 ; done;

echo Creating favicon

convert $out/favicon-512.png -define icon:auto-resize=48,32,16 $out/favicon.ico

echo Done

Copy the .png files and .ico file generated above as well as the original .svg file into your root directory (or, if in a sub-directory, add the path below), editing the “color” of the Safari pinned tab mask icon. You might also want to make a monochrome version of the .svg file and reference that as the “mask-icon” instead, it will probably look better, but that’s more work.

The following goes inside the head directives in your index.html to load the correct sizes as needed (delete the lines for Microsoft’s browserconfig.xml file and/or Android’s manifest file if not needed.)

<!-- basic svg -->
<link rel="icon" type="image/svg+xml" href="/favicon.svg">

<!-- generics -->
<link rel="icon" href="favicon-16.png" sizes="16x16">
<link rel="icon" href="favicon-32.png" sizes="32x32">
<link rel="icon" href="favicon-48.png" sizes="48x48">
<link rel="icon" href="favicon-128.png" sizes="128x128">
<link rel="icon" href="favicon-192.png" sizes="192x192">

<!-- .ico files -->
<link rel="icon" href="/favicon.ico" type="image/x-icon" />
<link rel="shortcut icon" href="/favicon.ico" type="image/x-icon" />

<!-- Android -->
<link rel="shortcut icon" href="favicon-192.png" sizes="192x192">
<link rel="manifest" href="manifest.json" />

<!-- iOS -->
<link rel="apple-touch-icon" href="favicon-76.png" sizes="76x76">
<link rel="apple-touch-icon" href="favicon-120.png" sizes="120x120">
<link rel="apple-touch-icon" href="favicon-152.png" sizes="152x152">
<link rel="apple-touch-icon" href="favicon-167.png" sizes="167x167">
<link rel="apple-touch-icon" href="favicon-180.png" sizes="180x180">
<link rel="mask-icon" href="/favicon.svg" color="brown">

<!-- Windows --> 
<meta name="msapplication-config" content="/browserconfig.xml" />

For WordPress integration, you don’t have access to a standard index.html file, and there are crazy redirects happening, so you need to append to your theme’s functions.php file with the below code snippet wrapped around the above icon declaration block (optimally your child theme unless you’re a theme developer since it’ll get overwritten on update otherwise):

/* Allows browsers to find favicons */
add_action('wp_head', 'add_favicon');
function add_favicon(){
?>
REPLACE THIS LINE WITH THE BLOCK ABOVE
<?php
};

Then, just for Windows 8 & 10, there’s an xml file to add to your directory (root by default in this example) Also note you need to select a color for your site, which has to be named “browserconfig.xml

<?xml version="1.0" encoding="utf-8"?>
<browserconfig>
    <msapplication>
        <tile>
            <square70x70logo src="/favicon-70.png"/>
            <square150x150logo src="/favicon-150.png"/>
            <wide310x150logo src="/favicon-310x150.png"/>
            <square310x310logo src="/favicon-310.png"/>
            <TileColor>#ff8d22</TileColor>
        </tile>
    </msapplication>
</browserconfig>

There’s one more file that’s helpful for mobile compatibility, the android save to desktop file, “manifest.json“.  This requires editing and can’t be pure copy pasta.  Fill in the blanks and select your colors

{
"name": "",
"short_name": "",
"description": "",
"start_url": "/?homescreen=1",
"icons": [
{
"src": "/favicon-192.png",
"sizes": "192x192",
"type": "image/png"
},
{
"src": "/favicon-512.png",
"sizes": "512x512",
"type": "image/png"
}
],
"theme_color": "#ffffff",
"background_color": "#ff8d22",
"display": "standalone"
}

Check the icons with this favicon tester (or any other).

Manifest validation: https://manifest-validator.appspot.com/

Posted at 17:26:44 GMT-0700

Category: HowToLinuxSelf-publishing

Disk Checks for Large Arrays

Friday, August 21, 2015 

If you have a large array of disks attached to your server, which is obviously going to be running FreeBSD or OpenBSD if you care about security, stability, and scalability; there are some tricks for dealing with large numbers of disks (like having 227 4TB disks attached to a single host).

Using Bash (yes there are security issues, but it is powerful)

# for i in `seq 0 227`; do smartctl -t short /dev/da$i; sleep 15; done 1Thanks Jared

executes a short smart test on all disks. Smartctl seems to max out at 32 concurrent tests, so sleep 15 ensures the 3 minute tests are finishing before new ones are executed. If you’re in a hurry, sleep 5 should do the trick and ensure all of them execute.

to get results try something like:

# for i in `seq 0 227`; do echo "/dev/da$i"; smartctl -a /dev/da$i; sleep .5; done

Bulk Fixes

Problem with the disks – need to clear existing formatting?

unmount each disk

# for i in `seq 0 227`; do umount -f /dev/da$i; done

unlock (if needed)

# sysctl kern.geom.debugflags=0x10

Overwrite the start of each disk

# for i in `seq 0 227`; do dd if=/dev/zero of=/dev/da$i bs=1k count=100; done

Overwrite the end of each disk

# for i in `seq 0 227`; do dd if=/dev/zero of=/dev/da$i bs=1m oseek=`diskinfo da$i | awk '{print int($3 / (1024*1024)) - 4;}'`; done

Recreate GPT (for ZFS)

# for i in `seq 0 227`; do gpart create -s gpt /dev/da$i; sleep .5; done

Destroy multipaths

# for i in `seq 1 114`; do gmultipath destroy disk$i; done

Disable multipath completely

# for i in `seq 1 114`; do gmultipath destroy disk$i; done
# gmultipath unload
# mv /boot/kernel-debug/geom_multipath.ko /boot/kernel-debug/geom_multipath.ko.bad
# mv /boot/kernel/geom_multipath.ko /boot/kernel/geom_multipath.ko.bad

Footnotes

Footnotes
1 Thanks Jared
Posted at 12:52:56 GMT-0700

Category: FreeBSDHowToTechnology