bash
Smol bash script for finding oversize media files
Sometimes you want to know if you have media files that are taking up more than their fair share of space. You compressed the file some time ago in an old, inefficient format, or you just need to archive the oversize stuff, this can help you find em. It’s different from file size detection in that it uses mediainfo
to determine the media file length and a variety of other useful data bits and wc -c
to get the size (so data rate includes any file overhead), and from that computes the total effective data rate. All math is done with bc
, which is usually installed. Files are found recursively (descending into sub-directories) from the starting point (passed as first argument) using find
.
basic usage would be:
./find-high-rate-media.sh /search/path/tostart/ [min bpp] [min data rate] [min size] > oversize.csv 2>&1
The script will then report media with a rate higher than minimum and size larger than minimum as a tab delimited list of filenames, calculated rate, and calculated size. Piping the output to a file, output.csv
, makes it easy to sort and otherwise manipulate in LibreOffice Calc as a tab delimited file. The values are interpreted as the minimum for suppression of output, so any file that exceeds all three minimum triggers will be output to the screen (or .csv file if so redirected).
The script takes four command line variables:
- The starting directory, which defaults to . [defaults to the directory the script is executed in]
- The minimum bits per pixel (including audio, sorry) for exclusions (i.e. more bpp and the filename will be output) [defaults to 0.25 bpp]
- The minimum data rate in kbps [defaults to 1 kbps so files would by default only be excluded by bits per pixel rate]
- The minimum file size in megabytes [defaults to 1mb so files would by default only be excluded by bits per pixel rate]
Save the file as a name you like (such as find-high-rate-media.sh) and # chmod +x find-high-rate-media.sh
and run it to find your oversized media.
!/usr/bin/bash ############################# USE ####################################################### # This creates a tab-delimeted CSV file of recursive directories of media files enumerating # key compression parameters. Note bits per pixel includes audio, somewhat necessarily given # the simplicity of the analysis. This can throw off the calculation. # find_media.sh /starting/path/ [min bits per pixel] [min data rate] [min file size mb] # /find-high-rate-media.sh /Media 0.2 400 0 > /recomp.csv 2>&1 # The "find" command will traverse the file system from the starting path down. # if output isn't directed to a CSV file, it will be written to screen. If directed to CSV # this will generate a tab delimted csv file with key information about all found media files # the extensions supported can be extended if it isn't complete, but verify that the # format is parsable by the tools called for extracting media information - mostly mediainfo # Typical bits per pixel range from 0.015 for a HVEC highly compressed file at the edge of obvious # degradation to quite a bit higher. Raw would be 24 or even 30 bits per pixel for 10bit raw. # Uncompressed YUV video is about 12 bpp. # this can be useful for finding under and/or overcompressed video files # the program will suppress output if the files bits per pixel is below the supplied threshold # to reverse this invert the rate test to " if (( $(bc <<<"$rate < $maxr") )); then..." # if a min data rate is supplied, output will be suppressed for files with a lower data rate # if a min file size is supplied, output will be suppressed for files smaller than this size ######################################################################################## # No argument given? if [ -z "$1" ]; then printf "\nUsage:\n starting by default in the current directory and searchign recusrively \n" dir="$(pwd)" else dir="$1" echo -e "starting in " $dir "" fi if [ -z "$2" ]; then printf "\nUsage:\n returning files with bits per pixel greater than default max of .25 bpp \n" maxr=0.25 else maxr=$2 echo -e "returning files with bits per pixel greater than " $maxr " bpp" fi if [ -z "$3" ]; then printf "\nUsage:\n returning files with data rate greater than default max of 1 kbps \n" maxdr=1 else maxdr=$3 echo -e "returning files with data rate greater than " $maxdr " kbps" fi if [ -z "$4" ]; then printf "\nUsage:\n no min file size provided returning files larger than 1MB \n" maxs=1 else maxs=$4 echo -e "returning files with file size greater than " $maxs " MB \n\n" fi msec="1000" kilo="1024" reint='^[0-9]+$' refp='^[0-9]+([.][0-9]+)?$' echo -e "file path \t rate bpp \t rate kbps \t V CODEC \t A CODEC \t Frame Size \t FPS \t Runtime \t size MB" find "$dir" -type f \( -iname \*.avi -o -iname \*.mkv -o -iname \*.mp4 -o -iname \*.wmv -iname \*.m4v \) -print0 | while read -rd $'\0' file do if [[ -f "$file" ]]; then bps="0.1" size="$(wc -c "$file" | awk '{print $1}')" duration="$(mediainfo --Inform="Video;%Duration%" "$file")" if ! [[ $duration =~ $refp ]] ; then duration=$msec fi seconds=$(bc -l <<<"${duration}/${msec}") sizek=$(bc -l <<<"scale=1; ${size}/${kilo}") sizem=$(bc -l <<<"scale=1; ${sizek}/${kilo}") rate=$(bc -l <<<"scale=1; ${sizek}/${seconds}") codec="$(mediainfo --Inform="Video;%Format%" "$file")" audio="$(mediainfo --Inform="Audio;%Format%" "$file")" framerate="$(mediainfo --Inform="General;%FrameRate%" "$file")" if ! [[ $framerate =~ $refp ]] ; then framerate=100 fi rtime="$(mediainfo --Inform="General;%Duration/String3%" "$file")" width="$(mediainfo --Inform="Video;%Width%" "$file")" if ! [[ $width =~ $reint ]] ; then width=1 fi height="$(mediainfo --Inform="Video;%Height%" "$file")" if ! [[ $height =~ $reint ]] ; then height=1 fi pixels=$(bc -l <<<"scale=1; ${width}*${height}*${seconds}*${framerate}") bps=$(bc -l <<<"scale=4; ${size}*8/${pixels}") if (( $(bc -l <<<"$bps > $maxr") )); then if (( $(bc -l <<<"$sizem > $maxs") )); then if (( $(bc -l <<<"$rate > $maxdr") )); then echo -e "$file" "\t" $bps "\t" $rate "\t" $codec "\t" $audio "\t" $width"x"$height "\t" $framerate "\t" $rtime "\t" $sizem fi fi fi fi done
Results might look like:
Another common task is renaming video files with some key stats on the contents so they’re easier to find and compare. Linux has limited integration with media information (dolphin is somewhat capable, but thunar not so much). This little script also leans on mediainfo
command line to append the following to the file name of media files recursively found below a starting directory path:
- WidthxHeight in pixels (e.g. 1920×1080)
- Runtime in HH-MM-SS.msec (e.g. 02-38-15.111) (colons aren’t a good thing in filenames, yah, it is confusingly like a date)
- CODEC name (e.g. AVC)
- Datarate (e.g. 1323kbps)
For example
kittyplay.mp4 -> kittyplay_1280x682_02-38-15.111_AVC_154.3kbps.mp4
The code is also available here.
#!/usr/bin/bash PATH="/home/gessel/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" ############################# USE ####################################################### # find_media.sh /starting/path/ (quote path names with spaces) ######################################################################################## # No argument given? if [ -z "$1" ]; then printf "\nUsage:\n pass a starting point like \"/Downloads/Media files/\" \n" exit 1 fi msec="1000" kilo="1024" s="_" x="x" kbps="kbps" dot="." find "$1" -type f \( -iname \*.avi -o -iname \*.mkv -o -iname \*.mp4 -o -iname \*.wmv \) -print0 | while read -rd $'\0' file do if [[ -f "$file" ]]; then size="$(wc -c "$file" | awk '{print $1}')" duration="$(mediainfo --Inform="Video;%Duration%" "$file")" seconds=$(bc -l <<<"${duration}/${msec}") sizek=$(bc -l <<<"scale=1; ${size}/${kilo}") sizem=$(bc -l <<<"scale=1; ${sizek}/${kilo}") rate=$(bc -l <<<"scale=1; ${sizek}/${seconds}") codec="$(mediainfo --Inform="Video;%Format%" "$file")" framerate="$(mediainfo --Inform="General;%FrameRate%" "$file")" rtime="$(mediainfo --Inform="General;%Duration/String3%" "$file")" runtime="${rtime//:/-}" width="$(mediainfo --Inform="Video;%Width%" "$file")" height="$(mediainfo --Inform="Video;%Height%" "$file")" fname="${file%.*}" ext="${file##*.}" $(mv "$file" "$fname$s$width$x$height$s$runtime$s$codec$s$rate$kbps$dot$ext") fi done
If you don’t have mediainfo installed,
sudo apt update sudo apt install mediainfo
Favicon generation script
Favicons are a useful (and fun) part of the browsing experience. They once were simple – just an .ico file of the right size in the root directory. Then things got weird and computing stopped assuming an approximate standard ppi for displays, starting with mobile and “retina” displays. The obvious answer would be .svg favicons, but, wouldn’t’ya know, Apple doesn’t support them (neither does Firefox mobile) so for a few more iterations, it still makes sense to generate an array of sizes with code to select the right one. This little tool pretty much automates that from a starting .svg file.
There are plenty of good favicon scripts and tools on the interwebs. I was playing around with .svg sources for favicons and found it a bit tedious to generate the sizes considered important for current (2020-ish) browsing happiness. I found a good start at stackexchnage by @gary, though the sizes weren’t current recommended (per this github project). Your needs may vary, but it is easy enough to edit.
The script relies on the following wonderful FOSS tools:
- Inkscape to handle svg to png conversion
- Pngquant for png file optimization
- Imagemagick for conversion to .ico
These are available in most distros (software manager had them in Mint 19).
Note that my version leaves the format as .png – the optimized png will be many times smaller than the .ico format and png works for everything except IE<11, which nobody should be using anyway. The favicon.ico
generated is 16, 32, and 48 pixels in 3 different layers from the 512×512 pixel version.
The command line options for inkscape changed a bit, the bash script below has been updated to reflect current.
Note: @Chewie9999 commented on https://github.com/mastodon/mastodon/issues/7396 that for Mastodon, the sizes needed would be generated with the following:
size=(16 32 36 48 57 60 72 76 96 114 120 144 152 167 180 192 256 310 384 512 1024)
The code below can be saved as a bash file, set execution bit, and call as ./favicon file.svg and off you go:
#!/bin/bash # this makes the output verbose set -ex # collect the file name you entered on the command line (file.svg) svg=$1 # set the sizes to be generated (plus 310x150 for msft) size=(16 32 48 70 76 120 128 150 152 167 180 192 310 512) # set the write director as a favicon directory below current out="$(pwd)" out+="/favicon" mkdir -p $out echo Making bitmaps from your svg... for i in ${size[@]}; do inkscape -o "$out/favicon-$i.png" -w $i -h $i $svg done # Microsoft wide icon (annoying, probably going away) inkscape -o "$out/favicon-310x150.png" -w 310 -h 150 $svg echo Compressing... for f in $out/*.png; do pngquant -f --ext .png "$f" --posterize 4 --speed 1 ; done; echo Creating favicon convert $out/favicon-512.png -define icon:auto-resize=48,32,16 $out/favicon.ico echo Done
Copy the .png files and .ico file generated above as well as the original .svg file into your root directory (or, if in a sub-directory, add the path below), editing the “color” of the Safari pinned tab mask icon. You might also want to make a monochrome version of the .svg file and reference that as the “mask-icon” instead, it will probably look better, but that’s more work.
The following goes inside the head directives in your index.html to load the correct sizes as needed (delete the lines for Microsoft’s browserconfig.xml
file and/or Android’s manifest
file if not needed.)
<!-- basic svg --> <link rel="icon" type="image/svg+xml" href="/favicon.svg"> <!-- generics --> <link rel="icon" href="favicon-16.png" sizes="16x16"> <link rel="icon" href="favicon-32.png" sizes="32x32"> <link rel="icon" href="favicon-48.png" sizes="48x48"> <link rel="icon" href="favicon-128.png" sizes="128x128"> <link rel="icon" href="favicon-192.png" sizes="192x192"> <!-- .ico files --> <link rel="icon" href="/favicon.ico" type="image/x-icon" /> <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon" /> <!-- Android --> <link rel="shortcut icon" href="favicon-192.png" sizes="192x192"> <link rel="manifest" href="manifest.json" /> <!-- iOS --> <link rel="apple-touch-icon" href="favicon-76.png" sizes="76x76"> <link rel="apple-touch-icon" href="favicon-120.png" sizes="120x120"> <link rel="apple-touch-icon" href="favicon-152.png" sizes="152x152"> <link rel="apple-touch-icon" href="favicon-167.png" sizes="167x167"> <link rel="apple-touch-icon" href="favicon-180.png" sizes="180x180"> <link rel="mask-icon" href="/favicon.svg" color="brown"> <!-- Windows --> <meta name="msapplication-config" content="/browserconfig.xml" />
For WordPress integration, you don’t have access to a standard index.html file, and there are crazy redirects happening, so you need to append to your theme’s functions.php
file with the below code snippet wrapped around the above icon declaration block (optimally your child theme unless you’re a theme developer since it’ll get overwritten on update otherwise):
/* Allows browsers to find favicons */ add_action('wp_head', 'add_favicon'); function add_favicon(){ ?> REPLACE THIS LINE WITH THE BLOCK ABOVE <?php };
Then, just for Windows 8 & 10, there’s an xml file to add to your directory (root by default in this example) Also note you need to select a color for your site, which has to be named “browserconfig.xml
”
<?xml version="1.0" encoding="utf-8"?> <browserconfig> <msapplication> <tile> <square70x70logo src="/favicon-70.png"/> <square150x150logo src="/favicon-150.png"/> <wide310x150logo src="/favicon-310x150.png"/> <square310x310logo src="/favicon-310.png"/> <TileColor>#ff8d22</TileColor> </tile> </msapplication> </browserconfig>
There’s one more file that’s helpful for mobile compatibility, the android save to desktop file, “manifest.json
“. This requires editing and can’t be pure copy pasta. Fill in the blanks and select your colors
{ "name": "", "short_name": "", "description": "", "start_url": "/?homescreen=1", "icons": [ { "src": "/favicon-192.png", "sizes": "192x192", "type": "image/png" }, { "src": "/favicon-512.png", "sizes": "512x512", "type": "image/png" } ], "theme_color": "#ffffff", "background_color": "#ff8d22", "display": "standalone" }
Check the icons with this favicon tester (or any other).
Manifest validation: https://manifest-validator.appspot.com/
Disk Checks for Large Arrays
If you have a large array of disks attached to your server, which is obviously going to be running FreeBSD or OpenBSD if you care about security, stability, and scalability; there are some tricks for dealing with large numbers of disks (like having 227 4TB disks attached to a single host).
Using Bash (yes there are security issues, but it is powerful)
# for i in `seq 0 227`; do smartctl -t short /dev/da$i; sleep 15; done
1Thanks Jared
executes a short smart test on all disks. Smartctl
seems to max out at 32 concurrent tests, so sleep 15
ensures the 3 minute tests are finishing before new ones are executed. If you’re in a hurry, sleep 5
should do the trick and ensure all of them execute.
to get results try something like:
# for i in `seq 0 227`; do echo "/dev/da$i"; smartctl -a /dev/da$i; sleep .5; done
Bulk Fixes
Problem with the disks – need to clear existing formatting?
unmount each disk
# for i in `seq 0 227`; do umount -f /dev/da$i; done
unlock (if needed)
# sysctl kern.geom.debugflags=0x10
Overwrite the start of each disk
# for i in `seq 0 227`; do dd if=/dev/zero of=/dev/da$i bs=1k count=100; done
Overwrite the end of each disk
# for i in `seq 0 227`; do dd if=/dev/zero of=/dev/da$i bs=1m oseek=`diskinfo da$i | awk '{print int($3 / (1024*1024)) - 4;}'`; done
Recreate GPT (for ZFS)
# for i in `seq 0 227`; do gpart create -s gpt /dev/da$i; sleep .5; done
Destroy multipaths
# for i in `seq 1 114`; do gmultipath destroy disk$i; done
Disable multipath completely
# for i in `seq 1 114`; do gmultipath destroy disk$i; done
# gmultipath unload
# mv /boot/kernel-debug/geom_multipath.ko /boot/kernel-debug/geom_multipath.ko.bad
# mv /boot/kernel/geom_multipath.ko /boot/kernel/geom_multipath.ko.bad
Footnotes
↑1 | Thanks Jared |
---|