video
Post has video
Smol bash script for finding oversize media files
Sometimes you want to know if you have media files that are taking up more than their fair share of space. You compressed the file some time ago in an old, inefficient format, or you just need to archive the oversize stuff, this can help you find em. It’s different from file size detection in that it uses mediainfo
to determine the media file length and a variety of other useful data bits and wc -c
to get the size (so data rate includes any file overhead), and from that computes the total effective data rate. All math is done with bc
, which is usually installed. Files are found recursively (descending into sub-directories) from the starting point (passed as first argument) using find
.
basic usage would be:
./find-high-rate-media.sh /search/path/tostart/ [min bpp] [min data rate] [min size] > oversize.csv 2>&1
The script will then report media with a rate higher than minimum and size larger than minimum as a tab delimited list of filenames, calculated rate, and calculated size. Piping the output to a file, output.csv
, makes it easy to sort and otherwise manipulate in LibreOffice Calc as a tab delimited file. The values are interpreted as the minimum for suppression of output, so any file that exceeds all three minimum triggers will be output to the screen (or .csv file if so redirected).
The script takes four command line variables:
- The starting directory, which defaults to . [defaults to the directory the script is executed in]
- The minimum bits per pixel (including audio, sorry) for exclusions (i.e. more bpp and the filename will be output) [defaults to 0.25 bpp]
- The minimum data rate in kbps [defaults to 1 kbps so files would by default only be excluded by bits per pixel rate]
- The minimum file size in megabytes [defaults to 1mb so files would by default only be excluded by bits per pixel rate]
Save the file as a name you like (such as find-high-rate-media.sh) and # chmod +x find-high-rate-media.sh
and run it to find your oversized media.
!/usr/bin/bash ############################# USE ####################################################### # This creates a tab-delimeted CSV file of recursive directories of media files enumerating # key compression parameters. Note bits per pixel includes audio, somewhat necessarily given # the simplicity of the analysis. This can throw off the calculation. # find_media.sh /starting/path/ [min bits per pixel] [min data rate] [min file size mb] # /find-high-rate-media.sh /Media 0.2 400 0 > /recomp.csv 2>&1 # The "find" command will traverse the file system from the starting path down. # if output isn't directed to a CSV file, it will be written to screen. If directed to CSV # this will generate a tab delimted csv file with key information about all found media files # the extensions supported can be extended if it isn't complete, but verify that the # format is parsable by the tools called for extracting media information - mostly mediainfo # Typical bits per pixel range from 0.015 for a HVEC highly compressed file at the edge of obvious # degradation to quite a bit higher. Raw would be 24 or even 30 bits per pixel for 10bit raw. # Uncompressed YUV video is about 12 bpp. # this can be useful for finding under and/or overcompressed video files # the program will suppress output if the files bits per pixel is below the supplied threshold # to reverse this invert the rate test to " if (( $(bc <<<"$rate < $maxr") )); then..." # if a min data rate is supplied, output will be suppressed for files with a lower data rate # if a min file size is supplied, output will be suppressed for files smaller than this size ######################################################################################## # No argument given? if [ -z "$1" ]; then printf "\nUsage:\n starting by default in the current directory and searchign recusrively \n" dir="$(pwd)" else dir="$1" echo -e "starting in " $dir "" fi if [ -z "$2" ]; then printf "\nUsage:\n returning files with bits per pixel greater than default max of .25 bpp \n" maxr=0.25 else maxr=$2 echo -e "returning files with bits per pixel greater than " $maxr " bpp" fi if [ -z "$3" ]; then printf "\nUsage:\n returning files with data rate greater than default max of 1 kbps \n" maxdr=1 else maxdr=$3 echo -e "returning files with data rate greater than " $maxdr " kbps" fi if [ -z "$4" ]; then printf "\nUsage:\n no min file size provided returning files larger than 1MB \n" maxs=1 else maxs=$4 echo -e "returning files with file size greater than " $maxs " MB \n\n" fi msec="1000" kilo="1024" reint='^[0-9]+$' refp='^[0-9]+([.][0-9]+)?$' echo -e "file path \t rate bpp \t rate kbps \t V CODEC \t A CODEC \t Frame Size \t FPS \t Runtime \t size MB" find "$dir" -type f \( -iname \*.avi -o -iname \*.mkv -o -iname \*.mp4 -o -iname \*.wmv -iname \*.m4v \) -print0 | while read -rd $'\0' file do if [[ -f "$file" ]]; then bps="0.1" size="$(wc -c "$file" | awk '{print $1}')" duration="$(mediainfo --Inform="Video;%Duration%" "$file")" if ! [[ $duration =~ $refp ]] ; then duration=$msec fi seconds=$(bc -l <<<"${duration}/${msec}") sizek=$(bc -l <<<"scale=1; ${size}/${kilo}") sizem=$(bc -l <<<"scale=1; ${sizek}/${kilo}") rate=$(bc -l <<<"scale=1; ${sizek}/${seconds}") codec="$(mediainfo --Inform="Video;%Format%" "$file")" audio="$(mediainfo --Inform="Audio;%Format%" "$file")" framerate="$(mediainfo --Inform="General;%FrameRate%" "$file")" if ! [[ $framerate =~ $refp ]] ; then framerate=100 fi rtime="$(mediainfo --Inform="General;%Duration/String3%" "$file")" width="$(mediainfo --Inform="Video;%Width%" "$file")" if ! [[ $width =~ $reint ]] ; then width=1 fi height="$(mediainfo --Inform="Video;%Height%" "$file")" if ! [[ $height =~ $reint ]] ; then height=1 fi pixels=$(bc -l <<<"scale=1; ${width}*${height}*${seconds}*${framerate}") bps=$(bc -l <<<"scale=4; ${size}*8/${pixels}") if (( $(bc -l <<<"$bps > $maxr") )); then if (( $(bc -l <<<"$sizem > $maxs") )); then if (( $(bc -l <<<"$rate > $maxdr") )); then echo -e "$file" "\t" $bps "\t" $rate "\t" $codec "\t" $audio "\t" $width"x"$height "\t" $framerate "\t" $rtime "\t" $sizem fi fi fi fi done
Results might look like:
Another common task is renaming video files with some key stats on the contents so they’re easier to find and compare. Linux has limited integration with media information (dolphin is somewhat capable, but thunar not so much). This little script also leans on mediainfo
command line to append the following to the file name of media files recursively found below a starting directory path:
- WidthxHeight in pixels (e.g. 1920×1080)
- Runtime in HH-MM-SS.msec (e.g. 02-38-15.111) (colons aren’t a good thing in filenames, yah, it is confusingly like a date)
- CODEC name (e.g. AVC)
- Datarate (e.g. 1323kbps)
For example
kittyplay.mp4 -> kittyplay_1280x682_02-38-15.111_AVC_154.3kbps.mp4
The code is also available here.
#!/usr/bin/bash PATH="/home/gessel/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" ############################# USE ####################################################### # find_media.sh /starting/path/ (quote path names with spaces) ######################################################################################## # No argument given? if [ -z "$1" ]; then printf "\nUsage:\n pass a starting point like \"/Downloads/Media files/\" \n" exit 1 fi msec="1000" kilo="1024" s="_" x="x" kbps="kbps" dot="." find "$1" -type f \( -iname \*.avi -o -iname \*.mkv -o -iname \*.mp4 -o -iname \*.wmv \) -print0 | while read -rd $'\0' file do if [[ -f "$file" ]]; then size="$(wc -c "$file" | awk '{print $1}')" duration="$(mediainfo --Inform="Video;%Duration%" "$file")" seconds=$(bc -l <<<"${duration}/${msec}") sizek=$(bc -l <<<"scale=1; ${size}/${kilo}") sizem=$(bc -l <<<"scale=1; ${sizek}/${kilo}") rate=$(bc -l <<<"scale=1; ${sizek}/${seconds}") codec="$(mediainfo --Inform="Video;%Format%" "$file")" framerate="$(mediainfo --Inform="General;%FrameRate%" "$file")" rtime="$(mediainfo --Inform="General;%Duration/String3%" "$file")" runtime="${rtime//:/-}" width="$(mediainfo --Inform="Video;%Width%" "$file")" height="$(mediainfo --Inform="Video;%Height%" "$file")" fname="${file%.*}" ext="${file##*.}" $(mv "$file" "$fname$s$width$x$height$s$runtime$s$codec$s$rate$kbps$dot$ext") fi done
If you don’t have mediainfo installed,
sudo apt update sudo apt install mediainfo
South Lake Tahoe Caldor Fire Timelapse
Sentinalhub Playground is an excellent resource for near real time, albeit not quite google earth 1m resolution, satellite images. One of the cool features is being able to adjust the mapping of the satellite bands to RGB outputs. For example, using Sentinel-2 L2A image data of South Lake Tahoe between 2021-08-17 and 2021-09-01 and remapping the 2190nm (SWIR2) to red, which tends to highlight fires though isn’t thermal, 783nm to green, a vegetation band (though it is NIR to humans) to make vegetation cover more obvious, and 443nm to blue instead of 490nm as shorter wavelengths tend to be scattered more by aerosols and smoke the fire line (bright red) and smoke (obvs) is very visible while vegetation is (false) green. Burnt earth shows as dark red, compared to bare ground, which tends to show tan in this mapping, thus revealing the current line of fire, the recently burned areas, and the wind direction carrying smoke, which tends to correlate with the advancing line, and fuel (vegetation) still standing.
Then using the history controller to generate and save a sequence of stills, we can animate the progress of the fire with a simple FFMPEG command:
ffmpeg -framerate 1 -pattern_type glob -i '*.jpg' -vf crop=1754:1146 -c:v libx264 -r 30 -pix_fmt yuv420p fire.mp4
and you get:
Kitty Poop (1995)
Many years ago (21 years, 9 months as of this post), I used some as-of-then only slightly out of date equipment to record a one week time lapse of the cats’ litter box.
I found the video on a CD-ROM (remember those?) and thought I’d see if it was still usable. It wasn’t – Quicktime had abandoned support for most of the 1990’s era codecs, and as it was pre-internet, there just wasn’t any support any more. I had to fire up my old Mac 9500, which booted just fine after years of sitting, even if most of the rubber feet on the peripherals had long since turned to goo. The OS9 version of QT let me resave as uncompressed, which of course was way too big for the massive dual 9GB drives in that machine. Youtube would eat the uncompressed format and this critical archival record is preserved for a little longer.
Time lapse of the litter box. Shot in Sept, 1995 in San Francisco, CA. Captured with a RasterOps ColorBoard 364 Nubus card from a Sony XC-999 on a Mac IIfx.
Visiting the Burj Khalifa
Dubai is an interesting contrast to Iraq. The first time I went through DXB from BSR it was more than a little culture shock. Getting out of the airport only amplifies the experience.
Jared and I had dinner at the Mall of Dubai and before eating had a little walk around the fountains – the largest dancing fountains in the world at the foot of the tallest man-made structure in the world.
Dubai is an good place to spot cars. Obviously the gold accented rolls is more pose-worthy than the $450k GTO. Then again they were probably posing with the license plate number which I think was 1, and therefore cost as much as 20 Ferrari GTOs.