Easily dumping all AWS SSM details

We have a lot of credentials and config mixed up in AWS SSM/Parameter Store over many different regions and profiles. We want to export all of these to a spreadsheet to allow some more junior team members to convert the non-secret ones into standard parameters which can be stored in our kubernetes git config (on a per-environment basis). The following script does this:

#!/bin/bash
for profile in x y z; do
    awsv2 --profile $profile ssm get-parameters-by-path --path / --with-decryption > paramstore-$profile.json
done
cat *.json  | jq -r '.Parameters[] | select(.Name+.Value |test("pass|private|secret|token|://.*:|certificate|pwd|cookie|key"; "i")|not) | [.Name, .Value] | @csv' > out.csv

Using Postgres NOTIFY with placeholders

When writing SQL you always want to use placeholders rather than trying to escape text yourself and risk an SQL injection attack.

Postgres provides great functionality for this such as:

SELECT * FROM table WHERE username = $1

Today I was trying to send arbitrary text to a channel via the very powerful NOTIFY command. However every time I tried to use placeholders I was getting errors (from python’s asyncpg driver, which is pulling it directly from postgres error code 42601).

Eventually, looking through the docs I found this quote:

To send a notification you can also use the function pg_notify(text, text). The function takes the channel name as the first argument and the payload as the second. The function is much easier to use than the NOTIFY command if you need to work with non-constant channel names and payloads.

So, after wasting an hour trying all sorts of different quoting strategies, I was able to change NOTIFY $1, $2 into SELECT pg_notify($1, $2) and resolve the issue.

I’m using SQLAlchemy by the way so it looks something like:

from sqlalchemy import text 
await session.execute(
  text("SELECT pg_notify(:channel, :data)")
    .bindparams(channel="channel", data="my text")
)

Perfect!

Hacking ElasticSearch python client to work with AWS OpenSearch

Because of various disagreements between AWS and ElasticSearch, AWS released a fork called OpenSearch but and Elastic updated their clients to throw errors if you try to use them with this product.

This is obviously really annoying if you are using 3rd party software which uses the Elastic libraries, but trying to run them against AWS managed services. The following hack fixes this for Python elasticsearch v7.17 at least, by disabling the unnecessary version check:

# Hack elasticsearch to work with AWS
from elasticsearch import Transport
Transport._do_verify_elasticsearch = lambda self, headers, timeout: None

Screen corruption on KDE with Ubuntu 20.04

At some point in the past couple of weeks I guess my laptop updated libraries or something, because when I had to reboot my laptop yesterday I started seeing massive screen corruption in some applications. It manifested itself in horizontal white/black lines remaining especially when selecting text. This was especially visible in konsole. I couldn’t see any package in particular which had updated recently and I tried looking at a few different Xorg or kernel options but to no avail.

Eventually I looked at the KDE compositor settings. I noticed that the “Rendering Backend” was set to XRender, which as far as I understand it is very old. Updating it to ‘OpenGL 3.1 instantly fixed the issue.

I’m leaving this mostly as a note to myself for if it happens again in the future, but at the same time perhaps it is a wider regression in the ubuntu 20.04 KDE packages so it would help someone else.

Best wordle starter words

I recently, like pretty much everyone else got into Wordle. One of the most important things in getting the correct answer is to find the best first word or two to start with which will help guide you to the correct answer. The ideal first word(s) should use one each of the most common letters so for example in the first 2 guesses you can test the top 10 characters.

My first (relatively uneducated) guesses based on what I vaguely remembered about letter frequency in English were ‘spear’ and ‘mount’ – 4 vowels and some of the most common consonants. However it’s pretty much a random guess so I was wondering if we could figure out a better approach.

It’s pretty straight forward to look at the source code of Wordle, which contains two word lists. The first one contains 2315 5-letter words which can be the answer, the second contains a further 10,000 of all possible 5 letter words in English.

So, I wrote a small script to analyse the frequency of letters in the list of possible answers, and then based on that filter the possible words to find the best starting (and subsequent) guesses which would work.

I’ve put the simple python script I used at the bottom of the article, but the output is:

Matching 5 new letters (39%) are: [‘arose’]
Matching 5 new letters (66%) are: [‘unlit’, ‘until’]
Matching 4 new letters (81%) are: [‘duchy’]
Matching 3 new letters (89%) are: [‘pygmy’]

What this means is that if you start with the word ‘arose’, and then ‘until’ (or ‘unlit’), even though it’s only 10 unique letters (38% of the alphabet) because they are the most frequent ones they will cover 2/3 (66%) of the possible words.

In terms of letter frequency overall we get the following ordered detail:

[(‘e’, 1233), (‘a’, 979), (‘r’, 899), (‘o’, 754), (‘t’, 729), (‘l’, 719), (‘i’, 671), (‘s’, 669), (‘n’, 575), (‘c’, 477), (‘u’, 467), (‘y’, 425), (‘d’, 393), (‘h’, 389),
(‘p’, 367), (‘m’, 316), (‘g’, 311), (‘b’, 281), (‘f’, 230), (‘k’, 210), (‘w’, 195), (‘v’, 153), (‘z’, 40), (‘x’, 37), (‘q’, 29), (‘j’, 27)]

The script I wrote is not perfect but it’s at least a start at finding some optimum words

import sys
with open(sys.argv[1]) as fh:
    words = [l.strip() for l in fh]

chars = {}
for char in ''.join(words):
    chars[char] = chars.get(char, 0) + 1
frequency = sorted(chars.keys(), key=lambda c: -chars[c])
print(sorted(chars.items(), key=lambda c: -c[1]))

total_freq = 0
while len(frequency) > 5:
    matching = words
    letters = []
    for char in frequency:
        new_matching = [w for w in matching if char in w]
        if new_matching:
            matching = new_matching
            letters.append(char)
        if len(letters) == 5:
            break

    total_freq += sum([chars[c] for c in letters])
    print("Matching %d new letters (%d%%) are: %r" % (len(letters), total_freq / sum(chars.values()) * 100, matching))
    frequency = [c for c in frequency if c not in letters]

Simple mitigation for the new DNS cache poisoning attack

As reported in many places, a new attack has been presented which can allow an attacker to poison caching and forwarding DNS server entries. The PDF is an interesting read and contains many different ideas which chained together can lead to this attack. I believe the following firewall rule should defend against the attack on caching servers with very little side effect by preventing sending of ICMP messages saying that the given UDP port was unreachable:

iptables -I OUTPUT -p icmp --icmp-type port-unreachable -m u32 --u32 '34 & 0xFF = 17' -j DROP

Running systemd inside a Centos 8 Docker container

Support for systemd in Docker has improved a lot since this 2016 article, but it’s still not obvious quite how to make it work. Why would you want this? Mostly for testing full-server deploys (for example we test ansible deployments against various docker containers to ensure there are no bugs). Here’s a systemd-based centos 8 Dockerfile that also includes an ssh server:
FROM centos:8

# Set up base packages that are expected
RUN dnf -y install openssh-server crontabs NetworkManager firewalld selinux-policy

RUN systemctl mask dev-mqueue.mount dev-hugepages.mount \
     systemd-remount-fs.service sys-kernel-config.mount \
     sys-kernel-debug.mount sys-fs-fuse-connections.mount \
     graphical.target systemd-logind.service \
     NetworkManager.service systemd-hostnamed.service

STOPSIGNAL SIGRTMIN+3

# SSHd setup
EXPOSE 22
COPY docker.pub /root/.ssh/authorized_keys
RUN chmod 600 /etc/ssh/ssh_host* /root/.ssh/authorized_keys

CMD ["/sbin/init"]
You can then launch this like:
docker run -v /sys/fs/cgroup:/sys/fs/cgroup:ro --tmpfs /run container-name
For the latest centos 7 you can use the following Dockerfile:
FROM centos:7

RUN yum -y install openssh-server NetworkManager firewalld && \
    systemctl disable NetworkManager && systemctl enable sshd

EXPOSE 22
COPY docker.pub /root/.ssh/authorized_keys
RUN chmod 600 /etc/ssh/ssh_host* /root/.ssh/authorized_keys

STOPSIGNAL SIGRTMIN+3

CMD ["/sbin/init"]

Complex network setup on Centos/Redhat 8 via Network Manager

Here are some notes, mostly for my future self about how to set up bond + vlan + bridge networks in Centos 8 in order to create a highly resilient virtual machine host server.

Firstly, create the bond interface and set the options:

nmcli con add type bond ifname bond0 bond.options "mode=802.3ad,miimon=100" ipv4.method disabled ipv6.method ignore connection.zone trusted

Note the 802.3ad option – this says we are going to be using the LACP protocol which requires router support, however you can look at the different mode options in the kernel documentation.

Then, we add the required interfaces into the bond – in this case enp131s0f0 and enp131s0f1

nmcli con add type ethernet ifname enp131s0f0 master bond0 slave-type bond connection.zone trusted
nmcli con add type ethernet ifname enp131s0f1 master bond0 slave-type bond connection.zone trusted

If you want to create a standard vlan interface over the top of the bond for vlan tag 123 we can do this just with 1 more command:

nmcli con add type vlan ifname bond0.123 dev bond0 id 123 ipv4.method manual ipv4.addresses a.b.c.d/24 ipv4.gateway a.b.c.d ipv4.dns 8.8.8.8 connection.zone public

Alternatively if you want to set up a bridge on this interface:

nmcli con add ifname br.123 type bridge ipv4.method manual ipv4.addresses a.b.c.d/24 ipv4.gateway a.b.c.d ipv4.dns 8.8.8.8 \
        bridge.stp no bridge.forward-delay 0 connection.zone public
nmcli con add type vlan ifname bond0.123 dev bond0 id 123 master br.123 slave-type bridge connection.zone trusted

If you don’t want the default route or DNS, simply remove the ipv4.gateway and ipv4.dns options above.

If you want to create a bridge which doesn’t have any IP address on the host (just bridges through to the host vm’s) then replace all the ipv4 settings with: ipv4.method disabled ipv6.method ignore.

Ripping unfinalized DVDs from Linux

Recently we had some DVD’s with old family videos that had been recorded directly but never finalized, the device that recorded them then broke so it was impossible to finalize them, and it seemed to be pretty much impossible for anything else to read them. So, I figured out a way to recover them using linux…

First I installed the dvd+rw-tools package on ubuntu and used that to get various info and prove that the DVD itself was readable even though nothing on the system could see it as a video disk or filesystem.

$ dvd+rw-mediainfo /dev/sr0 
INQUIRY:                [HL-DT-ST][DVDRAM GTB0N    ][1.00]
GET [CURRENT] CONFIGURATION:
 Mounted Media:         11h, DVD-R Sequential
 Media ID:              CMC MAG. AE1
 Current Write Speed:   8.0x1385=11080KB/s
 Write Speed #0:        8.0x1385=11080KB/s
 Write Speed #1:        4.0x1385=5540KB/s
 Speed Descriptor#0:    00/0 R@8.0x1385=11080KB/s W@8.0x1385=11080KB/s
 Speed Descriptor#1:    00/0 R@8.0x1385=11080KB/s W@4.0x1385=5540KB/s
READ DVD STRUCTURE[#10h]:
 Media Book Type:       00h, DVD-ROM book [revision 0]
 Legacy lead-out at:    2298496*2KB=4707319808
READ DVD STRUCTURE[#0h]:
 Media Book Type:       25h, DVD-R book [revision 5]
 Last border-out at:    8390653*2KB=17184057344
READ DISC INFORMATION:
 Disc status:           appendable
 Number of Sessions:    1
 State of Last Session: incomplete
 "Next" Track:          1
 Number of Tracks:      19
READ TRACK INFORMATION[#1]:
 Track State:           reserved
 Track Start Address:   0*2KB
 Next Writable Address: 0*2KB
 Free Blocks:           2544*2KB
 Track Size:            2544*2KB
READ TRACK INFORMATION[#2]:
 Track State:           complete incremental
 Track Start Address:   2560*2KB
 Free Blocks:           0*2KB
 Track Size:            240*2KB
 Last Recorded Address: 2591*2KB

I then pulled the entire DVD into a file on the local computer for easier processing later. The command will produce lots of errors (as there are parts of the DVD that are not readable as they were never written to), but the output file (image.iso) will contain a full dump of the DVD eventually

dd if=/dev/sr0 of=image.iso bs=2048 conv=noerror,notrunc iflag=nonblock

I then put together a short perl script to search through this file for 1kb blocks beginning with the magic tag DVDVIDEO – these seemed to be the starts of individual chapters, which avconv (also called ffmpeg on some distributions) can then extract into proper video/audio.

#!/usr/bin/perl
use v5.16;
use strict;
use warnings;
my $off = 0;
my $file = $ARGV[0];
open my $fh, '<:bytes', $file or die;
my $buf;
my @pos;

# Search through each block for one beginning with the header text and store these in array of offsets - I think it's one for each track
while( my $len = read $fh, $buf, 1024 ) {
        die if $len != 1024;

        if( $buf =~ /^DVDVIDEO/ ) {
                push @pos, $off;
        }
        $off++;
}
push @pos, $off;

my $chap = 0;
for( my $i = 0; $i < @pos - 1; $i++ ) {
        my $length = $pos[$i+1] - $pos[$i];
        next if $length < 1000;
        $chap++;
        say "dd if=$file bs=1024 skip=$pos[$i] count=$length | avconv -i - -acodec copy -vcodec copy out$chap.mp4";
}

Save that as extract_dvd_tracks.pl and then run it to extract them as files named like out0.mp4. Note that it won’t process tracks that are less than about 1Mb because there seemed to be a number of small sections like this which we wanted to skip over.

perl extract_dvd_tracks.pl image.iso

Job done! Note that there are probably many different formats and layouts for unfinalized DVD’s, this may be just one of many but hopefully the principle remains the same.

Extracting all PHP code from a file

In checking over a project recently, I wanted to extract all PHP code from a set of files and combine it into a single output so I could easily assess what was being used. The eventual command I ended up with was as follows, hopefully it will be useful to someone else in the future:

find -name \*.php | xargs perl -nE 'BEGIN{ undef $/ } say for /<\?php\s*((?:(?!\s*\?>).)+)/sg'