Some time ago I had a bit of an issue with e-mails being modified by replacing links with so-called safelinks, which are effectively trackers, make it impossible to check links and make sharing links considerably less secure, among some other issues. Meanwhile, these people, so concerned about my security, started to add headers to e-mails warning me if that e-mail is not from Cardiff University.
It takes about one day to get used to it and ignore it, so the whole attempt is ineffective, as far as I can tell. More recently, they also started to put these headers in front of signed and even encrypted messages. The message itself is placed in a mime-attachment as a forwarded message. Luckily, at least for now, it is not further modified, even if the whole practice is in principle altering records. It still makes handling encrypted message and verifying signed messaged more complicated for no reason (any e-mail program can as well just display a warning for every incoming e-mail…). Overall, the effect of this is that e-mails are now even less secure and even more hassle.
Well, we can fix it, easily, with a mail filter (and as a side-effect, add the domains doing this to the list of potential attackers… I leave that to you in whatever system you are using there). It’s an extension of the original filter from the above post fixing the links and filtering the messages. So far, this worked reliably and can just be used in a procmail or similar e-mail filter. Here it is (it needs python 3 with urllib, email and BeautifulSoup packages – see imports).
#!/usr/bin/env python3 # # email_cardiff_filter.py - fix Cardiff University e-mail security problems. # Version 0.6 # Copyright (C) 2019-2021 Frank C Langbein, frank@langbein.org # # Dedicated to all Cardiff University system admins who waste my time. # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU Affero General Public License as # published by the Free Software Foundation, either version 3 of the # License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Affero General Public License for more details. # # You should have received a copy of the GNU Affero General Public License # along with this program. If not, see <https://www.gnu.org/licenses/>. import sys import argparse import re import urllib.parse import email from os.path import expanduser from bs4 import BeautifulSoup def replace_urls(str): # Fix malicous safelink links/trackers in str and return modified string. safelinks_pattern = re.compile(r'https?://[a-zA-Z0-9.-]*\.safelinks\.protection\.outlook\.com/\?url=((?:[^&]|%[0-9a-fA-F]{2})+)&[-a-zA-Z0-9+/&;=%.]*') res = '' pos = 0 for match in safelinks_pattern.finditer(str): res += str[pos:match.start()] + urllib.parse.unquote(match.expand(r'\1')) pos = match.end() return res+str[pos:] def fix_text(msg, ty, cte_default): # Fix text/{html,plain} blocks by removing modified text and fixing urls # msg - email message block # ty - type of block (expects text/html or text/plain) # cte_default - default content transfer encoding, from e-mail header # Result is update to msg. # Decode block cs = str(msg.get_content_charset("utf-8")) text = msg.get_payload(decode=True).decode(cs, 'ignore').strip() # Check transfer encoding and make sure its in header cte_flag = False for k in reversed(range(len(msg._headers))): if msg._headers[k][0].lower() == 'content-transfer-encoding': cte = msg._headers[k][1].lower() cte_flag = True if not cte_flag: if len(text) == len(text.encode()): msg.add_header("Content-Transfer-Encoding","7bit") cte="7bit" else: msg.add_header("Content-Transfer-Encoding",cte_default) cte=cte_decfault # Cleanup text if ty == "text/html": soup = BeautifulSoup(text, 'html.parser') for illegal in soup.find_all(string=re.compile('External *email.*Cardiff *University'), limit=1): # Move to top-level of block (block is right after body) while illegal is not None and illegal.parent is not None and illegal.parent.name != "body": p = illegal.parent # Move up, making sure we remain first child (otherwise text is later in message, so not removed) k = 0 while p.contents[k] == '\n': k = k + 1 if p.contents[k] == illegal: illegal = p else: illegal = None # Remove illegal text block (if at start of message) if illegal is not None: for nxt in illegal.find_next_siblings(limit=2): if nxt.name == "br": nxt.decompose() illegal.decompose() text = str(soup) else: text = str(re.sub(r'^External *email.*Cardiff *University.*dolenni\.[\r\n]*', '', text, flags=re.S)) # Fix URLs text = replace_urls(text) # Re-encode block try: msg.set_payload(text,cs) except UnicodeEncodeError: msg.set_payload(text,"utf-8") if cte == "base64": # Check if encoding worked and if not, switch to quoted-printable. # Needed as sometimes a base64 transfer-encoding header seems to be ignored. new_text = msg.get_payload(decode=True).decode(cs, 'ignore').strip() if new_text != text: msg.replace_header("Content-Transfer-Encoding","quoted-printable") try: msg.set_payload(text,cs) except UnicodeEncodeError: msg.set_payload(text,"utf-8") def extract_cardiff_forward(msg): # Check if the multi-part mime message msg contains an actual signed or encrypted # forwarded e-mail, and this is just an attempt to modify the message that no one # actually cares about but makes decryption and validation of signatures harder, # so makes e-mail less secure. # Pattern indicating records have been altered idi_pattern_start = re.compile(r'^External *email.*Cardiff *University') idi_end = ' ddolenni.' # Find content modification idi_found = False for p in msg.walk(): ty = p.get_content_type() if idi_found and ty == "message/rfc822": msg = str(re.sub(r'^Content-Type: .*[\r\n][\r\n]', '', str(p))) return email.message_from_string(msg) elif ty == "text/plain": # Decode block text = p.get_payload(decode=True).decode(str(p.get_content_charset("utf-8")), 'ignore').strip() if idi_pattern_start.match(text) and text[-len(idi_end):] == idi_end: # Danger, message modified - fixing idi_found = True return msg # Nope, other reason if __name__ == '__main__': # Arguments parsing, only for basic housekeeping parse = argparse.ArgumentParser(description='Fix security problems with Cardiff University e-mails.') args = parse.parse_args() # Read message and process parts msg = email.message_from_file(sys.stdin) cte = msg.get("Content-Transfer-Encoding","7bit") # Do not modified if signed mod = True ct = msg.get_content_type() if ct[0:10] == "multipart/": msg = extract_cardiff_forward(msg) # Check if they've attmpted to modify a signed or encrypted message ct = msg.get_content_type() if ct == "multipart/signed": mod = False # Do not modify signed messages else: for p in msg.walk(): ty = p.get_content_type() if ty == "application/pgp-signature": mod = False if mod: # Fix e-mail for p in msg.walk(): ty = p.get_content_type() if ty == "text/html" or ty == "text/plain": # Fix message text and links in text fix_text(p, ty, cte) # Cleanup headers try: hdrs = [l.strip() for l in open('./headers.lst')] except: try: hdrs = [l.strip() for l in open(expanduser('~/etc/email/headers.lst'))] except: print("Cannot read headers.lst") quit() for k in reversed(range(len(msg._headers))): if not msg._headers[k][0].lower() in hdrs: del(msg._headers[k]) print(msg)
The script also reads a headers.lst file and removes any headers not in that list – I cannot trust what is in those other headers and do not need them, so I simply remove them. It’s simple to comment this out. The headers.lst file is generated and regularly updated with this script (just run as a cronjob from time to time; make sure you have curl installed).
#!/bin/bash # # email_headers_update - get list of headers for mail/mime # Version 0.2 # Copyright (C) 2019,2021 Frank C Langbein, Cardiff University, frank@langbein.org # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU Affero General Public License as # published by the Free Software Foundation, either version 3 of the # License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Affero General Public License for more details. # # You should have received a copy of the GNU Affero General Public License # along with this program. If not, see <https://www.gnu.org/licenses/>. test -r ./headers.lst || mkdir -p ~/etc/email && cd ~/etc/email cat >headers.tmp <<EOD from subject date to return-path envelope-to delivery-date received dkim-signature domainkey-signature message-id mime-version content-type thread-topic thread-index x-originating-ip x-autoresponse-suppress x-originatororg x-sa-exim-connect-ip x-sa-exim-mail-from x-sa-exim-version x-sa-exim-scanned x-spam-checker-version x-spam-level x-spam-status EOD test -f headers.lst && cat headers.lst >>headers.tmp curl -s https://www.iana.org/assignments/message-headers/perm-headers.csv | tail -n +2 | while read l; do h="`echo $l | cut -d, -f1 | tr 'A-Z' 'a-z'`" p="`echo $l | cut -d, -f3 | tr 'A-Z' 'a-z'`" test "$p" = "mail" -o "$p" = "mime" && echo $h >>headers.tmp done sort -u <headers.tmp >headers.lst rm -f headers.tmp
It should be relatively simple to adapt to other setups if you have some basic coding skills.
I am not doing anything on the web-interface for this. It should be simple to hide the messages in question or even open the encrypted/signed attachments with a grease/tampermonkey script. But I’ve not been on that web interface for eternities and do not intend to return to it. Just store all messages locally, remove them from the uncontrollable server and have some minimal peace.
(And sorry, I cannot support windows or macOS… the above should help, but I can’t do more for these hopeless platforms).
The license for both scripts is AGPL-3.0-or-later.
The angry little girl on the feature image for this article comes from here: https://tenor.com/view/mad-angry-angry-girl-angry-little-girl-gif-11979588. It’s a near-perfect match.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.