Category Archives: IT

IT related posts, technical stuff

Compressing a PDF file on macOS

Sometimes people create PDF files with large images, resulting in files that are too large to email or sometimes even upload on web forms.

A quick and dirty way to compress such files is using ghostscript in a terminal, which you can install on macOS using homebrew (brew install ghostscript):

gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -dCompatibilityLevel=1.5 -dPDFSETTINGS=/screen -dEmbedAllFonts=true -dSubsetFonts=true -dColorImageDownsampleType=/Bicubic -dColorImageResolution=192 -dGrayImageDownsampleType=/Bicubic -dGrayImageResolution=144 -dMonoImageDownsampleType=/Bicubic -dMonoImageResolution=144 -sOutputFile="${1%.*}.compressed.pdf" myfile.pdf

Replace “myfile.pdf” with the file to compress.

For convenience, you can create a function:

pdfc() {
  command gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -dCompatibilityLevel=1.5 -dPDFSETTINGS=/screen -dEmbedAllFonts=true -dSubsetFonts=true -dColorImageDownsampleType=/Bicubic -dColorImageResolution=192 -dGrayImageDownsampleType=/Bicubic -dGrayImageResolution=144 -dMonoImageDownsampleType=/Bicubic -dMonoImageResolution=144 -sOutputFile="${1%.*}.compressed.pdf" "$1"
}

Use like so:

pdfc myfile.pdf

And a smaller pdf file will be written to myfile.compressed.pdf.

This isn’t without cost. The resulting PDF file will contain lower resolution images which have less detail and will look worse zoomed in, but you can tweak the resolution settings to see what works for you. Generally I find the above settings acceptable for most documents at normal reading size. It will reduce a file with 3 large images captured with a phone from 6Mb to around 100Kb.

Samba config for Apple Time Machine

I’ve been using samba’s vfs_fruit module to enable backing up my mac laptop to my ubuntu-based NAS. I’ve found the configuration fiddly, and it occasionally breaks with macOS upgrades.

Anyway, I thought I would document my settings in case it helps anyone else out there. The file system backing my Time Machine share is ZFS, and I am using Samba 4.15 and MacOS Sonoma 14.4.1.

Global config

In /etc/samba/smb.conf under the [global] section, I have the following (not complete config, just the relevant settings):

protocol=SMB3
vfs objects = acl_xattr fruit streams_xattr aio_pthread
fruit:aapl = yes
fruit:model = MacSamba
fruit:posix_rename = yes
fruit:metadata = stream
fruit:nfs_aces = no
recycle:keeptree = no
oplocks = yes
locking = yes

Some comments:

  • I believe SMB3 is required – Time Machine struggles with older protocols.
  • The order of vfs objects is important – aio_pthread must go last.
  • Without aio_pthread, my backups fail while scanning. I suspect Time Machine is heavily threaded and does a lot of requests in parallel – apparently too much for a single samba thread.
  • fruit:posix_rename = yes appears to be the default and can probably be omitted
  • fruit:metadata = stream was a copy-paste and not thought through by me – I’m unsure of the implications of this

These work for me as a general set of settings for mac clients – I don’t use Windows or Linux clients often, so I don’t know how well it works for them. It’s possible some of these options are not required, as they’ve accumulated over time.

Share Config

The share itself is configured like so:

[TimeMachine NAS]
path=/pool1/backup/timemachine
comment=Time Machine
valid users = alex
writable = yes
durable handles = yes
kernel oplocks = no
kernel share modes = no
posix locking = no
ea support = yes
browseable = yes
read only = no
inherit acls = yes
fruit:time machine = yes

According to the docs, “fruit:time machine = yes” sets durable handles, kernel oplocks, kernel share modes and posix locking – you can probably omit these.

/pool1/backup/timemachine is a ZFS volume with a quota, which was set with “zfs set quota=3TB pool1/backup“. The available space is reported correctly to the client (Finder), so I’d expect it to work fine for restricting disk usage, and for time machine to manage its snapshots.

See also

ZFS compression and encryption

Up until a recent overhaul, I was using btrfs in raid1 to manage the 4 drives I had in my NAS. However it’s been clear for a while that the momentum is behind zfs. It has more features, better stability, and generally inspires much more confidence when things go wrong. btrfs still has its place in managing single-device boot volumes, but for multiple physical devices, I would definitely recommend zfs over btrfs.

When I added a couple of new 16TB disks, I opted to create a new pool with a single mirror vdev. If I need to expand it in future, I’ll add another mirrored vdev to the pool.

Continue reading

Home Server – new HBA edition

Some long time readers of this blog may remember my home server articles, the most recent being “Ubuntu Home Server 14.04 – A DIY NAS“. There haven’t been any more recently because there’s not been much to report. The server described in that article, built in 2014, has been backbone of my home network ever since.

Since then, I have swapped out hard drives a couple of times (it now contains 2x16TB Seagate Exos and 4x4TB Seagate IronWolf), doubled the ram to 8GB, and added a NVME riser card (along with a cheap 128GB NVME SSD), so I could have a separate boot drive while using all 6 SATA ports for hard drives.

Along the way it also lost HTPC and media player duties to an Apple TV, so now it’s little more than a file and backup server with Plex Media Server, Syncthing, and Duplicati installed. And the operating system has been upgraded from Ubuntu 14.04 to 16.04, 18.04, 20.04 and now 22.04.

A couple of weeks ago though, it failed. And by failed I mean, all I got was blank screen when powering on. No post, and no signs of life other than spinning fans.

My immediate thought was a loose connector, or possibly memory or motherboard failure, so I disconnected everything, blew the dust out and plugged everything back in. With the hard drives unplugged, everything worked. With 4 hard drives plugged in it still worked. Then it failed again when I connected the last two.

By now I figure I’m looking at a dodgy SATA cable, SATA port, or hard drive, but the core components are obviously fine. So why not give it a minor overhaul at the same time?

Continue reading

How to run an ethernet cable in your home, and save your relationship

My partner and I live in a 2-bedroom flat with our very young daughter. After a couple of weeks of working from the living room, which is where the WiFi is, and where I typically keep my computer, I decided, for the good of our relationship, to move my office to the spare bedroom.

There’s just one problem:

I can connect to the Wifi, but performance is abysmal

I’m not alone in spending a lot more time working from home recently, and if my Slack calls at work are anything to go by, I’m also not alone in struggling with poor WiFi reception. Urban areas tend to be densely packed with WiFi signals at the best of times, let alone while everyone’s cooped up at home full-time.

In my case, the WiFi connection in the spare bedroom is totally unusable for work, but continuing to work from the lounge would risk my relationship (and possibly my general safety).

So what’s a self-isolating telecommuter to do?

Continue reading

Domain Expert vs Generalist

When should you use a blunt generalist tool, and when should you use a sharper domain-specific tool?

I posted a question on Serverfault recently, and received a relevant answer that wasn’t quite what I was looking for:

Systemd – How do I automatically reload a unit, when another oneshot service is fired by timer?

My reply to the answer thanked him for it, but mentioned that I think systemd is the right place to do this “sort of thing”. In reply to my reply, he told me that systemd is “absolutely the wrong place” to do this sort of thing, which is pretty strong language!

I think we’re approaching this from different perspectives here, so let’s break the problem down in general terms.

Continue reading

Improving your privacy with a custom email domain

This blog post is a follow-up to It’s Time to Ditch Gmail. It began as a review of Fastmail, and my experience of moving to it from Gmail, but I quickly found myself going on a tangent. Since privacy was the main driver in my decision to move to Fastmail, and using a custom domain is one of the ways that I protect my privacy, I figured it was important enough to warrant its own post.

One of the factors that made it easier to move away from Gmail is my use of a custom domain for most of my mail. Before moving to Fastmail, this domain was tied to a GSuite account which forwarded everything to my standard Gmail account. This made switching in anger much easier, as I had fewer accounts to log in to and update my email address, and those that were still pointing directly at Gmail tended to be older low-value accounts that I no longer use anyway.

In this article though, I want to take a detour to explain why I use a custom domain, and how it can aid your privacy. Continue reading

Provisioning Vault with Code

A couple of years ago, Hashicorp published a blog post “Codifying Vault Policies and Configuration“. We used a heavily modified version of their scripts to get us going with Vault.

However there are a few problems with the approach, some of which are noted in the original post.

The main one is that if we remove a policy from the configuration, applying it again will not remove the objects from Vault. Essentially it is additive only, and while it will modify existing objects and create new ones, removing objects that are no longer declared is arguably just as important.

Another problem is that shell scripts inevitably have dependencies, which you may not want to install on your shell servers. Curl, in particular, is extremely useful for hackers, and we don’t want to have it available in production (in our environment, access to the vault API from outside the network is not allowed).

Finally, shell scripts aren’t easy to test, and don’t scale particularly well as complexity grows. You can do some amazing things in bash, but once it gets beyond a few hundred lines it’s time to break out into a proper language.

So that’s what I did.

The result is a tool called vaultsmith, and it’s designed to do one thing – take a directory of json files and apply them to your vault server.

Continue reading

Upstream Bug? Fix it.

This is a blog post I originally wrote more than two years ago, in reaction to “spirited debates” I was having with developers. I didn’t post it, but perhaps I should have! Anyway the ideas within are as true to me now as they were then, so I thought I’d post it today after a bit of revision.

How many times have you had a developer shrug their shoulders at you and say “it’s an upstream bug”?

I heard it today, and it is so, so wrong, that it is practically an admission of guilt.

Do you say that to your customers when their personal data is leaked from your database? When your app crashes their device? No? Good, because it’s your problem.

It’s great that you can use third party libraries to do your job more efficiently, but doing so does not absolve you of responsibility if the product breaks. You made the decision on what library to use, and you are ultimately responsible for delivering functionality. Continue reading

From Ivy Bridge to Threadripper Part 1 – A Water Cooling Retrospective

Some of the links in this article are Amazon affiliate links, which pay me a commission if you make a purchase.

I could have brought a plain old Ryzen, a Core i7 or even another Core i5. But with Intel sitting on its hands the past 5 years in the face of no competition, I decided it was time to splash out and reward AMD for not only investing in CPUs again, but making an interesting high-end desktop product while not nickel & diming its customers over PCI-E lanes.

And so, I brought a 1920X.

I don’t really need 12 cores. Other than general browsing, my PC is used for work, (coding) plus a bit of gaming, and a gaming CPU this is not. Running multiple VMs and M.2 devices without slowing down will be nice, but this build is mostly overkill for my needs. And that’s really the point! Continue reading