Using Local LLMs for Note Metadata
I believe the true power of AI is to enable us to automate repetitive or mundane tasks so that we don’t have to. As part of this interest, I have been experimenting with LLM models running locally on my PC. I remember “back in the day,” probably a couple of years ago at this point, I first got an LLM to run locally on my (at the time) Windows 10 PC, and it was quite the laborious process involving running commands from the command line, installing various dependencies, messing around with a bunch of settings, downloading specific drivers, etc. I don’t remember all the steps; I just remember it took me most of a late night to get it to work. Now, by contrast, all you need to do is download LM Studio, Msty or something similar, and you can load whatever models you want using a nice GUI, and expose them via an API to run whatever tasks you want. You can go from nothing to up and running within 5 minutes. It’s quite a shift. Even on Linux it’s a snap (or an AppImage, really).
Joking aside, one of the useful tasks I wanted an LLM to automate for me was the tagging of my notes and SEO for my notes. I hate SEO with a passion, but I feel like it’s a necessary evil in modern web design. Don’t get me wrong, SEO is critical from a business perspective, I get that. But for a personal blog or a notes site it seems like just another pain you have to deal with if you want an accurate representation of your website on search engines and when shared with social media.
Experimenting with ChatGPT
Before I even thought about running this locally, I first played with ChatGPT to get it to provide SEO and tags. I did this by manually copy-pasting my notes into ChatGPT and asking it to generate tags and SEO content for me and then copying and pasting the results back into Obsidian. The results were pretty decent, but I think my prompting was a bit off because it ended up being a little… too much. For example, here’s what it gave me for my essay Flex Learning Day:
tags:
- personal-essay
- community
- civic-life
- fear-and-resilience
- parenthood
seo-title: "Flex Learning Day: Fear, Community, and Parenthood in Minnesota"
seo-description: "A personal essay on a flex learning day in Minnesota—extreme cold, ICE activity near schools, community resilience, and raising children amid fear and uncertainty."
seo-keywords:
- flex learning day
- minnesota winter
- community resilience
- immigration enforcement schools
- parenting during crisis
- civic engagement minnesota
- fear and solidarity
Here’s another example from my notes on Linux Mint:
tags:
- linux
- linux-mint
- desktop-linux
- user-experience
- operating-systems
seo-title: "Linux Mint Notes: A Windows-Adjacent Linux Desktop Experience"
seo-description: "Personal notes on using Linux Mint—its Windows-like interface, stability quirks, multi-monitor issues, and why I ultimately switched to Fedora and GNOME on Wayland."
seo-keywords:
- linux mint
- linux mint cinnamon
- linux mint review
- linux mint vs fedora
- cinnamon desktop
- linux mint multi monitor issues
- desktop linux experience
This was an OK result, but I wasn’t super thrilled about two things:
- The tags aren’t all ones I wouldn’t likely reuse. My tagging scheme for my notes is a little more conservative than this. For example, I’m never going to use
fear-and-resilienceever again. What’s the point of a tag except to help you discover related content? I don’t want to end up with a situation where I have a bunch of meaningless one-off tags on my notes. - The SEO tends to editorialize my work, which I don’t really want. I’d prefer a more content-driven approach to SEO, where it really just provides a flat summary of the content. The SEO title, for example, is one that I would never pick.
I know I could prompt it better for better results, but I didn’t like the process of copy-pasting back and forth between ChatGPT and Obsidian, so I knew I had to automate the process. I figured a smaller, locally running model would work very well for these purposes.
Implementing Locally
So I set out to use a local LLM to handle tagging of notes, SEO keywords, SEO title, and SEO description fields for each of my notes. The way I did this was I used Claude to write a Python script that I can run whenever I want which scans my notes folder, adds the appropriate tags from a preselected list I created in a simple YAML file, adds 1-2 additional useful tags, and writes the SEO. It writes directly to the frontmatter in my notes, eliminating the need to copy-paste, and it provides the prompt directly in the script so I get consistent results. I chose the openai/gpt-oss-20b model for this, although I’m curious about experimenting with different models.
Initial Results
Here’s what it gave me for the same notes above.
Flex Learning Day:
Tags:
- governance
- security
- community
- immigration-enforcement
- cold-weather-education
SEO Title: "Flex Learning Day Amid Cold Weather and ICE Enforcement in Minnesota"
SEO Description: "A personal account of a cold Friday in Minnesota where school closures, ICE activity, and community solidarity intersected."
SEO Keywords:
- Minnesota flex learning day
- ICE enforcement Minneapolis
- cold weather education
- community protest response
- civic engagement Minnesota
- school closures winter
- immigration policy impact
- student safety during strikes
- family experience ICE
- local community solidarity
- political activism in Minnesota
Linux Mint Notes:
tags:
- linux
- automation
- productivity
- display-management
seo-title: "Linux Mint Experience and Switching to Fedora Workstation 43"
seo-description: "A review of Linux Mint’s Windows‑like interface, stability issues, display script workarounds, and the move to Fedora Workstation with GNOME Wayland."
seo-keywords:
- Linux Mint
- Fedora Workstation
- GNOME Wayland
- Cinnamon desktop
- Windows-like Linux
- dual monitor setup
- display script
- WMI keys
- system stability
- desktop environment comparison
- Linux usability
- user experience
Is this perfect? No, but it gets me a lot closer to what I wanted. The tags that the LLM suggested are immigration-enforcement and cold-weather-education on the first note, and display-management on the second note. Are these good suggestions? Potentially, but either way I can quickly review what the LLM suggested and if I like one of the tags I can easily add it to my YAML list of preselected tags for the future, or just delete it if it’s not a tag I want to use. The SEO metadata is much closer to what I wanted: less clickbait, more content-focused.
Is the process completely automated? No. I need to run the script manually. I could set it up to run as a cron job overnight or something, but I don’t know if I really want to. I’m fine manually triggering it each time. More importantly, whatever it gives me still needs to be reviewed and edited, but it gets me 95% of the way there.
Enhancements
Here is a list of enhancements I added after the initial testing:
- Increased the amount of text sent from the note to the LLM from the first 2,000 characters to the first 20,000 characters.
- Added the ability to populate the
slug:field with a slug based off the filename.
Known Bugs
- It seems to occasionally insert a new line in random places in the note metadata, throwing an invalid YAML error.
Instructions
In case you find this useful for your own purposes, I’ve explained here how you can copy my script and use it yourself. Feel free to adapt it to your purposes, and if you find it useful please let me know.
- Go to the code repo on Gitea and download the files as a zip.
- Extract the zip folder into whatever folder you want (e.g.,
~/bin/note-tagger). It doesn’t need to be the folder your notes are in. - Install dependencies:
pip install pyyaml requests - Install LM Studio and load whatever model you want. I used openai/gpt-oss-20b.
- Edit the
tag-taxonomy.yamlfile, adding the tags you want the LLM to use and removing any tags that aren’t applicable to you. - Edit the following variables at the top of the
note-tagger.pyfile:- Change the ‘NOTES_FOLDER’ variable to the folder your notes are in.
- Change the ‘MODEL_NAME’ variable to the model name you loaded in LM Studio.
- Change the ‘LM_STUDIO_URL’ variable to the URL of the LM Studio API (usually https://localhost:1234/v1/chat/completions)
- Linux Note: Don’t forget to make the
tag-notes.pyfile executable by runningchmod +x tag-notes.py - Run the script from the command line:
./tag-notes.py - Profit!