The search for a good offline dictionary for Linux
… proved longer than expected
To me, not having access to a good dictionary feels a bit like forgetting to put on clothes. When Flaubert throws a big word at me, I marinate in my own confusion, helpless and vulnerable.
Those of you who rarely disconnect from the internet might be surprised to learn that the dictionaries for many desktop dictionary apps live in the internet, not in the app, and stop working when offline. This seems so silly! The biggest dictionary in the world would be smaller than my node_modules.
Since paper dictionaries are passé, and rarely have the word I’m looking for, when travelling, I tend to turn to my Kobo dictionary even if I’m at my computer. It rarely has the word I’m looking for, but at least it’s quick to find out that it doesn’t have it.
After going through the rigmarole of selecting and adding dictionaries to KOReader, I thought, how hard can it be to use one of these on Linux? Let’s just say: this is my third attempt.
As with most things Linux, it’s only hard if you don’t know how to do it. And even the first time, it isn’t hard, per se, it’s just annoying, and confusing, and barely worth it.
The contenders:
- gnome-dictionary (online by default, but supports dictd servers)
- dict (command line, online by default) and dictd
- stardict (GUI) and sdcv (command line)
- GoldenDict
- Artha
- Wordbook (untested)
If you are content with Wordnet, sudo apt install artha and you’re done. (N.B. Artha is technically a thesaurus.) It has some cool features like hotkey lookup that just works, and regular expression search.
Or, the less attractive but more powerful GoldenDict:
sudo apt install goldendict goldendict-wordnet
I am not content with Wordnet. I am also not content with a GUI only app, because part of the motivation for this project was that I wanted to export my Kobo vocabulary builder table, and generate an Anki deck of flashcards with the words I had saved on one side, and a good definition on the other.
You can easily add bought/pirated/out-of-copyright dictionaries to GoldenDict (Edit > Dictionaries, and click add), and it seems to take the same format as KOReader.
Command Line Usage
To get sdcv working, you need to install it (sudo apt install sdcv), and add dictionaries to /usr/share/stardict/dic. Or, go where your dictionaries are, and run:
ln -s /path/to/dictionaries/this-is-the-dictionary-i-want/* /usr/share/stardict/dic
Note that linking to top-level folder into dic doesn’t work: it needs the individual files in there.
You can then run sdcv --list-dicts to check that they added successfully. You can also use a specific dictionary in future commands, using what shows up here.
My next surprise was that sdcv output raw HTML on the command line.
This is probably because I’m using the wrong conversion, but my bandaid solution was to pipe it through pandoc, and into a pager (since good dictionaries tend to have exhaustively long entries):
sdcv --use-dict "this-is-the-dictionary-i-want" test | pandoc -f html -t plain | less
Hotkey Lookup with GoldenDict
There was one final feature that I wanted: an equivalent of long-press lookup on ereaders.
GoldenDict probably has this built-in, but I didn’t get it working. Instead, I added a custom keyboard shortcut, with the following contents:
bash -c "goldendict \"$(xsel | tr '\n' ' ' | sed -r 's/^[^[:alpha:]]*([-[:alpha:]]*).*$/\1/')\""
Don’t ask me how it works: it works, and that is enough.