Introduction
I've been reading books on the computer for many years, but it wasn't until I started writing fiction again that I became seriously interested in the field of e-books. As it turns out, information about them is spread across various websites, and it takes a while to form a good image.
This is a guide to e-books from a personal perspective, that aims to collect a bunch of basics in one place. I tried to keep the language simple while not shying away from technical terms. Feedback is welcome, can't promise to update much though.
What are e-books?
From Wikipedia, the free encyclopedia:
An electronic book (variously, e-book, ebook, digital book, or even e-edition) is a book-length publication in digital form, consisting of text, images, or both, and produced on, published through, and readable on computers or other electronic devices.
Why e-books?
I love printed books. I grew up with them, and I hope they never go away. It's a pleasure to hold them, they can be works of art and they make great presents. But e-books have a lot going for them as well:
- E-books can be searched; if you ever spent long minutes browsing a paper book looking for a particular place only to be frustrated, you know what I mean.
- E-books only take up as much physical space as the storage device. My Palm fits in a pocket, yet with the 256M memory card it can probably hold 700-800 e-books.
- E-books can be organized easily. Libraries spend a lot of time sorting and cataloging books, while on the computer they can essentially sort themselves.
- E-books are trivially shared. It costs nothing to copy an e-book and send it across the planet, which is great when you're trying to keep a really old book available after it's no longer commercially viable.
- Contrary to the definition in the previous section, e-books are free from the constraints of the printed book, and can be as short or long as they need to be.
E-book formats
TXT
Plain text is the most basic file type one may use for e-books. People used to office suites may balk at the lack of formatting options, but in fact you need very little of that. Plain text files are compact (since they have zero overhead), compress well and can be read nearly everywhere; I had a bunch of books on my old MP3 player for example. In the way of downsides, it's hard to navigate within a large text file (except by searching for known keywords) and sometimes they can be a little too plain.
Plain text can be read on just about any device, often with preloaded apps.
HTML
Being the language of the Web, HTML is uniquely well positioned to be used for e-books. On the plus side, it's ubiquitous, allows for rich formatting, and adapts to various screen sizes and capabilities more easily than TXT. On the minus side, handling it uses up ample hardware resources. My Nokia E5 used to choke trying to render an entire novel in the browser. But even cheaper feature phones can read HTML to a degree.
There are several competing standards for metadata in HTML, but luckily they can coexist just fine.
Designed in the early 1990s as an interchange format, PDF was later made an industry standard for documents meant to be printed. That makes PDF files great for preserving the exact visual appearance of a book, but they're also large, and unsuitable for small screens (unless special care is taken) since they don't reflow.
PDF files can be read on any laptop or desktop computer; smartphones and the like can open them as well, but that's only good for previewing.
EPUB
Now we get into formats specifically designed for e-books. The most widespread of those is EPUB, an open format that bundles HTML files along with images and fonts into a ZIP archive. On the minus side, it suffers from design-by-committee, being unnecessarily complex (I would have made it more akin to CBZ). As an advantage, EPUB is very compact and allows for rich metadata, which greatly facilitates cataloging work. It can also do well on devices with limited memory.
You can read EPUB on Java-enabled feature phones with Albite Reader, or on PCs using SumatraPDF on Windows and Okular on Linux (poorly in the latter case). Most dedicated e-book readers support EPUB natively, and compatible software is available for pretty much any other mobile device; on Android I currently use the ultra-lightweight Booky McBookface from F-Droid.
PDB
Since I mentioned my Palm, you'll notice every single file made for the good old machines has the .pdb extension. That's actually a container format, much like AVI is for movies. The contents can be Aportis Doc (a.k.a. PalmDoc), eReader (nowadays owned by B&N), Plucker or zTxt, to list just the most common e-book formats for the platform. Some of these formats can be read on modern devices, but there is little point.
FB2
Coming from Russia, where e-books have been very popular long before the rest of the world had heard about them, FictionBook (a.k.a. FB2) is a custom format for representing literature. It's between TXT and HTML in size and it encodes an entire book, including images, in a single XML-based file. That makes it very easy to parse, but abusing the ability to embed images can make a FB2 file swell up quickly, and since it has to be parsed all at once, devices with plentiful memory are required to handle it.
On Linux, Okular can read FB2 files. Try CoolReader on Android, though others should work.
Making your own
Sooner or later you'll want to make your own e-books, either from scratch or by assembling 3rd-party materials (imagine an anthology of public domain short stories). This is incredibly valuable, because it means anyone can be a publisher and printing press, cntributing to the world's culture without having to ask the rich and powerful for permission.
As a technical person, my process goes as follows:
- write in plain text and convert to HTML with Markdown;
- compile the HTML files into an EPUB with Sigil;
- convert the EPUB to other formats with Calibre.
Or, if you plan to go printing as well:
- create your book in LibreOffice; make sure to mark the headings as headings and so on, it's very important (in other words, use styles);
- export as PDF; remember to tick the PDF/A-1a checkbox, which will give you an "archival" document with the fonts embedded and other useful features (but in this case remember to double check the license for the fonts you used);
- import the .odt file into Calibre and convert to EPUB;
- load the EPUB in Sigil and clean it up, because it's going to need it!
Miscellaneous
It's worth mentioning that Calibre is a powerhouse, which can keep your e-books in a library and show it off via the Internet in addition to making conversions and other useful functions.
Calibre also features an excellent e-book reader which can display EPUB, FB2, eReader and MOBI at the very least.