Introduction

I've been reading books on the computer for many years, but it wasn't until I started writing fiction again that I became seriously interested in the field of e-books. As it turns out, information about them is spread across various websites, and it takes a while to form a good image.

This is a guide to e-books from a personal perspective, that aims to collect a bunch of basics in one place. I tried to keep the language simple while not shying away from technical terms. Feedback is welcome, can't promise to update much though.

What are e-books?

From Wikipedia, the free encyclopedia:

An electronic book (variously, e-book, ebook, digital book, or even e-edition) is a book-length publication in digital form, consisting of text, images, or both, and produced on, published through, and readable on computers or other electronic devices.

Why e-books?

I love printed books. I grew up with them, and I hope they never go away. It's a pleasure to hold them, they can be works of art and they make great presents. But e-books have a lot going for them as well:

E-book formats

TXT

Plain text is the most basic file type one may use for e-books. People used to office suites may balk at the lack of formatting options, but in fact you need very little of that. Plain text files are compact (since they have zero overhead), compress well and can be read nearly everywhere; I had a bunch of books on my old MP3 player for example. In the way of downsides, it's hard to navigate within a large text file (except by searching for known keywords) and sometimes they can be a little too plain.

Plain text can be read on just about any device, often with preloaded apps.

HTML

Being the language of the Web, HTML is uniquely well positioned to be used for e-books. On the plus side, it's ubiquitous, allows for rich formatting, and adapts to various screen sizes and capabilities more easily than TXT. On the minus side, handling it uses up ample hardware resources. My Nokia E5 used to choke trying to render an entire novel in the browser. But even cheaper feature phones can read HTML to a degree.

There are several competing standards for metadata in HTML, but luckily they can coexist just fine.

PDF

Designed in the early 1990s as an interchange format, PDF was later made an industry standard for documents meant to be printed. That makes PDF files great for preserving the exact visual appearance of a book, but they're also large, and unsuitable for small screens (unless special care is taken) since they don't reflow.

PDF files can be read on any laptop or desktop computer; smartphones and the like can open them as well, but that's only good for previewing.

EPUB

Now we get into formats specifically designed for e-books. The most widespread of those is EPUB, an open format that bundles HTML files along with images and fonts into a ZIP archive. On the minus side, it suffers from design-by-committee, being unnecessarily complex (I would have made it more akin to CBZ). As an advantage, EPUB is very compact and allows for rich metadata, which greatly facilitates cataloging work. It can also do well on devices with limited memory.

You can read EPUB on Java-enabled feature phones with Albite Reader, or on PCs using SumatraPDF on Windows and Okular on Linux (poorly in the latter case). Most dedicated e-book readers support EPUB natively, and compatible software is available for pretty much any other mobile device; on Android I currently use the ultra-lightweight Booky McBookface from F-Droid.

PDB

Since I mentioned my Palm, you'll notice every single file made for the good old machines has the .pdb extension. That's actually a container format, much like AVI is for movies. The contents can be Aportis Doc (a.k.a. PalmDoc), eReader (nowadays owned by B&N), Plucker or zTxt, to list just the most common e-book formats for the platform. Some of these formats can be read on modern devices, but there is little point.

FB2

Coming from Russia, where e-books have been very popular long before the rest of the world had heard about them, FictionBook (a.k.a. FB2) is a custom format for representing literature. It's between TXT and HTML in size and it encodes an entire book, including images, in a single XML-based file. That makes it very easy to parse, but abusing the ability to embed images can make a FB2 file swell up quickly, and since it has to be parsed all at once, devices with plentiful memory are required to handle it.

On Linux, Okular can read FB2 files. Try CoolReader on Android, though others should work.

Making your own

Sooner or later you'll want to make your own e-books, either from scratch or by assembling 3rd-party materials (imagine an anthology of public domain short stories). This is incredibly valuable, because it means anyone can be a publisher and printing press, cntributing to the world's culture without having to ask the rich and powerful for permission.

As a technical person, my process goes as follows:

Or, if you plan to go printing as well:

Miscellaneous

It's worth mentioning that Calibre is a powerhouse, which can keep your e-books in a library and show it off via the Internet in addition to making conversions and other useful functions.

Calibre also features an excellent e-book reader which can display EPUB, FB2, eReader and MOBI at the very least.