Paginated outputs remain important to scholarly communications, and are still critical for books like monographs. Even in today’s increasingly digital discovery landscape, many readers of long-form content continue to prefer print, and the ability to cite page numbers continues to be critical to creating good old-fashioned tools like a book index. But producing paginated books from HTML source files that could also be used for generating other types of digital files has always been a challenge, as Nellie McKesson notes in her recent blog post on Hederis. So, a couple of years ago, the University of California Press and the California Digital Library partnered with Coko to begin an ambitious project to develop a workflow application that would allow books to be built in a browser using entirely open source technologies. Editoria is not the first open source, browser-based book production system that has ever been attempted, but it’s (at least to our knowledge) the first that has attempted to replicate the rigorous production editing process and workflow, which includes styling, copyediting, author review, and proofreading, in a browser-based application.
We borrowed the idea of single-source publishing using HTML source from predecessor applications like Adam Hyde’s Booktype, O’Reilly’s Atlas, and Hugh McGuire’s Pressbooks, all of which use some form of PDF rendering engine (often proprietary) to output beautiful, paginated books in addition to EPUBs and other HTML or XML-based files. Then, we’ve tried to stand on the shoulders of those applications by building in a greater degree of workflow support. It’s an ambitious project, and supporting paginated outputs from a single HTML-based source file, has been a non-trivial aspect of the system’s development.
Editoria starts with a book dashboard where all active titles that a user is working on can be seen and accessed.
Clicking the “edit” link next to any of the books in the library brings you to the so-called “Book Builder” interface, which represents the narrative structure of the book:
We’ve introduced the ability to upload Microsoft Word documents and to order the book into the commonly understood sections of frontmatter, body, and backmatter that are outlined in the Chicago Manual of Style.
Clicking “edit” next to any section of the book drops you into a web-based word processing environment, Wax, where production editors, copyeditors, and authors can all interact around the text.
This is where manuscript styling and copyediting take place. The interface introduces the ability to edit notes as well as text. Access to the text can be managed using a team manager that allows for fairly granular role-based permissions. These allow administrators to control who can do what to the text at what point.
Once the the text has reached a point at which everyone feels that it’s ready to be published, the book can be exported in either PDF or EPUB format. At the moment, these are the two standards-based formats that the system supports. We are currently using the Vivliostyle CSS-based typsetting engine. We initially chose Vivliostyle because not only was it open source, but it also tried, insofar as possible, to work with existing web standards for browser-based pagination.
This is a fairly simple, text-only example, but there are a number of complex elements that all have to be handled perfectly in order to render a high-quality PDF from HTML source. These two pages alone contain or require running heads, subheads, footnotes, diacritics, and hyphenation control, all of which are critical to not just rendering a serviceable PDF, but a PDF that is of high enough quality for a publisher or author.
A page allows a reader to situate herself in the text for a fully immersive reading experience. As can be seen from the amount of work that has gone into creating a a high-quality word processing environment, what we are striving to do with Editoria is not just to create a book, but to create the best book that can be created in the browser using one source file. EPUBs and other reflowable formats will continue to evolve, and they are capable of supporting features at the moment that are difficult to support in paginated media. Our friends who are in the midst of developing Manifold and Fulcrum are showing everyone all that the EPUB can do. But, in the end, there is still a need for pages. Interoperable, open source, standards-based approaches using CSS and Javascript—which are at the heart of the Paged Media project— will ensure that everyone can have their content in exactly the way that they would like to consume it for some time to come.
Bravo! Fantastic to see this progress from Editoria. And btw I think all your choices in its development have been spot-on. Well done!
-Bill Kasdorf