From 617489bc7ad777de7d5423439f04f74e7089bd53 Mon Sep 17 00:00:00 2001 From: Richard Wong Date: Sun, 28 Apr 2024 21:00:42 +0900 Subject: [PATCH] Doc: added a readme --- README.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..673d060 --- /dev/null +++ b/README.md @@ -0,0 +1,31 @@ +# epub to html + +This repo contains some sample code to convert an epub to html. + +## How? + +epubs are just a collection of html files. + +I unzipped the epub into a folder called "./epub" and then work from there. + +I used BeautifulSoup to go through the html files and to merge them. + +I also did some structure processing to enable links to work in the single-page +html. + +## Why? + +Sometimes you just need a simple single-page html to read your document in the +browser. + +I realized that there is a surprising lack of tools to merge multiple html +files into one with working links. + +## Upcoming plans + +For now I assume that you have to manually unzip the epubs to gain access to +the internal html file directory of the epub. I also make no assumptions on +the general structure of epubs. I just tested it on a single epub that I had. + +Future work will be making the tool more user-friendly by making it a simple +binary that just takes an epub file as input.