kindle manager

2021-08-25 17:58:31 +08:00
parent 5f0c0a9724
commit 6b3c0f3b6b
303 changed files with 87829 additions and 42537 deletions
--- a/.DS_Store
+++ b/.DS_Store
--- a/backup/bk.info.data
+++ b/backup/bk.info.data
--- a/backup/bk.note.data
+++ b/backup/bk.note.data
--- a/backup/bk.word.data
+++ b/backup/bk.word.data
--- a/changelog.md
+++ b/changelog.md
@@ -151,3 +151,5 @@ b['1']['2'] = {'3':1}  # OK
 - [ ] write and manage notes
 - [ ] save export file to specific directory(config dialog)
 - [ ] 解析mobi电子书，把标记与章节目录匹配，可导出为笔记
--- a/cui.win.bat
+++ b/cui.win.bat
@@ -1,3 +1,4 @@
 pyside2-uic mainwindow.ui -o mainwindow.py
 pyside2-rcc kmanapp.qrc -o kmanapp_rc.py
-cp -fr *py *md *ico *qrc *ui ~/penv/kman/
+cp -fr icons *py *md *ico *qrc *ui ~/penv/kman/
--- a/downimg/s32331195.jpg
+++ b/downimg/s32331195.jpg
--- a/downimg/s6798546.jpg
+++ b/downimg/s6798546.jpg
--- a/export.md
+++ b/export.md
@@ -1,72 +1,17 @@
-TYPE|BOOKNAME|AUTHOR|MARKTIME|CONTENT
+WORD|BOOKNAME|AUTHOR|CATEGORY|USAGE|TIMESTAMP
--|--|--|--|--
+--|--|--|--|--|--
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 8:01:17|在古希腊时期，所谓僭主，意思是指“不通过世袭、传统或是合法民主选举程序，而是凭借个人的声望与影响力获得权力，来统治城邦的统治者”。照此标准，俄狄浦斯无疑是僭主，他虽然不是借助暴力欺诈而是借助理性获取王位，但依旧不具备传统意义上的统治合法性。
+愍|熊逸·佛学50讲|熊逸|learning|话说回来，愍度道人和伧道人一起商量渡江之后的职业规划问题，两个人都感觉自己掌握的这些佛学知识到了江南恐怕没有市场，没有市场就吃不上饭，这可怎么办呢？|2020/10/04 20:22:24
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 8:04:16|，俄狄浦斯的悲惨人生，在普通人看来是任意性与偶然性在作祟，在神的视角里，则是向神定的秩序与必然的法则的回归。
+愍|熊逸·佛学50讲|熊逸|learning|五代十国的社会局势混乱，人们前途未卜，人们急需宗教赋予精神世界一点确定性，“心无义”就是愍度道人为了迎合当时人们内心需求而发明出来的理论。 |2020/10/04 20:23:56
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 8:05:05|德国学者约亨·施密特（Jochen Schmidt）认为，公元前429年和公元前427年的两场瘟疫，以及长达27年的伯罗奔尼撒战争给雅典的政治与宗教造成了深远的心理影响：“一方面它促成了怀疑和玩世不恭……另一方面，在这多灾多难的几年里，许多人逃到旧宗教里面去。”索福克勒斯代表的就是后一类人。
+毗|图解五灯会元(图解经典)|释普济|learning|现在从过去七佛开始讲起，在过去庄严劫时，出现的毗婆尸佛、尸弃佛、毗舍浮佛、拘留孙佛、拘那含牟尼佛、迦叶佛、释迦牟尼佛，这是七佛。|2019/09/20 19:46:28
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 8:07:08|《俄狄浦斯在科罗诺斯》就在试图传达这样一个信息：既然理性化的努力注定失败，那就让我们泰然接受这个命定失败的结局，既然神的意志人类无法理解，那就让我们泰然接受神的安排。这正是古希腊悲剧精神的要义所在：“它接受生活，是因为它清楚地看到生活必然如此，而不会是其他的样子。”
+毗|中央帝国的哲学密码|郭建龙|learning|印度的原始人口是达罗毗荼人，后来来自亚欧大草原的雅利安人入侵了印度，成了统治阶层。|2020/03/10 22:18:35
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 8:08:01|俄狄浦斯荣登忒拜王位的过程象征了理性对于神启、血统和传统的胜利，而他的最终垮台则被视为理性主义的失败，意味着对知识与力量过分自信的人所遭到的“存在意义上的失败”。在古希腊的神话和诗歌中，有太多英雄人物遭受到这种“存在意义上的失败”。普罗米修斯、阿伽门农、俄狄浦斯，莫不如此。
+毗|印度，漂浮的次大陆|郭建龙|learning|所谓达罗毗荼人，是与北部的雅利安人相对的概念，他们声称自己的祖先存在于雅利安入侵之前的印度，是更加原初的印度人，他们拥有自己的语言系统，并为自己的种族而感到骄傲。|2020/06/21 07:57:45
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 8:09:10|在结束这一讲之前，我想请你们和我一起重温德尔菲神庙上最著名的三条箴言：“认识你自己”，“凡事勿过度”，以及“生存与毁灭就在一瞬间”。其中，第一条箴言宣告了人类终其一生的命题；第二条箴言告诫人类要克服本性上的僭越冲动，始终恪守在永恒固定的界限之内；第三条箴言则再次重申了《僭主俄狄浦斯》中“第四合唱歌”中的警示： 凡人的子孙啊，我把你们的生命当作一场空！谁的幸福不是表面现象，一会儿就消失了？不幸的俄狄浦斯，你的命运，你的命运警告我不要说凡人是幸福的。
+毗|思辨的禅趣:《坛经》视野下的世界秩序(中国思想史系列)|熊逸|learning|小乘佛教对这满街菩萨的盛况看不顺眼，据小乘经典《大毗婆沙论》说，从人变成菩萨的过程要经过无穷无尽的时间，要感得所谓“三十二大人相”——这在前文讲过一些，是说佛陀与生俱来的三十二种体貌特征，《三国演义》说刘备“大耳垂肩，双手过膝”就是这三十二相的其中之二。|2020/09/27 20:25:39
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 8:21:37|历史上最著名的智者叫作普罗塔戈拉（Protagoras，约公元前490年—前420年），他在开班授徒之前，会跟学生事先协定，学生先付一半学费，剩下的一半等学生打赢了第一场官司以后再付，而如果第一场官司打输了，那么证明老师教学效果不佳，剩下的那一半学费就不用交了。可问题是，有一个学生毕业之后既不出庭打官司，也不交剩下的学费。普罗塔戈拉等了很久，终于耐不住性子，就向法院提起了诉讼，师徒对簿公堂。在法庭上，普罗塔戈拉跟他的学生说：“如果你打赢了这场官司，那么按照合同，你应该付我另一半的学费；如果你输了，那么按照法庭的裁决，你也应该付给我另一半学费。总之，这次官司无论输赢，你都得付我另一半学费。”谁知道他的学生针锋相对地反驳说：“如果我打赢了这场官司，那么按照法庭的裁决，我不需要付你另一半学费；而如果我打输了，那么按照协定，我也不必付给你另一半学费。所以，无论输赢，我都不必付给你学费。
+毗|熊逸·佛学50讲|熊逸|learning|印度教貌似并没有排斥佛陀，而是用兼收并蓄的精神把他收归自家旗下，说他是毗湿奴神的十种化身之一，存心把大家引上邪路。 |2020/10/06 08:08:40
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 8:22:46|我一直认为，那些在辩论赛上口若悬河的人就是智者派的现代传人。事实上，普罗塔戈拉就是第一个创办辩论比赛的人。
+毗|熊逸·佛学50讲|熊逸|learning|《大毗婆沙论》，全名叫作《阿毗达摩大毗婆沙论》。“|2020/10/07 15:55:54
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 8:23:01|哲学的目的是寻求真理，因此就必须要找到区分真与假、对与错的标准，智者派却引入了相对主义，他们选择双重标准，不追求真理而追求输赢，这当然会遭到哲学家的激烈反对。
+呢？不|熊逸·佛学50讲|熊逸|learning|五荤是哪五种蔬菜呢？|2020/10/04 17:44:52
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 8:24:07|希罗多德在《历史》中讲过一个故事：波斯国王大流士曾经问希腊人，给他多少钱可以让他吃自己父亲的尸体，希腊人的回答是，多少钱也不可以；然后他又把吃自己双亲尸体的印度人叫来，问给多少钱才能答应火葬他们的父亲或者母亲，印度人的回答是，给多少钱我也不会这么做。讲完这个故事，希罗多德引用了诗人品达的一句话作为总结：“习惯是万物的主宰。
+在|联邦党人文集(全新译本）(“活着的宪法”)(西方学术经典译丛)|[美]亚历山大·汉密尔顿|learning|马勃雷神父在评论希腊时说道：在其他地方都如此容易引发动荡的民主政府，在亚该亚共和国的成员中并没有造成混乱，原因就是在那里这一政府得到了联盟权力和联盟法律的调和。 |2019/10/05 22:32:32
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 8:25:27|普罗塔戈拉最著名的命题就是：“人是万物的尺度，是存在者存在的尺度，也是不存在者不存在的尺度。”相比于“神是万物的尺度”，“人是万物的尺度”无疑具有进步性，它意味着人文主义的兴起，对传统秩序形成了强有力的冲击，代表了某种进步的力量。德国哲学家卡西尔就曾经高度评价智者派，认为他们“以一种新的精神突破了由传统的概念、一般的偏见和社会习俗所形成的障碍”。
+在|苏世民:我的经验与教训（2018读桥水达利欧的原则，2020看黑石苏世民的经验!一本书读懂从白手起家到华尔街新国王的传奇人生）|苏世民|learning|在危机爆发之后，政府则会启动第二套灾难性行动——整顿银行，要求银行收紧贷款标准。|2020/05/07 21:40:49
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 8:43:06|苏格拉底（Socrates）生于公元前469年，死于公元前399年，他是雕刻匠和产婆的孩子，土生土长的雅典本地人。在他壮年的时候，经历了雅典民主制最为辉煌的时刻，他的后半生则亲历了长达27年的伯罗奔尼撒战争，目睹了雅典民主制的盛极而衰。
+在|熊逸·佛学50讲|熊逸|learning|它是禅定的一种方式，在修这种禅定的时候，需要无差别地把众生当作观想对象，做到“慈、悲、喜、舍”四点。|2020/10/08 09:22:51
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 9:42:18|我们的确常常混淆大事与小事——在菜市场买菜的时候斤斤计较，股市里投钱却一掷千金；在单位里为了职位升迁和奖金多少斤斤计较，对影响收入的各种税收政策和法律却事不关己高高挂起；每天忙着美容健身和养生，对自己灵魂的健康却漠不关心
+革囊众秽|熊逸·佛学50讲|熊逸|learning|革囊众秽 话说有天神想试探一下佛陀的修为，就送了一名美女给他。|2020/10/06 23:00:53
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 9:45:55|20世纪最伟大的哲学家维特根斯坦也说过类似的话，他说：“我们觉得，即使一切可能的科学问题都已得到解答，也还完全没有触及人生问题。”也许正是出于同样的考虑，苹果公司的创始人乔布斯才会说：我愿意把我所有的科技去换取和苏格拉底相处的一个下午。
+囊|熊逸·佛学50讲|熊逸|learning|革囊众秽 话说有天神想试探一下佛陀的修为，就送了一名美女给他。|2020/10/06 23:00:42
-HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 9:55:26|“真正重要的不是活着，而是活得好。活得好意味着活得高尚、正直”
+迦|熊逸·佛学50讲|熊逸|learning|佛陀涅槃之后，你得先记住两个人，一个叫阿难，一个叫大迦叶（shè）。 |2020/10/04 22:26:34
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 9:56:55|让我们重温苏格拉底的那句名言：“未经考察的人生是不值得过的人生。”但是，我想接着苏格拉底的话往下说：“过度考察的人生是没法过的人生。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 10:19:40|有理性的人必然拥有关于自我的知识，他也因此是有德性的人，有德性的人一定能够得到幸福。这毫无疑问是一种理性主义的道德哲学。有意思的是，20世纪的哲人维特根斯坦似乎也认同这一点。在一封私人信件中，他写道：“我勤勉地工作，希望自己能更好（better）和更明智（smarter）。当然，这两者本就是一回事。”
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 10:25:05|如果你读过乔治·奥威尔的政治寓言小说《1984》，就会发现二者惊人的相似之处，在奥威尔笔下，虚拟的大洋国里有四个政府部门，“真理部”负责撒谎，“和平部”负责战争，“仁爱部”负责刑讯，“富足部”制造短缺。大洋国和战争期间的古希腊的共同特征是，所有的词义都出现了黑白颠倒的现象。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 10:43:07|如果碰巧你读过休谟，也许还会背诵这句话给自己听：“如果我独自一人把严厉的约束加于自己，而其他人却在那里为所欲为，那么我就会由于正直而成为呆子了。”所以说，公平游戏原则的要点在于“限制的相互性”，也就是说，在一个社会合作体系中，只有当其他人服从规则的时候，我才会服从规则。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 10:46:20|，英国哲学家休谟曾经举过一个例子进行反驳：假设你在睡梦之中被人绑架到了一条船上，醒来时眼前除了汪洋大海就只剩下那个面目可憎的海盗头子。现在你有两个选择：1.离开这条船；2.留在这艘船里。前者意味着跳进海里淹死，所以你只能选择留在船上，那么这是否意味着你其实已经在自由地表达你对船主的权威的认可？
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:35:38|现在我该走了，我去赴死；你们去继续生活，谁也不知道我们之中谁更幸福，只有神知道
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:35:49|苏格拉底的一生说过无数的话，其中最打动我的一句话来自《申辩篇》，这是他在雅典公民大会上说的最后一句话：
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:37:50|对于古希腊哲人来说，哲学除了能够带来为知识而知识的快乐，还能给哲学家带来永生，因为他们深信，肉体是灵魂的坟墓，只有当灵魂彻底摆脱了肉身的羁绊，灵魂才有可能不朽。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:39:17|为了帮助读者迅速地把握雅典民主制的基本特点，我在这里重点介绍四个关键词：陶片放逐法，抽签制，直接民主，以及民意煽动者。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:41:26|我们今天最熟悉的民主形式是代议制民主，也称间接民主，相比之下，雅典实行的却是直接民主。二者最大的区别在于，在直接民主这里，人民既是统治者又是被统治者，没有任何中介和代表；在间接民主这里，统治者由被统治者选举产生，用美国建国之父麦迪逊的话说就是：“公民从自己中间选出少数人作为自己的政治代表。”直接民主的好处是最充分地体现出“主权在民”的原则，让民意以最直接、最畅通无阻的方式加以表达，但是坏处也同样明显，因为民意具有很强的任意性，所以直接民主很容易堕落成为“暴民统治”，这一点在雅典民主制的晚期展露无遗。因为给后人留下太坏的印象，所以美国的建国之父们都对“民主”二字敬而远之，避之唯恐不及，因为在他们看来，“民主从来都是一场动荡和纷争，与人身安全和财产权利无法兼容，这种政体往往因为暴力导致的终结而非常短命”。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:42:32|。事情往往就是这样，雅典议事会就像今天的网络世界，谁能用最漂亮的语言和机锋抓住人们的眼球，谁就能获得控制民意的力量
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:43:53|在民主政治中，最可能赢得民意的不是德才兼备之士而是巧舌如簧的民意煽动者，这些人最擅长拨弄听众的情绪，翻手为云覆手为雨。因为要在既定的时间里挫败论敌、说服听众，所以在演讲和辩论的过程中就必须采用“半真半假的陈述、虚伪的谎言或者恶意的人身攻击”。说到人身攻击，当年鲁迅先生曾经举过一个例子，非常深刻也非常形象，他说：“譬如勇士，也战斗，也休息，也饮食，自然也性交，如果只取他末一点，画起像来，挂在妓院里，尊为性交大师，那当然也不能说是毫无根据的，然而，岂不冤哉！”
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:45:26|如果让我们重提政治哲学中的那个核心问题——“应该由谁说了算”，民主派的回答是由平民（demos）说了算，而苏格拉底的回答则是：由专家或者最智慧的人说了算。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:46:19|苏格拉底说：“当前风气是父亲尽量使自己像孩子，甚至怕自己的儿子，而儿子也跟父亲平起平坐，既不敬也不怕自己的双亲，似乎这样一来他才算是一个自由人。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:47:54|我们一定要牢记于心的是，对于古希腊人来说，君主制和贵族制是常态政治，是祖宗旧制，而民主制则是异端歧出，是洪水猛兽，是一个必须要竭力加以辩护的坏东西。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:50:38|公元前430年，也就是伯罗奔尼撒战争开始之后的第二年，伯里克利在雅典阵亡将士的葬礼上做了一场震古烁今的演讲，在这场演讲中，他对雅典民主制进行了最富激情的辩护和赞颂。我实在是太喜爱这段话了，所以请允许我在这里引述一遍： 我们的政体名副其实为民主政体，因为统治权属于大多数人而非属于少数人。在私人争端中，我们的法律保证平等地对待所有人。然而个人的优秀德性，并不因此遭到抹杀。当一个公民的某项才能特别杰出，他将被优先考虑担任公职。这并非特权，而是美德的报酬。贫穷亦不构成阻碍，一个人不论其地位如何卑微，总能奉献其一己之力于国家。雅典的公民并不因私人事业而忽视公共事业，因为连我们的商人对政治都有正确的认识与了解。只有我们雅典人视不关心公共事务的人为无用之人，虽然他们并非有害。在雅典，政策虽然由少数人制定，但是我们全体人民乃是最终的裁定者。我们认为讨论并不会阻碍行动与效率，而是欠缺知识才会，而知识只能藉行动前的讨论才能获得。当别人因无知而勇猛，因反省而踯躅不前，我们却因决策前的深思熟虑而行动果敢。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:51:01|雅典民主制实现了依迪丝·汉密尔顿在《希腊精神》中所说的“绝妙的平衡”，在这个制度里，人人平等的政治权利与卓越个体的脱颖而出，私人事业与公共事务，少数人制定政策与全体公民作为最终的裁定者，深思熟虑与行动果敢，这些看似对立的双方都达到了绝妙的平衡
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:52:42|，苏格拉底之死是雅典民主制最大的历史污点，千百年来，人们一直以此来攻击民主制
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:59:05|现在的问题是：如果好政体的堕落是不可避免的话，那么我们应该选择什么样的政体呢？毫无疑问就是坏中最不坏的那个政体——民主制。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 11:59:16|由于苏格拉底之死，由于柏拉图以及众多哲人对雅典民主制的批评，民主在相当长的时间里一直背负着骂名，被世人视为一个坏东西。民主制之所以这么不招人待见，一个很重要的原因是人类一直不死心，一直想要追求至善的政体，试图在地上建立天国。直到各种实验都以惨败告终之后，人们才开始意识到，民主虽然是个坏东西，但它却是坏中最不坏的那个东西。1947年11月11日，英国前首相丘吉尔在众议院中说：除了所有那些一再尝试过的其他政府形式之外，民主是最坏的政府形式。这句话说得非常拗口，其实，丘吉尔的意思就是，民主制是坏中最不坏的制度。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 24:00:22|按照亚里士多德的政体分类标准，我们可以根据“统治者的人数多少”以及“统治的目的到底是为了公共利益还是私人利益”来区分六类政体，它们分别是： 1.一个人统治并且为了公共的利益，这是君主制； 2.一个人统治并且为了私人的利益，这是僭主制； 3.少数人统治并且为了公共的利益，这是贵族制； 4.少数人统治并且为了私人的利益，这是寡头制； 5.多数人统治并且为了公共的利益，这是共和制； 6.多数人统治并且为了私人的利益，这是民主制。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 24:09:37|英文中有一个词叫作tantalize，与坦塔罗斯（Tantalus）同属一个词根，意思是逗人、惹弄人、使人干着急。坦塔罗斯遭受的惩罚正是如此——“被诱惑但却不能被满足”
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 24:10:36|坦塔罗斯的故事传达了这样的寓意：你可以沉浸在幸福之中，你甚至可以永远无忧无虑地幸福下去，但前提是你必须保持住你的那份单纯天真。换句话说，你只要享受快乐和幸福就好了，千万不要去追问快乐的原因，更不要愚蠢地试图去改变它们，或者把它们控制在你手里。反过来说，如果你竟然斗胆去改变和控制它们，那么你就会永远无法拥有你在单纯天真状态下才可以享受的天堂之乐。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 24:12:23|这两则故事一个来自古希腊，一个来自耶路撒冷。西方思想的两个主要源头在这个问题上是一致的，它们都想要传达这样一个观念：“天真的失去，是找不到回归天堂之路的关键所在。”
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 24:16:03|这首先是一个人生哲学的问题，就像我在第2讲中问过的那个问题：你到底是愿意做一头终日快乐的猪，还是一个愁眉苦脸的苏格拉底？
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 24:24:04|柏拉图在青年时期对政治充满了热情，但是苏格拉底之死让他对现实政治心灰意冷，对民主制更是彻底丧失信心。苏格拉底的死，可以说是柏拉图失去天真的关键时刻，从此他离开雅典，自我放逐，四处游历，直到12年后才重返雅典。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 23:03:29|正义就是强者的利益！这个论断是不是非常耳熟？没错，在第10讲中，我们介绍赫西俄德的观点时曾经提到，人类之所以会从黄金时代堕落到黑铁时代，归根结底，就是因为人类相信“力量就是正义”
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 23:05:05|坦白说，过去这些年，类似的自鸣得意的政治现实主义在我国也日渐成为主流，与之相伴的是犬儒主义、失败主义以及精致的利己主义的盛行。这不仅对政治生活构成了巨大的戕害，对伦理生活也构成了巨大的戕害。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/18 23:13:59|。最后，说到权力，你一定听说过这句话：权力是男人的春药。可是按照柏拉图在《理想国》中的观点，真正的统治者其实根本不想获得权力，换言之，“凡渴求权力的人都不应该拥有权力”（伊迪丝·汉密尔顿语）。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 8:32:28|有比较就会有落差，有落差就会有妒忌。”我猜想这是人之常情，从小到大，几乎每个人都曾在某个阶段深深地妒忌过另一个人。因为妒忌，就会暗暗生出好胜之心，有时候这种“胜过”他人的念头太过强烈，以至于辗转反侧彻夜难眠。这里的“胜过”就是我们反复提到的“僭越”、“逾越”的意思。所以我们又重新回到了此前反复提及的那个命题——每个人都是潜在的僭主！之所以每个人都是潜在的僭主，按照苏格拉底的思路，问题就出在没能拥有正确的知识，尤其是没能拥有正确的自我认知，这样一来我们又重新回到了此前反复提及的德尔菲神庙的那句箴言：认识你自己。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 8:39:19|用当代伦理学的话说，柏拉图更是个行为者中心（agent-centered）而非行为中心（act-centered）的伦理学家。”以行为为中心的伦理学问的是“what should I do？”——我应该做什么？而以行为者为中心的伦理学问的是“what should I be？”——我应该成为什么样的人？这显然是两种非常不同的伦理学进路。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 8:42:29|事实上，在《理想国》第一卷中，苏格拉底没有真正说服色拉叙马霍斯，苏格拉底的几个论证都存在着明显的缺陷，在现实世界里，色拉叙马霍斯更是赢家，因为不正义的人往往比正义的人过得更好。色拉叙马霍斯并没有退场，他一直停留在《理想国》中，作为一个影子般的存在
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 8:43:52|柏拉图的学生亚里士多德曾经这样概括希腊人对于幸福生活的三种态度：“快感的人生”追求的是“快乐”；“政治的人生”追求的是“荣誉”；“思辨的人生”追求的是“沉思”
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 9:21:33|小型公司支付甜饼钱的概率要比大型公司高出3%~5%，这并不是因为小型公司的员工更诚实，而是因为在小型公司中，人与人之间的熟悉程度和情感纽带更加紧密，犯罪者或者说犯错感所承受的羞耻感和社会压力更大。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 9:21:39|现代社会之所以出现世风日下、人心不古的道德危机，一个很重要的原因是道德生活的外部环境改变了。在人潮汹涌的大型陌生人社会中，我们除了要设立严刑峻法，更为重要的是要建立各种纵横交错的熟人社区，让原子化的个体重新恢复与周遭环境和人的深厚联系。这或许是在上帝已死的时代挽救道德败坏的一个可行途径，尽管在面对古格斯戒指这样的极端诱惑时，它依旧无法回答“我为什么要成为一个道德的人”。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 9:34:18|。我们可以用八字箴言来概括苏格拉底的“正义观”：各归其位，各司其职。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 9:37:14|对于现代人来说，生活就像一场实验，这场实验要求你不断地调整方向，改换赛道，校准目标，去发现和实现那个“非你不能做，非你做不好”的“自然天赋”。在这个过程中，你必须要不断地去试错，不断地去犯错，在经历了种种努力、奋斗、失败、绝望与痛苦之后，才有可能认识你自己，发现你自己，并最终成为你自己
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 9:42:42|最近网上流传一句索尔仁尼琴的话，我怀疑是伪作，但道理很深刻：“我们知道他们在说谎，他们也知道自己是说谎，他们也知道我们知道他们在说谎，我们也知道他们知道我们知道他们说谎，但是他们依然在说谎。”为什么会出现如此悖谬的情况？其实我在一篇文章中有过解释： “一个不再被人们认可或相信的意识形态仍旧可以继续发挥政治和社会价值分配的功能，哪怕它看上去漏洞百出，苟延残喘，但只要每个人都可以通过它获得自己想要的东西，那么它就仍然功能健全，运转良好，这才是意识形态的本来面目。在某种意义上，这样的意识形态更可怕，因为它不再是少数人处心积虑地说谎，而是所有人心照不宣地共同维护那个公开的谎言。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 9:43:34|从现代人的观点看，权力导致腐败，极端的权力导致极端的腐败，那么，我们为什么还要把权力交给护卫者？结合了凶猛与温顺品格的护卫者是如何可能的？
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 9:49:32|除此之外，这段话还透露出一个非常重要的统治秘诀：“政治稳定的理想方子是上位者团结，下位者分裂。”
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 9:56:25|柏拉图虽然不是自由民主制的同路人，但也不是纳粹主义者，因为他只是把优生学运用到了护卫者的遴选上，而没有拓展到整个城邦。就其取消私有财产的观点而言，他也并非一个共产主义者，因为他只是在护卫者内部取消了私有财产，而不是将其扩大到整个城邦。如果一定要给柏拉图贴个标签，也许可以称为他为权威主义和家长制的信奉者。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 9:56:41|思想的龙种常常结出现实的跳蚤，任何理论一旦被运用到现实世界，都存在变形的可能，对《理想国》中一些危险的思想因素保持足够的警惕是必要的。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 10:03:02|智慧、勇敢、节制和正义被称为古希腊的“四主德”，后来中世纪的神学家托马斯·阿奎那又补充了基督教的三种美德——信、望、爱，也即信仰、希望与博爱。古希腊的四主德处理的是人与人的关系，而基督教三美德处理的是神与人的关系
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 10:05:15|我们在第24讲中曾经介绍过古希腊诗人西蒙尼得的那句名言：正义就是给每个人应得的东西。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 10:05:38|阿那克西曼德用“正义女神”（Dike）称呼这种“力的平衡”。罗素指出，这种“正义”的观念——不能逾越永恒固定的界限的观念——是一种最深刻的希腊信仰
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 21:13:29|亚里士多德与柏拉图是西方哲学史上最著名的一对师徒，对于普通人来说，亚里士多德最著名的一段话莫过于“吾爱吾师，吾更爱真理”。从最纯粹的意义上来讲，这句话显示出真正的哲人在追求真理时应该秉承的求真态度；从心理学的角度来说，则反映出任何天才都是脑后有反骨的，他不愿意也不可能永远躲在另一个巨人的背影后面。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/19 21:14:37|公元前343年亚里士多德成为马其顿王国的王子亚历山大的老师。公元前337年，马其顿王国征服了包括雅典在内的希腊城邦，次年亚历山大大帝正式登基。公元前335年亚里士多德重返雅典，创办了著名的吕克昂学园，据说他白天给专业学生上课，晚上则给普罗大众开讲座，因为他习惯在散步的时候和学生讨论问题，所以后人称这个学派为“逍遥学派”
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/21 7:56:59|腊历史上曾经被四个帝国长期统治：马其顿、罗马、拜占庭，以及奥斯曼帝国。对于前三者，现代希腊人都已经欣然接纳为“我”的历史，但对奥斯曼帝国却始终耿耿于怀。2008
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/21 8:04:57|们不再问：人怎样才能够创造一个好国家？而是问：在一个罪恶的世界里，人怎样才能够有德；或者，在一个受苦受难的世界里，人怎样才能够幸福？ 这个问题意识对于中国人来说再熟悉不过，“穷则独善其身，达则兼济天下”，“人生在世不称意，明朝散发弄扁舟”，在入世和出世之间无缝切换，这是深入中国人骨髓的精神风格。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/21 22:23:45|借用宋代禅宗大师青原行思的三重境界说，古希腊的怀疑派是先经历了“见山不是山，见水不是水”，然后才到达“见山还是山，见水还是水”的境界。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/21 22:27:09|罗尔斯的“判断的负担”和怀疑派的“五式”相比，虽然内容上存在差异，但精神气质却非常相似——他们都深刻地意识到人类理性的限度，以及由此导致的人类生存的基本困境。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/21 22:29:05|我曾经有幸去过两次希腊，在雅典城内闲逛的时候，最引人瞩目的风景之一，就是三三两两倒卧在路边的狗，不管是清晨还是午后，它们总是四肢舒坦、旁若无人地晒着太阳。每当看到这幕场景，我就会想起亚历山大大帝与犬儒主义的创始人第欧根尼（Diogenes，约公元前412年—前323年）的那段经典对话。求贤若渴的亚历山大找到第欧根尼，说：“我就是亚历山大，请问你有什么要求，我一定为你办到。”当时正在木桶里晒太阳的第欧根尼回答说：“请你走开一点，不要遮住我的阳光。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/21 22:30:04|第欧根尼批评文明的矫饰和价值的伪善，主张放弃包括财产、婚姻、家庭、公民身份、学识和声誉在内的一切身外之物，选择像条狗一样过最原始简朴的生活，目的是为了追求德性，获得完美的幸福。第欧根尼有一个非常独特的观点，认为普罗米修斯盗火，其实不是在造福人类，而是在祸害人类，因为他把奇技淫巧带到人间，让生活变得复杂而累赘，所以普罗米修斯受到惩罚完全是罪有应得。
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/21 22:37:51|格拉底同样高度重视个人的德性，认为有德之人不可能受到伤害，这句话稍加引申，就可以得出犬儒学派的核心观点：只要确保了德性，就确保了好生活，除此之外的世俗生活都是不必要的负累
 HL|打开：周濂的100堂西方哲学课（一部有营养、有态度，读得懂、读得动的西方哲学史）|周濂|2020/7/21 22:38:15|跟亚里士多德一样，芝诺也是一个脑后有反骨的天才人物，亚里士多德最终脱离学园创立了逍遥学派，芝诺则离开犬儒学派创立了斯多亚学派。
--- a/kmanapp.mac.spec
+++ b/kmanapp.mac.spec
@@ -0,0 +1,48 @@
 # -*- mode: python ; coding: utf-8 -*-
 #datas: 元组的组成为(原项目中资源文件路径，打包后路径)
 block_cipher = None
 a = Analysis(['kman.py', 'kmanapp.py', 'kmanapp_rc.py', 'mainwindow.py', 'mtable.py', 'parseweb.py', ],
 #a = Analysis(['kmanapp.py'],
             pathex=['.',
                     '/Users/mark/penv',
                     '/Users/mark/penv/kman',
                     '/Users/mark/.virtualenvs/kmanenv/lib/python3.7/site-packages/shiboken2',
                     '/Users/mark/.virtualenvs/kmanenv/lib/python3.7/site-packages/PySide2'],
             binaries=[],
             datas=[('kmanapp.ico','.'), ],
             hiddenimports=[],
             hookspath=[],
             runtime_hooks=[],
             excludes=[],
             win_no_prefer_redirects=False,
             win_private_assemblies=False,
             cipher=block_cipher,
             noarchive=False)
 pyz = PYZ(a.pure, a.zipped_data,
             cipher=block_cipher)
 exe = EXE(pyz,
          a.scripts,
          [],
          exclude_binaries=True,
          name='kmanapp',
          debug=all,
          bootloader_ignore_signals=False,
          strip=False,
          upx=True,
          runtime_tmpdir=None,
          console=True,
          icon='kmanapp.ico')
 coll = COLLECT(exe,
               a.binaries,
               a.zipfiles,
               a.datas,
               strip=False,
               upx=True,
               upx_exclude=[],
               name='kmanapp')
 app = BUNDLE(coll,
             name='kmanapp.app',
             icon='kmanapp.ico',
             bundle_identifier=None)
--- a/kmanapp.win.spec
+++ b/kmanapp.win.spec
@@ -34,7 +34,7 @@ a = Analysis(['kman.py',
             #pathex=[basedir],
             binaries=[],
             datas=[('backup','backup'),
-                    ('My Clippings.txt','.'),
+                    #('My Clippings.txt','.'),
                    ('vocab.db','.'),
                    ('downimg','downimg'),
                    ('*.md','.')],
--- a/kmanapp_rc.py
+++ b/kmanapp_rc.py
--- a/mainwindow.py
+++ b/mainwindow.py
@@ -1,129 +1,153 @@
 # -*- coding: utf-8 -*-
-# Form implementation generated from reading ui file 'mainwindow.ui',
+################################################################################
-# licensing of 'mainwindow.ui' applies.
+## Form generated from reading UI file 'mainwindow.ui'
-#
+##
-# Created: Mon Jul  6 15:41:31 2020
+## Created by: Qt User Interface Compiler version 5.15.0
-#      by: pyside2-uic  running on PySide2 5.12.6
+##
-#
+## WARNING! All changes made in this file will be lost when recompiling UI file!
-# WARNING! All changes made in this file will be lost!
+################################################################################
-from PySide2 import QtCore, QtGui, QtWidgets
+from PySide2.QtCore import (QCoreApplication, QDate, QDateTime, QMetaObject,
    QObject, QPoint, QRect, QSize, QTime, QUrl, Qt)
 from PySide2.QtGui import (QBrush, QColor, QConicalGradient, QCursor, QFont,
    QFontDatabase, QIcon, QKeySequence, QLinearGradient, QPalette, QPainter,
    QPixmap, QRadialGradient)
 from PySide2.QtWidgets import *
 import kmanapp_rc
 class Ui_MainWindow(object):
    def setupUi(self, MainWindow):
-        MainWindow.setObjectName("MainWindow")
+        if not MainWindow.objectName():
            MainWindow.setObjectName(u"MainWindow")
        MainWindow.resize(774, 410)
-        MainWindow.setIconSize(QtCore.QSize(40, 40))
+        MainWindow.setIconSize(QSize(40, 40))
-        self.centralwidget = QtWidgets.QWidget(MainWindow)
+        self.actionimportlocal = QAction(MainWindow)
-        self.centralwidget.setObjectName("centralwidget")
+        self.actionimportlocal.setObjectName(u"actionimportlocal")
-        self.gridLayout = QtWidgets.QGridLayout(self.centralwidget)
+        icon = QIcon()
-        self.gridLayout.setObjectName("gridLayout")
+        icon.addFile(u":/icons/downr.png", QSize(), QIcon.Normal, QIcon.Off)
-        self.horizontalLayout = QtWidgets.QHBoxLayout()
+        self.actionimportlocal.setIcon(icon)
-        self.horizontalLayout.setObjectName("horizontalLayout")
+        self.actionimportkindle = QAction(MainWindow)
-        self.searchLabel = QtWidgets.QLabel(self.centralwidget)
+        self.actionimportkindle.setObjectName(u"actionimportkindle")
-        self.searchLabel.setObjectName("searchLabel")
+        icon1 = QIcon()
        icon1.addFile(u":/icons/kindle.png", QSize(), QIcon.Normal, QIcon.Off)
        self.actionimportkindle.setIcon(icon1)
        self.actionconfig = QAction(MainWindow)
        self.actionconfig.setObjectName(u"actionconfig")
        icon2 = QIcon()
        icon2.addFile(u":/icons/config.png", QSize(), QIcon.Normal, QIcon.Off)
        self.actionconfig.setIcon(icon2)
        self.actionflush = QAction(MainWindow)
        self.actionflush.setObjectName(u"actionflush")
        icon3 = QIcon()
        icon3.addFile(u":/icons/refresh.png", QSize(), QIcon.Normal, QIcon.Off)
        self.actionflush.setIcon(icon3)
        self.actionwords = QAction(MainWindow)
        self.actionwords.setObjectName(u"actionwords")
        icon4 = QIcon()
        icon4.addFile(u":/icons/books.png", QSize(), QIcon.Normal, QIcon.Off)
        self.actionwords.setIcon(icon4)
        self.actionstatistic = QAction(MainWindow)
        self.actionstatistic.setObjectName(u"actionstatistic")
        icon5 = QIcon()
        icon5.addFile(u":/icons/statistics.png", QSize(), QIcon.Normal, QIcon.Off)
        self.actionstatistic.setIcon(icon5)
        self.actionhomepage = QAction(MainWindow)
        self.actionhomepage.setObjectName(u"actionhomepage")
        icon6 = QIcon()
        icon6.addFile(u":/icons/web.png", QSize(), QIcon.Normal, QIcon.Off)
        self.actionhomepage.setIcon(icon6)
        self.actionabout = QAction(MainWindow)
        self.actionabout.setObjectName(u"actionabout")
        icon7 = QIcon()
        icon7.addFile(u":/icons/question.png", QSize(), QIcon.Normal, QIcon.Off)
        self.actionabout.setIcon(icon7)
        self.actionsearch = QAction(MainWindow)
        self.actionsearch.setObjectName(u"actionsearch")
        icon8 = QIcon()
        icon8.addFile(u":/icons/Pixadex.png", QSize(), QIcon.Normal, QIcon.Off)
        self.actionsearch.setIcon(icon8)
        self.actionexport = QAction(MainWindow)
        self.actionexport.setObjectName(u"actionexport")
        icon9 = QIcon()
        icon9.addFile(u":/icons/md2.png", QSize(), QIcon.Normal, QIcon.Off)
        self.actionexport.setIcon(icon9)
        self.centralwidget = QWidget(MainWindow)
        self.centralwidget.setObjectName(u"centralwidget")
        self.gridLayout = QGridLayout(self.centralwidget)
        self.gridLayout.setObjectName(u"gridLayout")
        self.horizontalLayout = QHBoxLayout()
        self.horizontalLayout.setObjectName(u"horizontalLayout")
        self.searchLabel = QLabel(self.centralwidget)
        self.searchLabel.setObjectName(u"searchLabel")
        self.horizontalLayout.addWidget(self.searchLabel)
-        self.searchLineEdit = QtWidgets.QLineEdit(self.centralwidget)
+
-        self.searchLineEdit.setObjectName("searchLineEdit")
+        self.searchLineEdit = QLineEdit(self.centralwidget)
        self.searchLineEdit.setObjectName(u"searchLineEdit")
        self.horizontalLayout.addWidget(self.searchLineEdit)
-        self.searchComboBox = QtWidgets.QComboBox(self.centralwidget)
+
-        self.searchComboBox.setCurrentText("")
+        self.searchComboBox = QComboBox(self.centralwidget)
-        self.searchComboBox.setObjectName("searchComboBox")
+        self.searchComboBox.setObjectName(u"searchComboBox")
        self.horizontalLayout.addWidget(self.searchComboBox)
-        self.searchToolButton = QtWidgets.QToolButton(self.centralwidget)
+
-        icon = QtGui.QIcon()
+        self.searchToolButton = QToolButton(self.centralwidget)
-        icon.addPixmap(QtGui.QPixmap(":/icons/search.jpeg"), QtGui.QIcon.Normal, QtGui.QIcon.Off)
+        self.searchToolButton.setObjectName(u"searchToolButton")
-        self.searchToolButton.setIcon(icon)
+        icon10 = QIcon()
-        self.searchToolButton.setObjectName("searchToolButton")
+        icon10.addFile(u":/icons/search.jpeg", QSize(), QIcon.Normal, QIcon.Off)
        self.searchToolButton.setIcon(icon10)
        self.horizontalLayout.addWidget(self.searchToolButton)
        self.gridLayout.addLayout(self.horizontalLayout, 0, 0, 1, 1)
-        self.splitter_2 = QtWidgets.QSplitter(self.centralwidget)
+
-        self.splitter_2.setOrientation(QtCore.Qt.Horizontal)
+        self.splitter_2 = QSplitter(self.centralwidget)
-        self.splitter_2.setObjectName("splitter_2")
+        self.splitter_2.setObjectName(u"splitter_2")
-        self.treeView = QtWidgets.QTreeView(self.splitter_2)
+        self.splitter_2.setOrientation(Qt.Horizontal)
-        sizePolicy = QtWidgets.QSizePolicy(QtWidgets.QSizePolicy.Preferred, QtWidgets.QSizePolicy.Expanding)
+        self.treeView = QTreeView(self.splitter_2)
        self.treeView.setObjectName(u"treeView")
        sizePolicy = QSizePolicy(QSizePolicy.Preferred, QSizePolicy.Expanding)
        sizePolicy.setHorizontalStretch(0)
        sizePolicy.setVerticalStretch(0)
        sizePolicy.setHeightForWidth(self.treeView.sizePolicy().hasHeightForWidth())
        self.treeView.setSizePolicy(sizePolicy)
-        self.treeView.setMaximumSize(QtCore.QSize(401, 16777215))
+        self.treeView.setMaximumSize(QSize(401, 16777215))
-        self.treeView.setObjectName("treeView")
+        self.splitter_2.addWidget(self.treeView)
        self.treeView.header().setVisible(False)
-        self.splitter = QtWidgets.QSplitter(self.splitter_2)
+        self.splitter = QSplitter(self.splitter_2)
-        self.splitter.setOrientation(QtCore.Qt.Vertical)
+        self.splitter.setObjectName(u"splitter")
-        self.splitter.setObjectName("splitter")
+        self.splitter.setOrientation(Qt.Vertical)
-        self.tableView = QtWidgets.QTableView(self.splitter)
+        self.tableView = QTableView(self.splitter)
-        sizePolicy = QtWidgets.QSizePolicy(QtWidgets.QSizePolicy.Maximum, QtWidgets.QSizePolicy.Expanding)
+        self.tableView.setObjectName(u"tableView")
-        sizePolicy.setHorizontalStretch(0)
+        sizePolicy1 = QSizePolicy(QSizePolicy.Maximum, QSizePolicy.Expanding)
-        sizePolicy.setVerticalStretch(0)
+        sizePolicy1.setHorizontalStretch(0)
-        sizePolicy.setHeightForWidth(self.tableView.sizePolicy().hasHeightForWidth())
+        sizePolicy1.setVerticalStretch(0)
-        self.tableView.setSizePolicy(sizePolicy)
+        sizePolicy1.setHeightForWidth(self.tableView.sizePolicy().hasHeightForWidth())
-        self.tableView.setObjectName("tableView")
+        self.tableView.setSizePolicy(sizePolicy1)
-        self.textEdit = QtWidgets.QTextBrowser(self.splitter)
+        self.splitter.addWidget(self.tableView)
-        self.textEdit.setObjectName("textEdit")
+        self.textEdit = QTextBrowser(self.splitter)
        self.textEdit.setObjectName(u"textEdit")
        self.splitter.addWidget(self.textEdit)
        self.splitter_2.addWidget(self.splitter)
        self.gridLayout.addWidget(self.splitter_2, 1, 0, 1, 1)
        MainWindow.setCentralWidget(self.centralwidget)
-        self.statusbar = QtWidgets.QStatusBar(MainWindow)
+        self.statusbar = QStatusBar(MainWindow)
-        self.statusbar.setObjectName("statusbar")
+        self.statusbar.setObjectName(u"statusbar")
        MainWindow.setStatusBar(self.statusbar)
-        self.menuBar = QtWidgets.QMenuBar()
+        self.menuBar = QMenuBar(MainWindow)
-        self.menuBar.setGeometry(QtCore.QRect(0, 0, 774, 22))
+        self.menuBar.setObjectName(u"menuBar")
-        self.menuBar.setObjectName("menuBar")
+        self.menuBar.setGeometry(QRect(0, 0, 774, 22))
        MainWindow.setMenuBar(self.menuBar)
-        self.toolBar = QtWidgets.QToolBar(MainWindow)
+        self.toolBar = QToolBar(MainWindow)
-        self.toolBar.setObjectName("toolBar")
+        self.toolBar.setObjectName(u"toolBar")
-        MainWindow.addToolBar(QtCore.Qt.TopToolBarArea, self.toolBar)
+        MainWindow.addToolBar(Qt.TopToolBarArea, self.toolBar)
-        self.actionimportlocal = QtWidgets.QAction(MainWindow)
+
        icon1 = QtGui.QIcon()
        icon1.addPixmap(QtGui.QPixmap(":/icons/downr.png"), QtGui.QIcon.Normal, QtGui.QIcon.Off)
        self.actionimportlocal.setIcon(icon1)
        self.actionimportlocal.setObjectName("actionimportlocal")
        self.actionimportkindle = QtWidgets.QAction(MainWindow)
        icon2 = QtGui.QIcon()
        icon2.addPixmap(QtGui.QPixmap(":/icons/kindle.png"), QtGui.QIcon.Normal, QtGui.QIcon.Off)
        self.actionimportkindle.setIcon(icon2)
        self.actionimportkindle.setObjectName("actionimportkindle")
        self.actionconfig = QtWidgets.QAction(MainWindow)
        icon3 = QtGui.QIcon()
        icon3.addPixmap(QtGui.QPixmap(":/icons/config.png"), QtGui.QIcon.Normal, QtGui.QIcon.Off)
        self.actionconfig.setIcon(icon3)
        self.actionconfig.setObjectName("actionconfig")
        self.actionflush = QtWidgets.QAction(MainWindow)
        icon4 = QtGui.QIcon()
        icon4.addPixmap(QtGui.QPixmap(":/icons/refresh.png"), QtGui.QIcon.Normal, QtGui.QIcon.Off)
        self.actionflush.setIcon(icon4)
        self.actionflush.setObjectName("actionflush")
        self.actionwords = QtWidgets.QAction(MainWindow)
        icon5 = QtGui.QIcon()
        icon5.addPixmap(QtGui.QPixmap(":/icons/books.png"), QtGui.QIcon.Normal, QtGui.QIcon.Off)
        self.actionwords.setIcon(icon5)
        self.actionwords.setObjectName("actionwords")
        self.actionstatistic = QtWidgets.QAction(MainWindow)
        icon6 = QtGui.QIcon()
        icon6.addPixmap(QtGui.QPixmap(":/icons/statistics.png"), QtGui.QIcon.Normal, QtGui.QIcon.Off)
        self.actionstatistic.setIcon(icon6)
        self.actionstatistic.setObjectName("actionstatistic")
        self.actionhomepage = QtWidgets.QAction(MainWindow)
        icon7 = QtGui.QIcon()
        icon7.addPixmap(QtGui.QPixmap(":/icons/web.png"), QtGui.QIcon.Normal, QtGui.QIcon.Off)
        self.actionhomepage.setIcon(icon7)
        self.actionhomepage.setObjectName("actionhomepage")
        self.actionabout = QtWidgets.QAction(MainWindow)
        icon8 = QtGui.QIcon()
        icon8.addPixmap(QtGui.QPixmap(":/icons/question.png"), QtGui.QIcon.Normal, QtGui.QIcon.Off)
        self.actionabout.setIcon(icon8)
        self.actionabout.setObjectName("actionabout")
        self.actionsearch = QtWidgets.QAction(MainWindow)
        icon9 = QtGui.QIcon()
        icon9.addPixmap(QtGui.QPixmap(":/icons/Pixadex.png"), QtGui.QIcon.Normal, QtGui.QIcon.Off)
        self.actionsearch.setIcon(icon9)
        self.actionsearch.setObjectName("actionsearch")
        self.actionexport = QtWidgets.QAction(MainWindow)
        icon10 = QtGui.QIcon()
        icon10.addPixmap(QtGui.QPixmap(":/icons/md2.png"), QtGui.QIcon.Normal, QtGui.QIcon.Off)
        self.actionexport.setIcon(icon10)
        self.actionexport.setObjectName("actionexport")
        self.toolBar.addAction(self.actionimportkindle)
        self.toolBar.addAction(self.actionimportlocal)
        self.toolBar.addSeparator()
@@ -139,33 +163,56 @@ class Ui_MainWindow(object):
        self.toolBar.addAction(self.actionflush)
        self.retranslateUi(MainWindow)
-        QtCore.QMetaObject.connectSlotsByName(MainWindow)
+
        QMetaObject.connectSlotsByName(MainWindow)
    # setupUi
    def retranslateUi(self, MainWindow):
-        MainWindow.setWindowTitle(QtWidgets.QApplication.translate("MainWindow", "Kindle Management", None, -1))
+        MainWindow.setWindowTitle(QCoreApplication.translate("MainWindow", u"Kindle Management", None))
-        self.searchLabel.setText(QtWidgets.QApplication.translate("MainWindow", "Search", None, -1))
+        self.actionimportlocal.setText(QCoreApplication.translate("MainWindow", u"importlocal", None))
-        self.searchLineEdit.setPlaceholderText(QtWidgets.QApplication.translate("MainWindow", "可按书名、作者、内容搜索笔记", None, -1))
+#if QT_CONFIG(tooltip)
-        self.searchToolButton.setText(QtWidgets.QApplication.translate("MainWindow", "...", None, -1))
+        self.actionimportlocal.setToolTip(QCoreApplication.translate("MainWindow", u"import clipping file from local clipping file", None))
-        self.toolBar.setWindowTitle(QtWidgets.QApplication.translate("MainWindow", "toolBar", None, -1))
+#endif // QT_CONFIG(tooltip)
-        self.actionimportlocal.setText(QtWidgets.QApplication.translate("MainWindow", "importlocal", None, -1))
+        self.actionimportkindle.setText(QCoreApplication.translate("MainWindow", u"importkindle", None))
-        self.actionimportlocal.setToolTip(QtWidgets.QApplication.translate("MainWindow", "import clipping file from local clipping file", None, -1))
+#if QT_CONFIG(tooltip)
-        self.actionimportkindle.setText(QtWidgets.QApplication.translate("MainWindow", "importkindle", None, -1))
+        self.actionimportkindle.setToolTip(QCoreApplication.translate("MainWindow", u"import clipping file from kindle", None))
-        self.actionimportkindle.setToolTip(QtWidgets.QApplication.translate("MainWindow", "import clipping file from kindle", None, -1))
+#endif // QT_CONFIG(tooltip)
-        self.actionconfig.setText(QtWidgets.QApplication.translate("MainWindow", "config", None, -1))
+        self.actionconfig.setText(QCoreApplication.translate("MainWindow", u"config", None))
-        self.actionconfig.setToolTip(QtWidgets.QApplication.translate("MainWindow", "configuration", None, -1))
+#if QT_CONFIG(tooltip)
-        self.actionflush.setText(QtWidgets.QApplication.translate("MainWindow", "refresh", None, -1))
+        self.actionconfig.setToolTip(QCoreApplication.translate("MainWindow", u"configuration", None))
-        self.actionflush.setToolTip(QtWidgets.QApplication.translate("MainWindow", "refresh import file/quick import from kindle", None, -1))
+#endif // QT_CONFIG(tooltip)
-        self.actionwords.setText(QtWidgets.QApplication.translate("MainWindow", "words", None, -1))
+        self.actionflush.setText(QCoreApplication.translate("MainWindow", u"refresh", None))
-        self.actionwords.setToolTip(QtWidgets.QApplication.translate("MainWindow", "words", None, -1))
+#if QT_CONFIG(tooltip)
-        self.actionstatistic.setText(QtWidgets.QApplication.translate("MainWindow", "statistic", None, -1))
+        self.actionflush.setToolTip(QCoreApplication.translate("MainWindow", u"refresh import file/quick import from kindle", None))
-        self.actionstatistic.setToolTip(QtWidgets.QApplication.translate("MainWindow", "statistics reading habbit", None, -1))
+#endif // QT_CONFIG(tooltip)
-        self.actionhomepage.setText(QtWidgets.QApplication.translate("MainWindow", "homepage", None, -1))
+        self.actionwords.setText(QCoreApplication.translate("MainWindow", u"words", None))
-        self.actionhomepage.setToolTip(QtWidgets.QApplication.translate("MainWindow", "redirect to my homepage", None, -1))
+#if QT_CONFIG(tooltip)
-        self.actionabout.setText(QtWidgets.QApplication.translate("MainWindow", "about", None, -1))
+        self.actionwords.setToolTip(QCoreApplication.translate("MainWindow", u"words", None))
-        self.actionabout.setToolTip(QtWidgets.QApplication.translate("MainWindow", "open about dialog", None, -1))
+#endif // QT_CONFIG(tooltip)
-        self.actionsearch.setText(QtWidgets.QApplication.translate("MainWindow", "search", None, -1))
+        self.actionstatistic.setText(QCoreApplication.translate("MainWindow", u"statistic", None))
-        self.actionsearch.setToolTip(QtWidgets.QApplication.translate("MainWindow", "search note", None, -1))
+#if QT_CONFIG(tooltip)
-        self.actionexport.setText(QtWidgets.QApplication.translate("MainWindow", "export", None, -1))
+        self.actionstatistic.setToolTip(QCoreApplication.translate("MainWindow", u"statistics reading habbit", None))
-        self.actionexport.setToolTip(QtWidgets.QApplication.translate("MainWindow", "export to file", None, -1))
+#endif // QT_CONFIG(tooltip)
        self.actionhomepage.setText(QCoreApplication.translate("MainWindow", u"homepage", None))
 #if QT_CONFIG(tooltip)
        self.actionhomepage.setToolTip(QCoreApplication.translate("MainWindow", u"redirect to my homepage", None))
 #endif // QT_CONFIG(tooltip)
        self.actionabout.setText(QCoreApplication.translate("MainWindow", u"about", None))
 #if QT_CONFIG(tooltip)
        self.actionabout.setToolTip(QCoreApplication.translate("MainWindow", u"open about dialog", None))
 #endif // QT_CONFIG(tooltip)
        self.actionsearch.setText(QCoreApplication.translate("MainWindow", u"search", None))
 #if QT_CONFIG(tooltip)
        self.actionsearch.setToolTip(QCoreApplication.translate("MainWindow", u"search note", None))
 #endif // QT_CONFIG(tooltip)
        self.actionexport.setText(QCoreApplication.translate("MainWindow", u"export", None))
 #if QT_CONFIG(tooltip)
        self.actionexport.setToolTip(QCoreApplication.translate("MainWindow", u"export to file", None))
 #endif // QT_CONFIG(tooltip)
        self.searchLabel.setText(QCoreApplication.translate("MainWindow", u"Search", None))
        self.searchLineEdit.setPlaceholderText(QCoreApplication.translate("MainWindow", u"\u53ef\u6309\u4e66\u540d\u3001\u4f5c\u8005\u3001\u5185\u5bb9\u641c\u7d22\u7b14\u8bb0", None))
        self.searchComboBox.setCurrentText("")
        self.searchToolButton.setText(QCoreApplication.translate("MainWindow", u"...", None))
        self.toolBar.setWindowTitle(QCoreApplication.translate("MainWindow", u"toolBar", None))
    # retranslateUi
 import kmanapp_rc
--- a/makepkg.md
+++ b/makepkg.md
@@ -159,4 +159,3 @@ mkvirtualenv kmanenv
 1. [my spec file1](kmanapp.spec)
 1. [my spec file2](kmanapp.win.spec)
--- a/mobimaster/.DS_Store
+++ b/mobimaster/.DS_Store
--- a/mobimaster/LICENSE
+++ b/mobimaster/LICENSE
@@ -0,0 +1,674 @@
                    GNU GENERAL PUBLIC LICENSE
                       Version 3, 29 June 2007
 Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.
                            Preamble
  The GNU General Public License is a free, copyleft license for
 software and other kinds of works.
  The licenses for most software and other practical works are designed
 to take away your freedom to share and change the works.  By contrast,
 the GNU General Public License is intended to guarantee your freedom to
 share and change all versions of a program--to make sure it remains free
 software for all its users.  We, the Free Software Foundation, use the
 GNU General Public License for most of our software; it applies also to
 any other work released this way by its authors.  You can apply it to
 your programs, too.
  When we speak of free software, we are referring to freedom, not
 price.  Our General Public Licenses are designed to make sure that you
 have the freedom to distribute copies of free software (and charge for
 them if you wish), that you receive source code or can get it if you
 want it, that you can change the software or use pieces of it in new
 free programs, and that you know you can do these things.
  To protect your rights, we need to prevent others from denying you
 these rights or asking you to surrender the rights.  Therefore, you have
 certain responsibilities if you distribute copies of the software, or if
 you modify it: responsibilities to respect the freedom of others.
  For example, if you distribute copies of such a program, whether
 gratis or for a fee, you must pass on to the recipients the same
 freedoms that you received.  You must make sure that they, too, receive
 or can get the source code.  And you must show them these terms so they
 know their rights.
  Developers that use the GNU GPL protect your rights with two steps:
 (1) assert copyright on the software, and (2) offer you this License
 giving you legal permission to copy, distribute and/or modify it.
  For the developers' and authors' protection, the GPL clearly explains
 that there is no warranty for this free software.  For both users' and
 authors' sake, the GPL requires that modified versions be marked as
 changed, so that their problems will not be attributed erroneously to
 authors of previous versions.
  Some devices are designed to deny users access to install or run
 modified versions of the software inside them, although the manufacturer
 can do so.  This is fundamentally incompatible with the aim of
 protecting users' freedom to change the software.  The systematic
 pattern of such abuse occurs in the area of products for individuals to
 use, which is precisely where it is most unacceptable.  Therefore, we
 have designed this version of the GPL to prohibit the practice for those
 products.  If such problems arise substantially in other domains, we
 stand ready to extend this provision to those domains in future versions
 of the GPL, as needed to protect the freedom of users.
  Finally, every program is threatened constantly by software patents.
 States should not allow patents to restrict development and use of
 software on general-purpose computers, but in those that do, we wish to
 avoid the special danger that patents applied to a free program could
 make it effectively proprietary.  To prevent this, the GPL assures that
 patents cannot be used to render the program non-free.
  The precise terms and conditions for copying, distribution and
 modification follow.
                       TERMS AND CONDITIONS
  0. Definitions.
  "This License" refers to version 3 of the GNU General Public License.
  "Copyright" also means copyright-like laws that apply to other kinds of
 works, such as semiconductor masks.
  "The Program" refers to any copyrightable work licensed under this
 License.  Each licensee is addressed as "you".  "Licensees" and
 "recipients" may be individuals or organizations.
  To "modify" a work means to copy from or adapt all or part of the work
 in a fashion requiring copyright permission, other than the making of an
 exact copy.  The resulting work is called a "modified version" of the
 earlier work or a work "based on" the earlier work.
  A "covered work" means either the unmodified Program or a work based
 on the Program.
  To "propagate" a work means to do anything with it that, without
 permission, would make you directly or secondarily liable for
 infringement under applicable copyright law, except executing it on a
 computer or modifying a private copy.  Propagation includes copying,
 distribution (with or without modification), making available to the
 public, and in some countries other activities as well.
  To "convey" a work means any kind of propagation that enables other
 parties to make or receive copies.  Mere interaction with a user through
 a computer network, with no transfer of a copy, is not conveying.
  An interactive user interface displays "Appropriate Legal Notices"
 to the extent that it includes a convenient and prominently visible
 feature that (1) displays an appropriate copyright notice, and (2)
 tells the user that there is no warranty for the work (except to the
 extent that warranties are provided), that licensees may convey the
 work under this License, and how to view a copy of this License.  If
 the interface presents a list of user commands or options, such as a
 menu, a prominent item in the list meets this criterion.
  1. Source Code.
  The "source code" for a work means the preferred form of the work
 for making modifications to it.  "Object code" means any non-source
 form of a work.
  A "Standard Interface" means an interface that either is an official
 standard defined by a recognized standards body, or, in the case of
 interfaces specified for a particular programming language, one that
 is widely used among developers working in that language.
  The "System Libraries" of an executable work include anything, other
 than the work as a whole, that (a) is included in the normal form of
 packaging a Major Component, but which is not part of that Major
 Component, and (b) serves only to enable use of the work with that
 Major Component, or to implement a Standard Interface for which an
 implementation is available to the public in source code form.  A
 "Major Component", in this context, means a major essential component
 (kernel, window system, and so on) of the specific operating system
 (if any) on which the executable work runs, or a compiler used to
 produce the work, or an object code interpreter used to run it.
  The "Corresponding Source" for a work in object code form means all
 the source code needed to generate, install, and (for an executable
 work) run the object code and to modify the work, including scripts to
 control those activities.  However, it does not include the work's
 System Libraries, or general-purpose tools or generally available free
 programs which are used unmodified in performing those activities but
 which are not part of the work.  For example, Corresponding Source
 includes interface definition files associated with source files for
 the work, and the source code for shared libraries and dynamically
 linked subprograms that the work is specifically designed to require,
 such as by intimate data communication or control flow between those
 subprograms and other parts of the work.
  The Corresponding Source need not include anything that users
 can regenerate automatically from other parts of the Corresponding
 Source.
  The Corresponding Source for a work in source code form is that
 same work.
  2. Basic Permissions.
  All rights granted under this License are granted for the term of
 copyright on the Program, and are irrevocable provided the stated
 conditions are met.  This License explicitly affirms your unlimited
 permission to run the unmodified Program.  The output from running a
 covered work is covered by this License only if the output, given its
 content, constitutes a covered work.  This License acknowledges your
 rights of fair use or other equivalent, as provided by copyright law.
  You may make, run and propagate covered works that you do not
 convey, without conditions so long as your license otherwise remains
 in force.  You may convey covered works to others for the sole purpose
 of having them make modifications exclusively for you, or provide you
 with facilities for running those works, provided that you comply with
 the terms of this License in conveying all material for which you do
 not control copyright.  Those thus making or running the covered works
 for you must do so exclusively on your behalf, under your direction
 and control, on terms that prohibit them from making any copies of
 your copyrighted material outside their relationship with you.
  Conveying under any other circumstances is permitted solely under
 the conditions stated below.  Sublicensing is not allowed; section 10
 makes it unnecessary.
  3. Protecting Users' Legal Rights From Anti-Circumvention Law.
  No covered work shall be deemed part of an effective technological
 measure under any applicable law fulfilling obligations under article
 11 of the WIPO copyright treaty adopted on 20 December 1996, or
 similar laws prohibiting or restricting circumvention of such
 measures.
  When you convey a covered work, you waive any legal power to forbid
 circumvention of technological measures to the extent such circumvention
 is effected by exercising rights under this License with respect to
 the covered work, and you disclaim any intention to limit operation or
 modification of the work as a means of enforcing, against the work's
 users, your or third parties' legal rights to forbid circumvention of
 technological measures.
  4. Conveying Verbatim Copies.
  You may convey verbatim copies of the Program's source code as you
 receive it, in any medium, provided that you conspicuously and
 appropriately publish on each copy an appropriate copyright notice;
 keep intact all notices stating that this License and any
 non-permissive terms added in accord with section 7 apply to the code;
 keep intact all notices of the absence of any warranty; and give all
 recipients a copy of this License along with the Program.
  You may charge any price or no price for each copy that you convey,
 and you may offer support or warranty protection for a fee.
  5. Conveying Modified Source Versions.
  You may convey a work based on the Program, or the modifications to
 produce it from the Program, in the form of source code under the
 terms of section 4, provided that you also meet all of these conditions:
    a) The work must carry prominent notices stating that you modified
    it, and giving a relevant date.
    b) The work must carry prominent notices stating that it is
    released under this License and any conditions added under section
    7.  This requirement modifies the requirement in section 4 to
    "keep intact all notices".
    c) You must license the entire work, as a whole, under this
    License to anyone who comes into possession of a copy.  This
    License will therefore apply, along with any applicable section 7
    additional terms, to the whole of the work, and all its parts,
    regardless of how they are packaged.  This License gives no
    permission to license the work in any other way, but it does not
    invalidate such permission if you have separately received it.
    d) If the work has interactive user interfaces, each must display
    Appropriate Legal Notices; however, if the Program has interactive
    interfaces that do not display Appropriate Legal Notices, your
    work need not make them do so.
  A compilation of a covered work with other separate and independent
 works, which are not by their nature extensions of the covered work,
 and which are not combined with it such as to form a larger program,
 in or on a volume of a storage or distribution medium, is called an
 "aggregate" if the compilation and its resulting copyright are not
 used to limit the access or legal rights of the compilation's users
 beyond what the individual works permit.  Inclusion of a covered work
 in an aggregate does not cause this License to apply to the other
 parts of the aggregate.
  6. Conveying Non-Source Forms.
  You may convey a covered work in object code form under the terms
 of sections 4 and 5, provided that you also convey the
 machine-readable Corresponding Source under the terms of this License,
 in one of these ways:
    a) Convey the object code in, or embodied in, a physical product
    (including a physical distribution medium), accompanied by the
    Corresponding Source fixed on a durable physical medium
    customarily used for software interchange.
    b) Convey the object code in, or embodied in, a physical product
    (including a physical distribution medium), accompanied by a
    written offer, valid for at least three years and valid for as
    long as you offer spare parts or customer support for that product
    model, to give anyone who possesses the object code either (1) a
    copy of the Corresponding Source for all the software in the
    product that is covered by this License, on a durable physical
    medium customarily used for software interchange, for a price no
    more than your reasonable cost of physically performing this
    conveying of source, or (2) access to copy the
    Corresponding Source from a network server at no charge.
    c) Convey individual copies of the object code with a copy of the
    written offer to provide the Corresponding Source.  This
    alternative is allowed only occasionally and noncommercially, and
    only if you received the object code with such an offer, in accord
    with subsection 6b.
    d) Convey the object code by offering access from a designated
    place (gratis or for a charge), and offer equivalent access to the
    Corresponding Source in the same way through the same place at no
    further charge.  You need not require recipients to copy the
    Corresponding Source along with the object code.  If the place to
    copy the object code is a network server, the Corresponding Source
    may be on a different server (operated by you or a third party)
    that supports equivalent copying facilities, provided you maintain
    clear directions next to the object code saying where to find the
    Corresponding Source.  Regardless of what server hosts the
    Corresponding Source, you remain obligated to ensure that it is
    available for as long as needed to satisfy these requirements.
    e) Convey the object code using peer-to-peer transmission, provided
    you inform other peers where the object code and Corresponding
    Source of the work are being offered to the general public at no
    charge under subsection 6d.
  A separable portion of the object code, whose source code is excluded
 from the Corresponding Source as a System Library, need not be
 included in conveying the object code work.
  A "User Product" is either (1) a "consumer product", which means any
 tangible personal property which is normally used for personal, family,
 or household purposes, or (2) anything designed or sold for incorporation
 into a dwelling.  In determining whether a product is a consumer product,
 doubtful cases shall be resolved in favor of coverage.  For a particular
 product received by a particular user, "normally used" refers to a
 typical or common use of that class of product, regardless of the status
 of the particular user or of the way in which the particular user
 actually uses, or expects or is expected to use, the product.  A product
 is a consumer product regardless of whether the product has substantial
 commercial, industrial or non-consumer uses, unless such uses represent
 the only significant mode of use of the product.
  "Installation Information" for a User Product means any methods,
 procedures, authorization keys, or other information required to install
 and execute modified versions of a covered work in that User Product from
 a modified version of its Corresponding Source.  The information must
 suffice to ensure that the continued functioning of the modified object
 code is in no case prevented or interfered with solely because
 modification has been made.
  If you convey an object code work under this section in, or with, or
 specifically for use in, a User Product, and the conveying occurs as
 part of a transaction in which the right of possession and use of the
 User Product is transferred to the recipient in perpetuity or for a
 fixed term (regardless of how the transaction is characterized), the
 Corresponding Source conveyed under this section must be accompanied
 by the Installation Information.  But this requirement does not apply
 if neither you nor any third party retains the ability to install
 modified object code on the User Product (for example, the work has
 been installed in ROM).
  The requirement to provide Installation Information does not include a
 requirement to continue to provide support service, warranty, or updates
 for a work that has been modified or installed by the recipient, or for
 the User Product in which it has been modified or installed.  Access to a
 network may be denied when the modification itself materially and
 adversely affects the operation of the network or violates the rules and
 protocols for communication across the network.
  Corresponding Source conveyed, and Installation Information provided,
 in accord with this section must be in a format that is publicly
 documented (and with an implementation available to the public in
 source code form), and must require no special password or key for
 unpacking, reading or copying.
  7. Additional Terms.
  "Additional permissions" are terms that supplement the terms of this
 License by making exceptions from one or more of its conditions.
 Additional permissions that are applicable to the entire Program shall
 be treated as though they were included in this License, to the extent
 that they are valid under applicable law.  If additional permissions
 apply only to part of the Program, that part may be used separately
 under those permissions, but the entire Program remains governed by
 this License without regard to the additional permissions.
  When you convey a copy of a covered work, you may at your option
 remove any additional permissions from that copy, or from any part of
 it.  (Additional permissions may be written to require their own
 removal in certain cases when you modify the work.)  You may place
 additional permissions on material, added by you to a covered work,
 for which you have or can give appropriate copyright permission.
  Notwithstanding any other provision of this License, for material you
 add to a covered work, you may (if authorized by the copyright holders of
 that material) supplement the terms of this License with terms:
    a) Disclaiming warranty or limiting liability differently from the
    terms of sections 15 and 16 of this License; or
    b) Requiring preservation of specified reasonable legal notices or
    author attributions in that material or in the Appropriate Legal
    Notices displayed by works containing it; or
    c) Prohibiting misrepresentation of the origin of that material, or
    requiring that modified versions of such material be marked in
    reasonable ways as different from the original version; or
    d) Limiting the use for publicity purposes of names of licensors or
    authors of the material; or
    e) Declining to grant rights under trademark law for use of some
    trade names, trademarks, or service marks; or
    f) Requiring indemnification of licensors and authors of that
    material by anyone who conveys the material (or modified versions of
    it) with contractual assumptions of liability to the recipient, for
    any liability that these contractual assumptions directly impose on
    those licensors and authors.
  All other non-permissive additional terms are considered "further
 restrictions" within the meaning of section 10.  If the Program as you
 received it, or any part of it, contains a notice stating that it is
 governed by this License along with a term that is a further
 restriction, you may remove that term.  If a license document contains
 a further restriction but permits relicensing or conveying under this
 License, you may add to a covered work material governed by the terms
 of that license document, provided that the further restriction does
 not survive such relicensing or conveying.
  If you add terms to a covered work in accord with this section, you
 must place, in the relevant source files, a statement of the
 additional terms that apply to those files, or a notice indicating
 where to find the applicable terms.
  Additional terms, permissive or non-permissive, may be stated in the
 form of a separately written license, or stated as exceptions;
 the above requirements apply either way.
  8. Termination.
  You may not propagate or modify a covered work except as expressly
 provided under this License.  Any attempt otherwise to propagate or
 modify it is void, and will automatically terminate your rights under
 this License (including any patent licenses granted under the third
 paragraph of section 11).
  However, if you cease all violation of this License, then your
 license from a particular copyright holder is reinstated (a)
 provisionally, unless and until the copyright holder explicitly and
 finally terminates your license, and (b) permanently, if the copyright
 holder fails to notify you of the violation by some reasonable means
 prior to 60 days after the cessation.
  Moreover, your license from a particular copyright holder is
 reinstated permanently if the copyright holder notifies you of the
 violation by some reasonable means, this is the first time you have
 received notice of violation of this License (for any work) from that
 copyright holder, and you cure the violation prior to 30 days after
 your receipt of the notice.
  Termination of your rights under this section does not terminate the
 licenses of parties who have received copies or rights from you under
 this License.  If your rights have been terminated and not permanently
 reinstated, you do not qualify to receive new licenses for the same
 material under section 10.
  9. Acceptance Not Required for Having Copies.
  You are not required to accept this License in order to receive or
 run a copy of the Program.  Ancillary propagation of a covered work
 occurring solely as a consequence of using peer-to-peer transmission
 to receive a copy likewise does not require acceptance.  However,
 nothing other than this License grants you permission to propagate or
 modify any covered work.  These actions infringe copyright if you do
 not accept this License.  Therefore, by modifying or propagating a
 covered work, you indicate your acceptance of this License to do so.
  10. Automatic Licensing of Downstream Recipients.
  Each time you convey a covered work, the recipient automatically
 receives a license from the original licensors, to run, modify and
 propagate that work, subject to this License.  You are not responsible
 for enforcing compliance by third parties with this License.
  An "entity transaction" is a transaction transferring control of an
 organization, or substantially all assets of one, or subdividing an
 organization, or merging organizations.  If propagation of a covered
 work results from an entity transaction, each party to that
 transaction who receives a copy of the work also receives whatever
 licenses to the work the party's predecessor in interest had or could
 give under the previous paragraph, plus a right to possession of the
 Corresponding Source of the work from the predecessor in interest, if
 the predecessor has it or can get it with reasonable efforts.
  You may not impose any further restrictions on the exercise of the
 rights granted or affirmed under this License.  For example, you may
 not impose a license fee, royalty, or other charge for exercise of
 rights granted under this License, and you may not initiate litigation
 (including a cross-claim or counterclaim in a lawsuit) alleging that
 any patent claim is infringed by making, using, selling, offering for
 sale, or importing the Program or any portion of it.
  11. Patents.
  A "contributor" is a copyright holder who authorizes use under this
 License of the Program or a work on which the Program is based.  The
 work thus licensed is called the contributor's "contributor version".
  A contributor's "essential patent claims" are all patent claims
 owned or controlled by the contributor, whether already acquired or
 hereafter acquired, that would be infringed by some manner, permitted
 by this License, of making, using, or selling its contributor version,
 but do not include claims that would be infringed only as a
 consequence of further modification of the contributor version.  For
 purposes of this definition, "control" includes the right to grant
 patent sublicenses in a manner consistent with the requirements of
 this License.
  Each contributor grants you a non-exclusive, worldwide, royalty-free
 patent license under the contributor's essential patent claims, to
 make, use, sell, offer for sale, import and otherwise run, modify and
 propagate the contents of its contributor version.
  In the following three paragraphs, a "patent license" is any express
 agreement or commitment, however denominated, not to enforce a patent
 (such as an express permission to practice a patent or covenant not to
 sue for patent infringement).  To "grant" such a patent license to a
 party means to make such an agreement or commitment not to enforce a
 patent against the party.
  If you convey a covered work, knowingly relying on a patent license,
 and the Corresponding Source of the work is not available for anyone
 to copy, free of charge and under the terms of this License, through a
 publicly available network server or other readily accessible means,
 then you must either (1) cause the Corresponding Source to be so
 available, or (2) arrange to deprive yourself of the benefit of the
 patent license for this particular work, or (3) arrange, in a manner
 consistent with the requirements of this License, to extend the patent
 license to downstream recipients.  "Knowingly relying" means you have
 actual knowledge that, but for the patent license, your conveying the
 covered work in a country, or your recipient's use of the covered work
 in a country, would infringe one or more identifiable patents in that
 country that you have reason to believe are valid.
  If, pursuant to or in connection with a single transaction or
 arrangement, you convey, or propagate by procuring conveyance of, a
 covered work, and grant a patent license to some of the parties
 receiving the covered work authorizing them to use, propagate, modify
 or convey a specific copy of the covered work, then the patent license
 you grant is automatically extended to all recipients of the covered
 work and works based on it.
  A patent license is "discriminatory" if it does not include within
 the scope of its coverage, prohibits the exercise of, or is
 conditioned on the non-exercise of one or more of the rights that are
 specifically granted under this License.  You may not convey a covered
 work if you are a party to an arrangement with a third party that is
 in the business of distributing software, under which you make payment
 to the third party based on the extent of your activity of conveying
 the work, and under which the third party grants, to any of the
 parties who would receive the covered work from you, a discriminatory
 patent license (a) in connection with copies of the covered work
 conveyed by you (or copies made from those copies), or (b) primarily
 for and in connection with specific products or compilations that
 contain the covered work, unless you entered into that arrangement,
 or that patent license was granted, prior to 28 March 2007.
  Nothing in this License shall be construed as excluding or limiting
 any implied license or other defenses to infringement that may
 otherwise be available to you under applicable patent law.
  12. No Surrender of Others' Freedom.
  If conditions are imposed on you (whether by court order, agreement or
 otherwise) that contradict the conditions of this License, they do not
 excuse you from the conditions of this License.  If you cannot convey a
 covered work so as to satisfy simultaneously your obligations under this
 License and any other pertinent obligations, then as a consequence you may
 not convey it at all.  For example, if you agree to terms that obligate you
 to collect a royalty for further conveying from those to whom you convey
 the Program, the only way you could satisfy both those terms and this
 License would be to refrain entirely from conveying the Program.
  13. Use with the GNU Affero General Public License.
  Notwithstanding any other provision of this License, you have
 permission to link or combine any covered work with a work licensed
 under version 3 of the GNU Affero General Public License into a single
 combined work, and to convey the resulting work.  The terms of this
 License will continue to apply to the part which is the covered work,
 but the special requirements of the GNU Affero General Public License,
 section 13, concerning interaction through a network will apply to the
 combination as such.
  14. Revised Versions of this License.
  The Free Software Foundation may publish revised and/or new versions of
 the GNU General Public License from time to time.  Such new versions will
 be similar in spirit to the present version, but may differ in detail to
 address new problems or concerns.
  Each version is given a distinguishing version number.  If the
 Program specifies that a certain numbered version of the GNU General
 Public License "or any later version" applies to it, you have the
 option of following the terms and conditions either of that numbered
 version or of any later version published by the Free Software
 Foundation.  If the Program does not specify a version number of the
 GNU General Public License, you may choose any version ever published
 by the Free Software Foundation.
  If the Program specifies that a proxy can decide which future
 versions of the GNU General Public License can be used, that proxy's
 public statement of acceptance of a version permanently authorizes you
 to choose that version for the Program.
  Later license versions may give you additional or different
 permissions.  However, no additional obligations are imposed on any
 author or copyright holder as a result of your choosing to follow a
 later version.
  15. Disclaimer of Warranty.
  THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
 APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
 HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
 OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
 THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
 PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
 IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
 ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
  16. Limitation of Liability.
  IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
 WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
 THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
 GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
 USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
 DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
 PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
 EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
 SUCH DAMAGES.
  17. Interpretation of Sections 15 and 16.
  If the disclaimer of warranty and limitation of liability provided
 above cannot be given local legal effect according to their terms,
 reviewing courts shall apply local law that most closely approximates
 an absolute waiver of all civil liability in connection with the
 Program, unless a warranty or assumption of liability accompanies a
 copy of the Program in return for a fee.
                     END OF TERMS AND CONDITIONS
            How to Apply These Terms to Your New Programs
  If you develop a new program, and you want it to be of the greatest
 possible use to the public, the best way to achieve this is to make it
 free software which everyone can redistribute and change under these terms.
  To do so, attach the following notices to the program.  It is safest
 to attach them to the start of each source file to most effectively
 state the exclusion of warranty; and each file should have at least
 the "copyright" line and a pointer to where the full notice is found.
    <one line to give the program's name and a brief idea of what it does.>
    Copyright (C) <year>  <name of author>
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.
    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 Also add information on how to contact you by electronic and paper mail.
  If the program does terminal interaction, make it output a short
 notice like this when it starts in an interactive mode:
    <program>  Copyright (C) <year>  <name of author>
    This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
    This is free software, and you are welcome to redistribute it
    under certain conditions; type `show c' for details.
 The hypothetical commands `show w' and `show c' should show the appropriate
 parts of the General Public License.  Of course, your program's commands
 might be different; for a GUI interface, you would use an "about box".
  You should also get your employer (if you work as a programmer) or school,
 if any, to sign a "copyright disclaimer" for the program, if necessary.
 For more information on this, and how to apply and follow the GNU GPL, see
 <http://www.gnu.org/licenses/>.
  The GNU General Public License does not permit incorporating your program
 into proprietary programs.  If your program is a subroutine library, you
 may consider it more useful to permit linking proprietary applications with
 the library.  If this is what you want to do, use the GNU Lesser General
 Public License instead of this License.  But first, please read
 <http://www.gnu.org/philosophy/why-not-lgpl.html>.
--- a/mobimaster/README.md
+++ b/mobimaster/README.md
@@ -0,0 +1,142 @@
 # mobi - library for unpacking unencrypted mobi files
 ## extrct structure
 ```
 def extract(infile):
   |
 def unpackBook(infile, tempdir, epubver="A")
 ```
 ## unpackBook structure
 ```
 def unpackBook( infile, outdir, apnxfile=None, epubver="2", use_hd=False, dodump=False, dowriteraw=False, dosplitcombos=False,)
   |
 # 制作目录结构 unpack_structure
 files = fileNames(infile, outdir)
 #process the PalmDoc database header and verify it is a mobi
 # sect 包含什么？
 sect = Sectionizer(infile)
   |
 # CG mobi header, if K8 hasK8 = True
 mh = MobiHeader(sect, 0)
   |
 # if this is a combination mobi7-mobi8 file split them up
 # 默认关闭 SPLIT_COMBO_MOBIS = False
 # ???
 mobisplit = mobi_split(infile)
 # k8, 制作k8目录结构
 files.makeK8Struct()
   |
 # mhlst - mobi_header list
 # 关键!! 处理所有电子书
 def process_all_mobi_headers( files, apnxfile, sect, mhlst, K8Boundary, False, epubver, use_hd)
   |
 # 依据传入section，处理header，放入header_K8.dat/header_OR.dat/header.dat
   |
 # first handle all of the different resource sections:  images, resources, fonts, and etc
 # build up a list of image names to use to postprocess the ebook
 ```
 ncx_data K7 position/150 = real page number
 ncx_data K8 包含目录结构
 ```
 https://www.mobileread.com/forums/showthread.php?t=61986&page=78
 It is a compiled ebook format that uses a palm database structure - a set of starting offsets to binary data referred to either as sections or records depending on who you ask. What KindleUnpack does is examine these binary sections, identifies any sections that are headers and then use them to identify starting section numbers where images are stored, text is stored, index information, and etc. and then extract them to files. The data from these files are used to create html3.2 code that can be used to input back into kindlegen for the older mobi 7 pieces and used to create an epub-like structure for the kf8 pieces. If you actually want to see the rawml you can dump that as well.
 The header sections also have EXTH records that contain the MetaData information. If you want to understand the exact layout of the mobi file, simply run DumpMobiHeader_v018.py or later and look at the description of what is stored in each section of the palm database file.
 For joint mobis, the images are not duplicated, they are stored after the mobi7 header and before the kf8 header. Later mobis can also have a completely separate container of HDImages and placeholders.
 When Kindleunpack unpacks image sections (and fonts and RESC sections) it stores them all in a mobi 7 folder and copies the correct piecs to the mobi 8 folder as needed. When Kindleunpack unpacks from the HD Container, it will store these images in their own HDImage folder as they can notbe shared with a mobi 7. There is a switch to have the HDImages overwrite their low resolution cousins.
 So please run DumpMobiHeader and examine the section map to see what is actually being stored i side the palm database structure.
 If you have further questions, post the output of DumpMobiHeader from running on your mobi so that I understand exactly what it is you are askng. It will even work on DRMd ebooks since the headers themselves and most images are not typically encrypted.
 Hope this helps,
 KevinH
 KevinH is offline  	Reply With Quote
 Old 07-05-2015, 07:37 PM	  #1169
 [![Version](https://img.shields.io/pypi/v/mobi.svg)](https://pypi.python.org/pypi/mobi/)
 [![Downloads](https://pepy.tech/badge/mobi)](https://pepy.tech/project/mobi)
 > A fork of [KindleUnpack](https://github.com/kevinhendricks/KindleUnpack) which removes the GUI part and makes it available as a python library via [PyPi](https://pypi.org/project/mobi/) for easy unpacking of mobi files.
 ## Usage
 ### As library
 ```python
 import mobi
 tempdir, filepath = mobi.extract("mybook.mobi")
 ```
 'tempdir' is the path where the mobi is unpacked
 'filepath' is the path to either an epub, html or pdf file depending on the mobi type 
 | NOTE: You are responsible to delete the generated tempdir! |
 | --- |
 ### From the command line
 The installer also creates a console script entrypoint that wraps the original KindleUnpack
 ```console
 $ mobiunpack
 KindleUnpack v0.82
   Based on initial mobipocket version Copyright © 2009 Charles M. Hannum <root@ihack.net>
   Extensive Extensions and Improvements Copyright © 2009-2014
       by:  P. Durrant, K. Hendricks, S. Siebert, fandrieu, DiapDealer, nickredding, tkeo.
   This program is free software: you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation, version 3.
 Description:
  Unpacks an unencrypted Kindle/MobiPocket ebook to html and images
  or an unencrypted Kindle/Print Replica ebook to PDF and images
  into the specified output folder.
 Usage:
  mobiunpack -r -s -p apnxfile -d -h --epub_version= infile [outdir]
 Options:
    -h                 print this help message
    -i                 use HD Images, if present, to overwrite reduced resolution images
    -s                 split combination mobis into mobi7 and mobi8 ebooks
    -p APNXFILE        path to an .apnx file associated with the azw3 input (optional)
    --epub_version=    specify epub version to unpack to: 2, 3, A (for automatic) or
                         F (force to fit to epub2 definitions), default is 2
    -d                 dump headers and other info to output and extra files
    -r                 write raw data to the output folder
 ```
 ### [0.3.1] - 2020-06-27
 - Fix pypi link
 - Update dependencies
 ### [0.3.0] - 2020-03-02
 - Add support for mobi7 only files
 - Add experimental support for mobi print replica files
 - Add support for file-like objects
 ### [0.2.0] - 2020-03-02
 - Minimal working 'extract' function and 'mobiunpack' console wrapper
 - Replace most print calls with logging
 ### [0.1.0] - 2020-03-02
 - Empty package registered on pypi
 ## License
 GPL-3.0-only
 All credits for the hard work go to https://github.com/kevinhendricks/KindleUnpack
--- a/mobimaster/mobi/init.py
+++ b/mobimaster/mobi/init.py
@@ -0,0 +1,7 @@
 import os
 os.environ["LOGURU_AUTOINIT"] = "False"
 from mobi.extract import extract
 from mobi.extract import extracttest
 __version__ = "0.3.1"
--- a/mobimaster/mobi/compatibility_utils.py
+++ b/mobimaster/mobi/compatibility_utils.py
@@ -0,0 +1,295 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 # Copyright (c) 2014 Kevin B. Hendricks, John Schember, and Doug Massay
 # All rights reserved.
 #
 # Redistribution and use in source and binary forms, with or without modification,
 # are permitted provided that the following conditions are met:
 #
 # 1. Redistributions of source code must retain the above copyright notice, this list of
 # conditions and the following disclaimer.
 #
 # 2. Redistributions in binary form must reproduce the above copyright notice, this list
 # of conditions and the following disclaimer in the documentation and/or other materials
 # provided with the distribution.
 #
 # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
 # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
 # SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
 # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
 # TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
 # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
 # WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 from __future__ import unicode_literals, division, absolute_import, print_function
 import sys
 import codecs
 PY2 = sys.version_info[0] == 2
 PY3 = sys.version_info[0] == 3
 iswindows = sys.platform.startswith("win")
 try:
    from urllib.parse import unquote
 except ImportError:
    from urllib import unquote
 if PY2:
    from HTMLParser import HTMLParser
    _h = HTMLParser()
 elif sys.version_info[1] < 4:
    import html.parser
    _h = html.parser.HTMLParser()
 else:
    import html as _h
 if PY3:
    text_type = str
    binary_type = bytes
    # if will be printing arbitraty binary data to stdout on python 3
    # sys.stdin = sys.stdin.detach()
    # sys.stdout = sys.stdout.detach()
    # sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())
 else:
    range = xrange
    text_type = unicode
    binary_type = str
    # if will be printing unicode under python 2 need to protect
    # against sys.stdout.encoding being None stupidly forcing forcing ascii encoding of unicode
    # sys.stdout = codecs.getwriter("utf-8")(sys.stdout)
    # alternatively set environment variable as follows **before** launching python:  export PYTHONIOENCODING=UTF-8
 # NOTE: Python 3 is completely broken when accessing single bytes in bytes strings
 # (and they amazingly claim by design and no bug!)
 # To illustrate: this works for unicode in Python 3 and for all Python 2.X for both bytestrings and unicode
 # >>> o = '123456789'
 # >>> o[-3]
 # '7'
 # >>> type(o[-3])
 # <class 'str'>
 # >>> type(o)
 # <class 'str'>
 # Unfortunately, this is what Python 3 does for no sane reason and only for bytestrings
 # >>> o = b'123456789'
 # >>> o[-3]
 # 55
 # >>> type(o[-3])
 # <class 'int'>
 # >>> type(o)
 # <class 'bytes'>
 # This mind boggling  behaviour also happens when indexing a bytestring and/or
 # iteratoring over a bytestring.  In other words it will return an int but not
 # the byte itself!!!!!!!
 # The only way to access a single byte as a byte in bytestring and get the byte in both
 # Python 2 and Python 3 is to use a slice
 # This problem is so common there are horrible hacks floating around the net to **try**
 # to work around it, so that code that works on both Python 2 and Python 3 is possible.
 # So in order to write code that works on both Python 2 and Python 3
 # if you index or access a single byte and want its ord() then use the bord() function.
 # If instead you want it as a single character byte use the bchar() function
 # both of which are defined below.
 if PY3:
    # Also Note: if decode a bytestring using 'latin-1' (or any other full range 0-255 encoding)
    # in place of ascii you will get a byte value to half-word or integer value
    # one-to-one mapping (in the 0 - 255 range)
    def bchr(s):
        return bytes([s])
    def bstr(s):
        if isinstance(s, str):
            return bytes(s, "latin-1")
        else:
            return bytes(s)
    def bord(s):
        return s
    def bchar(s):
        return bytes([s])
 else:
    def bchr(s):
        return chr(s)
    def bstr(s):
        return str(s)
    def bord(s):
        return ord(s)
    def bchar(s):
        return s
 if PY3:
    # list-producing versions of the major Python iterating functions
    def lrange(*args, **kwargs):
        return list(range(*args, **kwargs))
    def lzip(*args, **kwargs):
        return list(zip(*args, **kwargs))
    def lmap(*args, **kwargs):
        return list(map(*args, **kwargs))
    def lfilter(*args, **kwargs):
        return list(filter(*args, **kwargs))
 else:
    import __builtin__
    # Python 2-builtin ranges produce lists
    lrange = __builtin__.range
    lzip = __builtin__.zip
    lmap = __builtin__.map
    lfilter = __builtin__.filter
 # In Python 3 you can no longer use .encode('hex') on a bytestring
 # instead use the following on both platforms
 import binascii
 def hexlify(bdata):
    return (binascii.hexlify(bdata)).decode("ascii")
 # If you: import struct
 # Note:  struct pack, unpack, unpack_from all *require* bytestring format
 # data all the way up to at least Python 2.7.5, Python 3 is okay with either
 # If you: import re
 # note: Python 3 "re" requires the pattern to be the exact same type as the data to be
 # searched ... but u"" is not allowed for the pattern itself only b""
 # Python 2.X allows the pattern to be any type and converts it to match the data
 # and returns the same type as the data
 # convert string to be utf-8 encoded
 def utf8_str(p, enc="utf-8"):
    if p is None:
        return None
    if isinstance(p, text_type):
        return p.encode("utf-8")
    if enc != "utf-8":
        return p.decode(enc).encode("utf-8")
    return p
 # convert string to be unicode encoded
 def unicode_str(p, enc="utf-8"):
    if p is None:
        return None
    if isinstance(p, text_type):
        return p
    return p.decode(enc)
 ASCII_CHARS = set(chr(x) for x in range(128))
 URL_SAFE = set(
    "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789" "#" "_.-/~"
 )
 IRI_UNSAFE = ASCII_CHARS - URL_SAFE
 # returns a quoted IRI (not a URI)
 def quoteurl(href):
    if isinstance(href, binary_type):
        href = href.decode("utf-8")
    result = []
    for char in href:
        if char in IRI_UNSAFE:
            char = "%%%02x" % ord(char)
        result.append(char)
    return "".join(result)
 # unquotes url/iri
 def unquoteurl(href):
    if isinstance(href, binary_type):
        href = href.decode("utf-8")
    href = unquote(href)
    return href
 # unescape html
 def unescapeit(sval):
    return _h.unescape(sval)
 # Python 2.X commandline parsing under Windows has been horribly broken for years!
 # Use the following code to emulate full unicode commandline parsing on Python 2
 # ie. To get  sys.argv arguments and properly encode them as unicode
 def unicode_argv():
    global iswindows
    global PY3
    if PY3:
        return sys.argv
    if iswindows:
        # Versions 2.x of Python don't support Unicode in sys.argv on
        # Windows, with the underlying Windows API instead replacing multi-byte
        # characters with '?'.  So use shell32.GetCommandLineArgvW to get sys.argv
        # as a list of Unicode strings
        from ctypes import POINTER, byref, cdll, c_int, windll
        from ctypes.wintypes import LPCWSTR, LPWSTR
        GetCommandLineW = cdll.kernel32.GetCommandLineW
        GetCommandLineW.argtypes = []
        GetCommandLineW.restype = LPCWSTR
        CommandLineToArgvW = windll.shell32.CommandLineToArgvW
        CommandLineToArgvW.argtypes = [LPCWSTR, POINTER(c_int)]
        CommandLineToArgvW.restype = POINTER(LPWSTR)
        cmd = GetCommandLineW()
        argc = c_int(0)
        argv = CommandLineToArgvW(cmd, byref(argc))
        if argc.value > 0:
            # Remove Python executable and commands if present
            start = argc.value - len(sys.argv)
            return [argv[i] for i in range(start, argc.value)]
        # this should never happen
        return None
    else:
        argv = []
        argvencoding = sys.stdin.encoding
        if argvencoding is None:
            argvencoding = sys.getfilesystemencoding()
        if argvencoding is None:
            argvencoding = "utf-8"
        for arg in sys.argv:
            if isinstance(arg, text_type):
                argv.append(arg)
            else:
                argv.append(arg.decode(argvencoding))
        return argv
 # Python 2.X is broken in that it does not recognize CP65001 as UTF-8
 def add_cp65001_codec():
    if PY2:
        try:
            codecs.lookup("cp65001")
        except LookupError:
            codecs.register(
                lambda name: name == "cp65001" and codecs.lookup("utf-8") or None
            )
    return
--- a/mobimaster/mobi/extract.py
+++ b/mobimaster/mobi/extract.py
@@ -0,0 +1,87 @@
 # -*- coding: utf-8 -*-
 import shutil
 from loguru import logger
 import tempfile
 from os.path import basename, splitext, exists, join
 from mobi.kindleunpack import unpackBook
 def extract(infile):
    """Extract mobi file and return path to epub file"""
    tempdir = tempfile.mkdtemp(prefix="mobiex")
    if hasattr(infile, "fileno"):
        tempname = next(tempfile._get_candidate_names()) + ".mobi"
        pos = infile.tell()
        infile.seek(0)
        with open(join(tempdir, tempname), "wb") as outfile:
            shutil.copyfileobj(infile, outfile)
        infile.seek(pos)
        infile = join(tempdir, tempname)
    logger.debug("file: %s" % infile)
    fname_in = basename(infile)
    base, ext = splitext(fname_in)
    fname_out_epub = base + ".epub"
    fname_out_html = "book.html"
    fname_out_pdf = base + ".001.pdf"
    unpackBook(infile, tempdir, epubver="A")
    epub_filepath = join(tempdir, "mobi8", fname_out_epub)
    html_filepath = join(tempdir, "mobi7", fname_out_html)
    pdf_filepath = join(tempdir, fname_out_pdf)
    if exists(epub_filepath):
        return tempdir, epub_filepath
    elif exists(html_filepath):
        return tempdir, html_filepath
    elif exists(pdf_filepath):
        return tempdir, pdf_filepath
    raise ValueError("Coud not extract from %s" % infile)
 def extracttest(infile):
    """Extract mobi file and return path to epub file"""
    tempdir = './t/'
    if hasattr(infile, "fileno"):
        tempname = next(tempfile._get_candidate_names()) + ".mobi"
        pos = infile.tell()
        infile.seek(0)
        with open(join(tempdir, tempname), "wb") as outfile:
            shutil.copyfileobj(infile, outfile)
        infile.seek(pos)
        infile = join(tempdir, tempname)
        # tempname 8x2vf7yv.mobi pos 0 infile ./t/8x2vf7yv.mobi
        print('tempname {} pos {} infile {}'.format(tempname,pos,infile))
    logger.debug("file: %s" % infile)
    fname_in = basename(infile)
    base, ext = splitext(fname_in)
    fname_out_epub = base + ".epub"
    fname_out_html = "book.html"
    fname_out_pdf = base + ".001.pdf"
    # infile ./t/8x2vf7yv.mobi
    unpackBook(infile, tempdir, epubver="A")
    epub_filepath = join(tempdir, "mobi8", fname_out_epub)
    html_filepath = join(tempdir, "mobi7", fname_out_html)
    pdf_filepath = join(tempdir, fname_out_pdf)
    # CGDBG
    #epub_filepath ./t/mobi8/p302tbwb.epub html_filepath ./t/mobi7/book.html pdf_filepath ./t/p302tbwb.001.pdf
    print('epub_filepath {} html_filepath {} pdf_filepath {}'.format( epub_filepath, html_filepath, pdf_filepath))
    if exists(epub_filepath):
        return tempdir, epub_filepath
    elif exists(html_filepath):
        return tempdir, html_filepath
    elif exists(pdf_filepath):
        return tempdir, pdf_filepath
    raise ValueError("Coud not extract from %s" % infile)
 if __name__ == "__main__":
    #print(extracttest("../tests/demo.mobi"))
    pass
--- a/mobimaster/mobi/kindleunpack.py
+++ b/mobimaster/mobi/kindleunpack.py
--- a/mobimaster/mobi/makencx.py
+++ b/mobimaster/mobi/makencx.py
@@ -0,0 +1,286 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 import os
 from .unipath import pathof
 from loguru import logger
 import re
 # note: re requites the pattern to be the exact same type as the data to be searched in python3
 # but u"" is not allowed for the pattern itself only b""
 '''
 NCX (Navigation Control for XML applications) is a generalized navigation definition DTD for application
 to Digital Talking Books, eBooks, and general web content models.                                                
 This DTD is an XML application that layers navigation functionality on top of SMIL 2.0  content.                                       
 The NCX defines a navigation path/model that may be applied upon existing publications,
 without modification of the existing publication source, so long as the navigation targets within
 the source publication can be directly referenced via a URI.                      
 http://www.daisy.org/z3986/2005/ncx-2005-1.dtd
 '''
 from .mobi_utils import toBase32
 from .mobi_index import MobiIndex
 DEBUG_NCX = True
 class ncxExtract:
    def __init__(self, mh):
        self.mh = mh
        self.sect = self.mh.sect
        self.isNCX = False
        self.mi = MobiIndex(self.sect)
        self.ncxidx = self.mh.ncxidx
        self.indx_data = None
    def parseNCX(self):
        indx_data = []
        tag_fieldname_map = {
            1: ["pos", 0],
            2: ["len", 0],
            3: ["noffs", 0],
            4: ["hlvl", 0],
            5: ["koffs", 0],
            6: ["pos_fid", 0],
            21: ["parent", 0],
            22: ["child1", 0],
            23: ["childn", 0],
        }
        if self.ncxidx != 0xFFFFFFFF:
            outtbl, ctoc_text = self.mi.getIndexData(self.ncxidx, "NCX")
            if DEBUG_NCX:
                logger.debug("ctoc_text {}".format(ctoc_text))
                logger.debug("outtbl {}".format(outtbl))
            num = 0
            for [text, tagMap] in outtbl:
                tmp = {
                    "name": text.decode("utf-8"),
                    "pos": -1,
                    "len": 0,
                    "noffs": -1,
                    "text": "Unknown Text",
                    "hlvl": -1,
                    "kind": "Unknown Kind",
                    "pos_fid": None,
                    "parent": -1,
                    "child1": -1,
                    "childn": -1,
                    "num": num,
                }
                for tag in tag_fieldname_map:
                    [fieldname, i] = tag_fieldname_map[tag]
                    if tag in tagMap:
                        fieldvalue = tagMap[tag][i]
                        if tag == 6:
                            pos_fid = toBase32(fieldvalue, 4).decode("utf-8")
                            fieldvalue2 = tagMap[tag][i + 1]
                            pos_off = toBase32(fieldvalue2, 10).decode("utf-8")
                            fieldvalue = "kindle:pos:fid:%s:off:%s" % (pos_fid, pos_off)
                        tmp[fieldname] = fieldvalue
                        if tag == 3:
                            toctext = ctoc_text.get(fieldvalue, "Unknown Text")
                            toctext = toctext.decode(self.mh.codec)
                            tmp["text"] = toctext
                        if tag == 5:
                            kindtext = ctoc_text.get(fieldvalue, "Unknown Kind")
                            kindtext = kindtext.decode(self.mh.codec)
                            tmp["kind"] = kindtext
                indx_data.append(tmp)
                # CGDBG
                '''
                record number:  3
                name:  03
                position 461377  length:  465358  => position/150 = real page number
                text:  第二章 青铜时代——单机游戏
                kind:  Unknown Kind
                heading level:  0 => level of section
                parent: -1  => record number of previous level of section
                first child:  15  last child:  26 => range of record number of next level section
                pos_fid is  kindle:pos:fid:0023:off:0000000000
                '''
                if DEBUG_NCX:
                    print("record number: ", num)
                    print(
                        "name: ", tmp["name"],
                    )
                    print("position", tmp["pos"], " length: ", tmp["len"])
                    print("text: ", tmp["text"])
                    print("kind: ", tmp["kind"])
                    print("heading level: ", tmp["hlvl"])
                    print("parent:", tmp["parent"])
                    print(
                        "first child: ", tmp["child1"], " last child: ", tmp["childn"]
                    )
                    print("pos_fid is ", tmp["pos_fid"])
                    print("\n\n")
                num += 1
        self.indx_data = indx_data
        # {'name': '00', 'pos': 167, 'len': 24798, 'noffs': 0, 'text': '版权信息', 'hlvl': 0, 'kind': 'Unknown Kind', 'pos_fid': None, 'parent': -1, 'child1': -1, 'childn': -1, 'num': 0}
        # {'name': '0B', 'pos': 67932, 'len': 3274, 'noffs': 236, 'text': '8.希罗多德', 'hlvl': 0, 'kind': 'Unknown Kind', 'pos_fid': None, 'parent': -1, 'child1': -1, 'childn': -1, 'num': 11}
        print(indx_data)
        return indx_data
    def writeNCX(self, metadata):
        # build the xml
        self.isNCX = True
        logger.debug("Write ncx")
        # write the ncx file
        # build the xml
        xml = self.buildNCX(
            metadata["Title"][0],
            metadata["UniqueID"][0],
            metadata.get("Language")[0],
        )
        # write the ncx file
        # ncxname = os.path.join(self.files.mobi7dir, self.files.getInputFileBasename() + '.ncx')
        ncxname = os.path.join(self.files.mobi7dir, "toc.ncx")
        with open(pathof(ncxname), "wb") as f:
            f.write(xml.encode("utf-8"))
    def buildNCX(self):
        indx_data = self.indx_data
        # recursive part
        def recursINDX(max_lvl=0, num=0, lvl=0, start=-1, end=-1):
            if start > len(indx_data) or end > len(indx_data):
                print("Warning: missing INDX child entries", start, end, len(indx_data))
                return ""
            if DEBUG_NCX:
                logger.debug("recursINDX lvl %d from %d to %d" % (lvl, start, end))
            xml = ""
            if start <= 0:
                start = 0
            if end <= 0:
                end = len(indx_data)
            if lvl > max_lvl:
                max_lvl = lvl
            indent = "  " * (2 + lvl)
            for i in range(start, end):
                e = indx_data[i]
                if not e["hlvl"] == lvl:
                    continue
                # open entry
                num += 1
                link = "%s#filepos%d" % (htmlfile, e["pos"])
                tagid = "np_%d" % num
                entry = ncx_entry % (tagid, num, e["text"], link)
                entry = re.sub(re.compile("^", re.M), indent, entry, 0)
                xml += entry + "\n"
                # recurs
                if e["child1"] >= 0:
                    xmlrec, max_lvl, num = recursINDX(
                        max_lvl, num, lvl + 1, e["child1"], e["childn"] + 1
                    )
                    xml += xmlrec
                # close entry
                xml += indent + "</navPoint>\n"
            return xml, max_lvl, num
        body, max_lvl, num = recursINDX()
        header = ncx_header % (lang, ident, max_lvl + 1, title)
        ncx = header + body + ncx_footer
        if not len(indx_data) == num:
            print("Warning: different number of entries in NCX", len(indx_data), num)
        return ncx
    def buildK8NCX(self, indx_data, title, ident, lang):
        ncx_header = """<?xml version='1.0' encoding='utf-8'?>
 <ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1" xml:lang="%s">
 <head>
 <meta content="%s" name="dtb:uid"/>
 <meta content="%d" name="dtb:depth"/>
 <meta content="mobiunpack.py" name="dtb:generator"/>
 <meta content="0" name="dtb:totalPageCount"/>
 <meta content="0" name="dtb:maxPageNumber"/>
 </head>
 <docTitle>
 <text>%s</text>
 </docTitle>
 <navMap>
 """
        ncx_footer = """  </navMap>
 </ncx>
 """
        ncx_entry = """<navPoint id="%s" playOrder="%d">
 <navLabel>
 <text>%s</text>
 </navLabel>
 <content src="%s"/>"""
        # recursive part
        def recursINDX(max_lvl=0, num=0, lvl=0, start=-1, end=-1):
            if start > len(indx_data) or end > len(indx_data):
                print("Warning: missing INDX child entries", start, end, len(indx_data))
                return ""
            if DEBUG_NCX:
                logger.debug("recursINDX lvl %d from %d to %d" % (lvl, start, end))
            xml = ""
            if start <= 0:
                start = 0
            if end <= 0:
                end = len(indx_data)
            if lvl > max_lvl:
                max_lvl = lvl
            indent = "  " * (2 + lvl)
            for i in range(start, end):
                e = indx_data[i]
                htmlfile = e["filename"]
                desttag = e["idtag"]
                if not e["hlvl"] == lvl:
                    continue
                # open entry
                num += 1
                if desttag == "":
                    link = "Text/%s" % htmlfile
                else:
                    link = "Text/%s#%s" % (htmlfile, desttag)
                tagid = "np_%d" % num
                entry = ncx_entry % (tagid, num, e["text"], link)
                entry = re.sub(re.compile("^", re.M), indent, entry, 0)
                xml += entry + "\n"
                # recurs
                if e["child1"] >= 0:
                    xmlrec, max_lvl, num = recursINDX(
                        max_lvl, num, lvl + 1, e["child1"], e["childn"] + 1
                    )
                    xml += xmlrec
                # close entry
                xml += indent + "</navPoint>\n"
            return xml, max_lvl, num
        body, max_lvl, num = recursINDX()
        header = ncx_header % (lang, ident, max_lvl + 1, title)
        ncx = header + body + ncx_footer
        if not len(indx_data) == num:
            print("Warning: different number of entries in NCX", len(indx_data), num)
        return ncx
    def writeK8NCX(self, ncx_data, metadata):
        # build the xml
        self.isNCX = True
        logger.debug("Write K8 ncx")
        xml = self.buildK8NCX(
            ncx_data,
            metadata["Title"][0],
            metadata["UniqueID"][0],
            metadata.get("Language")[0],
        )
        ncxname = os.path.join('./', 'k8toc.ncx.json')
        with open(pathof(ncxname), "wb") as f:
            f.write(xml.encode("utf-8"))
--- a/mobimaster/mobi/mobi_cover.py
+++ b/mobimaster/mobi/mobi_cover.py
@@ -0,0 +1,245 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import unicode_str
 from loguru import logger
 from .unipath import pathof
 import os
 import imghdr
 import struct
 # note:  struct pack, unpack, unpack_from all require bytestring format
 # data all the way up to at least python 2.7.5, python 3 okay with bytestring
 USE_SVG_WRAPPER = True
 """ Set to True to use svg wrapper for default. """
 FORCE_DEFAULT_TITLE = False
 """ Set to True to force to use the default title. """
 COVER_PAGE_FINENAME = "cover_page.xhtml"
 """ The name for the cover page. """
 DEFAULT_TITLE = "Cover"
 """ The default title for the cover page. """
 MAX_WIDTH = 4096
 """ The max width for the svg cover page. """
 MAX_HEIGHT = 4096
 """ The max height for the svg cover page. """
 def get_image_type(imgname, imgdata=None):
    imgtype = unicode_str(imghdr.what(pathof(imgname), imgdata))
    # imghdr only checks for JFIF or Exif JPEG files. Apparently, there are some
    # with only the magic JPEG bytes out there...
    # ImageMagick handles those, so, do it too.
    if imgtype is None:
        if imgdata is None:
            with open(pathof(imgname), "rb") as f:
                imgdata = f.read()
        if imgdata[0:2] == b"\xFF\xD8":
            # Get last non-null bytes
            last = len(imgdata)
            while imgdata[last - 1 : last] == b"\x00":
                last -= 1
            # Be extra safe, check the trailing bytes, too.
            if imgdata[last - 2 : last] == b"\xFF\xD9":
                imgtype = "jpeg"
    return imgtype
 def get_image_size(imgname, imgdata=None):
    """Determine the image type of imgname (or imgdata) and return its size.
    Originally,
    Determine the image type of fhandle and return its size.
    from draco"""
    if imgdata is None:
        fhandle = open(pathof(imgname), "rb")
        head = fhandle.read(24)
    else:
        head = imgdata[0:24]
    if len(head) != 24:
        return
    imgtype = get_image_type(imgname, imgdata)
    if imgtype == "png":
        check = struct.unpack(b">i", head[4:8])[0]
        if check != 0x0D0A1A0A:
            return
        width, height = struct.unpack(b">ii", head[16:24])
    elif imgtype == "gif":
        width, height = struct.unpack(b"<HH", head[6:10])
    elif imgtype == "jpeg" and imgdata is None:
        try:
            fhandle.seek(0)  # Read 0xff next
            size = 2
            ftype = 0
            while not 0xC0 <= ftype <= 0xCF:
                fhandle.seek(size, 1)
                byte = fhandle.read(1)
                while ord(byte) == 0xFF:
                    byte = fhandle.read(1)
                ftype = ord(byte)
                size = struct.unpack(b">H", fhandle.read(2))[0] - 2
            # We are at a SOFn block
            fhandle.seek(1, 1)  # Skip `precision' byte.
            height, width = struct.unpack(b">HH", fhandle.read(4))
        except Exception:  # IGNORE:W0703
            return
    elif imgtype == "jpeg" and imgdata is not None:
        try:
            pos = 0
            size = 2
            ftype = 0
            while not 0xC0 <= ftype <= 0xCF:
                pos += size
                byte = imgdata[pos : pos + 1]
                pos += 1
                while ord(byte) == 0xFF:
                    byte = imgdata[pos : pos + 1]
                    pos += 1
                ftype = ord(byte)
                size = struct.unpack(b">H", imgdata[pos : pos + 2])[0] - 2
                pos += 2
            # We are at a SOFn block
            pos += 1  # Skip `precision' byte.
            height, width = struct.unpack(b">HH", imgdata[pos : pos + 4])
            pos += 4
        except Exception:  # IGNORE:W0703
            return
    else:
        return
    return width, height
 # XXX experimental
 class CoverProcessor(object):
    """Create a cover page.
    """
    def __init__(self, files, metadata, rscnames, imgname=None, imgdata=None):
        self.files = files
        self.metadata = metadata
        self.rscnames = rscnames
        self.cover_page = COVER_PAGE_FINENAME
        self.use_svg = USE_SVG_WRAPPER  # Use svg wrapper.
        self.lang = metadata.get("Language", ["en"])[0]
        # This should ensure that if the methods to find the cover image's
        # dimensions should fail for any reason, the SVG routine will not be used.
        [self.width, self.height] = (-1, -1)
        if FORCE_DEFAULT_TITLE:
            self.title = DEFAULT_TITLE
        else:
            self.title = metadata.get("Title", [DEFAULT_TITLE])[0]
        self.cover_image = None
        if imgname is not None:
            self.cover_image = imgname
        elif "CoverOffset" in metadata:
            imageNumber = int(metadata["CoverOffset"][0])
            cover_image = self.rscnames[imageNumber]
            if cover_image is not None:
                self.cover_image = cover_image
            else:
                logger.debug("Warning: Cannot identify the cover image.")
        if self.use_svg:
            try:
                if imgdata is None:
                    fname = os.path.join(files.imgdir, self.cover_image)
                    [self.width, self.height] = get_image_size(fname)
                else:
                    [self.width, self.height] = get_image_size(None, imgdata)
            except:
                self.use_svg = False
            width = self.width
            height = self.height
            if width < 0 or height < 0 or width > MAX_WIDTH or height > MAX_HEIGHT:
                self.use_svg = False
        return
    def getImageName(self):
        return self.cover_image
    def getXHTMLName(self):
        return self.cover_page
    def buildXHTML(self):
        logger.debug("Building a cover page.")
        files = self.files
        cover_image = self.cover_image
        title = self.title
        lang = self.lang
        image_dir = os.path.normpath(os.path.relpath(files.k8images, files.k8text))
        image_path = os.path.join(image_dir, cover_image).replace("\\", "/")
        if not self.use_svg:
            data = ""
            data += '<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html>'
            data += '<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops"'
            data += ' xml:lang="{:s}">\n'.format(lang)
            data += "<head>\n<title>{:s}</title>\n".format(title)
            data += '<style type="text/css">\n'
            data += "body {\n  margin: 0;\n  padding: 0;\n  text-align: center;\n}\n"
            data += "div {\n  height: 100%;\n  width: 100%;\n  text-align: center;\n  page-break-inside: avoid;\n}\n"
            data += "img {\n  display: inline-block;\n  height: 100%;\n  margin: 0 auto;\n}\n"
            data += "</style>\n</head>\n"
            data += "<body><div>\n"
            data += '  <img src="{:s}" alt=""/>\n'.format(image_path)
            data += "</div></body>\n</html>"
        else:
            width = self.width
            height = self.height
            viewBox = "0 0 {0:d} {1:d}".format(width, height)
            data = ""
            data += '<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html>'
            data += '<html xmlns="http://www.w3.org/1999/xhtml"'
            data += ' xml:lang="{:s}">\n'.format(lang)
            data += "<head>\n  <title>{:s}</title>\n".format(title)
            data += '<style type="text/css">\n'
            data += "svg {padding: 0pt; margin:0pt}\n"
            data += "body { text-align: center; padding:0pt; margin: 0pt; }\n"
            data += "</style>\n</head>\n"
            data += "<body>\n  <div>\n"
            data += '    <svg xmlns="http://www.w3.org/2000/svg" height="100%" preserveAspectRatio="xMidYMid meet"'
            data += ' version="1.1" viewBox="{0:s}" width="100%" xmlns:xlink="http://www.w3.org/1999/xlink">\n'.format(
                viewBox
            )
            data += '      <image height="{0}" width="{1}" xlink:href="{2}"/>\n'.format(
                height, width, image_path
            )
            data += "    </svg>\n"
            data += "  </div>\n</body>\n</html>"
        return data
    def writeXHTML(self):
        files = self.files
        cover_page = self.cover_page
        data = self.buildXHTML()
        outfile = os.path.join(files.k8text, cover_page)
        if os.path.exists(pathof(outfile)):
            logger.debug("Warning: {:s} already exists.".format(cover_page))
            os.remove(pathof(outfile))
        with open(pathof(outfile), "wb") as f:
            f.write(data.encode("utf-8"))
        return
    def guide_toxml(self):
        files = self.files
        text_dir = os.path.relpath(files.k8text, files.k8oebps)
        data = '<reference type="cover" title="Cover" href="{:s}/{:s}" />\n'.format(
            text_dir, self.cover_page
        )
        return data
--- a/mobimaster/mobi/mobi_dict.py
+++ b/mobimaster/mobi/mobi_dict.py
@@ -0,0 +1,473 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import PY2, PY3, utf8_str, bstr, bchr
 from loguru import logger
 if PY2:
    range = xrange
    array_format = b"B"
 if PY3:
    unichr = chr
    array_format = "B"
 import array
 import struct
 # note:  struct pack, unpack, unpack_from all require bytestring format
 # data all the way up to at least python 2.7.5, python 3 okay with bytestring
 from .mobi_index import getVariableWidthValue, readTagSection, getTagMap
 from .mobi_utils import toHex
 DEBUG_DICT = True
 class InflectionData(object):
    def __init__(self, infldatas):
        self.infldatas = infldatas
        self.starts = []
        self.counts = []
        for idata in self.infldatas:
            (start,) = struct.unpack_from(b">L", idata, 0x14)
            (count,) = struct.unpack_from(b">L", idata, 0x18)
            self.starts.append(start)
            self.counts.append(count)
    def lookup(self, lookupvalue):
        i = 0
        rvalue = lookupvalue
        while rvalue >= self.counts[i]:
            rvalue = rvalue - self.counts[i]
            i += 1
            if i == len(self.counts):
                logger.debug("Error: Problem with multiple inflections data sections")
                return lookupvalue, self.starts[0], self.counts[0], self.infldatas[0]
        return rvalue, self.starts[i], self.counts[i], self.infldatas[i]
    def offsets(self, value):
        rvalue, start, count, data = self.lookup(value)
        (offset,) = struct.unpack_from(b">H", data, start + 4 + (2 * rvalue))
        if rvalue + 1 < count:
            (nextOffset,) = struct.unpack_from(
                b">H", data, start + 4 + (2 * (rvalue + 1))
            )
        else:
            nextOffset = None
        return offset, nextOffset, data
 class dictSupport(object):
    def __init__(self, mh, sect):
        self.mh = mh
        self.header = mh.header
        self.sect = sect
        self.metaOrthIndex = mh.metaOrthIndex
        self.metaInflIndex = mh.metaInflIndex
    def parseHeader(self, data):
        "read INDX header"
        if not data[:4] == b"INDX":
            logger.debug("Warning: index section is not INDX")
            return False
        words = (
            "len",
            "nul1",
            "type",
            "gen",
            "start",
            "count",
            "code",
            "lng",
            "total",
            "ordt",
            "ligt",
            "nligt",
            "nctoc",
        )
        num = len(words)
        values = struct.unpack(bstr(">%dL" % num), data[4 : 4 * (num + 1)])
        header = {}
        for n in range(num):
            header[words[n]] = values[n]
        ordt1 = None
        ordt2 = None
        otype, oentries, op1, op2, otagx = struct.unpack_from(b">LLLLL", data, 0xA4)
        header["otype"] = otype
        header["oentries"] = oentries
        if DEBUG_DICT:
            logger.debug(
                "otype %d, oentries %d, op1 %d, op2 %d, otagx %d"
                % (otype, oentries, op1, op2, otagx)
            )
        if header["code"] == 0xFDEA or oentries > 0:
            # some dictionaries seem to be codepage 65002 (0xFDEA) which seems
            # to be some sort of strange EBCDIC utf-8 or 16 encoded strings
            # So we need to look for them and store them away to process leading text
            # ORDT1 has 1 byte long entries, ORDT2 has 2 byte long entries
            # we only ever seem to use the second but ...
            #
            # if otype = 0, ORDT table uses 16 bit values as offsets into the table
            # if otype = 1, ORDT table uses 8 bit values as offsets inot the table
            assert data[op1 : op1 + 4] == b"ORDT"
            assert data[op2 : op2 + 4] == b"ORDT"
            ordt1 = struct.unpack_from(bstr(">%dB" % oentries), data, op1 + 4)
            ordt2 = struct.unpack_from(bstr(">%dH" % oentries), data, op2 + 4)
        if DEBUG_DICT:
            logger.debug("parsed INDX header:")
            for key in header:
                logger.debug(
                    key, "%x" % header[key],
                )
            logger.debug("\n")
        return header, ordt1, ordt2
    def getPositionMap(self):
        sect = self.sect
        positionMap = {}
        metaOrthIndex = self.metaOrthIndex
        metaInflIndex = self.metaInflIndex
        decodeInflection = True
        if metaOrthIndex != 0xFFFFFFFF:
            logger.debug(
                "Info: Document contains orthographic index, handle as dictionary"
            )
            if metaInflIndex == 0xFFFFFFFF:
                decodeInflection = False
            else:
                metaInflIndexData = sect.loadSection(metaInflIndex)
                logger.debug("\nParsing metaInflIndexData")
                midxhdr, mhordt1, mhordt2 = self.parseHeader(metaInflIndexData)
                metaIndexCount = midxhdr["count"]
                idatas = []
                for j in range(metaIndexCount):
                    idatas.append(sect.loadSection(metaInflIndex + 1 + j))
                dinfl = InflectionData(idatas)
                inflNameData = sect.loadSection(metaInflIndex + 1 + metaIndexCount)
                tagSectionStart = midxhdr["len"]
                inflectionControlByteCount, inflectionTagTable = readTagSection(
                    tagSectionStart, metaInflIndexData
                )
                if DEBUG_DICT:
                    logger.debug("inflectionTagTable: %s" % inflectionTagTable)
                if self.hasTag(inflectionTagTable, 0x07):
                    logger.debug(
                        "Error: Dictionary uses obsolete inflection rule scheme which is not yet supported"
                    )
                    decodeInflection = False
            data = sect.loadSection(metaOrthIndex)
            logger.debug("\nParsing metaOrthIndex")
            idxhdr, hordt1, hordt2 = self.parseHeader(data)
            tagSectionStart = idxhdr["len"]
            controlByteCount, tagTable = readTagSection(tagSectionStart, data)
            orthIndexCount = idxhdr["count"]
            logger.debug("orthIndexCount is", orthIndexCount)
            if DEBUG_DICT:
                logger.debug("orthTagTable: %s" % tagTable)
            if hordt2 is not None:
                logger.debug(
                    "orth entry uses ordt2 lookup table of type ", idxhdr["otype"]
                )
            hasEntryLength = self.hasTag(tagTable, 0x02)
            if not hasEntryLength:
                logger.debug("Info: Index doesn't contain entry length tags")
            logger.debug("Read dictionary index data")
            for i in range(metaOrthIndex + 1, metaOrthIndex + 1 + orthIndexCount):
                data = sect.loadSection(i)
                hdrinfo, ordt1, ordt2 = self.parseHeader(data)
                idxtPos = hdrinfo["start"]
                entryCount = hdrinfo["count"]
                idxPositions = []
                for j in range(entryCount):
                    (pos,) = struct.unpack_from(b">H", data, idxtPos + 4 + (2 * j))
                    idxPositions.append(pos)
                # The last entry ends before the IDXT tag (but there might be zero fill bytes we need to ignore!)
                idxPositions.append(idxtPos)
                for j in range(entryCount):
                    startPos = idxPositions[j]
                    endPos = idxPositions[j + 1]
                    textLength = ord(data[startPos : startPos + 1])
                    text = data[startPos + 1 : startPos + 1 + textLength]
                    if hordt2 is not None:
                        utext = ""
                        if idxhdr["otype"] == 0:
                            pattern = b">H"
                            inc = 2
                        else:
                            pattern = b">B"
                            inc = 1
                        pos = 0
                        while pos < textLength:
                            (off,) = struct.unpack_from(pattern, text, pos)
                            if off < len(hordt2):
                                utext += unichr(hordt2[off])
                            else:
                                utext += unichr(off)
                            pos += inc
                        text = utext.encode("utf-8")
                    tagMap = getTagMap(
                        controlByteCount,
                        tagTable,
                        data,
                        startPos + 1 + textLength,
                        endPos,
                    )
                    if 0x01 in tagMap:
                        if decodeInflection and 0x2A in tagMap:
                            inflectionGroups = self.getInflectionGroups(
                                text,
                                inflectionControlByteCount,
                                inflectionTagTable,
                                dinfl,
                                inflNameData,
                                tagMap[0x2A],
                            )
                        else:
                            inflectionGroups = b""
                        assert len(tagMap[0x01]) == 1
                        entryStartPosition = tagMap[0x01][0]
                        if hasEntryLength:
                            # The idx:entry attribute "scriptable" must be present to create entry length tags.
                            ml = (
                                b'<idx:entry scriptable="yes"><idx:orth value="'
                                + text
                                + b'">'
                                + inflectionGroups
                                + b"</idx:orth>"
                            )
                            if entryStartPosition in positionMap:
                                positionMap[entryStartPosition] = (
                                    positionMap[entryStartPosition] + ml
                                )
                            else:
                                positionMap[entryStartPosition] = ml
                            assert len(tagMap[0x02]) == 1
                            entryEndPosition = entryStartPosition + tagMap[0x02][0]
                            if entryEndPosition in positionMap:
                                positionMap[entryEndPosition] = (
                                    b"</idx:entry>" + positionMap[entryEndPosition]
                                )
                            else:
                                positionMap[entryEndPosition] = b"</idx:entry>"
                        else:
                            indexTags = (
                                b'<idx:entry>\n<idx:orth value="'
                                + text
                                + b'">\n'
                                + inflectionGroups
                                + b"</idx:entry>\n"
                            )
                            if entryStartPosition in positionMap:
                                positionMap[entryStartPosition] = (
                                    positionMap[entryStartPosition] + indexTags
                                )
                            else:
                                positionMap[entryStartPosition] = indexTags
        return positionMap
    def hasTag(self, tagTable, tag):
        """
        Test if tag table contains given tag.
        @param tagTable: The tag table.
        @param tag: The tag to search.
        @return: True if tag table contains given tag; False otherwise.
        """
        for currentTag, _, _, _ in tagTable:
            if currentTag == tag:
                return True
        return False
    def getInflectionGroups(
        self, mainEntry, controlByteCount, tagTable, dinfl, inflectionNames, groupList
    ):
        """
        Create string which contains the inflection groups with inflection rules as mobipocket tags.
        @param mainEntry: The word to inflect.
        @param controlByteCount: The number of control bytes.
        @param tagTable: The tag table.
        @param data: The Inflection data object to properly select the right inflection data section to use
        @param inflectionNames: The inflection rule name data.
        @param groupList: The list of inflection groups to process.
        @return: String with inflection groups and rules or empty string if required tags are not available.
        """
        result = b""
        for value in groupList:
            offset, nextOffset, data = dinfl.offsets(value)
            # First byte seems to be always 0x00 and must be skipped.
            assert ord(data[offset : offset + 1]) == 0x00
            tagMap = getTagMap(controlByteCount, tagTable, data, offset + 1, nextOffset)
            # Make sure that the required tags are available.
            if 0x05 not in tagMap:
                logger.debug("Error: Required tag 0x05 not found in tagMap")
                return ""
            if 0x1A not in tagMap:
                logger.debug("Error: Required tag 0x1a not found in tagMap")
                return b""
            result += b"<idx:infl>"
            for i in range(len(tagMap[0x05])):
                # Get name of inflection rule.
                value = tagMap[0x05][i]
                consumed, textLength = getVariableWidthValue(inflectionNames, value)
                inflectionName = inflectionNames[
                    value + consumed : value + consumed + textLength
                ]
                # Get and apply inflection rule across possibly multiple inflection data sections
                value = tagMap[0x1A][i]
                rvalue, start, count, data = dinfl.lookup(value)
                (offset,) = struct.unpack_from(b">H", data, start + 4 + (2 * rvalue))
                textLength = ord(data[offset : offset + 1])
                inflection = self.applyInflectionRule(
                    mainEntry, data, offset + 1, offset + 1 + textLength
                )
                if inflection is not None:
                    result += (
                        b'  <idx:iform name="'
                        + inflectionName
                        + b'" value="'
                        + inflection
                        + b'"/>'
                    )
            result += b"</idx:infl>"
        return result
    def applyInflectionRule(self, mainEntry, inflectionRuleData, start, end):
        """
        Apply inflection rule.
        @param mainEntry: The word to inflect.
        @param inflectionRuleData: The inflection rules.
        @param start: The start position of the inflection rule to use.
        @param end: The end position of the inflection rule to use.
        @return: The string with the inflected word or None if an error occurs.
        """
        mode = -1
        byteArray = array.array(array_format, mainEntry)
        position = len(byteArray)
        for charOffset in range(start, end):
            char = inflectionRuleData[charOffset : charOffset + 1]
            abyte = ord(char)
            if abyte >= 0x0A and abyte <= 0x13:
                # Move cursor backwards
                offset = abyte - 0x0A
                if mode not in [0x02, 0x03]:
                    mode = 0x02
                    position = len(byteArray)
                position -= offset
            elif abyte > 0x13:
                if mode == -1:
                    logger.debug(
                        "Error: Unexpected first byte %i of inflection rule" % abyte
                    )
                    return None
                elif position == -1:
                    logger.debug(
                        "Error: Unexpected first byte %i of inflection rule" % abyte
                    )
                    return None
                else:
                    if mode == 0x01:
                        # Insert at word start
                        byteArray.insert(position, abyte)
                        position += 1
                    elif mode == 0x02:
                        # Insert at word end
                        byteArray.insert(position, abyte)
                    elif mode == 0x03:
                        # Delete at word end
                        position -= 1
                        deleted = byteArray.pop(position)
                        if bchr(deleted) != char:
                            if DEBUG_DICT:
                                logger.debug(
                                    "0x03: %s %s %s %s"
                                    % (
                                        mainEntry,
                                        toHex(inflectionRuleData[start:end]),
                                        char,
                                        bchr(deleted),
                                    )
                                )
                            logger.debug(
                                "Error: Delete operation of inflection rule failed"
                            )
                            return None
                    elif mode == 0x04:
                        # Delete at word start
                        deleted = byteArray.pop(position)
                        if bchr(deleted) != char:
                            if DEBUG_DICT:
                                logger.debug(
                                    "0x03: %s %s %s %s"
                                    % (
                                        mainEntry,
                                        toHex(inflectionRuleData[start:end]),
                                        char,
                                        bchr(deleted),
                                    )
                                )
                            logger.debug(
                                "Error: Delete operation of inflection rule failed"
                            )
                            return None
                    else:
                        logger.debug(
                            "Error: Inflection rule mode %x is not implemented" % mode
                        )
                        return None
            elif abyte == 0x01:
                # Insert at word start
                if mode not in [0x01, 0x04]:
                    position = 0
                mode = abyte
            elif abyte == 0x02:
                # Insert at word end
                if mode not in [0x02, 0x03]:
                    position = len(byteArray)
                mode = abyte
            elif abyte == 0x03:
                # Delete at word end
                if mode not in [0x02, 0x03]:
                    position = len(byteArray)
                mode = abyte
            elif abyte == 0x04:
                # Delete at word start
                if mode not in [0x01, 0x04]:
                    position = 0
                # Delete at word start
                mode = abyte
            else:
                logger.debug(
                    "Error: Inflection rule mode %x is not implemented" % abyte
                )
                return None
        return utf8_str(byteArray.tostring())
--- a/mobimaster/mobi/mobi_header.py
+++ b/mobimaster/mobi/mobi_header.py
--- a/mobimaster/mobi/mobi_html.py
+++ b/mobimaster/mobi/mobi_html.py
@@ -0,0 +1,516 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import PY2, utf8_str
 from loguru import logger
 if PY2:
    range = xrange
 import re
 # note: re requites the pattern to be the exact same type as the data to be searched in python3
 # but u"" is not allowed for the pattern itself only b""
 from .mobi_utils import fromBase32
 class HTMLProcessor:
    def __init__(self, files, metadata, rscnames):
        self.files = files
        self.metadata = metadata
        self.rscnames = rscnames
        # for original style mobis, default to including all image files in the opf manifest
        self.used = {}
        for name in rscnames:
            self.used[name] = "used"
    def findAnchors(self, rawtext, indx_data, positionMap):
        # process the raw text
        # find anchors...
        logger.debug("Find link anchors")
        link_pattern = re.compile(
            br"""<[^<>]+filepos=['"]{0,1}(\d+)[^<>]*>""", re.IGNORECASE
        )
        # TEST NCX: merge in filepos from indx
        pos_links = [int(m.group(1)) for m in link_pattern.finditer(rawtext)]
        if indx_data:
            pos_indx = [e["pos"] for e in indx_data if e["pos"] > 0]
            pos_links = list(set(pos_links + pos_indx))
        for position in pos_links:
            if position in positionMap:
                positionMap[position] = positionMap[position] + utf8_str(
                    '<a id="filepos%d" />' % position
                )
            else:
                positionMap[position] = utf8_str('<a id="filepos%d" />' % position)
        # apply dictionary metadata and anchors
        logger.debug("Insert data into html")
        pos = 0
        lastPos = len(rawtext)
        dataList = []
        for end in sorted(positionMap.keys()):
            if end == 0 or end > lastPos:
                continue  # something's up - can't put a tag in outside <html>...</html>
            dataList.append(rawtext[pos:end])
            dataList.append(positionMap[end])
            pos = end
        dataList.append(rawtext[pos:])
        srctext = b"".join(dataList)
        rawtext = None
        dataList = None
        self.srctext = srctext
        self.indx_data = indx_data
        return srctext
    def insertHREFS(self):
        srctext = self.srctext
        rscnames = self.rscnames
        metadata = self.metadata
        # put in the hrefs
        logger.debug("Insert hrefs into html")
        # There doesn't seem to be a standard, so search as best as we can
        link_pattern = re.compile(
            br"""<a([^>]*?)filepos=['"]{0,1}0*(\d+)['"]{0,1}([^>]*?)>""", re.IGNORECASE
        )
        srctext = link_pattern.sub(br"""<a\1href="#filepos\2"\3>""", srctext)
        # remove empty anchors
        logger.debug("Remove empty anchors from html")
        srctext = re.sub(br"<a\s*/>", br"", srctext)
        srctext = re.sub(br"<a\s*>\s*</a>", br"", srctext)
        # convert image references
        logger.debug("Insert image references into html")
        # split string into image tag pieces and other pieces
        image_pattern = re.compile(br"""(<img.*?>)""", re.IGNORECASE)
        image_index_pattern = re.compile(
            br"""recindex=['"]{0,1}([0-9]+)['"]{0,1}""", re.IGNORECASE
        )
        srcpieces = image_pattern.split(srctext)
        srctext = self.srctext = None
        # all odd pieces are image tags (nulls string on even pieces if no space between them in srctext)
        for i in range(1, len(srcpieces), 2):
            tag = srcpieces[i]
            for m in image_index_pattern.finditer(tag):
                imageNumber = int(m.group(1))
                imageName = rscnames[imageNumber - 1]
                if imageName is None:
                    logger.debug(
                        "Error: Referenced image %s was not recognized as a valid image"
                        % imageNumber
                    )
                else:
                    replacement = b'src="Images/' + utf8_str(imageName) + b'"'
                    tag = image_index_pattern.sub(replacement, tag, 1)
            srcpieces[i] = tag
        srctext = b"".join(srcpieces)
        # add in character set meta into the html header if needed
        if "Codec" in metadata:
            srctext = (
                srctext[0:12]
                + b'<meta http-equiv="content-type" content="text/html; charset='
                + utf8_str(metadata.get("Codec")[0])
                + b'" />'
                + srctext[12:]
            )
        return srctext, self.used
 class XHTMLK8Processor:
    def __init__(self, rscnames, k8proc):
        self.rscnames = rscnames
        self.k8proc = k8proc
        self.used = {}
    def buildXHTML(self):
        # first need to update all links that are internal which
        # are based on positions within the xhtml files **BEFORE**
        # cutting and pasting any pieces into the xhtml text files
        #   kindle:pos:fid:XXXX:off:YYYYYYYYYY  (used for internal link within xhtml)
        #       XXXX is the offset in records into divtbl
        #       YYYYYYYYYYYY is a base32 number you add to the divtbl insertpos to get final position
        # pos:fid pattern
        posfid_pattern = re.compile(br"""(<a.*?href=.*?>)""", re.IGNORECASE)
        posfid_index_pattern = re.compile(
            br"""['"]kindle:pos:fid:([0-9|A-V]+):off:([0-9|A-V]+).*?["']"""
        )
        parts = []
        logger.debug("Building proper xhtml for each file")
        for i in range(self.k8proc.getNumberOfParts()):
            part = self.k8proc.getPart(i)
            [partnum, dir, filename, beg, end, aidtext] = self.k8proc.getPartInfo(i)
            # internal links
            srcpieces = posfid_pattern.split(part)
            for j in range(1, len(srcpieces), 2):
                tag = srcpieces[j]
                if tag.startswith(b"<"):
                    for m in posfid_index_pattern.finditer(tag):
                        posfid = m.group(1)
                        offset = m.group(2)
                        filename, idtag = self.k8proc.getIDTagByPosFid(posfid, offset)
                        if idtag == b"":
                            replacement = b'"' + utf8_str(filename) + b'"'
                        else:
                            replacement = (
                                b'"' + utf8_str(filename) + b"#" + idtag + b'"'
                            )
                        tag = posfid_index_pattern.sub(replacement, tag, 1)
                    srcpieces[j] = tag
            part = b"".join(srcpieces)
            parts.append(part)
        # we are free to cut and paste as we see fit
        # we can safely remove all of the Kindlegen generated aid tags
        # change aid ids that are in k8proc.linked_aids to xhtml ids
        find_tag_with_aid_pattern = re.compile(
            br"""(<[^>]*\said\s*=[^>]*>)""", re.IGNORECASE
        )
        within_tag_aid_position_pattern = re.compile(br"""\said\s*=['"]([^'"]*)['"]""")
        for i in range(len(parts)):
            part = parts[i]
            srcpieces = find_tag_with_aid_pattern.split(part)
            for j in range(len(srcpieces)):
                tag = srcpieces[j]
                if tag.startswith(b"<"):
                    for m in within_tag_aid_position_pattern.finditer(tag):
                        try:
                            aid = m.group(1)
                        except IndexError:
                            aid = None
                        replacement = b""
                        if aid in self.k8proc.linked_aids:
                            replacement = b' id="aid-' + aid + b'"'
                        tag = within_tag_aid_position_pattern.sub(replacement, tag, 1)
                    srcpieces[j] = tag
            part = b"".join(srcpieces)
            parts[i] = part
        # we can safely replace all of the Kindlegen generated data-AmznPageBreak tags
        # with page-break-after style patterns
        find_tag_with_AmznPageBreak_pattern = re.compile(
            br"""(<[^>]*\sdata-AmznPageBreak=[^>]*>)""", re.IGNORECASE
        )
        within_tag_AmznPageBreak_position_pattern = re.compile(
            br"""\sdata-AmznPageBreak=['"]([^'"]*)['"]"""
        )
        for i in range(len(parts)):
            part = parts[i]
            srcpieces = find_tag_with_AmznPageBreak_pattern.split(part)
            for j in range(len(srcpieces)):
                tag = srcpieces[j]
                if tag.startswith(b"<"):
                    srcpieces[j] = within_tag_AmznPageBreak_position_pattern.sub(
                        lambda m: b' style="page-break-after:' + m.group(1) + b'"', tag
                    )
            part = b"".join(srcpieces)
            parts[i] = part
        # we have to handle substitutions for the flows  pieces first as they may
        # be inlined into the xhtml text
        #   kindle:embed:XXXX?mime=image/gif (png, jpeg, etc) (used for images)
        #   kindle:flow:XXXX?mime=YYYY/ZZZ (used for style sheets, svg images, etc)
        #   kindle:embed:XXXX   (used for fonts)
        flows = []
        flows.append(None)
        flowinfo = []
        flowinfo.append([None, None, None, None])
        # regular expression search patterns
        img_pattern = re.compile(br"""(<[img\s|image\s][^>]*>)""", re.IGNORECASE)
        img_index_pattern = re.compile(
            br"""[('"]kindle:embed:([0-9|A-V]+)[^'"]*['")]""", re.IGNORECASE
        )
        tag_pattern = re.compile(br"""(<[^>]*>)""")
        flow_pattern = re.compile(
            br"""['"]kindle:flow:([0-9|A-V]+)\?mime=([^'"]+)['"]""", re.IGNORECASE
        )
        url_pattern = re.compile(br"""(url\(.*?\))""", re.IGNORECASE)
        url_img_index_pattern = re.compile(
            br"""[('"]kindle:embed:([0-9|A-V]+)\?mime=image/[^\)]*["')]""",
            re.IGNORECASE,
        )
        font_index_pattern = re.compile(
            br"""[('"]kindle:embed:([0-9|A-V]+)["')]""", re.IGNORECASE
        )
        url_css_index_pattern = re.compile(
            br"""kindle:flow:([0-9|A-V]+)\?mime=text/css[^\)]*""", re.IGNORECASE
        )
        url_svg_image_pattern = re.compile(
            br"""kindle:flow:([0-9|A-V]+)\?mime=image/svg\+xml[^\)]*""", re.IGNORECASE
        )
        for i in range(1, self.k8proc.getNumberOfFlows()):
            [ftype, format, dir, filename] = self.k8proc.getFlowInfo(i)
            flowpart = self.k8proc.getFlow(i)
            # links to raster image files from image tags
            # image_pattern
            srcpieces = img_pattern.split(flowpart)
            for j in range(1, len(srcpieces), 2):
                tag = srcpieces[j]
                if tag.startswith(b"<im"):
                    for m in img_index_pattern.finditer(tag):
                        imageNumber = fromBase32(m.group(1))
                        imageName = self.rscnames[imageNumber - 1]
                        if imageName is not None:
                            replacement = b'"../Images/' + utf8_str(imageName) + b'"'
                            self.used[imageName] = "used"
                            tag = img_index_pattern.sub(replacement, tag, 1)
                        else:
                            logger.debug(
                                "Error: Referenced image %s was not recognized as a valid image in %s"
                                % (imageNumber, tag)
                            )
                    srcpieces[j] = tag
            flowpart = b"".join(srcpieces)
            # replacements inside css url():
            srcpieces = url_pattern.split(flowpart)
            for j in range(1, len(srcpieces), 2):
                tag = srcpieces[j]
                #  process links to raster image files
                for m in url_img_index_pattern.finditer(tag):
                    imageNumber = fromBase32(m.group(1))
                    imageName = self.rscnames[imageNumber - 1]
                    osep = m.group()[0:1]
                    csep = m.group()[-1:]
                    if imageName is not None:
                        replacement = osep + b"../Images/" + utf8_str(imageName) + csep
                        self.used[imageName] = "used"
                        tag = url_img_index_pattern.sub(replacement, tag, 1)
                    else:
                        logger.debug(
                            "Error: Referenced image %s was not recognized as a valid image in %s"
                            % (imageNumber, tag)
                        )
                # process links to fonts
                for m in font_index_pattern.finditer(tag):
                    fontNumber = fromBase32(m.group(1))
                    fontName = self.rscnames[fontNumber - 1]
                    osep = m.group()[0:1]
                    csep = m.group()[-1:]
                    if fontName is None:
                        logger.debug(
                            "Error: Referenced font %s was not recognized as a valid font in %s"
                            % (fontNumber, tag)
                        )
                    else:
                        replacement = osep + b"../Fonts/" + utf8_str(fontName) + csep
                        tag = font_index_pattern.sub(replacement, tag, 1)
                        self.used[fontName] = "used"
                # process links to other css pieces
                for m in url_css_index_pattern.finditer(tag):
                    num = fromBase32(m.group(1))
                    [typ, fmt, pdir, fnm] = self.k8proc.getFlowInfo(num)
                    replacement = b'"../' + utf8_str(pdir) + b"/" + utf8_str(fnm) + b'"'
                    tag = url_css_index_pattern.sub(replacement, tag, 1)
                    self.used[fnm] = "used"
                # process links to svg images
                for m in url_svg_image_pattern.finditer(tag):
                    num = fromBase32(m.group(1))
                    [typ, fmt, pdir, fnm] = self.k8proc.getFlowInfo(num)
                    replacement = b'"../' + utf8_str(pdir) + b"/" + utf8_str(fnm) + b'"'
                    tag = url_svg_image_pattern.sub(replacement, tag, 1)
                    self.used[fnm] = "used"
                srcpieces[j] = tag
            flowpart = b"".join(srcpieces)
            # store away in our own copy
            flows.append(flowpart)
            # I do not think this case exists and even if it does exist, it needs to be done in a separate
            # pass to prevent inlining a flow piece into another flow piece before the inserted one or the
            # target one has been fully processed
            # but keep it around if it ends up we do need it
            # flow pattern not inside url()
            # srcpieces = tag_pattern.split(flowpart)
            # for j in range(1, len(srcpieces),2):
            #     tag = srcpieces[j]
            #     if tag.startswith(b'<'):
            #         for m in flow_pattern.finditer(tag):
            #             num = fromBase32(m.group(1))
            #             [typ, fmt, pdir, fnm] = self.k8proc.getFlowInfo(num)
            #             flowtext = self.k8proc.getFlow(num)
            #             if fmt == b'inline':
            #                 tag = flowtext
            #             else:
            #                 replacement = b'"../' + utf8_str(pdir) + b'/' + utf8_str(fnm) + b'"'
            #                 tag = flow_pattern.sub(replacement, tag, 1)
            #                 self.used[fnm] = 'used'
            #         srcpieces[j] = tag
            # flowpart = b"".join(srcpieces)
        # now handle the main text xhtml parts
        # Handle the flow items in the XHTML text pieces
        # kindle:flow:XXXX?mime=YYYY/ZZZ (used for style sheets, svg images, etc)
        tag_pattern = re.compile(br"""(<[^>]*>)""")
        flow_pattern = re.compile(
            br"""['"]kindle:flow:([0-9|A-V]+)\?mime=([^'"]+)['"]""", re.IGNORECASE
        )
        for i in range(len(parts)):
            part = parts[i]
            [partnum, dir, filename, beg, end, aidtext] = self.k8proc.partinfo[i]
            # flow pattern
            srcpieces = tag_pattern.split(part)
            for j in range(1, len(srcpieces), 2):
                tag = srcpieces[j]
                if tag.startswith(b"<"):
                    for m in flow_pattern.finditer(tag):
                        num = fromBase32(m.group(1))
                        if num > 0 and num < len(self.k8proc.flowinfo):
                            [typ, fmt, pdir, fnm] = self.k8proc.getFlowInfo(num)
                            flowpart = flows[num]
                            if fmt == b"inline":
                                tag = flowpart
                            else:
                                replacement = (
                                    b'"../'
                                    + utf8_str(pdir)
                                    + b"/"
                                    + utf8_str(fnm)
                                    + b'"'
                                )
                                tag = flow_pattern.sub(replacement, tag, 1)
                                self.used[fnm] = "used"
                        else:
                            print(
                                "warning: ignoring non-existent flow link",
                                tag,
                                " value 0x%x" % num,
                            )
                    srcpieces[j] = tag
            part = b"".join(srcpieces)
            # store away modified version
            parts[i] = part
        # Handle any embedded raster images links in style= attributes urls
        style_pattern = re.compile(
            br"""(<[a-zA-Z0-9]+\s[^>]*style\s*=\s*[^>]*>)""", re.IGNORECASE
        )
        img_index_pattern = re.compile(
            br"""[('"]kindle:embed:([0-9|A-V]+)[^'"]*['")]""", re.IGNORECASE
        )
        for i in range(len(parts)):
            part = parts[i]
            [partnum, dir, filename, beg, end, aidtext] = self.k8proc.partinfo[i]
            # replace urls in style attributes
            srcpieces = style_pattern.split(part)
            for j in range(1, len(srcpieces), 2):
                tag = srcpieces[j]
                if b"kindle:embed" in tag:
                    for m in img_index_pattern.finditer(tag):
                        imageNumber = fromBase32(m.group(1))
                        imageName = self.rscnames[imageNumber - 1]
                        osep = m.group()[0:1]
                        csep = m.group()[-1:]
                        if imageName is not None:
                            replacement = (
                                osep + b"../Images/" + utf8_str(imageName) + csep
                            )
                            self.used[imageName] = "used"
                            tag = img_index_pattern.sub(replacement, tag, 1)
                        else:
                            logger.debug(
                                "Error: Referenced image %s in style url was not recognized in %s"
                                % (imageNumber, tag)
                            )
                    srcpieces[j] = tag
            part = b"".join(srcpieces)
            # store away modified version
            parts[i] = part
        # Handle any embedded raster images links in the xhtml text
        # kindle:embed:XXXX?mime=image/gif (png, jpeg, etc) (used for images)
        img_pattern = re.compile(br"""(<[img\s|image\s][^>]*>)""", re.IGNORECASE)
        img_index_pattern = re.compile(br"""['"]kindle:embed:([0-9|A-V]+)[^'"]*['"]""")
        for i in range(len(parts)):
            part = parts[i]
            [partnum, dir, filename, beg, end, aidtext] = self.k8proc.partinfo[i]
            # links to raster image files
            # image_pattern
            srcpieces = img_pattern.split(part)
            for j in range(1, len(srcpieces), 2):
                tag = srcpieces[j]
                if tag.startswith(b"<im"):
                    for m in img_index_pattern.finditer(tag):
                        imageNumber = fromBase32(m.group(1))
                        imageName = self.rscnames[imageNumber - 1]
                        if imageName is not None:
                            replacement = b'"../Images/' + utf8_str(imageName) + b'"'
                            self.used[imageName] = "used"
                            tag = img_index_pattern.sub(replacement, tag, 1)
                        else:
                            logger.debug(
                                "Error: Referenced image %s was not recognized as a valid image in %s"
                                % (imageNumber, tag)
                            )
                    srcpieces[j] = tag
            part = b"".join(srcpieces)
            # store away modified version
            parts[i] = part
        # finally perform any general cleanups needed to make valid XHTML
        # these include:
        #   in svg tags replace "perserveaspectratio" attributes with "perserveAspectRatio"
        #   in svg tags replace "viewbox" attributes with "viewBox"
        #   in <li> remove value="XX" attributes since these are illegal
        tag_pattern = re.compile(br"""(<[^>]*>)""")
        li_value_pattern = re.compile(
            br"""\svalue\s*=\s*['"][^'"]*['"]""", re.IGNORECASE
        )
        for i in range(len(parts)):
            part = parts[i]
            [partnum, dir, filename, beg, end, aidtext] = self.k8proc.partinfo[i]
            # tag pattern
            srcpieces = tag_pattern.split(part)
            for j in range(1, len(srcpieces), 2):
                tag = srcpieces[j]
                if tag.startswith(b"<svg") or tag.startswith(b"<SVG"):
                    tag = tag.replace(b"preserveaspectratio", b"preserveAspectRatio")
                    tag = tag.replace(b"viewbox", b"viewBox")
                elif tag.startswith(b"<li ") or tag.startswith(b"<LI "):
                    tagpieces = li_value_pattern.split(tag)
                    tag = b"".join(tagpieces)
                srcpieces[j] = tag
            part = b"".join(srcpieces)
            # store away modified version
            parts[i] = part
        self.k8proc.setFlows(flows)
        self.k8proc.setParts(parts)
        return self.used
--- a/mobimaster/mobi/mobi_index.py
+++ b/mobimaster/mobi/mobi_index.py
@@ -0,0 +1,327 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import PY2, bchr, bstr, bord
 from loguru import logger
 if PY2:
    range = xrange
 import struct
 # note:  struct pack, unpack, unpack_from all require bytestring format
 # data all the way up to at least python 2.7.5, python 3 okay with bytestring
 from .mobi_utils import toHex
 class MobiIndex:
    # CGDBG
    def __init__(self, sect, DEBUG=True):
        self.sect = sect
        self.DEBUG = DEBUG
    def getIndexData(self, idx, label="Unknown"):
        sect = self.sect
        outtbl = []
        ctoc_text = {}
        if idx != 0xFFFFFFFF:
            sect.setsectiondescription(idx, "{0} Main INDX section".format(label))
            data = sect.loadSection(idx)
            idxhdr, hordt1, hordt2 = self.parseINDXHeader(data)
            IndexCount = idxhdr["count"]
            # handle the case of multiple sections used for CTOC
            rec_off = 0
            off = idx + IndexCount + 1
            for j in range(idxhdr["nctoc"]):
                cdata = sect.loadSection(off + j)
                sect.setsectiondescription(off + j, label + " CTOC Data " + str(j))
                ctocdict = self.readCTOC(cdata)
                for k in ctocdict:
                    ctoc_text[k + rec_off] = ctocdict[k]
                rec_off += 0x10000
            tagSectionStart = idxhdr["len"]
            controlByteCount, tagTable = readTagSection(tagSectionStart, data)
            if self.DEBUG:
                logger.debug("ControlByteCount is", controlByteCount)
                logger.debug("IndexCount is", IndexCount)
                logger.debug("TagTable: %s" % tagTable)
            for i in range(idx + 1, idx + 1 + IndexCount):
                sect.setsectiondescription(
                    i, "{0} Extra {1:d} INDX section".format(label, i - idx)
                )
                data = sect.loadSection(i)
                hdrinfo, ordt1, ordt2 = self.parseINDXHeader(data)
                idxtPos = hdrinfo["start"]
                entryCount = hdrinfo["count"]
                if self.DEBUG:
                    logger.debug("%s %s" % (idxtPos, entryCount))
                # loop through to build up the IDXT position starts
                idxPositions = []
                for j in range(entryCount):
                    (pos,) = struct.unpack_from(b">H", data, idxtPos + 4 + (2 * j))
                    idxPositions.append(pos)
                # The last entry ends before the IDXT tag (but there might be zero fill bytes we need to ignore!)
                idxPositions.append(idxtPos)
                # for each entry in the IDXT build up the tagMap and any associated text
                for j in range(entryCount):
                    startPos = idxPositions[j]
                    endPos = idxPositions[j + 1]
                    textLength = ord(data[startPos : startPos + 1])
                    text = data[startPos + 1 : startPos + 1 + textLength]
                    if hordt2 is not None:
                        text = b"".join(bchr(hordt2[bord(x)]) for x in text)
                    tagMap = getTagMap(
                        controlByteCount,
                        tagTable,
                        data,
                        startPos + 1 + textLength,
                        endPos,
                    )
                    outtbl.append([text, tagMap])
                    if self.DEBUG:
                        # CGDBG
                        logger.debug('tagMap {}'.format(tagMap))
                        logger.debug('text {}'.format(text))
                        logger.debug('data {}'.format(data))
        return outtbl, ctoc_text
    def parseINDXHeader(self, data):
        "read INDX header"
        if not data[:4] == b"INDX":
            logger.debug("Warning: index section is not INDX")
            return False
        words = (
            "len",
            "nul1",
            "type",
            "gen",
            "start",
            "count",
            "code",
            "lng",
            "total",
            "ordt",
            "ligt",
            "nligt",
            "nctoc",
        )
        num = len(words)
        values = struct.unpack(bstr(">%dL" % num), data[4 : 4 * (num + 1)])
        header = {}
        for n in range(num):
            header[words[n]] = values[n]
        ordt1 = None
        ordt2 = None
        ocnt, oentries, op1, op2, otagx = struct.unpack_from(b">LLLLL", data, 0xA4)
        if header["code"] == 0xFDEA or ocnt != 0 or oentries > 0:
            # horribly hacked up ESP (sample) mobi books use two ORDT sections but never specify
            # them in the proper place in the header.  They seem to be codepage 65002 which seems
            # to be some sort of strange EBCDIC utf-8 or 16 encoded strings
            # so we need to look for them and store them away to process leading text
            # ORDT1 has 1 byte long entries, ORDT2 has 2 byte long entries
            # we only ever seem to use the seocnd but ...
            assert ocnt == 1
            assert data[op1 : op1 + 4] == b"ORDT"
            assert data[op2 : op2 + 4] == b"ORDT"
            ordt1 = struct.unpack_from(bstr(">%dB" % oentries), data, op1 + 4)
            ordt2 = struct.unpack_from(bstr(">%dH" % oentries), data, op2 + 4)
        if self.DEBUG:
            logger.debug("parsed INDX header:")
            for n in words:
                print(
                    n, "%X" % header[n],
                )
            logger.debug("")
        return header, ordt1, ordt2
    def readCTOC(self, txtdata):
        # read all blocks from CTOC
        ctoc_data = {}
        offset = 0
        while offset < len(txtdata):
            if PY2:
                if txtdata[offset] == b"\0":
                    break
            else:
                if txtdata[offset] == 0:
                    break
            idx_offs = offset
            # first n bytes: name len as vwi
            pos, ilen = getVariableWidthValue(txtdata, offset)
            offset += pos
            # <len> next bytes: name
            name = txtdata[offset : offset + ilen]
            offset += ilen
            if self.DEBUG:
                logger.debug("name length is %s" % ilen)
                logger.debug("%s %s", (idx_offs, name))
            ctoc_data[idx_offs] = name
        return ctoc_data
 def getVariableWidthValue(data, offset):
    """
    Decode variable width value from given bytes.
    @param data: The bytes to decode.
    @param offset: The start offset into data.
    @return: Tuple of consumed bytes count and decoded value.
    """
    value = 0
    consumed = 0
    finished = False
    while not finished:
        v = data[offset + consumed : offset + consumed + 1]
        consumed += 1
        if ord(v) & 0x80:
            finished = True
        value = (value << 7) | (ord(v) & 0x7F)
    return consumed, value
 def readTagSection(start, data):
    """
    Read tag section from given data.
    @param start: The start position in the data.
    @param data: The data to process.
    @return: Tuple of control byte count and list of tag tuples.
    """
    controlByteCount = 0
    tags = []
    if data[start : start + 4] == b"TAGX":
        (firstEntryOffset,) = struct.unpack_from(b">L", data, start + 0x04)
        (controlByteCount,) = struct.unpack_from(b">L", data, start + 0x08)
        # Skip the first 12 bytes already read above.
        for i in range(12, firstEntryOffset, 4):
            pos = start + i
            tags.append(
                (
                    ord(data[pos : pos + 1]),
                    ord(data[pos + 1 : pos + 2]),
                    ord(data[pos + 2 : pos + 3]),
                    ord(data[pos + 3 : pos + 4]),
                )
            )
    return controlByteCount, tags
 def countSetBits(value, bits=8):
    """
    Count the set bits in the given value.
    @param value: Integer value.
    @param bits: The number of bits of the input value (defaults to 8).
    @return: Number of set bits.
    """
    count = 0
    for _ in range(bits):
        if value & 0x01 == 0x01:
            count += 1
        value = value >> 1
    return count
 def getTagMap(controlByteCount, tagTable, entryData, startPos, endPos):
    """
    Create a map of tags and values from the given byte section.
    @param controlByteCount: The number of control bytes.
    @param tagTable: The tag table.
    @param entryData: The data to process.
    @param startPos: The starting position in entryData.
    @param endPos: The end position in entryData or None if it is unknown.
    @return: Hashmap of tag and list of values.
    """
    tags = []
    tagHashMap = {}
    controlByteIndex = 0
    dataStart = startPos + controlByteCount
    for tag, valuesPerEntry, mask, endFlag in tagTable:
        if endFlag == 0x01:
            controlByteIndex += 1
            continue
        cbyte = ord(
            entryData[startPos + controlByteIndex : startPos + controlByteIndex + 1]
        )
        if 0:
            logger.debug(
                "Control Byte Index %0x , Control Byte Value %0x"
                % (controlByteIndex, cbyte)
            )
        value = (
            ord(
                entryData[startPos + controlByteIndex : startPos + controlByteIndex + 1]
            )
            & mask
        )
        if value != 0:
            if value == mask:
                if countSetBits(mask) > 1:
                    # If all bits of masked value are set and the mask has more than one bit, a variable width value
                    # will follow after the control bytes which defines the length of bytes (NOT the value count!)
                    # which will contain the corresponding variable width values.
                    consumed, value = getVariableWidthValue(entryData, dataStart)
                    dataStart += consumed
                    tags.append((tag, None, value, valuesPerEntry))
                else:
                    tags.append((tag, 1, None, valuesPerEntry))
            else:
                # Shift bits to get the masked value.
                while mask & 0x01 == 0:
                    mask = mask >> 1
                    value = value >> 1
                tags.append((tag, value, None, valuesPerEntry))
    for tag, valueCount, valueBytes, valuesPerEntry in tags:
        values = []
        if valueCount is not None:
            # Read valueCount * valuesPerEntry variable width values.
            for _ in range(valueCount):
                for _ in range(valuesPerEntry):
                    consumed, data = getVariableWidthValue(entryData, dataStart)
                    dataStart += consumed
                    values.append(data)
        else:
            # Convert valueBytes to variable width values.
            totalConsumed = 0
            while totalConsumed < valueBytes:
                # Does this work for valuesPerEntry != 1?
                consumed, data = getVariableWidthValue(entryData, dataStart)
                dataStart += consumed
                totalConsumed += consumed
                values.append(data)
            if totalConsumed != valueBytes:
                logger.debug(
                    "Error: Should consume %s bytes, but consumed %s"
                    % (valueBytes, totalConsumed)
                )
        tagHashMap[tag] = values
    # Test that all bytes have been processed if endPos is given.
    if endPos is not None and dataStart != endPos:
        # The last entry might have some zero padding bytes, so complain only if non zero bytes are left.
        for char in entryData[dataStart:endPos]:
            if bord(char) != 0:
                logger.debug(
                    "Warning: There are unprocessed index bytes left: %s"
                    % toHex(entryData[dataStart:endPos])
                )
                if 0:
                    logger.debug("controlByteCount: %s" % controlByteCount)
                    logger.debug("tagTable: %s" % tagTable)
                    logger.debug("data: %s" % toHex(entryData[startPos:endPos]))
                    logger.debug("tagHashMap: %s" % tagHashMap)
                break
    return tagHashMap
--- a/mobimaster/mobi/mobi_k8proc.py
+++ b/mobimaster/mobi/mobi_k8proc.py
@@ -0,0 +1,575 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import PY2, bstr, utf8_str
 from loguru import logger
 if PY2:
    range = xrange
 import os
 import struct
 # note:  struct pack, unpack, unpack_from all require bytestring format
 # data all the way up to at least python 2.7.5, python 3 okay with bytestring
 import re
 # note: re requites the pattern to be the exact same type as the data to be searched in python3
 # but u"" is not allowed for the pattern itself only b""
 from .mobi_index import MobiIndex
 from .mobi_utils import fromBase32
 from .unipath import pathof
 _guide_types = [
    b"cover",
    b"title-page",
    b"toc",
    b"index",
    b"glossary",
    b"acknowledgements",
    b"bibliography",
    b"colophon",
    b"copyright-page",
    b"dedication",
    b"epigraph",
    b"foreward",
    b"loi",
    b"lot",
    b"notes",
    b"preface",
    b"text",
 ]
 # locate beginning and ending positions of tag with specific aid attribute
 def locate_beg_end_of_tag(ml, aid):
    pattern = utf8_str(r"""<[^>]*\said\s*=\s*['"]%s['"][^>]*>""" % aid)
    aid_pattern = re.compile(pattern, re.IGNORECASE)
    for m in re.finditer(aid_pattern, ml):
        plt = m.start()
        pgt = ml.find(b">", plt + 1)
        return plt, pgt
    return 0, 0
 # iterate over all tags in block in reverse order, i.e. last ta to first tag
 def reverse_tag_iter(block):
    end = len(block)
    while True:
        pgt = block.rfind(b">", 0, end)
        if pgt == -1:
            break
        plt = block.rfind(b"<", 0, pgt)
        if plt == -1:
            break
        yield block[plt : pgt + 1]
        end = plt
 class K8Processor:
    def __init__(self, mh, sect, files, debug=False):
        self.sect = sect
        self.files = files
        self.mi = MobiIndex(sect)
        self.mh = mh
        self.skelidx = mh.skelidx
        self.fragidx = mh.fragidx
        self.guideidx = mh.guideidx
        self.fdst = mh.fdst
        self.flowmap = {}
        self.flows = None
        self.flowinfo = []
        self.parts = None
        self.partinfo = []
        self.linked_aids = set()
        self.fdsttbl = [0, 0xFFFFFFFF]
        self.DEBUG = debug
        # read in and parse the FDST info which is very similar in format to the Palm DB section
        # parsing except it provides offsets into rawML file and not the Palm DB file
        # this is needed to split up the final css, svg, etc flow section
        # that can exist at the end of the rawML file
        if self.fdst != 0xFFFFFFFF:
            header = self.sect.loadSection(self.fdst)
            if header[0:4] == b"FDST":
                (num_sections,) = struct.unpack_from(b">L", header, 0x08)
                self.fdsttbl = struct.unpack_from(
                    bstr(">%dL" % (num_sections * 2)), header, 12
                )[::2] + (mh.rawSize,)
                sect.setsectiondescription(self.fdst, "KF8 FDST INDX")
                if self.DEBUG:
                    logger.debug("\nFDST Section Map:  %d sections" % num_sections)
                    for j in range(num_sections):
                        logger.debug(
                            "Section %d: 0x%08X - 0x%08X"
                            % (j, self.fdsttbl[j], self.fdsttbl[j + 1])
                        )
            else:
                logger.debug("\nError: K8 Mobi with Missing FDST info")
        # read/process skeleton index info to create the skeleton table
        skeltbl = []
        if self.skelidx != 0xFFFFFFFF:
            # for i in range(2):
            #     fname = 'skel%04d.dat' % i
            #     data = self.sect.loadSection(self.skelidx + i)
            #     with open(pathof(fname), 'wb') as f:
            #         f.write(data)
            outtbl, ctoc_text = self.mi.getIndexData(self.skelidx, "KF8 Skeleton")
            fileptr = 0
            for [text, tagMap] in outtbl:
                # file number, skeleton name, fragtbl record count, start position, length
                skeltbl.append(
                    [fileptr, text, tagMap[1][0], tagMap[6][0], tagMap[6][1]]
                )
                fileptr += 1
        self.skeltbl = skeltbl
        if self.DEBUG:
            logger.debug("\nSkel Table:  %d entries" % len(self.skeltbl))
            logger.debug(
                "table: filenum, skeleton name, frag tbl record count, start position, length"
            )
            for j in range(len(self.skeltbl)):
                logger.debug(self.skeltbl[j])
        # read/process the fragment index to create the fragment table
        fragtbl = []
        if self.fragidx != 0xFFFFFFFF:
            # for i in range(3):
            #     fname = 'frag%04d.dat' % i
            #     data = self.sect.loadSection(self.fragidx + i)
            #     with open(pathof(fname), 'wb') as f:
            #         f.write(data)
            outtbl, ctoc_text = self.mi.getIndexData(self.fragidx, "KF8 Fragment")
            for [text, tagMap] in outtbl:
                # insert position, ctoc offset (aidtext), file number, sequence number, start position, length
                ctocoffset = tagMap[2][0]
                ctocdata = ctoc_text[ctocoffset]
                fragtbl.append(
                    [
                        int(text),
                        ctocdata,
                        tagMap[3][0],
                        tagMap[4][0],
                        tagMap[6][0],
                        tagMap[6][1],
                    ]
                )
        self.fragtbl = fragtbl
        if self.DEBUG:
            logger.debug("\nFragment Table: %d entries" % len(self.fragtbl))
            logger.debug(
                "table: file position, link id text, file num, sequence number, start position, length"
            )
            for j in range(len(self.fragtbl)):
                logger.debug(self.fragtbl[j])
        # read / process guide index for guide elements of opf
        guidetbl = []
        if self.guideidx != 0xFFFFFFFF:
            # for i in range(3):
            #     fname = 'guide%04d.dat' % i
            #     data = self.sect.loadSection(self.guideidx + i)
            #     with open(pathof(fname), 'wb') as f:
            #         f.write(data)
            outtbl, ctoc_text = self.mi.getIndexData(
                self.guideidx, "KF8 Guide elements)"
            )
            for [text, tagMap] in outtbl:
                # ref_type, ref_title, frag number
                ctocoffset = tagMap[1][0]
                ref_title = ctoc_text[ctocoffset]
                ref_type = text
                fileno = None
                if 3 in tagMap:
                    fileno = tagMap[3][0]
                if 6 in tagMap:
                    fileno = tagMap[6][0]
                guidetbl.append([ref_type, ref_title, fileno])
        self.guidetbl = guidetbl
        if self.DEBUG:
            logger.debug("\nGuide Table: %d entries" % len(self.guidetbl))
            logger.debug("table: ref_type, ref_title, fragtbl entry number")
            for j in range(len(self.guidetbl)):
                logger.debug(self.guidetbl[j])
    def buildParts(self, rawML):
        # now split the rawML into its flow pieces
        self.flows = []
        for j in range(0, len(self.fdsttbl) - 1):
            start = self.fdsttbl[j]
            end = self.fdsttbl[j + 1]
            self.flows.append(rawML[start:end])
        # the first piece represents the xhtml text
        text = self.flows[0]
        self.flows[0] = b""
        # walk the <skeleton> and fragment tables to build original source xhtml files
        # *without* destroying any file position information needed for later href processing
        # and create final list of file separation start: stop points and etc in partinfo
        if self.DEBUG:
            logger.debug("\nRebuilding flow piece 0: the main body of the ebook")
        self.parts = []
        self.partinfo = []
        fragptr = 0
        baseptr = 0
        cnt = 0
        filename = "part%04d.xhtml" % cnt
        for [skelnum, skelname, fragcnt, skelpos, skellen] in self.skeltbl:
            baseptr = skelpos + skellen
            skeleton = text[skelpos:baseptr]
            aidtext = "0"
            for i in range(fragcnt):
                [insertpos, idtext, filenum, seqnum, startpos, length] = self.fragtbl[
                    fragptr
                ]
                aidtext = idtext[12:-2]
                if i == 0:
                    filename = "part%04d.xhtml" % filenum
                slice = text[baseptr : baseptr + length]
                insertpos = insertpos - skelpos
                head = skeleton[:insertpos]
                tail = skeleton[insertpos:]
                actual_inspos = insertpos
                if tail.find(b">") < tail.find(b"<") or head.rfind(b">") < head.rfind(
                    b"<"
                ):
                    # There is an incomplete tag in either the head or tail.
                    # This can happen for some badly formed KF8 files
                    logger.debug(
                        "The fragment table for %s has incorrect insert position. Calculating manually."
                        % skelname
                    )
                    bp, ep = locate_beg_end_of_tag(skeleton, aidtext)
                    if bp != ep:
                        actual_inspos = ep + 1 + startpos
                if insertpos != actual_inspos:
                    print(
                        "fixed corrupt fragment table insert position",
                        insertpos + skelpos,
                        actual_inspos + skelpos,
                    )
                    insertpos = actual_inspos
                    self.fragtbl[fragptr][0] = actual_inspos + skelpos
                skeleton = skeleton[0:insertpos] + slice + skeleton[insertpos:]
                baseptr = baseptr + length
                fragptr += 1
            cnt += 1
            self.parts.append(skeleton)
            self.partinfo.append([skelnum, "Text", filename, skelpos, baseptr, aidtext])
        assembled_text = b"".join(self.parts)
        if self.DEBUG:
            outassembled = os.path.join(self.files.k8dir, "assembled_text.dat")
            with open(pathof(outassembled), "wb") as f:
                f.write(assembled_text)
        # The primary css style sheet is typically stored next followed by any
        # snippets of code that were previously inlined in the
        # original xhtml but have been stripped out and placed here.
        # This can include local CDATA snippets and and svg sections.
        # The problem is that for most browsers and ereaders, you can not
        # use <img src="imageXXXX.svg" /> to import any svg image that itself
        # properly uses an <image/> tag to import some raster image - it
        # should work according to the spec but does not for almost all browsers
        # and ereaders and causes epub validation issues because those  raster
        # images are in manifest but not in xhtml text - since they only
        # referenced from an svg image
        # So we need to check the remaining flow pieces to see if they are css
        # or svg images.  if svg images, we must check if they have an <image />
        # and if so inline them into the xhtml text pieces.
        # there may be other sorts of pieces stored here but until we see one
        # in the wild to reverse engineer we won't be able to tell
        self.flowinfo.append([None, None, None, None])
        svg_tag_pattern = re.compile(br"""(<svg[^>]*>)""", re.IGNORECASE)
        image_tag_pattern = re.compile(br"""(<image[^>]*>)""", re.IGNORECASE)
        for j in range(1, len(self.flows)):
            flowpart = self.flows[j]
            nstr = "%04d" % j
            m = re.search(svg_tag_pattern, flowpart)
            if m is not None:
                # svg
                ptype = b"svg"
                start = m.start()
                m2 = re.search(image_tag_pattern, flowpart)
                if m2 is not None:
                    pformat = b"inline"
                    pdir = None
                    fname = None
                    # strip off anything before <svg if inlining
                    flowpart = flowpart[start:]
                else:
                    pformat = b"file"
                    pdir = "Images"
                    fname = "svgimg" + nstr + ".svg"
            else:
                # search for CDATA and if exists inline it
                if flowpart.find(b"[CDATA[") >= 0:
                    ptype = b"css"
                    flowpart = b'<style type="text/css">\n' + flowpart + b"\n</style>\n"
                    pformat = b"inline"
                    pdir = None
                    fname = None
                else:
                    # css - assume as standalone css file
                    ptype = b"css"
                    pformat = b"file"
                    pdir = "Styles"
                    fname = "style" + nstr + ".css"
            self.flows[j] = flowpart
            self.flowinfo.append([ptype, pformat, pdir, fname])
        if self.DEBUG:
            logger.debug("\nFlow Map:  %d entries" % len(self.flowinfo))
            for fi in self.flowinfo:
                logger.debug(fi)
            logger.debug("\n")
            logger.debug(
                "\nXHTML File Part Position Information: %d entries"
                % len(self.partinfo)
            )
            for pi in self.partinfo:
                logger.debug(pi)
        if False:  # self.Debug:
            # dump all of the locations of the aid tags used in TEXT
            # find id links only inside of tags
            #    inside any < > pair find all "aid=' and return whatever is inside the quotes
            #    [^>]* means match any amount of chars except for  '>' char
            #    [^'"] match any amount of chars except for the quote character
            #    \s* means match any amount of whitespace
            logger.debug("\npositions of all aid= pieces")
            id_pattern = re.compile(
                br"""<[^>]*\said\s*=\s*['"]([^'"]*)['"][^>]*>""", re.IGNORECASE
            )
            for m in re.finditer(id_pattern, rawML):
                [filename, partnum, start, end] = self.getFileInfo(m.start())
                [seqnum, idtext] = self.getFragTblInfo(m.start())
                value = fromBase32(m.group(1))
                logger.debug(
                    "  aid: %s value: %d at: %d -> part: %d, start: %d, end: %d"
                    % (m.group(1), value, m.start(), partnum, start, end)
                )
                logger.debug("       %s  fragtbl entry %d" % (idtext, seqnum))
        return
    # get information fragment table entry by pos
    def getFragTblInfo(self, pos):
        for j in range(len(self.fragtbl)):
            [insertpos, idtext, filenum, seqnum, startpos, length] = self.fragtbl[j]
            if pos >= insertpos and pos < (insertpos + length):
                # why are these "in: and before: added here
                return seqnum, b"in: " + idtext
            if pos < insertpos:
                return seqnum, b"before: " + idtext
        return None, None
    # get information about the part (file) that exists at pos in original rawML
    def getFileInfo(self, pos):
        for [partnum, pdir, filename, start, end, aidtext] in self.partinfo:
            if pos >= start and pos < end:
                return filename, partnum, start, end
        return None, None, None, None
    # accessor functions to properly protect the internal structure
    def getNumberOfParts(self):
        return len(self.parts)
    def getPart(self, i):
        if i >= 0 and i < len(self.parts):
            return self.parts[i]
        return None
    def getPartInfo(self, i):
        if i >= 0 and i < len(self.partinfo):
            return self.partinfo[i]
        return None
    def getNumberOfFlows(self):
        return len(self.flows)
    def getFlow(self, i):
        # note flows[0] is empty - it was all of the original text
        if i > 0 and i < len(self.flows):
            return self.flows[i]
        return None
    def getFlowInfo(self, i):
        # note flowinfo[0] is empty - it was all of the original text
        if i > 0 and i < len(self.flowinfo):
            return self.flowinfo[i]
        return None
    def getIDTagByPosFid(self, posfid, offset):
        # first convert kindle:pos:fid and offset info to position in file
        # (fromBase32 can handle both string types on input)
        row = fromBase32(posfid)
        off = fromBase32(offset)
        [insertpos, idtext, filenum, seqnm, startpos, length] = self.fragtbl[row]
        pos = insertpos + off
        fname, pn, skelpos, skelend = self.getFileInfo(pos)
        if fname is None:
            # pos does not exist
            # default to skeleton pos instead
            print(
                "Link To Position", pos, "does not exist, retargeting to top of target"
            )
            pos = self.skeltbl[filenum][3]
            fname, pn, skelpos, skelend = self.getFileInfo(pos)
        # an existing "id=" or "name=" attribute must exist in original xhtml otherwise it would not have worked for linking.
        # Amazon seems to have added its own additional "aid=" inside tags whose contents seem to represent
        # some position information encoded into Base32 name.
        # so find the closest "id=" before position the file  by actually searching in that file
        idtext = self.getIDTag(pos)
        return fname, idtext
    def getIDTag(self, pos):
        # find the first tag with a named anchor (name or id attribute) before pos
        fname, pn, skelpos, skelend = self.getFileInfo(pos)
        if pn is None and skelpos is None:
            logger.debug("Error: getIDTag - no file contains %s" % pos)
        textblock = self.parts[pn]
        npos = pos - skelpos
        # if npos inside a tag then search all text before the its end of tag marker
        pgt = textblock.find(b">", npos)
        plt = textblock.find(b"<", npos)
        if plt == npos or pgt < plt:
            npos = pgt + 1
        # find id and name attributes only inside of tags
        # use a reverse tag search since that is faster
        #    inside any < > pair find "id=" and "name=" attributes return it
        #    [^>]* means match any amount of chars except for  '>' char
        #    [^'"] match any amount of chars except for the quote character
        #    \s* means match any amount of whitespace
        textblock = textblock[0:npos]
        id_pattern = re.compile(
            br"""<[^>]*\sid\s*=\s*['"]([^'"]*)['"]""", re.IGNORECASE
        )
        name_pattern = re.compile(
            br"""<[^>]*\sname\s*=\s*['"]([^'"]*)['"]""", re.IGNORECASE
        )
        aid_pattern = re.compile(br"""<[^>]+\s(?:aid|AID)\s*=\s*['"]([^'"]+)['"]""")
        for tag in reverse_tag_iter(textblock):
            # any ids in the body should default to top of file
            if tag[0:6] == b"<body ":
                return b""
            if tag[0:6] != b"<meta ":
                m = id_pattern.match(tag) or name_pattern.match(tag)
                if m is not None:
                    return m.group(1)
                m = aid_pattern.match(tag)
                if m is not None:
                    self.linked_aids.add(m.group(1))
                    return b"aid-" + m.group(1)
        return b""
    # do we need to do deep copying
    def setParts(self, parts):
        assert len(parts) == len(self.parts)
        for i in range(len(parts)):
            self.parts[i] = parts[i]
    # do we need to do deep copying
    def setFlows(self, flows):
        assert len(flows) == len(self.flows)
        for i in range(len(flows)):
            self.flows[i] = flows[i]
    # get information about the part (file) that exists at pos in original rawML
    def getSkelInfo(self, pos):
        for [partnum, pdir, filename, start, end, aidtext] in self.partinfo:
            if pos >= start and pos < end:
                return [partnum, pdir, filename, start, end, aidtext]
        return [None, None, None, None, None, None]
    # fileno is actually a reference into fragtbl (a fragment)
    def getGuideText(self):
        guidetext = b""
        for [ref_type, ref_title, fileno] in self.guidetbl:
            if ref_type == b"thumbimagestandard":
                continue
            if ref_type not in _guide_types and not ref_type.startswith(b"other."):
                if ref_type == b"start":
                    ref_type = b"text"
                else:
                    ref_type = b"other." + ref_type
            [pos, idtext, filenum, seqnm, startpos, length] = self.fragtbl[fileno]
            [pn, pdir, filename, skelpos, skelend, aidtext] = self.getSkelInfo(pos)
            idtext = self.getIDTag(pos)
            linktgt = filename.encode("utf-8")
            if idtext != b"":
                linktgt += b"#" + idtext
            guidetext += (
                b'<reference type="'
                + ref_type
                + b'" title="'
                + ref_title
                + b'" href="'
                + utf8_str(pdir)
                + b"/"
                + linktgt
                + b'" />\n'
            )
        # opf is encoded utf-8 so must convert any titles properly
        guidetext = (guidetext.decode(self.mh.codec)).encode("utf-8")
        return guidetext
    def getPageIDTag(self, pos):
        # find the first tag with a named anchor (name or id attribute) before pos
        # but page map offsets need to little more leeway so if the offset points
        # into a tag look for the next ending tag "/>" or "</" and start your search from there.
        fname, pn, skelpos, skelend = self.getFileInfo(pos)
        if pn is None and skelpos is None:
            logger.debug("Error: getIDTag - no file contains %s" % pos)
        textblock = self.parts[pn]
        npos = pos - skelpos
        # if npos inside a tag then search all text before next ending tag
        pgt = textblock.find(b">", npos)
        plt = textblock.find(b"<", npos)
        if plt == npos or pgt < plt:
            # we are in a tag
            # so find first ending tag
            pend1 = textblock.find(b"/>", npos)
            pend2 = textblock.find(b"</", npos)
            if pend1 != -1 and pend2 != -1:
                pend = min(pend1, pend2)
            else:
                pend = max(pend1, pend2)
            if pend != -1:
                npos = pend
            else:
                npos = pgt + 1
        # find id and name attributes only inside of tags
        # use a reverse tag search since that is faster
        #    inside any < > pair find "id=" and "name=" attributes return it
        #    [^>]* means match any amount of chars except for  '>' char
        #    [^'"] match any amount of chars except for the quote character
        #    \s* means match any amount of whitespace
        textblock = textblock[0:npos]
        id_pattern = re.compile(
            br"""<[^>]*\sid\s*=\s*['"]([^'"]*)['"]""", re.IGNORECASE
        )
        name_pattern = re.compile(
            br"""<[^>]*\sname\s*=\s*['"]([^'"]*)['"]""", re.IGNORECASE
        )
        for tag in reverse_tag_iter(textblock):
            # any ids in the body should default to top of file
            if tag[0:6] == b"<body ":
                return b""
            if tag[0:6] != b"<meta ":
                m = id_pattern.match(tag) or name_pattern.match(tag)
                if m is not None:
                    return m.group(1)
        return b""
--- a/mobimaster/mobi/mobi_k8resc.py
+++ b/mobimaster/mobi/mobi_k8resc.py
@@ -0,0 +1,290 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 DEBUG_USE_ORDERED_DICTIONARY = False  # OrderedDict is supoorted >= python 2.7.
 """ set to True to use OrderedDict for K8RESCProcessor.parsetag.tattr."""
 if DEBUG_USE_ORDERED_DICTIONARY:
    from collections import OrderedDict as dict_
 else:
    dict_ = dict
 from .compatibility_utils import unicode_str
 from loguru import logger
 from .mobi_utils import fromBase32
 _OPF_PARENT_TAGS = [
    "xml",
    "package",
    "metadata",
    "dc-metadata",
    "x-metadata",
    "manifest",
    "spine",
    "tours",
    "guide",
 ]
 class K8RESCProcessor(object):
    def __init__(self, data, debug=False):
        self._debug = debug
        self.resc = None
        self.opos = 0
        self.extrameta = []
        self.cover_name = None
        self.spine_idrefs = {}
        self.spine_order = []
        self.spine_pageattributes = {}
        self.spine_ppd = None
        # need3 indicate the book has fields which require epub3.
        # but the estimation of the source epub version from the fields is difficult.
        self.need3 = False
        self.package_ver = None
        self.extra_metadata = []
        self.refines_metadata = []
        self.extra_attributes = []
        # get header
        start_pos = data.find(b"<")
        self.resc_header = data[:start_pos]
        # get resc data length
        start = self.resc_header.find(b"=") + 1
        end = self.resc_header.find(b"&", start)
        resc_size = 0
        if end > 0:
            resc_size = fromBase32(self.resc_header[start:end])
        resc_rawbytes = len(data) - start_pos
        if resc_rawbytes == resc_size:
            self.resc_length = resc_size
        else:
            # Most RESC has a nul string at its tail but some do not.
            end_pos = data.find(b"\x00", start_pos)
            if end_pos < 0:
                self.resc_length = resc_rawbytes
            else:
                self.resc_length = end_pos - start_pos
        if self.resc_length != resc_size:
            logger.debug(
                "Warning: RESC section length({:d}bytes) does not match its size({:d}bytes).".format(
                    self.resc_length, resc_size
                )
            )
        # now parse RESC after converting it to unicode from utf-8
        self.resc = unicode_str(data[start_pos : start_pos + self.resc_length])
        self.parseData()
    def prepend_to_spine(self, key, idref, linear, properties):
        self.spine_order = [key] + self.spine_order
        self.spine_idrefs[key] = idref
        attributes = {}
        if linear is not None:
            attributes["linear"] = linear
        if properties is not None:
            attributes["properties"] = properties
        self.spine_pageattributes[key] = attributes
    # RESC tag iterator
    def resc_tag_iter(self):
        tcontent = last_tattr = None
        prefix = [""]
        while True:
            text, tag = self.parseresc()
            if text is None and tag is None:
                break
            if text is not None:
                tcontent = text.rstrip(" \r\n")
            else:  # we have a tag
                ttype, tname, tattr = self.parsetag(tag)
                if ttype == "begin":
                    tcontent = None
                    prefix.append(tname + ".")
                    if tname in _OPF_PARENT_TAGS:
                        yield "".join(prefix), tname, tattr, tcontent
                    else:
                        last_tattr = tattr
                else:  # single or end
                    if ttype == "end":
                        prefix.pop()
                        tattr = last_tattr
                        last_tattr = None
                        if tname in _OPF_PARENT_TAGS:
                            tname += "-end"
                    yield "".join(prefix), tname, tattr, tcontent
                    tcontent = None
    # now parse the RESC to extract spine and extra metadata info
    def parseData(self):
        for prefix, tname, tattr, tcontent in self.resc_tag_iter():
            if self._debug:
                logger.debug(
                    "   Parsing RESC: %s %s %s %s" % (prefix, tname, tattr, tcontent)
                )
            if tname == "package":
                self.package_ver = tattr.get("version", "2.0")
                package_prefix = tattr.get("prefix", "")
                if self.package_ver.startswith("3") or package_prefix.startswith(
                    "rendition"
                ):
                    self.need3 = True
            if tname == "spine":
                self.spine_ppd = tattr.get("page-progession-direction", None)
                if self.spine_ppd is not None and self.spine_ppd == "rtl":
                    self.need3 = True
            if tname == "itemref":
                skelid = tattr.pop("skelid", None)
                if skelid is None and len(self.spine_order) == 0:
                    # assume it was removed initial coverpage
                    skelid = "coverpage"
                    tattr["linear"] = "no"
                self.spine_order.append(skelid)
                idref = tattr.pop("idref", None)
                if idref is not None:
                    idref = "x_" + idref
                self.spine_idrefs[skelid] = idref
                if "id" in tattr:
                    del tattr["id"]
                # tattr["id"] = 'x_' + tattr["id"]
                if "properties" in tattr:
                    self.need3 = True
                self.spine_pageattributes[skelid] = tattr
            if tname == "meta" or tname.startswith("dc:"):
                if "refines" in tattr or "property" in tattr:
                    self.need3 = True
                if tattr.get("name", "") == "cover":
                    cover_name = tattr.get("content", None)
                    if cover_name is not None:
                        cover_name = "x_" + cover_name
                    self.cover_name = cover_name
                else:
                    self.extrameta.append([tname, tattr, tcontent])
    # parse and return either leading text or the next tag
    def parseresc(self):
        p = self.opos
        if p >= len(self.resc):
            return None, None
        if self.resc[p] != "<":
            res = self.resc.find("<", p)
            if res == -1:
                res = len(self.resc)
            self.opos = res
            return self.resc[p:res], None
        # handle comment as a special case
        if self.resc[p : p + 4] == "<!--":
            te = self.resc.find("-->", p + 1)
            if te != -1:
                te = te + 2
        else:
            te = self.resc.find(">", p + 1)
            ntb = self.resc.find("<", p + 1)
            if ntb != -1 and ntb < te:
                self.opos = ntb
                return self.resc[p:ntb], None
        self.opos = te + 1
        return None, self.resc[p : te + 1]
    # parses tag to identify:  [tname, ttype, tattr]
    #    tname: tag name
    #    ttype: tag type ('begin', 'end' or 'single');
    #    tattr: dictionary of tag atributes
    def parsetag(self, s):
        p = 1
        tname = None
        ttype = None
        tattr = dict_()
        while s[p : p + 1] == " ":
            p += 1
        if s[p : p + 1] == "/":
            ttype = "end"
            p += 1
            while s[p : p + 1] == " ":
                p += 1
        b = p
        while s[p : p + 1] not in (">", "/", " ", '"', "'", "\r", "\n"):
            p += 1
        tname = s[b:p].lower()
        # some special cases
        if tname == "?xml":
            tname = "xml"
        if tname == "!--":
            ttype = "single"
            comment = s[p:-3].strip()
            tattr["comment"] = comment
        if ttype is None:
            # parse any attributes of begin or single tags
            while s.find("=", p) != -1:
                while s[p : p + 1] == " ":
                    p += 1
                b = p
                while s[p : p + 1] != "=":
                    p += 1
                aname = s[b:p].lower()
                aname = aname.rstrip(" ")
                p += 1
                while s[p : p + 1] == " ":
                    p += 1
                if s[p : p + 1] in ('"', "'"):
                    p = p + 1
                    b = p
                    while s[p : p + 1] not in ('"', "'"):
                        p += 1
                    val = s[b:p]
                    p += 1
                else:
                    b = p
                    while s[p : p + 1] not in (">", "/", " "):
                        p += 1
                    val = s[b:p]
                tattr[aname] = val
        if ttype is None:
            ttype = "begin"
            if s.find("/", p) >= 0:
                ttype = "single"
        return ttype, tname, tattr
    def taginfo_toxml(self, taginfo):
        res = []
        tname, tattr, tcontent = taginfo
        res.append("<" + tname)
        if tattr is not None:
            for key in tattr:
                res.append(" " + key + '="' + tattr[key] + '"')
        if tcontent is not None:
            res.append(">" + tcontent + "</" + tname + ">\n")
        else:
            res.append("/>\n")
        return "".join(res)
    def hasSpine(self):
        return len(self.spine_order) > 0
    def needEPUB3(self):
        return self.need3
    def hasRefines(self):
        for [tname, tattr, tcontent] in self.extrameta:
            if "refines" in tattr:
                return True
        return False
    def createMetadata(self, epubver):
        for taginfo in self.extrameta:
            tname, tattr, tcontent = taginfo
            if "refines" in tattr:
                if epubver == "F" and "property" in tattr:
                    attr = ' id="%s" opf:%s="%s"\n' % (
                        tattr["refines"],
                        tattr["property"],
                        tcontent,
                    )
                    self.extra_attributes.append(attr)
                else:
                    tag = self.taginfo_toxml(taginfo)
                    self.refines_metadata.append(tag)
            else:
                tag = self.taginfo_toxml(taginfo)
                self.extra_metadata.append(tag)
--- a/mobimaster/mobi/mobi_nav.py
+++ b/mobimaster/mobi/mobi_nav.py
@@ -0,0 +1,202 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import unicode_str
 import os
 from .unipath import pathof
 from loguru import logger
 import re
 # note: re requites the pattern to be the exact same type as the data to be searched in python3
 # but u"" is not allowed for the pattern itself only b""
 DEBUG_NAV = False
 FORCE_DEFAULT_TITLE = False
 """ Set to True to force to use the default title. """
 NAVIGATION_FINENAME = "nav.xhtml"
 """ The name for the navigation document. """
 DEFAULT_TITLE = "Navigation"
 """ The default title for the navigation document. """
 class NAVProcessor(object):
    def __init__(self, files):
        self.files = files
        self.navname = NAVIGATION_FINENAME
    def buildLandmarks(self, guidetext):
        header = ""
        header += '  <nav epub:type="landmarks" id="landmarks" hidden="">\n'
        header += "    <h2>Guide</h2>\n"
        header += "    <ol>\n"
        element = '      <li><a epub:type="{:s}" href="{:s}">{:s}</a></li>\n'
        footer = ""
        footer += "    </ol>\n"
        footer += "  </nav>\n"
        type_map = {
            "cover": "cover",
            "title-page": "title-page",
            # ?: 'frontmatter',
            "text": "bodymatter",
            # ?: 'backmatter',
            "toc": "toc",
            "loi": "loi",
            "lot": "lot",
            "preface": "preface",
            "bibliography": "bibliography",
            "index": "index",
            "glossary": "glossary",
            "acknowledgements": "acknowledgements",
            "colophon": None,
            "copyright-page": None,
            "dedication": None,
            "epigraph": None,
            "foreword": None,
            "notes": None,
        }
        re_type = re.compile(r'\s+type\s*=\s*"(.*?)"', re.I)
        re_title = re.compile(r'\s+title\s*=\s*"(.*?)"', re.I)
        re_link = re.compile(r'\s+href\s*=\s*"(.*?)"', re.I)
        dir_ = os.path.relpath(self.files.k8text, self.files.k8oebps).replace("\\", "/")
        data = ""
        references = re.findall(r"<reference\s+.*?>", unicode_str(guidetext), re.I)
        for reference in references:
            mo_type = re_type.search(reference)
            mo_title = re_title.search(reference)
            mo_link = re_link.search(reference)
            if mo_type is not None:
                type_ = type_map.get(mo_type.group(1), None)
            else:
                type_ = None
            if mo_title is not None:
                title = mo_title.group(1)
            else:
                title = None
            if mo_link is not None:
                link = mo_link.group(1)
            else:
                link = None
            if type_ is not None and title is not None and link is not None:
                link = os.path.relpath(link, dir_).replace("\\", "/")
                data += element.format(type_, link, title)
        if len(data) > 0:
            return header + data + footer
        else:
            return ""
    def buildTOC(self, indx_data):
        header = ""
        header += '  <nav epub:type="toc" id="toc">\n'
        header += "    <h1>Table of contents</h1>\n"
        footer = "  </nav>\n"
        # recursive part
        def recursINDX(max_lvl=0, num=0, lvl=0, start=-1, end=-1):
            if start > len(indx_data) or end > len(indx_data):
                logger.debug(
                    "Warning (in buildTOC): missing INDX child entries",
                    start,
                    end,
                    len(indx_data),
                )
                return ""
            if DEBUG_NAV:
                logger.debug(
                    "recursINDX (in buildTOC) lvl %d from %d to %d" % (lvl, start, end)
                )
            xhtml = ""
            if start <= 0:
                start = 0
            if end <= 0:
                end = len(indx_data)
            if lvl > max_lvl:
                max_lvl = lvl
            indent1 = "  " * (2 + lvl * 2)
            indent2 = "  " * (3 + lvl * 2)
            xhtml += indent1 + "<ol>\n"
            for i in range(start, end):
                e = indx_data[i]
                htmlfile = e["filename"]
                desttag = e["idtag"]
                text = e["text"]
                if not e["hlvl"] == lvl:
                    continue
                num += 1
                if desttag == "":
                    link = htmlfile
                else:
                    link = "{:s}#{:s}".format(htmlfile, desttag)
                xhtml += indent2 + "<li>"
                entry = '<a href="{:}">{:s}</a>'.format(link, text)
                xhtml += entry
                # recurs
                if e["child1"] >= 0:
                    xhtml += "\n"
                    xhtmlrec, max_lvl, num = recursINDX(
                        max_lvl, num, lvl + 1, e["child1"], e["childn"] + 1
                    )
                    xhtml += xhtmlrec
                    xhtml += indent2
                # close entry
                xhtml += "</li>\n"
            xhtml += indent1 + "</ol>\n"
            return xhtml, max_lvl, num
        data, max_lvl, num = recursINDX()
        if not len(indx_data) == num:
            logger.debug(
                "Warning (in buildTOC): different number of entries in NCX",
                len(indx_data),
                num,
            )
        return header + data + footer
    def buildNAV(self, ncx_data, guidetext, title, lang):
        logger.debug("Building Navigation Document.")
        if FORCE_DEFAULT_TITLE:
            title = DEFAULT_TITLE
        nav_header = ""
        nav_header += '<?xml version="1.0" encoding="utf-8"?>\n<!DOCTYPE html>'
        nav_header += '<html xmlns="http://www.w3.org/1999/xhtml"'
        nav_header += ' xmlns:epub="http://www.idpf.org/2007/ops"'
        nav_header += ' lang="{0:s}" xml:lang="{0:s}">\n'.format(lang)
        nav_header += "<head>\n<title>{:s}</title>\n".format(title)
        nav_header += '<meta charset="UTF-8" />\n'
        nav_header += '<style type="text/css">\n'
        nav_header += "nav#landmarks { display:none; }\n"
        nav_header += "</style>\n</head>\n<body>\n"
        nav_footer = "</body>\n</html>\n"
        landmarks = self.buildLandmarks(guidetext)
        toc = self.buildTOC(ncx_data)
        data = nav_header
        data += landmarks
        data += toc
        data += nav_footer
        return data
    def getNAVName(self):
        return self.navname
    def writeNAV(self, ncx_data, guidetext, metadata):
        # build the xhtml
        # logger.debug("Write Navigation Document.")
        xhtml = self.buildNAV(
            ncx_data, guidetext, metadata.get("Title")[0], metadata.get("Language")[0]
        )
        fname = os.path.join(self.files.k8text, self.navname)
        with open(pathof(fname), "wb") as f:
            f.write(xhtml.encode("utf-8"))
--- a/mobimaster/mobi/mobi_ncx.py
+++ b/mobimaster/mobi/mobi_ncx.py
@@ -0,0 +1,315 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 import os
 from .unipath import pathof
 from loguru import logger
 import re
 # note: re requites the pattern to be the exact same type as the data to be searched in python3
 # but u"" is not allowed for the pattern itself only b""
 '''
 NCX (Navigation Control for XML applications) is a generalized navigation definition DTD for application
 to Digital Talking Books, eBooks, and general web content models.                                                
 This DTD is an XML application that layers navigation functionality on top of SMIL 2.0  content.                                       
 The NCX defines a navigation path/model that may be applied upon existing publications,
 without modification of the existing publication source, so long as the navigation targets within
 the source publication can be directly referenced via a URI.                      
 http://www.daisy.org/z3986/2005/ncx-2005-1.dtd
 '''
 from .mobi_utils import toBase32
 from .mobi_index import MobiIndex
 DEBUG_NCX = True
 class ncxExtract:
    def __init__(self, mh, files):
        self.mh = mh
        self.sect = self.mh.sect
        self.files = files
        self.isNCX = False
        self.mi = MobiIndex(self.sect)
        self.ncxidx = self.mh.ncxidx
        self.indx_data = None
    def parseNCX(self):
        indx_data = []
        tag_fieldname_map = {
            1: ["pos", 0],
            2: ["len", 0],
            3: ["noffs", 0],
            4: ["hlvl", 0],
            5: ["koffs", 0],
            6: ["pos_fid", 0],
            21: ["parent", 0],
            22: ["child1", 0],
            23: ["childn", 0],
        }
        if self.ncxidx != 0xFFFFFFFF:
            outtbl, ctoc_text = self.mi.getIndexData(self.ncxidx, "NCX")
            if DEBUG_NCX:
                logger.debug("ctoc_text {}".format(ctoc_text))
                logger.debug("outtbl {}".format(outtbl))
            num = 0
            for [text, tagMap] in outtbl:
                tmp = {
                    "name": text.decode("utf-8"),
                    "pos": -1,
                    "len": 0,
                    "noffs": -1,
                    "text": "Unknown Text",
                    "hlvl": -1,
                    "kind": "Unknown Kind",
                    "pos_fid": None,
                    "parent": -1,
                    "child1": -1,
                    "childn": -1,
                    "num": num,
                }
                for tag in tag_fieldname_map:
                    [fieldname, i] = tag_fieldname_map[tag]
                    if tag in tagMap:
                        fieldvalue = tagMap[tag][i]
                        if tag == 6:
                            pos_fid = toBase32(fieldvalue, 4).decode("utf-8")
                            fieldvalue2 = tagMap[tag][i + 1]
                            pos_off = toBase32(fieldvalue2, 10).decode("utf-8")
                            fieldvalue = "kindle:pos:fid:%s:off:%s" % (pos_fid, pos_off)
                        tmp[fieldname] = fieldvalue
                        if tag == 3:
                            toctext = ctoc_text.get(fieldvalue, "Unknown Text")
                            toctext = toctext.decode(self.mh.codec)
                            tmp["text"] = toctext
                        if tag == 5:
                            kindtext = ctoc_text.get(fieldvalue, "Unknown Kind")
                            kindtext = kindtext.decode(self.mh.codec)
                            tmp["kind"] = kindtext
                indx_data.append(tmp)
                # CGDBG
                '''
                record number:  3
                name:  03
                position 461377  length:  465358  => position/150 = real page number
                text:  第二章 青铜时代——单机游戏
                kind:  Unknown Kind
                heading level:  0 => level of section
                parent: -1  => record number of previous level of section
                first child:  15  last child:  26 => range of record number of next level section
                pos_fid is  kindle:pos:fid:0023:off:0000000000
                '''
                if DEBUG_NCX:
                    print("record number: ", num)
                    print(
                        "name: ", tmp["name"],
                    )
                    print("position", tmp["pos"], " length: ", tmp["len"])
                    print("text: ", tmp["text"])
                    print("kind: ", tmp["kind"])
                    print("heading level: ", tmp["hlvl"])
                    print("parent:", tmp["parent"])
                    print(
                        "first child: ", tmp["child1"], " last child: ", tmp["childn"]
                    )
                    print("pos_fid is ", tmp["pos_fid"])
                    print("\n\n")
                num += 1
        self.indx_data = indx_data
        # {'name': '00', 'pos': 167, 'len': 24798, 'noffs': 0, 'text': '版权信息', 'hlvl': 0, 'kind': 'Unknown Kind', 'pos_fid': None, 'parent': -1, 'child1': -1, 'childn': -1, 'num': 0}
        # {'name': '0B', 'pos': 67932, 'len': 3274, 'noffs': 236, 'text': '8.希罗多德', 'hlvl': 0, 'kind': 'Unknown Kind', 'pos_fid': None, 'parent': -1, 'child1': -1, 'childn': -1, 'num': 11}
        print(indx_data)
        return indx_data
    def buildNCX(self, htmlfile, title, ident, lang):
        indx_data = self.indx_data
        ncx_header = """<?xml version='1.0' encoding='utf-8'?>
 <ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1" xml:lang="%s">
 <head>
 <meta content="%s" name="dtb:uid"/>
 <meta content="%d" name="dtb:depth"/>
 <meta content="mobiunpack.py" name="dtb:generator"/>
 <meta content="0" name="dtb:totalPageCount"/>
 <meta content="0" name="dtb:maxPageNumber"/>
 </head>
 <docTitle>
 <text>%s</text>
 </docTitle>
 <navMap>
 """
        ncx_footer = """  </navMap>
 </ncx>
 """
        ncx_entry = """<navPoint id="%s" playOrder="%d">
 <navLabel>
 <text>%s</text>
 </navLabel>
 <content src="%s"/>"""
        # recursive part
        def recursINDX(max_lvl=0, num=0, lvl=0, start=-1, end=-1):
            if start > len(indx_data) or end > len(indx_data):
                print("Warning: missing INDX child entries", start, end, len(indx_data))
                return ""
            if DEBUG_NCX:
                logger.debug("recursINDX lvl %d from %d to %d" % (lvl, start, end))
            xml = ""
            if start <= 0:
                start = 0
            if end <= 0:
                end = len(indx_data)
            if lvl > max_lvl:
                max_lvl = lvl
            indent = "  " * (2 + lvl)
            for i in range(start, end):
                e = indx_data[i]
                if not e["hlvl"] == lvl:
                    continue
                # open entry
                num += 1
                link = "%s#filepos%d" % (htmlfile, e["pos"])
                print ( 'link {} '.format(link))
                tagid = "np_%d" % num
                entry = ncx_entry % (tagid, num, e["text"], link)
                entry = re.sub(re.compile("^", re.M), indent, entry, 0)
                xml += entry + "\n"
                # recurs
                if e["child1"] >= 0:
                    xmlrec, max_lvl, num = recursINDX(
                        max_lvl, num, lvl + 1, e["child1"], e["childn"] + 1
                    )
                    xml += xmlrec
                # close entry
                xml += indent + "</navPoint>\n"
            return xml, max_lvl, num
        body, max_lvl, num = recursINDX()
        header = ncx_header % (lang, ident, max_lvl + 1, title)
        ncx = header + body + ncx_footer
        if not len(indx_data) == num:
            print("Warning: different number of entries in NCX", len(indx_data), num)
        return ncx
    def writeNCX(self, metadata):
        # build the xml
        self.isNCX = True
        logger.debug("Write ncx")
        # htmlname = os.path.basename(self.files.outbase)
        # htmlname += '.html'
        htmlname = "book1.html"
        xml = self.buildNCX(
            htmlname,
            metadata["Title"][0],
            metadata["UniqueID"][0],
            metadata.get("Language")[0],
        )
        # write the ncx file
        # ncxname = os.path.join(self.files.mobi7dir, self.files.getInputFileBasename() + '.ncx')
        ncxname = os.path.join(self.files.mobi7dir, "toc.ncx")
        with open(pathof(ncxname), "wb") as f:
            f.write(xml.encode("utf-8"))
    def buildK8NCX(self, indx_data, title, ident, lang):
        ncx_header = """<?xml version='1.0' encoding='utf-8'?>
 <ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1" xml:lang="%s">
 <head>
 <meta content="%s" name="dtb:uid"/>
 <meta content="%d" name="dtb:depth"/>
 <meta content="mobiunpack.py" name="dtb:generator"/>
 <meta content="0" name="dtb:totalPageCount"/>
 <meta content="0" name="dtb:maxPageNumber"/>
 </head>
 <docTitle>
 <text>%s</text>
 </docTitle>
 <navMap>
 """
        ncx_footer = """  </navMap>
 </ncx>
 """
        ncx_entry = """<navPoint id="%s" playOrder="%d">
 <navLabel>
 <text>%s</text>
 </navLabel>
 <content src="%s"/>"""
        # recursive part
        def recursINDX(max_lvl=0, num=0, lvl=0, start=-1, end=-1):
            if start > len(indx_data) or end > len(indx_data):
                print("Warning: missing INDX child entries", start, end, len(indx_data))
                return ""
            if DEBUG_NCX:
                logger.debug("recursINDX lvl %d from %d to %d" % (lvl, start, end))
            xml = ""
            if start <= 0:
                start = 0
            if end <= 0:
                end = len(indx_data)
            if lvl > max_lvl:
                max_lvl = lvl
            indent = "  " * (2 + lvl)
            for i in range(start, end):
                e = indx_data[i]
                htmlfile = e["filename"]
                desttag = e["idtag"]
                if not e["hlvl"] == lvl:
                    continue
                # open entry
                num += 1
                if desttag == "":
                    link = "Text/%s" % htmlfile
                else:
                    link = "Text/%s#%s" % (htmlfile, desttag)
                tagid = "np_%d" % num
                entry = ncx_entry % (tagid, num, e["text"], link)
                entry = re.sub(re.compile("^", re.M), indent, entry, 0)
                xml += entry + "\n"
                # recurs
                if e["child1"] >= 0:
                    xmlrec, max_lvl, num = recursINDX(
                        max_lvl, num, lvl + 1, e["child1"], e["childn"] + 1
                    )
                    xml += xmlrec
                # close entry
                xml += indent + "</navPoint>\n"
            return xml, max_lvl, num
        body, max_lvl, num = recursINDX()
        header = ncx_header % (lang, ident, max_lvl + 1, title)
        ncx = header + body + ncx_footer
        if not len(indx_data) == num:
            print("Warning: different number of entries in NCX", len(indx_data), num)
        return ncx
    def writeK8NCX(self, ncx_data, metadata):
        # build the xml
        self.isNCX = True
        logger.debug("Write K8 ncx")
        xml = self.buildK8NCX(
            ncx_data,
            metadata["Title"][0],
            metadata["UniqueID"][0],
            metadata.get("Language")[0],
        )
        bname = "toc.ncx"
        ncxname = os.path.join(self.files.k8oebps, bname)
        with open(pathof(ncxname), "wb") as f:
            f.write(xml.encode("utf-8"))
--- a/mobimaster/mobi/mobi_opf.py
+++ b/mobimaster/mobi/mobi_opf.py
@@ -0,0 +1,828 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import unicode_str, unescapeit
 from .compatibility_utils import lzip
 from loguru import logger
 from .unipath import pathof
 from xml.sax.saxutils import escape as xmlescape
 import os
 import uuid
 from datetime import datetime
 # In EPUB3, NCX and <guide> MAY exist in OPF, although the NCX is superseded
 # by the Navigation Document and the <guide> is deprecated. Currently, EPUB3_WITH_NCX
 # and EPUB3_WITH_GUIDE are set to True due to compatibility with epub2 reading systems.
 # They might be change to set to False in the future.
 EPUB3_WITH_NCX = True  # Do not set to False except for debug.
 """ Set to True to create a toc.ncx when converting to epub3. """
 EPUB3_WITH_GUIDE = True  # Do not set to False except for debug.
 """ Set to True to create a guide element in an opf when converting to epub3. """
 EPUB_OPF = "content.opf"
 """ The name for the OPF of EPUB. """
 TOC_NCX = "toc.ncx"
 """ The name for the TOC of EPUB2. """
 NAVIGATION_DOCUMENT = "nav.xhtml"
 """ The name for the navigation document of EPUB3. """
 BEGIN_INFO_ONLY = "<!-- BEGIN INFORMATION ONLY "
 """ The comment to indicate the beginning of metadata which will be ignored by kindlegen. """
 END_INFO_ONLY = "END INFORMATION ONLY -->"
 """ The comment to indicate the end of metadata which will be ignored by kindlegen. """
 EXTH_TITLE_FURIGANA = "Title-Pronunciation"
 """ The name for Title Furigana(similar to file-as) set by KDP. """
 EXTH_CREATOR_FURIGANA = "Author-Pronunciation"
 """ The name for Creator Furigana(similar to file-as) set by KDP. """
 EXTH_PUBLISHER_FURIGANA = "Publisher-Pronunciation"
 """ The name for Publisher Furigana(similar to file-as) set by KDP. """
 EXTRA_ENTITIES = {'"': "&quot;", "'": "&apos;"}
 class OPFProcessor(object):
    def __init__(
        self,
        files,
        metadata,
        fileinfo,
        rscnames,
        hasNCX,
        mh,
        usedmap,
        pagemapxml="",
        guidetext="",
        k8resc=None,
        epubver="2",
    ):
        self.files = files
        self.metadata = metadata
        self.fileinfo = fileinfo
        self.rscnames = rscnames
        self.has_ncx = hasNCX
        self.codec = mh.codec
        self.isK8 = mh.isK8()
        self.printReplica = mh.isPrintReplica()
        self.guidetext = unicode_str(guidetext)
        self.used = usedmap
        self.k8resc = k8resc
        self.covername = None
        self.cover_id = "cover_img"
        if self.k8resc is not None and self.k8resc.cover_name is not None:
            # update cover id info from RESC if available
            self.cover_id = self.k8resc.cover_name
        # Create a unique urn uuid
        self.BookId = unicode_str(str(uuid.uuid4()))
        self.pagemap = pagemapxml
        self.ncxname = None
        self.navname = None
        # page-progression-direction is only set in spine
        self.page_progression_direction = metadata.pop(
            "page-progression-direction", [None]
        )[0]
        if "rl" in metadata.get("primary-writing-mode", [""])[0]:
            self.page_progression_direction = "rtl"
        self.epubver = epubver  # the epub version set by user
        self.target_epubver = (
            epubver  # the epub vertion set by user or detected automatically
        )
        if self.epubver == "A":
            self.target_epubver = self.autodetectEPUBVersion()
        elif self.epubver == "F":
            self.target_epubver = "2"
        elif self.epubver != "2" and self.epubver != "3":
            self.target_epubver = "2"
        # id for rifine attributes
        self.title_id = {}
        self.creator_id = {}
        self.publisher_id = {}
        # extra attributes
        self.title_attrib = {}
        self.creator_attrib = {}
        self.publisher_attrib = {}
        self.extra_attributes = []  # for force epub2 option
        # Create epub3 metadata from EXTH.
        self.exth_solved_refines_metadata = []
        self.exth_refines_metadata = []
        self.exth_fixedlayout_metadata = []
        self.defineRefinesID()
        self.processRefinesMetadata()
        if self.k8resc is not None:
            # Create metadata in RESC section.
            self.k8resc.createMetadata(epubver)
        if self.target_epubver == "3":
            self.createMetadataForFixedlayout()
    def escapeit(self, sval, EXTRAS=None):
        # note, xmlescape and unescape do not work with utf-8 bytestrings
        sval = unicode_str(sval)
        if EXTRAS:
            res = xmlescape(unescapeit(sval), EXTRAS)
        else:
            res = xmlescape(unescapeit(sval))
        return res
    def createMetaTag(self, data, property, content, refid=""):
        refines = ""
        if refid:
            refines = ' refines="#%s"' % refid
        data.append('<meta property="%s"%s>%s</meta>\n' % (property, refines, content))
    def buildOPFMetadata(self, start_tag, has_obfuscated_fonts=False):
        # convert from EXTH metadata format to target epub version metadata
        # epub 3 will ignore <meta name="xxxx" content="yyyy" /> style metatags
        #    but allows them to be present for backwards compatibility
        #    instead the new format is
        #    <meta property="xxxx" id="iiii" ... > property_value</meta>
        #       and DCMES elements such as:
        #    <dc:blah id="iiii">value</dc:blah>
        metadata = self.metadata
        k8resc = self.k8resc
        META_TAGS = [
            "Drm Server Id",
            "Drm Commerce Id",
            "Drm Ebookbase Book Id",
            "ASIN",
            "ThumbOffset",
            "Fake Cover",
            "Creator Software",
            "Creator Major Version",
            "Creator Minor Version",
            "Creator Build Number",
            "Watermark",
            "Clipping Limit",
            "Publisher Limit",
            "Text to Speech Disabled",
            "CDE Type",
            "Updated Title",
            "Font Signature (hex)",
            "Tamper Proof Keys (hex)",
        ]
        # def handleTag(data, metadata, key, tag, ids={}):
        def handleTag(data, metadata, key, tag, attrib={}):
            """Format metadata values.
            @param data: List of formatted metadata entries.
            @param metadata: The metadata dictionary.
            @param key: The key of the metadata value to handle.
            @param tag: The opf tag corresponds to the metadata value.
            ###@param ids: The ids in tags for refines property of epub3.
            @param attrib: The extra attibute for refines or opf prefixs.
           """
            if key in metadata:
                for i, value in enumerate(metadata[key]):
                    closingTag = tag.split(" ")[0]
                    res = "<%s%s>%s</%s>\n" % (
                        tag,
                        attrib.get(i, ""),
                        self.escapeit(value),
                        closingTag,
                    )
                    data.append(res)
                del metadata[key]
        # these are allowed but ignored by epub3
        def handleMetaPairs(data, metadata, key, name):
            if key in metadata:
                for value in metadata[key]:
                    res = '<meta name="%s" content="%s" />\n' % (
                        name,
                        self.escapeit(value, EXTRA_ENTITIES),
                    )
                    data.append(res)
                del metadata[key]
        data = []
        data.append(start_tag + "\n")
        # Handle standard metadata
        if "Title" in metadata:
            handleTag(data, metadata, "Title", "dc:title", self.title_attrib)
        else:
            data.append("<dc:title>Untitled</dc:title>\n")
        handleTag(data, metadata, "Language", "dc:language")
        if "UniqueID" in metadata:
            handleTag(data, metadata, "UniqueID", 'dc:identifier id="uid"')
        else:
            # No unique ID in original, give it a generic one.
            data.append('<dc:identifier id="uid">0</dc:identifier>\n')
        if self.target_epubver == "3":
            # epub version 3 minimal metadata requires a dcterms:modifed date tag
            self.createMetaTag(
                data,
                "dcterms:modified",
                datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%SZ"),
            )
        if self.isK8 and has_obfuscated_fonts:
            # Use the random generated urn:uuid so obuscated fonts work.
            # It doesn't need to be _THE_ unique identifier to work as a key
            # for obfuscated fonts in Sigil, ADE and calibre. Its just has
            # to use the opf:scheme="UUID" and have the urn:uuid: prefix.
            if self.target_epubver == "3":
                data.append(
                    "<dc:identifier>urn:uuid:" + self.BookId + "</dc:identifier>\n"
                )
            else:
                data.append(
                    '<dc:identifier opf:scheme="UUID">urn:uuid:'
                    + self.BookId
                    + "</dc:identifier>\n"
                )
        handleTag(data, metadata, "Creator", "dc:creator", self.creator_attrib)
        handleTag(data, metadata, "Contributor", "dc:contributor")
        handleTag(data, metadata, "Publisher", "dc:publisher", self.publisher_attrib)
        handleTag(data, metadata, "Source", "dc:source")
        handleTag(data, metadata, "Type", "dc:type")
        if self.target_epubver == "3":
            if "ISBN" in metadata:
                for i, value in enumerate(metadata["ISBN"]):
                    res = (
                        "<dc:identifier>urn:isbn:%s</dc:identifier>\n"
                        % self.escapeit(value)
                    )
                    data.append(res)
        else:
            handleTag(data, metadata, "ISBN", 'dc:identifier opf:scheme="ISBN"')
        if "Subject" in metadata:
            if "SubjectCode" in metadata:
                codeList = metadata["SubjectCode"]
                del metadata["SubjectCode"]
            else:
                codeList = None
            for i in range(len(metadata["Subject"])):
                if codeList and i < len(codeList):
                    data.append('<dc:subject BASICCode="' + codeList[i] + '">')
                else:
                    data.append("<dc:subject>")
                data.append(self.escapeit(metadata["Subject"][i]) + "</dc:subject>\n")
            del metadata["Subject"]
        handleTag(data, metadata, "Description", "dc:description")
        if self.target_epubver == "3":
            if "Published" in metadata:
                for i, value in enumerate(metadata["Published"]):
                    res = "<dc:date>%s</dc:date>\n" % self.escapeit(value)
                    data.append(res)
        else:
            handleTag(data, metadata, "Published", 'dc:date opf:event="publication"')
        handleTag(data, metadata, "Rights", "dc:rights")
        if self.epubver == "F":
            if self.extra_attributes or k8resc is not None and k8resc.extra_attributes:
                data.append(
                    "<!-- THE FOLLOWINGS ARE REQUIRED TO INSERT INTO <dc:xxx> MANUALLY\n"
                )
                if self.extra_attributes:
                    data += self.extra_attributes
                if k8resc is not None and k8resc.extra_attributes:
                    data += k8resc.extra_attributes
                data.append("-->\n")
        else:
            # Append refines metadata.
            if self.exth_solved_refines_metadata:
                data.append("<!-- Refines MetaData from EXTH -->\n")
                data += self.exth_solved_refines_metadata
            if (
                self.exth_refines_metadata
                or k8resc is not None
                and k8resc.refines_metadata
            ):
                data.append("<!-- THE FOLLOWINGS ARE REQUIRED TO EDIT IDS MANUALLY\n")
                if self.exth_refines_metadata:
                    data += self.exth_refines_metadata
                if k8resc is not None and k8resc.refines_metadata:
                    data += k8resc.refines_metadata
                data.append("-->\n")
        # Append metadata in RESC section.
        if k8resc is not None and k8resc.extra_metadata:
            data.append("<!-- Extra MetaData from RESC\n")
            data += k8resc.extra_metadata
            data.append("-->\n")
        if "CoverOffset" in metadata:
            imageNumber = int(metadata["CoverOffset"][0])
            self.covername = self.rscnames[imageNumber]
            if self.covername is None:
                logger.debug(
                    "Error: Cover image %s was not recognized as a valid image"
                    % imageNumber
                )
            else:
                # <meta name="cover"> is obsoleted in EPUB3, but kindlegen v2.9 requires it.
                data.append('<meta name="cover" content="' + self.cover_id + '" />\n')
                self.used[self.covername] = "used"
            del metadata["CoverOffset"]
        handleMetaPairs(data, metadata, "Codec", "output encoding")
        # handle kindlegen specifc tags
        handleTag(data, metadata, "DictInLanguage", "DictionaryInLanguage")
        handleTag(data, metadata, "DictOutLanguage", "DictionaryOutLanguage")
        handleMetaPairs(data, metadata, "RegionMagnification", "RegionMagnification")
        handleMetaPairs(data, metadata, "book-type", "book-type")
        handleMetaPairs(data, metadata, "zero-gutter", "zero-gutter")
        handleMetaPairs(data, metadata, "zero-margin", "zero-margin")
        handleMetaPairs(data, metadata, "primary-writing-mode", "primary-writing-mode")
        handleMetaPairs(data, metadata, "fixed-layout", "fixed-layout")
        handleMetaPairs(data, metadata, "orientation-lock", "orientation-lock")
        handleMetaPairs(data, metadata, "original-resolution", "original-resolution")
        # these are not allowed in epub2 or 3 so convert them to meta name content pairs
        # perhaps these could better be mapped into the dcterms namespace instead
        handleMetaPairs(data, metadata, "Review", "review")
        handleMetaPairs(data, metadata, "Imprint", "imprint")
        handleMetaPairs(data, metadata, "Adult", "adult")
        handleMetaPairs(data, metadata, "DictShortName", "DictionaryVeryShortName")
        # these are needed by kobo books upon submission but not sure if legal metadata in epub2 or epub3
        if "Price" in metadata and "Currency" in metadata:
            priceList = metadata["Price"]
            currencyList = metadata["Currency"]
            if len(priceList) != len(currencyList):
                logger.debug("Error: found %s price entries, but %s currency entries.")
            else:
                for i in range(len(priceList)):
                    data.append(
                        '<SRP Currency="'
                        + currencyList[i]
                        + '">'
                        + priceList[i]
                        + "</SRP>\n"
                    )
            del metadata["Price"]
            del metadata["Currency"]
        if self.target_epubver == "3":
            # Append metadata for EPUB3.
            if self.exth_fixedlayout_metadata:
                data.append("<!-- EPUB3 MedaData converted from EXTH -->\n")
                data += self.exth_fixedlayout_metadata
        # all that remains is extra EXTH info we will store inside a comment inside meta name/content pairs
        # so it can not impact anything and will be automatically stripped out if found again in a RESC section
        data.append(BEGIN_INFO_ONLY + "\n")
        if "ThumbOffset" in metadata:
            imageNumber = int(metadata["ThumbOffset"][0])
            imageName = self.rscnames[imageNumber]
            if imageName is None:
                logger.debug(
                    "Error: Cover Thumbnail image %s was not recognized as a valid image"
                    % imageNumber
                )
            else:
                data.append(
                    '<meta name="Cover ThumbNail Image" content="'
                    + "Images/"
                    + imageName
                    + '" />\n'
                )
                # self.used[imageName] = 'used' # thumbnail image is always generated by Kindlegen, so don't include in manifest
                self.used[imageName] = "not used"
            del metadata["ThumbOffset"]
        for metaName in META_TAGS:
            if metaName in metadata:
                for value in metadata[metaName]:
                    data.append(
                        '<meta name="'
                        + metaName
                        + '" content="'
                        + self.escapeit(value, EXTRA_ENTITIES)
                        + '" />\n'
                    )
                del metadata[metaName]
        for key in list(metadata.keys()):
            for value in metadata[key]:
                data.append(
                    '<meta name="'
                    + key
                    + '" content="'
                    + self.escapeit(value, EXTRA_ENTITIES)
                    + '" />\n'
                )
            del metadata[key]
        data.append(END_INFO_ONLY + "\n")
        data.append("</metadata>\n")
        return data
    def buildOPFManifest(self, ncxname, navname=None):
        # buildManifest for mobi7, azw4, epub2 and epub3.
        k8resc = self.k8resc
        cover_id = self.cover_id
        hasK8RescSpine = k8resc is not None and k8resc.hasSpine()
        self.ncxname = ncxname
        self.navname = navname
        data = []
        data.append("<manifest>\n")
        media_map = {
            ".jpg": "image/jpeg",
            ".jpeg": "image/jpeg",
            ".png": "image/png",
            ".gif": "image/gif",
            ".svg": "image/svg+xml",
            ".xhtml": "application/xhtml+xml",
            ".html": "text/html",  # for mobi7
            ".pdf": "application/pdf",  # for azw4(print replica textbook)
            ".ttf": "application/x-font-ttf",
            ".otf": "application/x-font-opentype",  # replaced?
            ".css": "text/css",
            # '.html' : 'text/x-oeb1-document',        # for mobi7
            # '.otf'  : 'application/vnd.ms-opentype', # [OpenType] OpenType fonts
            # '.woff' : 'application/font-woff',       # [WOFF] WOFF fonts
            # '.smil' : 'application/smil+xml',        # [MediaOverlays301] EPUB Media Overlay documents
            # '.pls'  : 'application/pls+xml',         # [PLS] Text-to-Speech (TTS) Pronunciation lexicons
            # '.mp3'  : 'audio/mpeg',
            # '.mp4'  : 'video/mp4',
            # '.js'   : 'text/javascript',             # not supported in K8
        }
        spinerefs = []
        idcnt = 0
        for [key, dir, fname] in self.fileinfo:
            name, ext = os.path.splitext(fname)
            ext = ext.lower()
            media = media_map.get(ext)
            ref = "item%d" % idcnt
            if hasK8RescSpine:
                if key is not None and key in k8resc.spine_idrefs:
                    ref = k8resc.spine_idrefs[key]
            properties = ""
            if dir != "":
                fpath = dir + "/" + fname
            else:
                fpath = fname
            data.append(
                '<item id="{0:}" media-type="{1:}" href="{2:}" {3:}/>\n'.format(
                    ref, media, fpath, properties
                )
            )
            if ext in [".xhtml", ".html"]:
                spinerefs.append(ref)
            idcnt += 1
        for fname in self.rscnames:
            if fname is not None:
                if self.used.get(fname, "not used") == "not used":
                    continue
                name, ext = os.path.splitext(fname)
                ext = ext.lower()
                media = media_map.get(ext, ext[1:])
                properties = ""
                if fname == self.covername:
                    ref = cover_id
                    if self.target_epubver == "3":
                        properties = 'properties="cover-image"'
                else:
                    ref = "item%d" % idcnt
                if ext == ".ttf" or ext == ".otf":
                    if self.isK8:  # fonts are only used in Mobi 8
                        fpath = "Fonts/" + fname
                        data.append(
                            '<item id="{0:}" media-type="{1:}" href="{2:}" {3:}/>\n'.format(
                                ref, media, fpath, properties
                            )
                        )
                else:
                    fpath = "Images/" + fname
                    data.append(
                        '<item id="{0:}" media-type="{1:}" href="{2:}" {3:}/>\n'.format(
                            ref, media, fpath, properties
                        )
                    )
                idcnt += 1
        if self.target_epubver == "3" and navname is not None:
            data.append(
                '<item id="nav" media-type="application/xhtml+xml" href="Text/'
                + navname
                + '" properties="nav"/>\n'
            )
        if self.has_ncx and ncxname is not None:
            data.append(
                '<item id="ncx" media-type="application/x-dtbncx+xml" href="'
                + ncxname
                + '" />\n'
            )
        if self.pagemap != "":
            data.append(
                '<item id="map" media-type="application/oebs-page-map+xml" href="page-map.xml" />\n'
            )
        data.append("</manifest>\n")
        return [data, spinerefs]
    def buildOPFSpine(self, spinerefs, isNCX):
        # build spine
        k8resc = self.k8resc
        hasK8RescSpine = k8resc is not None and k8resc.hasSpine()
        data = []
        ppd = ""
        if self.isK8 and self.page_progression_direction is not None:
            ppd = ' page-progression-direction="{:s}"'.format(
                self.page_progression_direction
            )
        ncx = ""
        if isNCX:
            ncx = ' toc="ncx"'
        map = ""
        if self.pagemap != "":
            map = ' page-map="map"'
        if self.epubver == "F":
            if ppd:
                ppd = "<!--" + ppd + " -->"
            spine_start_tag = "<spine{1:s}{2:s}>{0:s}\n".format(ppd, map, ncx)
        else:
            spine_start_tag = "<spine{0:s}{1:s}{2:s}>\n".format(ppd, map, ncx)
        data.append(spine_start_tag)
        if hasK8RescSpine:
            for key in k8resc.spine_order:
                idref = k8resc.spine_idrefs[key]
                attribs = k8resc.spine_pageattributes[key]
                tag = '<itemref idref="%s"' % idref
                for aname, val in list(attribs.items()):
                    if self.epubver == "F" and aname == "properties":
                        continue
                    if val is not None:
                        tag += ' %s="%s"' % (aname, val)
                tag += "/>"
                if self.epubver == "F" and "properties" in attribs:
                    val = attribs["properties"]
                    if val is not None:
                        tag += '<!-- properties="%s" -->' % val
                tag += "\n"
                data.append(tag)
        else:
            start = 0
            # special case the created coverpage if need be
            [key, dir, fname] = self.fileinfo[0]
            if key is not None and key == "coverpage":
                entry = spinerefs[start]
                data.append('<itemref idref="%s" linear="no"/>\n' % entry)
                start += 1
            for entry in spinerefs[start:]:
                data.append('<itemref idref="' + entry + '"/>\n')
        data.append("</spine>\n")
        return data
    def buildMobi7OPF(self):
        # Build an OPF for mobi7 and azw4.
        logger.debug("Building an opf for mobi7/azw4.")
        data = []
        data.append('<?xml version="1.0" encoding="utf-8"?>\n')
        data.append(
            '<package version="2.0" xmlns="http://www.idpf.org/2007/opf" unique-identifier="uid">\n'
        )
        metadata_tag = '<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">'
        opf_metadata = self.buildOPFMetadata(metadata_tag)
        data += opf_metadata
        if self.has_ncx:
            # ncxname = self.files.getInputFileBasename() + '.ncx'
            ncxname = "toc.ncx"
        else:
            ncxname = None
        [opf_manifest, spinerefs] = self.buildOPFManifest(ncxname)
        data += opf_manifest
        opf_spine = self.buildOPFSpine(spinerefs, self.has_ncx)
        data += opf_spine
        data.append("<tours>\n</tours>\n")
        if not self.printReplica:
            guide = "<guide>\n" + self.guidetext + "</guide>\n"
            data.append(guide)
        data.append("</package>\n")
        return "".join(data)
    def buildEPUBOPF(self, has_obfuscated_fonts=False):
        logger.debug(
            "Building an opf for mobi8 using epub version: %s" % self.target_epubver
        )
        if self.target_epubver == "2":
            has_ncx = self.has_ncx
            has_guide = True
            ncxname = None
            ncxname = TOC_NCX
            navname = None
            package = '<package version="2.0" xmlns="http://www.idpf.org/2007/opf" unique-identifier="uid">\n'
            tours = "<tours>\n</tours>\n"
            metadata_tag = '<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">'
        else:
            has_ncx = EPUB3_WITH_NCX
            has_guide = EPUB3_WITH_GUIDE
            ncxname = None
            if has_ncx:
                ncxname = TOC_NCX
            navname = NAVIGATION_DOCUMENT
            package = '<package version="3.0" xmlns="http://www.idpf.org/2007/opf" prefix="rendition: http://www.idpf.org/vocab/rendition/#" unique-identifier="uid">\n'
            tours = ""
            metadata_tag = '<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">'
        data = []
        data.append('<?xml version="1.0" encoding="utf-8"?>\n')
        data.append(package)
        opf_metadata = self.buildOPFMetadata(metadata_tag, has_obfuscated_fonts)
        data += opf_metadata
        [opf_manifest, spinerefs] = self.buildOPFManifest(ncxname, navname)
        data += opf_manifest
        opf_spine = self.buildOPFSpine(spinerefs, has_ncx)
        data += opf_spine
        data.append(tours)
        if has_guide:
            guide = "<guide>\n" + self.guidetext + "</guide>\n"
            data.append(guide)
        data.append("</package>\n")
        return "".join(data)
    def writeOPF(self, has_obfuscated_fonts=False):
        if self.isK8:
            data = self.buildEPUBOPF(has_obfuscated_fonts)
            outopf = os.path.join(self.files.k8oebps, EPUB_OPF)
            with open(pathof(outopf), "wb") as f:
                f.write(data.encode("utf-8"))
            return self.BookId
        else:
            data = self.buildMobi7OPF()
            outopf = os.path.join(self.files.mobi7dir, "content.opf")
            with open(pathof(outopf), "wb") as f:
                f.write(data.encode("utf-8"))
            return 0
    def getBookId(self):
        return self.BookId
    def getNCXName(self):
        return self.ncxname
    def getNAVName(self):
        return self.navname
    def getEPUBVersion(self):
        return self.target_epubver
    def hasNCX(self):
        return self.ncxname is not None and self.has_ncx
    def hasNAV(self):
        return self.navname is not None
    def autodetectEPUBVersion(self):
        # Determine EPUB version from metadata and RESC.
        metadata = self.metadata
        k8resc = self.k8resc
        epubver = "2"
        if "true" == metadata.get("fixed-layout", [""])[0].lower():
            epubver = "3"
        elif metadata.get("orientation-lock", [""])[0].lower() in [
            "portrait",
            "landscape",
        ]:
            epubver = "3"
        elif self.page_progression_direction == "rtl":
            epubver = "3"
        elif EXTH_TITLE_FURIGANA in metadata:
            epubver = "3"
        elif EXTH_CREATOR_FURIGANA in metadata:
            epubver = "3"
        elif EXTH_PUBLISHER_FURIGANA in metadata:
            epubver = "3"
        elif k8resc is not None and k8resc.needEPUB3():
            epubver = "3"
        return epubver
    def defineRefinesID(self):
        # the following EXTH are set by KDP.
        # 'Title_Furigana_(508)'
        # 'Creator_Furigana_(517)',
        # 'Publisher_Furigana_(522)'
        # It is difficult to find correspondence between Title, Creator, Publisher
        # and EXTH 508,512, 522 if they have more than two values since KDP seems not preserve the oders of EXTH 508,512 and 522.
        # It is also difficult to find correspondence between them and tags which have refine attributes in RESC.
        # So editing manually is required.
        metadata = self.metadata
        needRefinesId = False
        if self.k8resc is not None:
            needRefinesId = self.k8resc.hasRefines()
        # Create id for rifine attributes
        if (needRefinesId or EXTH_TITLE_FURIGANA in metadata) and "Title" in metadata:
            for i in range(len(metadata.get("Title"))):
                self.title_id[i] = "title%02d" % (i + 1)
        if (
            needRefinesId or EXTH_CREATOR_FURIGANA in metadata
        ) and "Creator" in metadata:
            for i in range(len(metadata.get("Creator"))):
                self.creator_id[i] = "creator%02d" % (i + 1)
        if (
            needRefinesId or EXTH_PUBLISHER_FURIGANA in metadata
        ) and "Publisher" in metadata:
            for i in range(len(metadata.get("Publisher"))):
                self.publisher_id[i] = "publisher%02d" % (i + 1)
    def processRefinesMetadata(self):
        # create refines metadata defined in epub3 or convert refines property to opf: attribues for epub2.
        metadata = self.metadata
        refines_list = [
            [EXTH_TITLE_FURIGANA, self.title_id, self.title_attrib, "title00"],
            [EXTH_CREATOR_FURIGANA, self.creator_id, self.creator_attrib, "creator00"],
            [
                EXTH_PUBLISHER_FURIGANA,
                self.publisher_id,
                self.publisher_attrib,
                "publisher00",
            ],
        ]
        create_refines_metadata = False
        for EXTH in lzip(*refines_list)[0]:
            if EXTH in metadata:
                create_refines_metadata = True
                break
        if create_refines_metadata:
            for [EXTH, id, attrib, defaultid] in refines_list:
                if self.target_epubver == "3":
                    for i, value in list(id.items()):
                        attrib[i] = ' id="%s"' % value
                    if EXTH in metadata:
                        if len(metadata[EXTH]) == 1 and len(id) == 1:
                            self.createMetaTag(
                                self.exth_solved_refines_metadata,
                                "file-as",
                                metadata[EXTH][0],
                                id[0],
                            )
                        else:
                            for i, value in enumerate(metadata[EXTH]):
                                self.createMetaTag(
                                    self.exth_refines_metadata,
                                    "file-as",
                                    value,
                                    id.get(i, defaultid),
                                )
                else:
                    if EXTH in metadata:
                        if len(metadata[EXTH]) == 1 and len(id) == 1:
                            attr = ' opf:file-as="%s"' % metadata[EXTH][0]
                            attrib[0] = attr
                        else:
                            for i, value in enumerate(metadata[EXTH]):
                                attr = ' id="#%s" opf:file-as="%s"\n' % (
                                    id.get(i, defaultid),
                                    value,
                                )
                                self.extra_attributes.append(attr)
    def createMetadataForFixedlayout(self):
        # convert fixed layout to epub3 format if needed.
        metadata = self.metadata
        if "fixed-layout" in metadata:
            fixedlayout = metadata["fixed-layout"][0]
            content = {"true": "pre-paginated"}.get(fixedlayout.lower(), "reflowable")
            self.createMetaTag(
                self.exth_fixedlayout_metadata, "rendition:layout", content
            )
        if "orientation-lock" in metadata:
            content = metadata["orientation-lock"][0].lower()
            if content == "portrait" or content == "landscape":
                self.createMetaTag(
                    self.exth_fixedlayout_metadata, "rendition:orientation", content
                )
        # according to epub3 spec about correspondence with Amazon
        # if 'original-resolution' is provided it needs to be converted to
        # meta viewport property tag stored in the <head></head> of **each**
        # xhtml page - so this tag would need to be handled by editing each part
        # before reaching this routine
        # we need to add support for this to the k8html routine
        # if 'original-resolution' in metadata.keys():
        #     resolution = metadata['original-resolution'][0].lower()
        #     width, height = resolution.split('x')
        #     if width.isdigit() and int(width) > 0 and height.isdigit() and int(height) > 0:
        #         viewport = 'width=%s, height=%s' % (width, height)
        #         self.createMetaTag(self.exth_fixedlayout_metadata, 'rendition:viewport', viewport)
--- a/mobimaster/mobi/mobi_pagemap.py
+++ b/mobimaster/mobi/mobi_pagemap.py
@@ -0,0 +1,185 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import PY2, unicode_str
 from loguru import logger
 if PY2:
    range = xrange
 import struct
 # note:  struct pack, unpack, unpack_from all require bytestring format
 # data all the way up to at least python 2.7.5, python 3 okay with bytestring
 import re
 # note: re requites the pattern to be the exact same type as the data to be searched in python3
 # but u"" is not allowed for the pattern itself only b""
 _TABLE = [
    ("m", 1000),
    ("cm", 900),
    ("d", 500),
    ("cd", 400),
    ("c", 100),
    ("xc", 90),
    ("l", 50),
    ("xl", 40),
    ("x", 10),
    ("ix", 9),
    ("v", 5),
    ("iv", 4),
    ("i", 1),
 ]
 def int_to_roman(i):
    parts = []
    num = i
    for letter, value in _TABLE:
        while value <= num:
            num -= value
            parts.append(letter)
    return "".join(parts)
 def roman_to_int(s):
    result = 0
    rnstr = s
    for letter, value in _TABLE:
        while rnstr.startswith(letter):
            result += value
            rnstr = rnstr[len(letter) :]
    return result
 _pattern = r"""\(([^\)]*)\)"""
 _tup_pattern = re.compile(_pattern, re.IGNORECASE)
 def _parseNames(numpages, data):
    data = unicode_str(data)
    pagenames = []
    pageMap = ""
    for i in range(numpages):
        pagenames.append(None)
    for m in re.finditer(_tup_pattern, data):
        tup = m.group(1)
        if pageMap != "":
            pageMap += ","
        pageMap += "(" + tup + ")"
        spos, nametype, svalue = tup.split(",")
        # print(spos, nametype, svalue)
        if nametype == "a" or nametype == "r":
            svalue = int(svalue)
        spos = int(spos)
        for i in range(spos - 1, numpages):
            if nametype == "r":
                pname = int_to_roman(svalue)
                svalue += 1
            elif nametype == "a":
                pname = "%s" % svalue
                svalue += 1
            elif nametype == "c":
                sp = svalue.find("|")
                if sp == -1:
                    pname = svalue
                else:
                    pname = svalue[0:sp]
                    svalue = svalue[sp + 1 :]
            else:
                logger.debug("Error: unknown page numbering type %s" % nametype)
            pagenames[i] = pname
    return pagenames, pageMap
 class PageMapProcessor:
    def __init__(self, mh, data):
        self.mh = mh
        self.data = data
        self.pagenames = []
        self.pageoffsets = []
        self.pageMap = ""
        self.pm_len = 0
        self.pm_nn = 0
        self.pn_bits = 0
        self.pmoff = None
        self.pmstr = ""
        logger.debug("Extracting Page Map Information")
        (rev_len,) = struct.unpack_from(b">L", self.data, 0x10)
        # skip over header, revision string length data, and revision string
        ptr = 0x14 + rev_len
        pm_1, self.pm_len, self.pm_nn, self.pm_bits = struct.unpack_from(
            b">4H", self.data, ptr
        )
        # print(pm_1, self.pm_len, self.pm_nn, self.pm_bits)
        self.pmstr = self.data[ptr + 8 : ptr + 8 + self.pm_len]
        self.pmoff = self.data[ptr + 8 + self.pm_len :]
        offsize = b">L"
        offwidth = 4
        if self.pm_bits == 16:
            offsize = b">H"
            offwidth = 2
        ptr = 0
        for i in range(self.pm_nn):
            (od,) = struct.unpack_from(offsize, self.pmoff, ptr)
            ptr += offwidth
            self.pageoffsets.append(od)
        self.pagenames, self.pageMap = _parseNames(self.pm_nn, self.pmstr)
    def getPageMap(self):
        return self.pageMap
    def getNames(self):
        return self.pagenames
    def getOffsets(self):
        return self.pageoffsets
    # page-map.xml will be unicode but encoded to utf-8 immediately before being written to a file
    def generateKF8PageMapXML(self, k8proc):
        pagemapxml = '<page-map xmlns="http://www.idpf.org/2007/opf">\n'
        for i in range(len(self.pagenames)):
            pos = self.pageoffsets[i]
            name = self.pagenames[i]
            if name is not None and name != "":
                [pn, dir, filename, skelpos, skelend, aidtext] = k8proc.getSkelInfo(pos)
                idtext = unicode_str(k8proc.getPageIDTag(pos))
                linktgt = unicode_str(filename)
                if idtext != "":
                    linktgt += "#" + idtext
                pagemapxml += '<page name="%s" href="%s/%s" />\n' % (name, dir, linktgt)
        pagemapxml += "</page-map>\n"
        return pagemapxml
    def generateAPNX(self, apnx_meta):
        if apnx_meta["format"] == "MOBI_8":
            content_header = (
                '{"contentGuid":"%(contentGuid)s","asin":"%(asin)s","cdeType":"%(cdeType)s","format":"%(format)s","fileRevisionId":"1","acr":"%(acr)s"}'
                % apnx_meta
            )
        else:
            content_header = (
                '{"contentGuid":"%(contentGuid)s","asin":"%(asin)s","cdeType":"%(cdeType)s","fileRevisionId":"1"}'
                % apnx_meta
            )
        content_header = content_header.encode("utf-8")
        page_header = '{"asin":"%(asin)s","pageMap":"%(pageMap)s"}' % apnx_meta
        page_header = page_header.encode("utf-8")
        apnx = struct.pack(b">H", 1) + struct.pack(b">H", 1)
        apnx += struct.pack(b">I", 12 + len(content_header))
        apnx += struct.pack(b">I", len(content_header))
        apnx += content_header
        apnx += struct.pack(b">H", 1)
        apnx += struct.pack(b">H", len(page_header))
        apnx += struct.pack(b">H", self.pm_nn)
        apnx += struct.pack(b">H", 32)
        apnx += page_header
        for page in self.pageoffsets:
            apnx += struct.pack(b">L", page)
        return apnx
--- a/mobimaster/mobi/mobi_sectioner.py
+++ b/mobimaster/mobi/mobi_sectioner.py
@@ -0,0 +1,204 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import PY2, hexlify, bstr, bord, bchar
 from loguru import logger
 import datetime
 if PY2:
    range = xrange
 # note:  struct pack, unpack, unpack_from all require bytestring format
 # data all the way up to at least python 2.7.5, python 3 okay with bytestring
 import struct
 from .unipath import pathof
 DUMP = False
 """ Set to True to dump all possible information. """
 class unpackException(Exception):
    pass
 def describe(data):
    txtans = ""
    hexans = hexlify(data)
    for i in data:
        if bord(i) < 32 or bord(i) > 127:
            txtans += "?"
        else:
            txtans += bchar(i).decode("latin-1")
    return '"' + txtans + '"' + " 0x" + hexans
 def datetimefrompalmtime(palmtime):
    if palmtime > 0x7FFFFFFF:
        pythondatetime = datetime.datetime(
            year=1904, month=1, day=1
        ) + datetime.timedelta(seconds=palmtime)
    else:
        pythondatetime = datetime.datetime(
            year=1970, month=1, day=1
        ) + datetime.timedelta(seconds=palmtime)
    return pythondatetime
 class Sectionizer:
    def __init__(self, filename):
        self.data = b""
        with open(pathof(filename), "rb") as f:
            self.data = f.read()
        self.palmheader = self.data[:78]
        self.palmname = self.data[:32]
        self.ident = self.palmheader[0x3C : 0x3C + 8]
        # CG struct.unpack_from(fmt, buffer, offset=0)
        (self.num_sections,) = struct.unpack_from(b">H", self.palmheader, 76)
        self.filelength = len(self.data)
        ## CGDBG ???
        ## sectionsdata (9680, 0, 18618, 2, 22275, 4, 25504, 6, 28607, 8,...
        sectionsdata = struct.unpack_from( bstr(">%dL" % (self.num_sections * 2)), self.data, 78
        ) + (self.filelength, 0)
        ## 所有section的offset和长度
        # sectionsoffset (9680, 18618, 22275, 25504, 28607, ...
        self.sectionoffsets = sectionsdata[::2]
        # ectionattributes (0, 2, 4, 6, 8, ...
        self.sectionattributes = sectionsdata[1::2]
        self.sectiondescriptions = ["" for x in range(self.num_sections + 1)]
        self.sectiondescriptions[-1] = "File Length Only"
        # CGDBG upack_from 返回什么？tuple (,)
        print( 'sectionsdata {} {}'.format(sectionsdata, bstr(">%dL" % (self.num_sections * 2))))
        print( 'sectionsoffset {} \n sectionattributes {}'.format( self.sectionoffsets, self.sectionattributes ))
        print( 'sectionsdescriptions {} '.format( self.sectiondescriptions))
        print( bstr(">%dL" % (self.num_sections * 2) ) )
        print( struct.unpack_from(bstr(">%dL" % (self.num_sections * 2)) , self.data, 78) )
        print( (self.filelength, 0) )
        return
    # sections information
    def dumpsectionsinfo(self):
        logger.debug("Section     Offset  Length      UID Attribs Description")
        for i in range(self.num_sections):
            '''
            logger.debug(
                "{}  {}  {}  {}  {}  {}  {}\n".format( i, i,
                    self.sectionoffsets[i],
                    self.sectionoffsets[i + 1] - self.sectionoffsets[i],
                    self.sectionattributes[i] & 0xFFFFFF,
                    (self.sectionattributes[i] >> 24) & 0xFF,
                    self.sectiondescriptions[i]))
            '''
            logger.debug(
                "%3d %3X  0x%07X 0x%05X % 8d % 7d %s"
                % (
                    i,
                    i,
                    self.sectionoffsets[i],
                    self.sectionoffsets[i + 1] - self.sectionoffsets[i],
                    self.sectionattributes[i] & 0xFFFFFF,
                    (self.sectionattributes[i] >> 24) & 0xFF,
                    self.sectiondescriptions[i],
                )
            )
        logger.debug(
            "%3d %3X  0x%07X                          %s"
            % (
                self.num_sections,
                self.num_sections,
                self.sectionoffsets[self.num_sections],
                self.sectiondescriptions[self.num_sections],
            )
        )
    def setsectiondescription(self, section, description):
        if section < len(self.sectiondescriptions):
            self.sectiondescriptions[section] = description
        else:
            logger.debug(
                "Section out of range: %d, description %s" % (section, description)
            )
    def dumppalmheader(self):
        logger.debug("Palm Database Header")
        logger.debug("Database name: " + repr(self.palmheader[:32]))
        (dbattributes,) = struct.unpack_from(b">H", self.palmheader, 32)
        logger.debug("Bitfield attributes: 0x%0X" % dbattributes,)
        if dbattributes != 0:
            print(" (",)
            if dbattributes & 2:
                print("Read-only; ",)
            if dbattributes & 4:
                print("Dirty AppInfoArea; ",)
            if dbattributes & 8:
                print("Needs to be backed up; ",)
            if dbattributes & 16:
                print("OK to install over newer; ",)
            if dbattributes & 32:
                print("Reset after installation; ",)
            if dbattributes & 64:
                print("No copying by PalmPilot beaming; ",)
            print(")")
        else:
            print("")
        logger.debug(
            "File version: %d" % struct.unpack_from(b">H", self.palmheader, 34)[0]
        )
        (dbcreation,) = struct.unpack_from(b">L", self.palmheader, 36)
        logger.debug(
            "Creation Date: "
            + str(datetimefrompalmtime(dbcreation))
            + (" (0x%0X)" % dbcreation)
        )
        (dbmodification,) = struct.unpack_from(b">L", self.palmheader, 40)
        logger.debug(
            "Modification Date: "
            + str(datetimefrompalmtime(dbmodification))
            + (" (0x%0X)" % dbmodification)
        )
        (dbbackup,) = struct.unpack_from(b">L", self.palmheader, 44)
        if dbbackup != 0:
            logger.debug(
                "Backup Date: "
                + str(datetimefrompalmtime(dbbackup))
                + (" (0x%0X)" % dbbackup)
            )
        logger.debug(
            "Modification No.: %d" % struct.unpack_from(b">L", self.palmheader, 48)[0]
        )
        logger.debug(
            "App Info offset: 0x%0X" % struct.unpack_from(b">L", self.palmheader, 52)[0]
        )
        logger.debug(
            "Sort Info offset: 0x%0X"
            % struct.unpack_from(b">L", self.palmheader, 56)[0]
        )
        logger.debug(
            "Type/Creator: %s/%s"
            % (repr(self.palmheader[60:64]), repr(self.palmheader[64:68]))
        )
        logger.debug(
            "Unique seed: 0x%0X" % struct.unpack_from(b">L", self.palmheader, 68)[0]
        )
        (expectedzero,) = struct.unpack_from(b">L", self.palmheader, 72)
        if expectedzero != 0:
            logger.debug(
                "Should be zero but isn't: %d"
                % struct.unpack_from(b">L", self.palmheader, 72)[0]
            )
        logger.debug(
            "Number of sections: %d" % struct.unpack_from(b">H", self.palmheader, 76)[0]
        )
        return
    def loadSection(self, section):
        before, after = self.sectionoffsets[section : section + 2]
        return self.data[before:after]
--- a/mobimaster/mobi/mobi_split.py
+++ b/mobimaster/mobi/mobi_split.py
@@ -0,0 +1,505 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 from loguru import logger
 import struct
 # note:  struct pack, unpack, unpack_from all require bytestring format
 # data all the way up to at least python 2.7.5, python 3 okay with bytestring
 from .unipath import pathof
 # CG : reference https://wiki.mobileread.com/wiki/MOBI
 # important  pdb header offsets
 unique_id_seed = 68
 number_of_pdb_records = 76
 # important palmdoc header offsets
 book_length = 4
 book_record_count = 8
 first_pdb_record = 78
 # important rec0 offsets
 length_of_book = 4
 mobi_header_base = 16
 mobi_header_length = 20
 mobi_type = 24
 mobi_version = 36
 first_non_text = 80
 title_offset = 84
 first_resc_record = 108
 first_content_index = 192
 last_content_index = 194
 kf8_fdst_index = 192  # for KF8 mobi headers
 fcis_index = 200
 flis_index = 208
 srcs_index = 224
 srcs_count = 228
 primary_index = 244
 datp_index = 256
 huffoff = 112
 hufftbloff = 120
 def getint(datain, ofs, sz=b"L"):
    (i,) = struct.unpack_from(b">" + sz, datain, ofs)
    return i
 def writeint(datain, ofs, n, len=b"L"):
    if len == b"L":
        return datain[:ofs] + struct.pack(b">L", n) + datain[ofs + 4 :]
    else:
        return datain[:ofs] + struct.pack(b">H", n) + datain[ofs + 2 :]
 def getsecaddr(datain, secno):
    nsec = getint(datain, number_of_pdb_records, b"H")
    assert secno >= 0 & secno < nsec, "secno %d out of range (nsec=%d)" % (secno, nsec)
    secstart = getint(datain, first_pdb_record + secno * 8)
    if secno == nsec - 1:
        secend = len(datain)
    else:
        secend = getint(datain, first_pdb_record + (secno + 1) * 8)
    return secstart, secend
 def readsection(datain, secno):
    secstart, secend = getsecaddr(datain, secno)
    return datain[secstart:secend]
 def writesection(datain, secno, secdata):  # overwrite, accounting for different length
    # dataout = deletesectionrange(datain,secno, secno)
    # return insertsection(dataout, secno, secdata)
    datalst = []
    nsec = getint(datain, number_of_pdb_records, b"H")
    zerosecstart, zerosecend = getsecaddr(datain, 0)
    secstart, secend = getsecaddr(datain, secno)
    dif = len(secdata) - (secend - secstart)
    datalst.append(datain[:unique_id_seed])
    datalst.append(struct.pack(b">L", 2 * nsec + 1))
    datalst.append(datain[unique_id_seed + 4 : number_of_pdb_records])
    datalst.append(struct.pack(b">H", nsec))
    newstart = zerosecstart
    for i in range(0, secno):
        ofs, flgval = struct.unpack_from(b">2L", datain, first_pdb_record + i * 8)
        datalst.append(struct.pack(b">L", ofs) + struct.pack(b">L", flgval))
    datalst.append(struct.pack(b">L", secstart) + struct.pack(b">L", (2 * secno)))
    for i in range(secno + 1, nsec):
        ofs, flgval = struct.unpack_from(b">2L", datain, first_pdb_record + i * 8)
        ofs = ofs + dif
        datalst.append(struct.pack(b">L", ofs) + struct.pack(b">L", flgval))
    lpad = newstart - (first_pdb_record + 8 * nsec)
    if lpad > 0:
        datalst.append(b"\0" * lpad)
    datalst.append(datain[zerosecstart:secstart])
    datalst.append(secdata)
    datalst.append(datain[secend:])
    dataout = b"".join(datalst)
    return dataout
 def nullsection(datain, secno):  # make it zero-length without deleting it
    datalst = []
    nsec = getint(datain, number_of_pdb_records, b"H")
    secstart, secend = getsecaddr(datain, secno)
    zerosecstart, zerosecend = getsecaddr(datain, 0)
    dif = secend - secstart
    datalst.append(datain[:first_pdb_record])
    for i in range(0, secno + 1):
        ofs, flgval = struct.unpack_from(b">2L", datain, first_pdb_record + i * 8)
        datalst.append(struct.pack(b">L", ofs) + struct.pack(b">L", flgval))
    for i in range(secno + 1, nsec):
        ofs, flgval = struct.unpack_from(b">2L", datain, first_pdb_record + i * 8)
        ofs = ofs - dif
        datalst.append(struct.pack(b">L", ofs) + struct.pack(b">L", flgval))
    lpad = zerosecstart - (first_pdb_record + 8 * nsec)
    if lpad > 0:
        datalst.append(b"\0" * lpad)
    datalst.append(datain[zerosecstart:secstart])
    datalst.append(datain[secend:])
    dataout = b"".join(datalst)
    return dataout
 def deletesectionrange(datain, firstsec, lastsec):  # delete a range of sections
    datalst = []
    firstsecstart, firstsecend = getsecaddr(datain, firstsec)
    lastsecstart, lastsecend = getsecaddr(datain, lastsec)
    zerosecstart, zerosecend = getsecaddr(datain, 0)
    dif = lastsecend - firstsecstart + 8 * (lastsec - firstsec + 1)
    nsec = getint(datain, number_of_pdb_records, b"H")
    datalst.append(datain[:unique_id_seed])
    datalst.append(struct.pack(b">L", 2 * (nsec - (lastsec - firstsec + 1)) + 1))
    datalst.append(datain[unique_id_seed + 4 : number_of_pdb_records])
    datalst.append(struct.pack(b">H", nsec - (lastsec - firstsec + 1)))
    newstart = zerosecstart - 8 * (lastsec - firstsec + 1)
    for i in range(0, firstsec):
        ofs, flgval = struct.unpack_from(b">2L", datain, first_pdb_record + i * 8)
        ofs = ofs - 8 * (lastsec - firstsec + 1)
        datalst.append(struct.pack(b">L", ofs) + struct.pack(b">L", flgval))
    for i in range(lastsec + 1, nsec):
        ofs, flgval = struct.unpack_from(b">2L", datain, first_pdb_record + i * 8)
        ofs = ofs - dif
        flgval = 2 * (i - (lastsec - firstsec + 1))
        datalst.append(struct.pack(b">L", ofs) + struct.pack(b">L", flgval))
    lpad = newstart - (first_pdb_record + 8 * (nsec - (lastsec - firstsec + 1)))
    if lpad > 0:
        datalst.append(b"\0" * lpad)
    datalst.append(datain[zerosecstart:firstsecstart])
    datalst.append(datain[lastsecend:])
    dataout = b"".join(datalst)
    return dataout
 def insertsection(datain, secno, secdata):  # insert a new section
    datalst = []
    nsec = getint(datain, number_of_pdb_records, b"H")
    # print("inserting secno" , secno,  "into" ,nsec, "sections")
    secstart, secend = getsecaddr(datain, secno)
    zerosecstart, zerosecend = getsecaddr(datain, 0)
    dif = len(secdata)
    datalst.append(datain[:unique_id_seed])
    datalst.append(struct.pack(b">L", 2 * (nsec + 1) + 1))
    datalst.append(datain[unique_id_seed + 4 : number_of_pdb_records])
    datalst.append(struct.pack(b">H", nsec + 1))
    newstart = zerosecstart + 8
    for i in range(0, secno):
        ofs, flgval = struct.unpack_from(b">2L", datain, first_pdb_record + i * 8)
        ofs += 8
        datalst.append(struct.pack(b">L", ofs) + struct.pack(b">L", flgval))
    datalst.append(struct.pack(b">L", secstart + 8) + struct.pack(b">L", (2 * secno)))
    for i in range(secno, nsec):
        ofs, flgval = struct.unpack_from(b">2L", datain, first_pdb_record + i * 8)
        ofs = ofs + dif + 8
        flgval = 2 * (i + 1)
        datalst.append(struct.pack(b">L", ofs) + struct.pack(b">L", flgval))
    lpad = newstart - (first_pdb_record + 8 * (nsec + 1))
    if lpad > 0:
        datalst.append(b"\0" * lpad)
    datalst.append(datain[zerosecstart:secstart])
    datalst.append(secdata)
    datalst.append(datain[secstart:])
    dataout = b"".join(datalst)
    return dataout
 def insertsectionrange(
    sectionsource, firstsec, lastsec, sectiontarget, targetsec
 ):  # insert a range of sections
    # print("inserting secno" , firstsec,  "to", lastsec, "into" ,targetsec, "sections")
    # dataout = sectiontarget
    # for idx in range(lastsec,firstsec-1,-1):
    #    dataout = insertsection(dataout,targetsec,readsection(sectionsource,idx))
    # return dataout
    datalst = []
    nsec = getint(sectiontarget, number_of_pdb_records, b"H")
    zerosecstart, zerosecend = getsecaddr(sectiontarget, 0)
    insstart, nul = getsecaddr(sectiontarget, targetsec)
    nins = lastsec - firstsec + 1
    srcstart, nul = getsecaddr(sectionsource, firstsec)
    nul, srcend = getsecaddr(sectionsource, lastsec)
    newstart = zerosecstart + 8 * nins
    datalst.append(sectiontarget[:unique_id_seed])
    datalst.append(struct.pack(b">L", 2 * (nsec + nins) + 1))
    datalst.append(sectiontarget[unique_id_seed + 4 : number_of_pdb_records])
    datalst.append(struct.pack(b">H", nsec + nins))
    for i in range(0, targetsec):
        ofs, flgval = struct.unpack_from(
            b">2L", sectiontarget, first_pdb_record + i * 8
        )
        ofsnew = ofs + 8 * nins
        flgvalnew = flgval
        datalst.append(struct.pack(b">L", ofsnew) + struct.pack(b">L", flgvalnew))
        # print(ofsnew, flgvalnew, ofs, flgval)
    srcstart0, nul = getsecaddr(sectionsource, firstsec)
    for i in range(nins):
        isrcstart, nul = getsecaddr(sectionsource, firstsec + i)
        ofsnew = insstart + (isrcstart - srcstart0) + 8 * nins
        flgvalnew = 2 * (targetsec + i)
        datalst.append(struct.pack(b">L", ofsnew) + struct.pack(b">L", flgvalnew))
        # print(ofsnew, flgvalnew)
    dif = srcend - srcstart
    for i in range(targetsec, nsec):
        ofs, flgval = struct.unpack_from(
            b">2L", sectiontarget, first_pdb_record + i * 8
        )
        ofsnew = ofs + dif + 8 * nins
        flgvalnew = 2 * (i + nins)
        datalst.append(struct.pack(b">L", ofsnew) + struct.pack(b">L", flgvalnew))
        # print(ofsnew, flgvalnew, ofs, flgval)
    lpad = newstart - (first_pdb_record + 8 * (nsec + nins))
    if lpad > 0:
        datalst.append(b"\0" * lpad)
    datalst.append(sectiontarget[zerosecstart:insstart])
    datalst.append(sectionsource[srcstart:srcend])
    datalst.append(sectiontarget[insstart:])
    dataout = b"".join(datalst)
    return dataout
 def get_exth_params(rec0):
    ebase = mobi_header_base + getint(rec0, mobi_header_length)
    elen = getint(rec0, ebase + 4)
    enum = getint(rec0, ebase + 8)
    return ebase, elen, enum
 def add_exth(rec0, exth_num, exth_bytes):
    ebase, elen, enum = get_exth_params(rec0)
    newrecsize = 8 + len(exth_bytes)
    newrec0 = (
        rec0[0 : ebase + 4]
        + struct.pack(b">L", elen + newrecsize)
        + struct.pack(b">L", enum + 1)
        + struct.pack(b">L", exth_num)
        + struct.pack(b">L", newrecsize)
        + exth_bytes
        + rec0[ebase + 12 :]
    )
    newrec0 = writeint(
        newrec0, title_offset, getint(newrec0, title_offset) + newrecsize
    )
    return newrec0
 def read_exth(rec0, exth_num):
    exth_values = []
    ebase, elen, enum = get_exth_params(rec0)
    ebase = ebase + 12
    while enum > 0:
        exth_id = getint(rec0, ebase)
        if exth_id == exth_num:
            # We might have multiple exths, so build a list.
            exth_values.append(rec0[ebase + 8 : ebase + getint(rec0, ebase + 4)])
        enum = enum - 1
        ebase = ebase + getint(rec0, ebase + 4)
    return exth_values
 def write_exth(rec0, exth_num, exth_bytes):
    ebase, elen, enum = get_exth_params(rec0)
    ebase_idx = ebase + 12
    enum_idx = enum
    while enum_idx > 0:
        exth_id = getint(rec0, ebase_idx)
        if exth_id == exth_num:
            dif = len(exth_bytes) + 8 - getint(rec0, ebase_idx + 4)
            newrec0 = rec0
            if dif != 0:
                newrec0 = writeint(
                    newrec0, title_offset, getint(newrec0, title_offset) + dif
                )
            return (
                newrec0[: ebase + 4]
                + struct.pack(
                    b">L", elen + len(exth_bytes) + 8 - getint(rec0, ebase_idx + 4)
                )
                + struct.pack(b">L", enum)
                + rec0[ebase + 12 : ebase_idx + 4]
                + struct.pack(b">L", len(exth_bytes) + 8)
                + exth_bytes
                + rec0[ebase_idx + getint(rec0, ebase_idx + 4) :]
            )
        enum_idx = enum_idx - 1
        ebase_idx = ebase_idx + getint(rec0, ebase_idx + 4)
    return rec0
 def del_exth(rec0, exth_num):
    ebase, elen, enum = get_exth_params(rec0)
    ebase_idx = ebase + 12
    enum_idx = 0
    while enum_idx < enum:
        exth_id = getint(rec0, ebase_idx)
        exth_size = getint(rec0, ebase_idx + 4)
        if exth_id == exth_num:
            newrec0 = rec0
            newrec0 = writeint(
                newrec0, title_offset, getint(newrec0, title_offset) - exth_size
            )
            newrec0 = newrec0[:ebase_idx] + newrec0[ebase_idx + exth_size :]
            newrec0 = (
                newrec0[0 : ebase + 4]
                + struct.pack(b">L", elen - exth_size)
                + struct.pack(b">L", enum - 1)
                + newrec0[ebase + 12 :]
            )
            return newrec0
        enum_idx += 1
        ebase_idx = ebase_idx + exth_size
    return rec0
 class mobi_split:
    def __init__(self, infile):
        datain = b""
        with open(pathof(infile), "rb") as f:
            datain = f.read()
        datain_rec0 = readsection(datain, 0)
        ver = getint(datain_rec0, mobi_version)
        self.combo = ver != 8
        if not self.combo:
            return
        exth121 = read_exth(datain_rec0, 121)
        if len(exth121) == 0:
            self.combo = False
            return
        else:
            # only pay attention to first exth121
            # (there should only be one)
            (datain_kf8,) = struct.unpack_from(b">L", exth121[0], 0)
            if datain_kf8 == 0xFFFFFFFF:
                self.combo = False
                return
        datain_kfrec0 = readsection(datain, datain_kf8)
        # create the standalone mobi7
        num_sec = getint(datain, number_of_pdb_records, b"H")
        # remove BOUNDARY up to but not including ELF record
        self.result_file7 = deletesectionrange(datain, datain_kf8 - 1, num_sec - 2)
        # check if there are SRCS records and delete them
        srcs = getint(datain_rec0, srcs_index)
        num_srcs = getint(datain_rec0, srcs_count)
        if srcs != 0xFFFFFFFF and num_srcs > 0:
            self.result_file7 = deletesectionrange(
                self.result_file7, srcs, srcs + num_srcs - 1
            )
            datain_rec0 = writeint(datain_rec0, srcs_index, 0xFFFFFFFF)
            datain_rec0 = writeint(datain_rec0, srcs_count, 0)
        # reset the EXTH 121 KF8 Boundary meta data to 0xffffffff
        datain_rec0 = write_exth(datain_rec0, 121, struct.pack(b">L", 0xFFFFFFFF))
        # datain_rec0 = del_exth(datain_rec0,121)
        # datain_rec0 = del_exth(datain_rec0,534)
        # don't remove the EXTH 125 KF8 Count of Resources, seems to be present in mobi6 files as well
        # set the EXTH 129 KF8 Masthead / Cover Image string to the null string
        datain_rec0 = write_exth(datain_rec0, 129, b"")
        # don't remove the EXTH 131 KF8 Unidentified Count, seems to be present in mobi6 files as well
        # need to reset flags stored in 0x80-0x83
        # old mobi with exth: 0x50, mobi7 part with exth: 0x1850, mobi8 part with exth: 0x1050
        # Bit Flags
        # 0x1000 = Bit 12 indicates if embedded fonts are used or not
        # 0x0800 = means this Header points to *shared* images/resource/fonts ??
        # 0x0080 = unknown new flag, why is this now being set by Kindlegen 2.8?
        # 0x0040 = exth exists
        # 0x0010 = Not sure but this is always set so far
        (fval,) = struct.unpack_from(b">L", datain_rec0, 0x80)
        # need to remove flag 0x0800 for KindlePreviewer 2.8 and unset Bit 12 for embedded fonts
        fval = fval & 0x07FF
        datain_rec0 = datain_rec0[:0x80] + struct.pack(b">L", fval) + datain_rec0[0x84:]
        self.result_file7 = writesection(self.result_file7, 0, datain_rec0)
        # no need to replace kf8 style fcis with mobi 7 one
        # fcis_secnum, = struct.unpack_from(b'>L',datain_rec0, 0xc8)
        # if fcis_secnum != 0xffffffff:
        #     fcis_info = readsection(datain, fcis_secnum)
        #     text_len,  = struct.unpack_from(b'>L', fcis_info, 0x14)
        #     new_fcis = 'FCIS\x00\x00\x00\x14\x00\x00\x00\x10\x00\x00\x00\x01\x00\x00\x00\x00'
        #     new_fcis += struct.pack(b'>L',text_len)
        #     new_fcis += '\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x08\x00\x01\x00\x01\x00\x00\x00\x00'
        #     self.result_file7 = writesection(self.result_file7, fcis_secnum, new_fcis)
        firstimage = getint(datain_rec0, first_resc_record)
        lastimage = getint(datain_rec0, last_content_index, b"H")
        # print("Old First Image, last Image", firstimage,lastimage)
        if lastimage == 0xFFFF:
            # find the lowest of the next sections and copy up to that.
            ofs_list = [
                (fcis_index, b"L"),
                (flis_index, b"L"),
                (datp_index, b"L"),
                (hufftbloff, b"L"),
            ]
            for ofs, sz in ofs_list:
                n = getint(datain_rec0, ofs, sz)
                # print("n",n)
                if n > 0 and n < lastimage:
                    lastimage = n - 1
        logger.debug("First Image, last Image %s %s" % (firstimage, lastimage))
        # Try to null out FONT and RES, but leave the (empty) PDB record so image refs remain valid
        for i in range(firstimage, lastimage):
            imgsec = readsection(self.result_file7, i)
            if imgsec[0:4] in [b"RESC", b"FONT"]:
                self.result_file7 = nullsection(self.result_file7, i)
        # mobi7 finished
        # create standalone mobi8
        self.result_file8 = deletesectionrange(datain, 0, datain_kf8 - 1)
        target = getint(datain_kfrec0, first_resc_record)
        self.result_file8 = insertsectionrange(
            datain, firstimage, lastimage, self.result_file8, target
        )
        datain_kfrec0 = readsection(self.result_file8, 0)
        # Only keep the correct EXTH 116 StartOffset, KG 2.5 carries over the one from the mobi7 part, which then points at garbage in the mobi8 part, and confuses FW 3.4
        kf8starts = read_exth(datain_kfrec0, 116)
        # If we have multiple StartOffset, keep only the last one
        kf8start_count = len(kf8starts)
        while kf8start_count > 1:
            kf8start_count -= 1
            datain_kfrec0 = del_exth(datain_kfrec0, 116)
        # update the EXTH 125 KF8 Count of Images/Fonts/Resources
        datain_kfrec0 = write_exth(
            datain_kfrec0, 125, struct.pack(b">L", lastimage - firstimage + 1)
        )
        # need to reset flags stored in 0x80-0x83
        # old mobi with exth: 0x50, mobi7 part with exth: 0x1850, mobi8 part with exth: 0x1050
        # standalone mobi8 with exth: 0x0050
        # Bit Flags
        # 0x1000 = Bit 12 indicates if embedded fonts are used or not
        # 0x0800 = means this Header points to *shared* images/resource/fonts ??
        # 0x0080 = unknown new flag, why is this now being set by Kindlegen 2.8?
        # 0x0040 = exth exists
        # 0x0010 = Not sure but this is always set so far
        (fval,) = struct.unpack_from(">L", datain_kfrec0, 0x80)
        fval = fval & 0x1FFF
        fval |= 0x0800
        datain_kfrec0 = (
            datain_kfrec0[:0x80] + struct.pack(b">L", fval) + datain_kfrec0[0x84:]
        )
        # properly update other index pointers that have been shifted by the insertion of images
        ofs_list = [
            (kf8_fdst_index, b"L"),
            (fcis_index, b"L"),
            (flis_index, b"L"),
            (datp_index, b"L"),
            (hufftbloff, b"L"),
        ]
        for ofs, sz in ofs_list:
            n = getint(datain_kfrec0, ofs, sz)
            if n != 0xFFFFFFFF:
                datain_kfrec0 = writeint(
                    datain_kfrec0, ofs, n + lastimage - firstimage + 1, sz
                )
        self.result_file8 = writesection(self.result_file8, 0, datain_kfrec0)
        # no need to replace kf8 style fcis with mobi 7 one
        # fcis_secnum, = struct.unpack_from(b'>L',datain_kfrec0, 0xc8)
        # if fcis_secnum != 0xffffffff:
        #     fcis_info = readsection(self.result_file8, fcis_secnum)
        #     text_len,  = struct.unpack_from(b'>L', fcis_info, 0x14)
        #     new_fcis = 'FCIS\x00\x00\x00\x14\x00\x00\x00\x10\x00\x00\x00\x01\x00\x00\x00\x00'
        #     new_fcis += struct.pack(b'>L',text_len)
        #     new_fcis += '\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x08\x00\x01\x00\x01\x00\x00\x00\x00'
        #     self.result_file8 = writesection(self.result_file8, fcis_secnum, new_fcis)
        # mobi8 finished
    def getResult8(self):
        return self.result_file8
    def getResult7(self):
        return self.result_file7
--- a/mobimaster/mobi/mobi_uncompress.py
+++ b/mobimaster/mobi/mobi_uncompress.py
@@ -0,0 +1,138 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import PY2, bchr, lmap, bstr
 if PY2:
    range = xrange
 import struct
 # note:  struct pack, unpack, unpack_from all require bytestring format
 # data all the way up to at least python 2.7.5, python 3 okay with bytestring
 class unpackException(Exception):
    pass
 class UncompressedReader:
    def unpack(self, data):
        return data
 class PalmdocReader:
    def unpack(self, i):
        o, p = b"", 0
        while p < len(i):
            # for python 3 must use slice since i[p] returns int while slice returns character
            c = ord(i[p : p + 1])
            p += 1
            if c >= 1 and c <= 8:
                o += i[p : p + c]
                p += c
            elif c < 128:
                o += bchr(c)
            elif c >= 192:
                o += b" " + bchr(c ^ 128)
            else:
                if p < len(i):
                    c = (c << 8) | ord(i[p : p + 1])
                    p += 1
                    m = (c >> 3) & 0x07FF
                    n = (c & 7) + 3
                    if m > n:
                        o += o[-m : n - m]
                    else:
                        for _ in range(n):
                            # because of completely ass-backwards decision by python mainters for python 3
                            # we must use slice for bytes as i[p] returns int while slice returns character
                            if m == 1:
                                o += o[-m:]
                            else:
                                o += o[-m : -m + 1]
        return o
 class HuffcdicReader:
    q = struct.Struct(b">Q").unpack_from
    def loadHuff(self, huff):
        if huff[0:8] != b"HUFF\x00\x00\x00\x18":
            raise unpackException("invalid huff header")
        off1, off2 = struct.unpack_from(b">LL", huff, 8)
        def dict1_unpack(v):
            codelen, term, maxcode = v & 0x1F, v & 0x80, v >> 8
            assert codelen != 0
            if codelen <= 8:
                assert term
            maxcode = ((maxcode + 1) << (32 - codelen)) - 1
            return (codelen, term, maxcode)
        self.dict1 = lmap(dict1_unpack, struct.unpack_from(b">256L", huff, off1))
        dict2 = struct.unpack_from(b">64L", huff, off2)
        self.mincode, self.maxcode = (), ()
        for codelen, mincode in enumerate((0,) + dict2[0::2]):
            self.mincode += (mincode << (32 - codelen),)
        for codelen, maxcode in enumerate((0,) + dict2[1::2]):
            self.maxcode += (((maxcode + 1) << (32 - codelen)) - 1,)
        self.dictionary = []
    def loadCdic(self, cdic):
        if cdic[0:8] != b"CDIC\x00\x00\x00\x10":
            raise unpackException("invalid cdic header")
        phrases, bits = struct.unpack_from(b">LL", cdic, 8)
        n = min(1 << bits, phrases - len(self.dictionary))
        h = struct.Struct(b">H").unpack_from
        def getslice(off):
            (blen,) = h(cdic, 16 + off)
            slice = cdic[18 + off : 18 + off + (blen & 0x7FFF)]
            return (slice, blen & 0x8000)
        self.dictionary += lmap(
            getslice, struct.unpack_from(bstr(">%dH" % n), cdic, 16)
        )
    def unpack(self, data):
        q = HuffcdicReader.q
        bitsleft = len(data) * 8
        data += b"\x00\x00\x00\x00\x00\x00\x00\x00"
        pos = 0
        (x,) = q(data, pos)
        n = 32
        s = b""
        while True:
            if n <= 0:
                pos += 4
                (x,) = q(data, pos)
                n += 32
            code = (x >> n) & ((1 << 32) - 1)
            codelen, term, maxcode = self.dict1[code >> 24]
            if not term:
                while code < self.mincode[codelen]:
                    codelen += 1
                maxcode = self.maxcode[codelen]
            n -= codelen
            bitsleft -= codelen
            if bitsleft < 0:
                break
            r = (maxcode - code) >> (32 - codelen)
            slice, flag = self.dictionary[r]
            if not flag:
                self.dictionary[r] = None
                slice = self.unpack(slice)
                self.dictionary[r] = (slice, 1)
            s += slice
        return s
--- a/mobimaster/mobi/mobi_utils.py
+++ b/mobimaster/mobi/mobi_utils.py
@@ -0,0 +1,252 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 # flake8: noqa
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import PY2, text_type, bchr, bord
 import binascii
 if PY2:
    range = xrange
 from itertools import cycle
 def getLanguage(langID, sublangID):
    mobilangdict = {
        54: {0: "af"},  # Afrikaans
        28: {0: "sq"},  # Albanian
        1: {
            0: "ar",
            5: "ar-dz",
            15: "ar-bh",
            3: "ar-eg",
            2: "ar-iq",
            11: "ar-jo",
            13: "ar-kw",
            12: "ar-lb",
            4: "ar-ly",
            6: "ar-ma",
            8: "ar-om",
            16: "ar-qa",
            1: "ar-sa",
            10: "ar-sy",
            7: "ar-tn",
            14: "ar-ae",
            9: "ar-ye",
        },
        # Arabic,  Arabic (Algeria),  Arabic (Bahrain),  Arabic (Egypt),  Arabic
        # (Iraq), Arabic (Jordan),  Arabic (Kuwait),  Arabic (Lebanon),  Arabic
        # (Libya), Arabic (Morocco),  Arabic (Oman),  Arabic (Qatar),  Arabic
        # (Saudi Arabia),  Arabic (Syria),  Arabic (Tunisia),  Arabic (United Arab
        # Emirates),  Arabic (Yemen)
        43: {0: "hy"},  # Armenian
        77: {0: "as"},  # Assamese
        44: {0: "az"},  # "Azeri (IANA: Azerbaijani)
        45: {0: "eu"},  # Basque
        35: {0: "be"},  # Belarusian
        69: {0: "bn"},  # Bengali
        2: {0: "bg"},  # Bulgarian
        3: {0: "ca"},  # Catalan
        4: {0: "zh", 3: "zh-hk", 2: "zh-cn", 4: "zh-sg", 1: "zh-tw"},
        # Chinese,  Chinese (Hong Kong),  Chinese (PRC),  Chinese (Singapore),  Chinese (Taiwan)
        26: {0: "hr", 3: "sr"},  # Croatian, Serbian
        5: {0: "cs"},  # Czech
        6: {0: "da"},  # Danish
        19: {0: "nl", 1: "nl", 2: "nl-be"},  # Dutch / Flemish,  Dutch (Belgium)
        9: {
            0: "en",
            1: "en",
            3: "en-au",
            40: "en-bz",
            4: "en-ca",
            6: "en-ie",
            8: "en-jm",
            5: "en-nz",
            13: "en-ph",
            7: "en-za",
            11: "en-tt",
            2: "en-gb",
            1: "en-us",
            12: "en-zw",
        },
        # English,  English (Australia),  English (Belize),  English (Canada),
        # English (Ireland),  English (Jamaica),  English (New Zealand),  English
        # (Philippines),  English (South Africa),  English (Trinidad),  English
        # (United Kingdom),  English (United States),  English (Zimbabwe)
        37: {0: "et"},  # Estonian
        56: {0: "fo"},  # Faroese
        41: {0: "fa"},  # Farsi / Persian
        11: {0: "fi"},  # Finnish
        12: {
            0: "fr",
            1: "fr",
            2: "fr-be",
            3: "fr-ca",
            5: "fr-lu",
            6: "fr-mc",
            4: "fr-ch",
        },
        # French,  French (Belgium),  French (Canada),  French (Luxembourg),  French (Monaco),  French (Switzerland)
        55: {0: "ka"},  # Georgian
        7: {0: "de", 1: "de", 3: "de-at", 5: "de-li", 4: "de-lu", 2: "de-ch"},
        # German,  German (Austria),  German (Liechtenstein),  German (Luxembourg),  German (Switzerland)
        8: {0: "el"},  # Greek, Modern (1453-)
        71: {0: "gu"},  # Gujarati
        13: {0: "he"},  # Hebrew (also code 'iw'?)
        57: {0: "hi"},  # Hindi
        14: {0: "hu"},  # Hungarian
        15: {0: "is"},  # Icelandic
        33: {0: "id"},  # Indonesian
        16: {0: "it", 1: "it", 2: "it-ch"},  # Italian,  Italian (Switzerland)
        17: {0: "ja"},  # Japanese
        75: {0: "kn"},  # Kannada
        63: {0: "kk"},  # Kazakh
        87: {0: "x-kok"},  # Konkani (real language code is 'kok'?)
        18: {0: "ko"},  # Korean
        38: {0: "lv"},  # Latvian
        39: {0: "lt"},  # Lithuanian
        47: {0: "mk"},  # Macedonian
        62: {0: "ms"},  # Malay
        76: {0: "ml"},  # Malayalam
        58: {0: "mt"},  # Maltese
        78: {0: "mr"},  # Marathi
        97: {0: "ne"},  # Nepali
        20: {0: "no"},  # Norwegian
        72: {0: "or"},  # Oriya
        21: {0: "pl"},  # Polish
        22: {0: "pt", 2: "pt", 1: "pt-br"},  # Portuguese,  Portuguese (Brazil)
        70: {0: "pa"},  # Punjabi
        23: {0: "rm"},  # "Rhaeto-Romanic" (IANA: Romansh)
        24: {0: "ro"},  # Romanian
        25: {0: "ru"},  # Russian
        59: {0: "sz"},  # "Sami (Lappish)" (not an IANA language code)
        # IANA code for "Northern Sami" is 'se'
        # 'SZ' is the IANA region code for Swaziland
        79: {0: "sa"},  # Sanskrit
        27: {0: "sk"},  # Slovak
        36: {0: "sl"},  # Slovenian
        46: {0: "sb"},  # "Sorbian" (not an IANA language code)
        # 'SB' is IANA region code for 'Solomon Islands'
        # Lower Sorbian = 'dsb'
        # Upper Sorbian = 'hsb'
        # Sorbian Languages = 'wen'
        10: {
            0: "es",
            4: "es",
            44: "es-ar",
            64: "es-bo",
            52: "es-cl",
            36: "es-co",
            20: "es-cr",
            28: "es-do",
            48: "es-ec",
            68: "es-sv",
            16: "es-gt",
            72: "es-hn",
            8: "es-mx",
            76: "es-ni",
            24: "es-pa",
            60: "es-py",
            40: "es-pe",
            80: "es-pr",
            56: "es-uy",
            32: "es-ve",
        },
        # Spanish,  Spanish (Mobipocket bug?),  Spanish (Argentina),  Spanish
        # (Bolivia),  Spanish (Chile),  Spanish (Colombia),  Spanish (Costa Rica),
        # Spanish (Dominican Republic),  Spanish (Ecuador),  Spanish (El
        # Salvador),  Spanish (Guatemala),  Spanish (Honduras),  Spanish (Mexico),
        # Spanish (Nicaragua),  Spanish (Panama),  Spanish (Paraguay),  Spanish
        # (Peru),  Spanish (Puerto Rico),  Spanish (Uruguay),  Spanish (Venezuela)
        48: {0: "sx"},  # "Sutu" (not an IANA language code)
        # "Sutu" is another name for "Southern Sotho"?
        # IANA code for "Southern Sotho" is 'st'
        65: {0: "sw"},  # Swahili
        29: {0: "sv", 1: "sv", 8: "sv-fi"},  # Swedish,  Swedish (Finland)
        73: {0: "ta"},  # Tamil
        68: {0: "tt"},  # Tatar
        74: {0: "te"},  # Telugu
        30: {0: "th"},  # Thai
        49: {0: "ts"},  # Tsonga
        50: {0: "tn"},  # Tswana
        31: {0: "tr"},  # Turkish
        34: {0: "uk"},  # Ukrainian
        32: {0: "ur"},  # Urdu
        67: {0: "uz", 2: "uz"},  # Uzbek
        42: {0: "vi"},  # Vietnamese
        52: {0: "xh"},  # Xhosa
        53: {0: "zu"},  # Zulu
    }
    lang = "en"
    if langID in mobilangdict:
        subdict = mobilangdict[langID]
        lang = subdict[0]
        if sublangID in subdict:
            lang = subdict[sublangID]
    return lang
 def toHex(byteList):
    return binascii.hexlify(byteList)
 # returns base32 bytestring
 def toBase32(value, npad=4):
    digits = b"0123456789ABCDEFGHIJKLMNOPQRSTUV"
    num_string = b""
    current = value
    while current != 0:
        next, remainder = divmod(current, 32)
        rem_string = digits[remainder : remainder + 1]
        num_string = rem_string + num_string
        current = next
    if num_string == b"":
        num_string = b"0"
    pad = npad - len(num_string)
    if pad > 0:
        num_string = b"0" * pad + num_string
    return num_string
 # converts base32 string to value
 def fromBase32(str_num):
    if isinstance(str_num, text_type):
        str_num = str_num.encode("latin-1")
    scalelst = [1, 32, 1024, 32768, 1048576, 33554432, 1073741824, 34359738368]
    value = 0
    j = 0
    n = len(str_num)
    scale = 0
    for i in range(n):
        c = str_num[n - i - 1 : n - i]
        if c in b"0123456789":
            v = ord(c) - ord(b"0")
        else:
            v = ord(c) - ord(b"A") + 10
        if j < len(scalelst):
            scale = scalelst[j]
        else:
            scale = scale * 32
        j += 1
        if v != 0:
            value = value + (v * scale)
    return value
 # note: if decode a bytestring using 'latin-1' (or any other 0-255 encoding)
 # in place of ascii you will get a byte to half-word or integer
 # one to one mapping of values from 0 - 255
 def mangle_fonts(encryption_key, data):
    if isinstance(encryption_key, text_type):
        encryption_key = encryption_key.encode("latin-1")
    crypt = data[:1024]
    key = cycle(iter(map(bord, encryption_key)))
    # encrypt = ''.join([chr(ord(x)^key.next()) for x in crypt])
    encrypt = b"".join([bchr(bord(x) ^ next(key)) for x in crypt])
    return encrypt + data[1024:]
--- a/mobimaster/mobi/mobiml2xhtml.py
+++ b/mobimaster/mobi/mobiml2xhtml.py
@@ -0,0 +1,585 @@
 #! /usr/bin/python
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 # this program works in concert with the output from KindleUnpack
 """
 Convert from Mobi ML to XHTML
 """
 import os
 import sys
 import re
 SPECIAL_HANDLING_TAGS = {
    "?xml": ("xmlheader", -1),
    "!--": ("comment", -3),
    "!DOCTYPE": ("doctype", -1),
 }
 SPECIAL_HANDLING_TYPES = ["xmlheader", "doctype", "comment"]
 SELF_CLOSING_TAGS = [
    "br",
    "hr",
    "input",
    "img",
    "image",
    "meta",
    "spacer",
    "link",
    "frame",
    "base",
    "col",
    "reference",
 ]
 class MobiMLConverter(object):
    PAGE_BREAK_PAT = re.compile(r"(<[/]{0,1}mbp:pagebreak\s*[/]{0,1}>)+", re.IGNORECASE)
    IMAGE_ATTRS = ("lowrecindex", "recindex", "hirecindex")
    def __init__(self, filename):
        self.base_css_rules = "blockquote { margin: 0em 0em 0em 1.25em }\n"
        self.base_css_rules += "p { margin: 0em }\n"
        self.base_css_rules += ".bold { font-weight: bold }\n"
        self.base_css_rules += ".italic { font-style: italic }\n"
        self.base_css_rules += (
            ".mbp_pagebreak { page-break-after: always; margin: 0; display: block }\n"
        )
        self.tag_css_rules = {}
        self.tag_css_rule_cnt = 0
        self.path = []
        self.filename = filename
        self.wipml = open(self.filename, "rb").read()
        self.pos = 0
        self.opfname = self.filename.rsplit(".", 1)[0] + ".opf"
        self.opos = 0
        self.meta = ""
        self.cssname = os.path.join(os.path.dirname(self.filename), "styles.css")
        self.current_font_size = 3
        self.font_history = []
    def cleanup_html(self):
        self.wipml = re.sub(
            r'<div height="0(pt|px|ex|em|%){0,1}"></div>', "", self.wipml
        )
        self.wipml = self.wipml.replace("\r\n", "\n")
        self.wipml = self.wipml.replace("> <", ">\n<")
        self.wipml = self.wipml.replace("<mbp: ", "<mbp:")
        # self.wipml = re.sub(r'<?xml[^>]*>', '', self.wipml)
        self.wipml = self.wipml.replace("<br></br>", "<br/>")
    def replace_page_breaks(self):
        self.wipml = self.PAGE_BREAK_PAT.sub(
            '<div class="mbp_pagebreak" />', self.wipml
        )
    # parse leading text of ml and tag
    def parseml(self):
        p = self.pos
        if p >= len(self.wipml):
            return None
        if self.wipml[p] != "<":
            res = self.wipml.find("<", p)
            if res == -1:
                res = len(self.wipml)
            self.pos = res
            return self.wipml[p:res], None
        # handle comment as a special case to deal with multi-line comments
        if self.wipml[p : p + 4] == "<!--":
            te = self.wipml.find("-->", p + 1)
            if te != -1:
                te = te + 2
        else:
            te = self.wipml.find(">", p + 1)
            ntb = self.wipml.find("<", p + 1)
            if ntb != -1 and ntb < te:
                self.pos = ntb
                return self.wipml[p:ntb], None
        self.pos = te + 1
        return None, self.wipml[p : te + 1]
    # parses string version of tag to identify its name,
    # its type 'begin', 'end' or 'single',
    # plus build a hashtable of its attributes
    # code is written to handle the possiblity of very poor formating
    def parsetag(self, s):
        p = 1
        # get the tag name
        tname = None
        ttype = None
        tattr = {}
        while s[p : p + 1] == " ":
            p += 1
        if s[p : p + 1] == "/":
            ttype = "end"
            p += 1
            while s[p : p + 1] == " ":
                p += 1
        b = p
        while s[p : p + 1] not in (">", "/", " ", '"', "'", "\r", "\n"):
            p += 1
        tname = s[b:p].lower()
        if tname == "!doctype":
            tname = "!DOCTYPE"
        # special cases
        if tname in SPECIAL_HANDLING_TAGS.keys():
            ttype, backstep = SPECIAL_HANDLING_TAGS[tname]
            tattr["special"] = s[p:backstep]
        if ttype is None:
            # parse any attributes
            while s.find("=", p) != -1:
                while s[p : p + 1] == " ":
                    p += 1
                b = p
                while s[p : p + 1] != "=":
                    p += 1
                aname = s[b:p].lower()
                aname = aname.rstrip(" ")
                p += 1
                while s[p : p + 1] == " ":
                    p += 1
                if s[p : p + 1] in ('"', "'"):
                    p = p + 1
                    b = p
                    while s[p : p + 1] not in ('"', "'"):
                        p += 1
                    val = s[b:p]
                    p += 1
                else:
                    b = p
                    while s[p : p + 1] not in (">", "/", " "):
                        p += 1
                    val = s[b:p]
                tattr[aname] = val
        # label beginning and single tags
        if ttype is None:
            ttype = "begin"
            if s.find(" /", p) >= 0:
                ttype = "single_ext"
            elif s.find("/", p) >= 0:
                ttype = "single"
        return ttype, tname, tattr
    # main routine to convert from mobi markup language to html
    def processml(self):
        # are these really needed
        html_done = False
        head_done = False
        body_done = False
        skip = False
        htmlstr = ""
        self.replace_page_breaks()
        self.cleanup_html()
        # now parse the cleaned up ml into standard xhtml
        while True:
            r = self.parseml()
            if not r:
                break
            text, tag = r
            if text:
                if not skip:
                    htmlstr += text
            if tag:
                ttype, tname, tattr = self.parsetag(tag)
                # If we run into a DTD or xml declarations inside the body ... bail.
                if (
                    tname in SPECIAL_HANDLING_TAGS.keys()
                    and tname != "comment"
                    and body_done
                ):
                    htmlstr += "\n</body></html>"
                    break
                # make sure self-closing tags actually self-close
                if ttype == "begin" and tname in SELF_CLOSING_TAGS:
                    ttype = "single"
                # make sure any end tags of self-closing tags are discarded
                if ttype == "end" and tname in SELF_CLOSING_TAGS:
                    continue
                # remove embedded guide and refernces from old mobis
                if tname in ("guide", "ncx", "reference") and ttype in (
                    "begin",
                    "single",
                    "single_ext",
                ):
                    tname = "removeme:{0}".format(tname)
                    tattr = None
                if (
                    tname in ("guide", "ncx", "reference", "font", "span")
                    and ttype == "end"
                ):
                    if self.path[-1] == "removeme:{0}".format(tname):
                        tname = "removeme:{0}".format(tname)
                        tattr = None
                # Get rid of font tags that only have a color attribute.
                if tname == "font" and ttype in ("begin", "single", "single_ext"):
                    if "color" in tattr.keys() and len(tattr.keys()) == 1:
                        tname = "removeme:{0}".format(tname)
                        tattr = None
                # Get rid of empty spans in the markup.
                if (
                    tname == "span"
                    and ttype in ("begin", "single", "single_ext")
                    and not len(tattr)
                ):
                    tname = "removeme:{0}".format(tname)
                # need to handle fonts outside of the normal methods
                # so fonts tags won't be added to the self.path since we keep track
                # of font tags separately with self.font_history
                if tname == "font" and ttype == "begin":
                    # check for nested font start tags
                    if len(self.font_history) > 0:
                        # inject a font end tag
                        taginfo = ("end", "font", None)
                        htmlstr += self.processtag(taginfo)
                    self.font_history.append((ttype, tname, tattr))
                    # handle the current font start tag
                    taginfo = (ttype, tname, tattr)
                    htmlstr += self.processtag(taginfo)
                    continue
                # check for nested font tags and unnest them
                if tname == "font" and ttype == "end":
                    self.font_history.pop()
                    # handle this font end tag
                    taginfo = ("end", "font", None)
                    htmlstr += self.processtag(taginfo)
                    # check if we were nested
                    if len(self.font_history) > 0:
                        # inject a copy of the most recent font start tag from history
                        taginfo = self.font_history[-1]
                        htmlstr += self.processtag(taginfo)
                    continue
                # keep track of nesting path
                if ttype == "begin":
                    self.path.append(tname)
                elif ttype == "end":
                    if tname != self.path[-1]:
                        print ("improper nesting: ", self.path, tname, ttype)
                        if tname not in self.path:
                            # handle case of end tag with no beginning by injecting empty begin tag
                            taginfo = ("begin", tname, None)
                            htmlstr += self.processtag(taginfo)
                            print "     - fixed by injecting empty start tag ", tname
                            self.path.append(tname)
                        elif len(self.path) > 1 and tname == self.path[-2]:
                            # handle case of dangling missing end
                            taginfo = ("end", self.path[-1], None)
                            htmlstr += self.processtag(taginfo)
                            print "     - fixed by injecting end tag ", self.path[-1]
                            self.path.pop()
                    self.path.pop()
                if tname == "removeme:{0}".format(tname):
                    if ttype in ("begin", "single", "single_ext"):
                        skip = True
                    else:
                        skip = False
                else:
                    taginfo = (ttype, tname, tattr)
                    htmlstr += self.processtag(taginfo)
                # handle potential issue of multiple html, head, and body sections
                if tname == "html" and ttype == "begin" and not html_done:
                    htmlstr += "\n"
                    html_done = True
                if tname == "head" and ttype == "begin" and not head_done:
                    htmlstr += "\n"
                    # also add in metadata and style link tags
                    htmlstr += self.meta
                    htmlstr += (
                        '<link href="styles.css" rel="stylesheet" type="text/css" />\n'
                    )
                    head_done = True
                if tname == "body" and ttype == "begin" and not body_done:
                    htmlstr += "\n"
                    body_done = True
        # handle issue of possibly missing html, head, and body tags
        # I have not seen this but the original did something like this so ...
        if not body_done:
            htmlstr = "<body>\n" + htmlstr + "</body>\n"
        if not head_done:
            headstr = "<head>\n"
            headstr += self.meta
            headstr += '<link href="styles.css" rel="stylesheet" type="text/css" />\n'
            headstr += "</head>\n"
            htmlstr = headstr + htmlstr
        if not html_done:
            htmlstr = "<html>\n" + htmlstr + "</html>\n"
        # finally add DOCTYPE info
        htmlstr = (
            '<?xml version="1.0"?>\n<!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">\n'
            + htmlstr
        )
        css = self.base_css_rules
        for cls, rule in self.tag_css_rules.items():
            css += ".%s { %s }\n" % (cls, rule)
        return (htmlstr, css, self.cssname)
    def ensure_unit(self, raw, unit="px"):
        if re.search(r"\d+$", raw) is not None:
            raw += unit
        return raw
    # flatten possibly modified tag back to string
    def taginfo_tostring(self, taginfo):
        (ttype, tname, tattr) = taginfo
        if ttype is None or tname is None:
            return ""
        if ttype == "end":
            return "</%s>" % tname
        if (
            ttype in SPECIAL_HANDLING_TYPES
            and tattr is not None
            and "special" in tattr.keys()
        ):
            info = tattr["special"]
            if ttype == "comment":
                return "<%s %s-->" % tname, info
            else:
                return "<%s %s>" % tname, info
        res = []
        res.append("<%s" % tname)
        if tattr is not None:
            for key in tattr.keys():
                res.append(' %s="%s"' % (key, tattr[key]))
        if ttype == "single":
            res.append("/>")
        elif ttype == "single_ext":
            res.append(" />")
        else:
            res.append(">")
        return "".join(res)
    # routines to convert from mobi ml tags atributes to xhtml attributes and styles
    def processtag(self, taginfo):
        # Converting mobi font sizes to numerics
        size_map = {
            "xx-small": "1",
            "x-small": "2",
            "small": "3",
            "medium": "4",
            "large": "5",
            "x-large": "6",
            "xx-large": "7",
        }
        size_to_em_map = {
            "1": ".65em",
            "2": ".75em",
            "3": "1em",
            "4": "1.125em",
            "5": "1.25em",
            "6": "1.5em",
            "7": "2em",
        }
        # current tag to work on
        (ttype, tname, tattr) = taginfo
        if not tattr:
            tattr = {}
        styles = []
        if tname is None or tname.startswith("removeme"):
            return ""
        # have not seen an example of this yet so keep it here to be safe
        # until this is better understood
        if tname in (
            "country-region",
            "place",
            "placetype",
            "placename",
            "state",
            "city",
            "street",
            "address",
            "content",
        ):
            tname = "div" if tname == "content" else "span"
            for key in tattr.keys():
                tattr.pop(key)
        # handle general case of style, height, width, bgcolor in any tag
        if "style" in tattr.keys():
            style = tattr.pop("style").strip()
            if style:
                styles.append(style)
        if "align" in tattr.keys():
            align = tattr.pop("align").strip()
            if align:
                if tname in ("table", "td", "tr"):
                    pass
                else:
                    styles.append("text-align: %s" % align)
        if "height" in tattr.keys():
            height = tattr.pop("height").strip()
            if (
                height
                and "<" not in height
                and ">" not in height
                and re.search(r"\d+", height)
            ):
                if tname in ("table", "td", "tr"):
                    pass
                elif tname == "img":
                    tattr["height"] = height
                else:
                    styles.append("margin-top: %s" % self.ensure_unit(height))
        if "width" in tattr.keys():
            width = tattr.pop("width").strip()
            if width and re.search(r"\d+", width):
                if tname in ("table", "td", "tr"):
                    pass
                elif tname == "img":
                    tattr["width"] = width
                else:
                    styles.append("text-indent: %s" % self.ensure_unit(width))
                    if width.startswith("-"):
                        styles.append("margin-left: %s" % self.ensure_unit(width[1:]))
        if "bgcolor" in tattr.keys():
            # no proprietary html allowed
            if tname == "div":
                del tattr["bgcolor"]
        elif tname == "font":
            # Change font tags to span tags
            tname = "span"
            if ttype in ("begin", "single", "single_ext"):
                # move the face attribute to css font-family
                if "face" in tattr.keys():
                    face = tattr.pop("face").strip()
                    styles.append('font-family: "%s"' % face)
                    # Monitor the constantly changing font sizes, change them to ems and move
                    # them to css. The following will work for 'flat' font tags, but nested font tags
                    # will cause things to go wonky. Need to revert to the parent font tag's size
                    # when a closing tag is encountered.
                if "size" in tattr.keys():
                    sz = tattr.pop("size").strip().lower()
                    try:
                        float(sz)
                    except ValueError:
                        if sz in size_map.keys():
                            sz = size_map[sz]
                    else:
                        if sz.startswith("-") or sz.startswith("+"):
                            sz = self.current_font_size + float(sz)
                            if sz > 7:
                                sz = 7
                            elif sz < 1:
                                sz = 1
                            sz = str(int(sz))
                    styles.append("font-size: %s" % size_to_em_map[sz])
                    self.current_font_size = int(sz)
        elif tname == "img":
            for attr in ("width", "height"):
                if attr in tattr:
                    val = tattr[attr]
                    if val.lower().endswith("em"):
                        try:
                            nval = float(val[:-2])
                            nval *= 16 * (
                                168.451 / 72
                            )  # Assume this was set using the Kindle profile
                            tattr[attr] = "%dpx" % int(nval)
                        except:
                            del tattr[attr]
                    elif val.lower().endswith("%"):
                        del tattr[attr]
        # convert the anchor tags
        if "filepos-id" in tattr:
            tattr["id"] = tattr.pop("filepos-id")
            if "name" in tattr and tattr["name"] != tattr["id"]:
                tattr["name"] = tattr["id"]
        if "filepos" in tattr:
            filepos = tattr.pop("filepos")
            try:
                tattr["href"] = "#filepos%d" % int(filepos)
            except ValueError:
                pass
        if styles:
            ncls = None
            rule = "; ".join(styles)
            for sel, srule in self.tag_css_rules.items():
                if srule == rule:
                    ncls = sel
                    break
            if ncls is None:
                self.tag_css_rule_cnt += 1
                ncls = "rule_%d" % self.tag_css_rule_cnt
                self.tag_css_rules[ncls] = rule
            cls = tattr.get("class", "")
            cls = cls + (" " if cls else "") + ncls
            tattr["class"] = cls
        # convert updated tag back to string representation
        if len(tattr) == 0:
            tattr = None
        taginfo = (ttype, tname, tattr)
        return self.taginfo_tostring(taginfo)
 """ main only left in for testing outside of plugin """
 def main(argv=sys.argv):
    if len(argv) != 2:
        return 1
    else:
        infile = argv[1]
    try:
        print "Converting Mobi Markup Language to XHTML"
        mlc = MobiMLConverter(infile)
        print "Processing ..."
        htmlstr, css, cssname = mlc.processml()
        outname = infile.rsplit(".", 1)[0] + "_converted.html"
        file(outname, "wb").write(htmlstr)
        file(cssname, "wb").write(css)
        print "Completed"
        print "XHTML version of book can be found at: " + outname
    except ValueError, e:
        print "Error: %s" % e
        return 1
    return 0
 if __name__ == "__main__":
    sys.exit(main())
--- a/mobimaster/mobi/unipath.py
+++ b/mobimaster/mobi/unipath.py
@@ -0,0 +1,103 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 # Copyright (c) 2014 Kevin B. Hendricks, John Schember, and Doug Massay
 # All rights reserved.
 #
 # Redistribution and use in source and binary forms, with or without modification,
 # are permitted provided that the following conditions are met:
 #
 # 1. Redistributions of source code must retain the above copyright notice, this list of
 # conditions and the following disclaimer.
 #
 # 2. Redistributions in binary form must reproduce the above copyright notice, this list
 # of conditions and the following disclaimer in the documentation and/or other materials
 # provided with the distribution.
 #
 # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
 # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 # OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
 # SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
 # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
 # TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
 # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
 # WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import PY2, text_type, binary_type
 import sys
 import os
 # utility routines to convert all paths to be full unicode
 # Under Python 2, if a bytestring, try to convert it to unicode using sys.getfilesystemencoding
 # Under Python 3, if bytes, try to convert it to unicode using os.fsencode() to decode it
 # Mac OS X and Windows will happily support full unicode paths
 # Linux can support full unicode paths but allows arbitrary byte paths which may be inconsistent with unicode
 fsencoding = sys.getfilesystemencoding()
 def pathof(s, enc=fsencoding):
    if s is None:
        return None
    if isinstance(s, text_type):
        return s
    if isinstance(s, binary_type):
        try:
            return s.decode(enc)
        except:
            pass
    return s
 def exists(s):
    return os.path.exists(pathof(s))
 def isfile(s):
    return os.path.isfile(pathof(s))
 def isdir(s):
    return os.path.isdir(pathof(s))
 def mkdir(s):
    return os.mkdir(pathof(s))
 def listdir(s):
    rv = []
    for file in os.listdir(pathof(s)):
        rv.append(pathof(file))
    return rv
 def getcwd():
    if PY2:
        return os.getcwdu()
    return os.getcwd()
 def walk(top):
    top = pathof(top)
    rv = []
    for base, dnames, names in os.walk(top):
        base = pathof(base)
        for name in names:
            name = pathof(name)
            rv.append(relpath(os.path.join(base, name), top))
    return rv
 def relpath(path, start=None):
    return os.path.relpath(pathof(path), pathof(start))
 def abspath(path):
    return os.path.abspath(pathof(path))
--- a/mobimaster/mobi/unpack_structure.py
+++ b/mobimaster/mobi/unpack_structure.py
@@ -0,0 +1,175 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # vim:ts=4:sw=4:softtabstop=4:smarttab:expandtab
 from __future__ import unicode_literals, division, absolute_import, print_function
 from .compatibility_utils import text_type
 from . import unipath
 from .unipath import pathof
 DUMP = False
 """ Set to True to dump all possible information. """
 import os
 import re
 # note: re requites the pattern to be the exact same type as the data to be searched in python3
 # but u"" is not allowed for the pattern itself only b""
 import zipfile
 import binascii
 from .mobi_utils import mangle_fonts
 class unpackException(Exception):
    pass
 class ZipInfo(zipfile.ZipInfo):
    def __init__(self, *args, **kwargs):
        if "compress_type" in kwargs:
            compress_type = kwargs.pop("compress_type")
        super(ZipInfo, self).__init__(*args, **kwargs)
        self.compress_type = compress_type
 class fileNames:
    def __init__(self, infile, outdir):
        self.infile = infile
        self.outdir = outdir
        if not unipath.exists(self.outdir):
            unipath.mkdir(self.outdir)
        self.mobi7dir = os.path.join(self.outdir, "mobi7")
        if not unipath.exists(self.mobi7dir):
            unipath.mkdir(self.mobi7dir)
        self.imgdir = os.path.join(self.mobi7dir, "Images")
        if not unipath.exists(self.imgdir):
            unipath.mkdir(self.imgdir)
        self.hdimgdir = os.path.join(self.outdir, "HDImages")
        if not unipath.exists(self.hdimgdir):
            unipath.mkdir(self.hdimgdir)
        self.outbase = os.path.join(
            self.outdir, os.path.splitext(os.path.split(infile)[1])[0]
        )
    def getInputFileBasename(self):
        return os.path.splitext(os.path.basename(self.infile))[0]
    def makeK8Struct(self):
        self.k8dir = os.path.join(self.outdir, "mobi8")
        if not unipath.exists(self.k8dir):
            unipath.mkdir(self.k8dir)
        self.k8metainf = os.path.join(self.k8dir, "META-INF")
        if not unipath.exists(self.k8metainf):
            unipath.mkdir(self.k8metainf)
        self.k8oebps = os.path.join(self.k8dir, "OEBPS")
        if not unipath.exists(self.k8oebps):
            unipath.mkdir(self.k8oebps)
        self.k8images = os.path.join(self.k8oebps, "Images")
        if not unipath.exists(self.k8images):
            unipath.mkdir(self.k8images)
        self.k8fonts = os.path.join(self.k8oebps, "Fonts")
        if not unipath.exists(self.k8fonts):
            unipath.mkdir(self.k8fonts)
        self.k8styles = os.path.join(self.k8oebps, "Styles")
        if not unipath.exists(self.k8styles):
            unipath.mkdir(self.k8styles)
        self.k8text = os.path.join(self.k8oebps, "Text")
        if not unipath.exists(self.k8text):
            unipath.mkdir(self.k8text)
    # recursive zip creation support routine
    def zipUpDir(self, myzip, tdir, localname):
        currentdir = tdir
        if localname != "":
            currentdir = os.path.join(currentdir, localname)
        list = unipath.listdir(currentdir)
        for file in list:
            afilename = file
            localfilePath = os.path.join(localname, afilename)
            realfilePath = os.path.join(currentdir, file)
            if unipath.isfile(realfilePath):
                myzip.write(
                    pathof(realfilePath), pathof(localfilePath), zipfile.ZIP_DEFLATED
                )
            elif unipath.isdir(realfilePath):
                self.zipUpDir(myzip, tdir, localfilePath)
    def makeEPUB(self, usedmap, obfuscate_data, uid):
        bname = os.path.join(self.k8dir, self.getInputFileBasename() + ".epub")
        # Create an encryption key for Adobe font obfuscation
        # based on the epub's uid
        if isinstance(uid, text_type):
            uid = uid.encode("ascii")
        if obfuscate_data:
            key = re.sub(br"[^a-fA-F0-9]", b"", uid)
            key = binascii.unhexlify((key + key)[:32])
        # copy over all images and fonts that are actually used in the ebook
        # and remove all font files from mobi7 since not supported
        imgnames = unipath.listdir(self.imgdir)
        for name in imgnames:
            if usedmap.get(name, "not used") == "used":
                filein = os.path.join(self.imgdir, name)
                if name.endswith(".ttf"):
                    fileout = os.path.join(self.k8fonts, name)
                elif name.endswith(".otf"):
                    fileout = os.path.join(self.k8fonts, name)
                elif name.endswith(".failed"):
                    fileout = os.path.join(self.k8fonts, name)
                else:
                    fileout = os.path.join(self.k8images, name)
                data = b""
                with open(pathof(filein), "rb") as f:
                    data = f.read()
                if obfuscate_data:
                    if name in obfuscate_data:
                        data = mangle_fonts(key, data)
                open(pathof(fileout), "wb").write(data)
                if name.endswith(".ttf") or name.endswith(".otf"):
                    os.remove(pathof(filein))
        # opf file name hard coded to "content.opf"
        container = '<?xml version="1.0" encoding="UTF-8"?>\n'
        container += '<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">\n'
        container += "    <rootfiles>\n"
        container += '<rootfile full-path="OEBPS/content.opf" media-type="application/oebps-package+xml"/>'
        container += "    </rootfiles>\n</container>\n"
        fileout = os.path.join(self.k8metainf, "container.xml")
        with open(pathof(fileout), "wb") as f:
            f.write(container.encode("utf-8"))
        if obfuscate_data:
            encryption = '<encryption xmlns="urn:oasis:names:tc:opendocument:xmlns:container" \
 xmlns:enc="http://www.w3.org/2001/04/xmlenc#" xmlns:deenc="http://ns.adobe.com/digitaleditions/enc">\n'
            for font in obfuscate_data:
                encryption += "  <enc:EncryptedData>\n"
                encryption += '    <enc:EncryptionMethod Algorithm="http://ns.adobe.com/pdf/enc#RC"/>\n'
                encryption += "    <enc:CipherData>\n"
                encryption += (
                    '      <enc:CipherReference URI="OEBPS/Fonts/' + font + '"/>\n'
                )
                encryption += "    </enc:CipherData>\n"
                encryption += "  </enc:EncryptedData>\n"
            encryption += "</encryption>\n"
            fileout = os.path.join(self.k8metainf, "encryption.xml")
            with open(pathof(fileout), "wb") as f:
                f.write(encryption.encode("utf-8"))
        # ready to build epub
        self.outzip = zipfile.ZipFile(pathof(bname), "w")
        # add the mimetype file uncompressed
        mimetype = b"application/epub+zip"
        fileout = os.path.join(self.k8dir, "mimetype")
        with open(pathof(fileout), "wb") as f:
            f.write(mimetype)
        nzinfo = ZipInfo("mimetype", compress_type=zipfile.ZIP_STORED)
        nzinfo.external_attr = 0o600 << 16  # make this a normal file
        self.outzip.writestr(nzinfo, mimetype)
        self.zipUpDir(self.outzip, self.k8dir, "META-INF")
        self.zipUpDir(self.outzip, self.k8dir, "OEBPS")
        self.outzip.close()
--- a/mobimaster/mobi/x
+++ b/mobimaster/mobi/x
@@ -0,0 +1 @@
 [22;0t]0;IPython: mobimaster/mobi
--- a/mobimaster/poetry.lock
+++ b/mobimaster/poetry.lock
@@ -0,0 +1,448 @@
 [[package]]
 category = "main"
 description = "Asyncio support for PEP-567 contextvars backport."
 marker = "python_version < \"3.7\""
 name = "aiocontextvars"
 optional = false
 python-versions = ">=3.5"
 version = "0.2.2"
 [package.dependencies]
 [package.dependencies.contextvars]
 python = "<3.7"
 version = "2.4"
 [[package]]
 category = "dev"
 description = "A small Python module for determining appropriate platform-specific dirs, e.g. a \"user data dir\"."
 marker = "python_version >= \"3.6\" and python_version < \"4.0\""
 name = "appdirs"
 optional = false
 python-versions = "*"
 version = "1.4.4"
 [[package]]
 category = "dev"
 description = "Atomic file writes."
 marker = "sys_platform == \"win32\""
 name = "atomicwrites"
 optional = false
 python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*"
 version = "1.4.0"
 [[package]]
 category = "dev"
 description = "Classes Without Boilerplate"
 name = "attrs"
 optional = false
 python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*"
 version = "19.3.0"
 [package.extras]
 azure-pipelines = ["coverage", "hypothesis", "pympler", "pytest (>=4.3.0)", "six", "zope.interface", "pytest-azurepipelines"]
 dev = ["coverage", "hypothesis", "pympler", "pytest (>=4.3.0)", "six", "zope.interface", "sphinx", "pre-commit"]
 docs = ["sphinx", "zope.interface"]
 tests = ["coverage", "hypothesis", "pympler", "pytest (>=4.3.0)", "six", "zope.interface"]
 [[package]]
 category = "dev"
 description = "The uncompromising code formatter."
 marker = "python_version >= \"3.6\" and python_version < \"4.0\""
 name = "black"
 optional = false
 python-versions = ">=3.6"
 version = "19.10b0"
 [package.dependencies]
 appdirs = "*"
 attrs = ">=18.1.0"
 click = ">=6.5"
 pathspec = ">=0.6,<1"
 regex = "*"
 toml = ">=0.9.4"
 typed-ast = ">=1.4.0"
 [package.extras]
 d = ["aiohttp (>=3.3.2)", "aiohttp-cors"]
 [[package]]
 category = "dev"
 description = "Composable command line interface toolkit"
 marker = "python_version >= \"3.6\" and python_version < \"4.0\""
 name = "click"
 optional = false
 python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*"
 version = "7.1.2"
 [[package]]
 category = "main"
 description = "Cross-platform colored terminal text."
 marker = "sys_platform == \"win32\""
 name = "colorama"
 optional = false
 python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*"
 version = "0.4.3"
 [[package]]
 category = "main"
 description = "PEP 567 Backport"
 marker = "python_version < \"3.7\""
 name = "contextvars"
 optional = false
 python-versions = "*"
 version = "2.4"
 [package.dependencies]
 immutables = ">=0.9"
 [[package]]
 category = "main"
 description = "Immutable Collections"
 marker = "python_version < \"3.7\""
 name = "immutables"
 optional = false
 python-versions = ">=3.5"
 version = "0.14"
 [[package]]
 category = "dev"
 description = "Read metadata from Python packages"
 marker = "python_version < \"3.8\""
 name = "importlib-metadata"
 optional = false
 python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,>=2.7"
 version = "1.7.0"
 [package.dependencies]
 zipp = ">=0.5"
 [package.extras]
 docs = ["sphinx", "rst.linker"]
 testing = ["packaging", "pep517", "importlib-resources (>=1.3)"]
 [[package]]
 category = "main"
 description = "Python logging made (stupidly) simple"
 name = "loguru"
 optional = false
 python-versions = ">=3.5"
 version = "0.4.1"
 [package.dependencies]
 colorama = ">=0.3.4"
 win32-setctime = ">=1.0.0"
 [package.dependencies.aiocontextvars]
 python = "<3.7"
 version = ">=0.2.0"
 [package.extras]
 dev = ["codecov (>=2.0.15)", "colorama (>=0.3.4)", "flake8 (>=3.7.7)", "isort (>=4.3.20)", "tox (>=3.9.0)", "tox-travis (>=0.12)", "pytest (>=4.6.2)", "pytest-cov (>=2.7.1)", "Sphinx (>=2.2.1)", "sphinx-autobuild (>=0.7.1)", "sphinx-rtd-theme (>=0.4.3)", "black (>=19.3b0)"]
 [[package]]
 category = "dev"
 description = "More routines for operating on iterables, beyond itertools"
 name = "more-itertools"
 optional = false
 python-versions = ">=3.5"
 version = "8.4.0"
 [[package]]
 category = "dev"
 description = "Core utilities for Python packages"
 name = "packaging"
 optional = false
 python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*"
 version = "20.4"
 [package.dependencies]
 pyparsing = ">=2.0.2"
 six = "*"
 [[package]]
 category = "dev"
 description = "Utility library for gitignore style pattern matching of file paths."
 marker = "python_version >= \"3.6\" and python_version < \"4.0\""
 name = "pathspec"
 optional = false
 python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*"
 version = "0.8.0"
 [[package]]
 category = "dev"
 description = "plugin and hook calling mechanisms for python"
 name = "pluggy"
 optional = false
 python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*"
 version = "0.13.1"
 [package.dependencies]
 [package.dependencies.importlib-metadata]
 python = "<3.8"
 version = ">=0.12"
 [package.extras]
 dev = ["pre-commit", "tox"]
 [[package]]
 category = "dev"
 description = "library with cross-python path, ini-parsing, io, code, log facilities"
 name = "py"
 optional = false
 python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*"
 version = "1.9.0"
 [[package]]
 category = "dev"
 description = "Python parsing module"
 name = "pyparsing"
 optional = false
 python-versions = ">=2.6, !=3.0.*, !=3.1.*, !=3.2.*"
 version = "2.4.7"
 [[package]]
 category = "dev"
 description = "pytest: simple powerful testing with Python"
 name = "pytest"
 optional = false
 python-versions = ">=3.5"
 version = "5.4.3"
 [package.dependencies]
 atomicwrites = ">=1.0"
 attrs = ">=17.4.0"
 colorama = "*"
 more-itertools = ">=4.0.0"
 packaging = "*"
 pluggy = ">=0.12,<1.0"
 py = ">=1.5.0"
 wcwidth = "*"
 [package.dependencies.importlib-metadata]
 python = "<3.8"
 version = ">=0.12"
 [package.extras]
 checkqa-mypy = ["mypy (v0.761)"]
 testing = ["argcomplete", "hypothesis (>=3.56)", "mock", "nose", "requests", "xmlschema"]
 [[package]]
 category = "dev"
 description = "Alternative regular expression module, to replace re."
 marker = "python_version >= \"3.6\" and python_version < \"4.0\""
 name = "regex"
 optional = false
 python-versions = "*"
 version = "2020.6.8"
 [[package]]
 category = "dev"
 description = "Python 2 and 3 compatibility utilities"
 name = "six"
 optional = false
 python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*"
 version = "1.15.0"
 [[package]]
 category = "dev"
 description = "Python Library for Tom's Obvious, Minimal Language"
 marker = "python_version >= \"3.6\" and python_version < \"4.0\""
 name = "toml"
 optional = false
 python-versions = "*"
 version = "0.10.1"
 [[package]]
 category = "dev"
 description = "a fork of Python 2 and 3 ast modules with type comment support"
 marker = "python_version >= \"3.6\" and python_version < \"4.0\""
 name = "typed-ast"
 optional = false
 python-versions = "*"
 version = "1.4.1"
 [[package]]
 category = "dev"
 description = "Measures the displayed width of unicode strings in a terminal"
 name = "wcwidth"
 optional = false
 python-versions = "*"
 version = "0.2.5"
 [[package]]
 category = "main"
 description = "A small Python utility to set file creation time on Windows"
 marker = "sys_platform == \"win32\""
 name = "win32-setctime"
 optional = false
 python-versions = ">=3.5"
 version = "1.0.1"
 [package.extras]
 dev = ["pytest (>=4.6.2)", "black (>=19.3b0)"]
 [[package]]
 category = "dev"
 description = "Backport of pathlib-compatible object wrapper for zip files"
 marker = "python_version < \"3.8\""
 name = "zipp"
 optional = false
 python-versions = ">=3.6"
 version = "3.1.0"
 [package.extras]
 docs = ["sphinx", "jaraco.packaging (>=3.2)", "rst.linker (>=1.9)"]
 testing = ["jaraco.itertools", "func-timeout"]
 [metadata]
 content-hash = "f31f306055c16c638d3e431931ddcbc4392b5a636f2ae85ff42880ff6895cd4a"
 python-versions = "^3.6"
 [metadata.files]
 aiocontextvars = [
    {file = "aiocontextvars-0.2.2-py2.py3-none-any.whl", hash = "sha256:885daf8261818767d8f7cbd79f9d4482d118f024b6586ef6e67980236a27bfa3"},
    {file = "aiocontextvars-0.2.2.tar.gz", hash = "sha256:f027372dc48641f683c559f247bd84962becaacdc9ba711d583c3871fb5652aa"},
 ]
 appdirs = [
    {file = "appdirs-1.4.4-py2.py3-none-any.whl", hash = "sha256:a841dacd6b99318a741b166adb07e19ee71a274450e68237b4650ca1055ab128"},
    {file = "appdirs-1.4.4.tar.gz", hash = "sha256:7d5d0167b2b1ba821647616af46a749d1c653740dd0d2415100fe26e27afdf41"},
 ]
 atomicwrites = [
    {file = "atomicwrites-1.4.0-py2.py3-none-any.whl", hash = "sha256:6d1784dea7c0c8d4a5172b6c620f40b6e4cbfdf96d783691f2e1302a7b88e197"},
    {file = "atomicwrites-1.4.0.tar.gz", hash = "sha256:ae70396ad1a434f9c7046fd2dd196fc04b12f9e91ffb859164193be8b6168a7a"},
 ]
 attrs = [
    {file = "attrs-19.3.0-py2.py3-none-any.whl", hash = "sha256:08a96c641c3a74e44eb59afb61a24f2cb9f4d7188748e76ba4bb5edfa3cb7d1c"},
    {file = "attrs-19.3.0.tar.gz", hash = "sha256:f7b7ce16570fe9965acd6d30101a28f62fb4a7f9e926b3bbc9b61f8b04247e72"},
 ]
 black = [
    {file = "black-19.10b0-py36-none-any.whl", hash = "sha256:1b30e59be925fafc1ee4565e5e08abef6b03fe455102883820fe5ee2e4734e0b"},
    {file = "black-19.10b0.tar.gz", hash = "sha256:c2edb73a08e9e0e6f65a0e6af18b059b8b1cdd5bef997d7a0b181df93dc81539"},
 ]
 click = [
    {file = "click-7.1.2-py2.py3-none-any.whl", hash = "sha256:dacca89f4bfadd5de3d7489b7c8a566eee0d3676333fbb50030263894c38c0dc"},
    {file = "click-7.1.2.tar.gz", hash = "sha256:d2b5255c7c6349bc1bd1e59e08cd12acbbd63ce649f2588755783aa94dfb6b1a"},
 ]
 colorama = [
    {file = "colorama-0.4.3-py2.py3-none-any.whl", hash = "sha256:7d73d2a99753107a36ac6b455ee49046802e59d9d076ef8e47b61499fa29afff"},
    {file = "colorama-0.4.3.tar.gz", hash = "sha256:e96da0d330793e2cb9485e9ddfd918d456036c7149416295932478192f4436a1"},
 ]
 contextvars = [
    {file = "contextvars-2.4.tar.gz", hash = "sha256:f38c908aaa59c14335eeea12abea5f443646216c4e29380d7bf34d2018e2c39e"},
 ]
 immutables = [
    {file = "immutables-0.14-cp35-cp35m-macosx_10_14_x86_64.whl", hash = "sha256:860666fab142401a5535bf65cbd607b46bc5ed25b9d1eb053ca8ed9a1a1a80d6"},
    {file = "immutables-0.14-cp35-cp35m-manylinux1_x86_64.whl", hash = "sha256:ce01788878827c3f0331c254a4ad8d9721489a5e65cc43e19c80040b46e0d297"},
    {file = "immutables-0.14-cp36-cp36m-macosx_10_14_x86_64.whl", hash = "sha256:8797eed4042f4626b0bc04d9cf134208918eb0c937a8193a2c66df5041e62d2e"},
    {file = "immutables-0.14-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:33ce2f977da7b5e0dddd93744862404bdb316ffe5853ec853e53141508fa2e6a"},
    {file = "immutables-0.14-cp36-cp36m-win_amd64.whl", hash = "sha256:6c8eace4d98988c72bcb37c05e79aae756832738305ae9497670482a82db08bc"},
    {file = "immutables-0.14-cp37-cp37m-macosx_10_14_x86_64.whl", hash = "sha256:ab6c18b7b2b2abc83e0edc57b0a38bf0915b271582a1eb8c7bed1c20398f8040"},
    {file = "immutables-0.14-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:c099212fd6504513a50e7369fe281007c820cf9d7bb22a336486c63d77d6f0b2"},
    {file = "immutables-0.14-cp37-cp37m-win_amd64.whl", hash = "sha256:714aedbdeba4439d91cb5e5735cb10631fc47a7a69ea9cc8ecbac90322d50a4a"},
    {file = "immutables-0.14-cp38-cp38-macosx_10_14_x86_64.whl", hash = "sha256:1c11050c49e193a1ec9dda1747285333f6ba6a30bbeb2929000b9b1192097ec0"},
    {file = "immutables-0.14-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:c453e12b95e1d6bb4909e8743f88b7f5c0c97b86a8bc0d73507091cb644e3c1e"},
    {file = "immutables-0.14-cp38-cp38-win_amd64.whl", hash = "sha256:ef9da20ec0f1c5853b5c8f8e3d9e1e15b8d98c259de4b7515d789a606af8745e"},
    {file = "immutables-0.14.tar.gz", hash = "sha256:a0a1cc238b678455145bae291d8426f732f5255537ed6a5b7645949704c70a78"},
 ]
 importlib-metadata = [
    {file = "importlib_metadata-1.7.0-py2.py3-none-any.whl", hash = "sha256:dc15b2969b4ce36305c51eebe62d418ac7791e9a157911d58bfb1f9ccd8e2070"},
    {file = "importlib_metadata-1.7.0.tar.gz", hash = "sha256:90bb658cdbbf6d1735b6341ce708fc7024a3e14e99ffdc5783edea9f9b077f83"},
 ]
 loguru = [
    {file = "loguru-0.4.1-py3-none-any.whl", hash = "sha256:074b3caa6748452c1e4f2b302093c94b65d5a4c5a4d7743636b4121e06437b0e"},
    {file = "loguru-0.4.1.tar.gz", hash = "sha256:a6101fd435ac89ba5205a105a26a6ede9e4ddbb4408a6e167852efca47806d11"},
 ]
 more-itertools = [
    {file = "more-itertools-8.4.0.tar.gz", hash = "sha256:68c70cc7167bdf5c7c9d8f6954a7837089c6a36bf565383919bb595efb8a17e5"},
    {file = "more_itertools-8.4.0-py3-none-any.whl", hash = "sha256:b78134b2063dd214000685165d81c154522c3ee0a1c0d4d113c80361c234c5a2"},
 ]
 packaging = [
    {file = "packaging-20.4-py2.py3-none-any.whl", hash = "sha256:998416ba6962ae7fbd6596850b80e17859a5753ba17c32284f67bfff33784181"},
    {file = "packaging-20.4.tar.gz", hash = "sha256:4357f74f47b9c12db93624a82154e9b120fa8293699949152b22065d556079f8"},
 ]
 pathspec = [
    {file = "pathspec-0.8.0-py2.py3-none-any.whl", hash = "sha256:7d91249d21749788d07a2d0f94147accd8f845507400749ea19c1ec9054a12b0"},
    {file = "pathspec-0.8.0.tar.gz", hash = "sha256:da45173eb3a6f2a5a487efba21f050af2b41948be6ab52b6a1e3ff22bb8b7061"},
 ]
 pluggy = [
    {file = "pluggy-0.13.1-py2.py3-none-any.whl", hash = "sha256:966c145cd83c96502c3c3868f50408687b38434af77734af1e9ca461a4081d2d"},
    {file = "pluggy-0.13.1.tar.gz", hash = "sha256:15b2acde666561e1298d71b523007ed7364de07029219b604cf808bfa1c765b0"},
 ]
 py = [
    {file = "py-1.9.0-py2.py3-none-any.whl", hash = "sha256:366389d1db726cd2fcfc79732e75410e5fe4d31db13692115529d34069a043c2"},
    {file = "py-1.9.0.tar.gz", hash = "sha256:9ca6883ce56b4e8da7e79ac18787889fa5206c79dcc67fb065376cd2fe03f342"},
 ]
 pyparsing = [
    {file = "pyparsing-2.4.7-py2.py3-none-any.whl", hash = "sha256:ef9d7589ef3c200abe66653d3f1ab1033c3c419ae9b9bdb1240a85b024efc88b"},
    {file = "pyparsing-2.4.7.tar.gz", hash = "sha256:c203ec8783bf771a155b207279b9bccb8dea02d8f0c9e5f8ead507bc3246ecc1"},
 ]
 pytest = [
    {file = "pytest-5.4.3-py3-none-any.whl", hash = "sha256:5c0db86b698e8f170ba4582a492248919255fcd4c79b1ee64ace34301fb589a1"},
    {file = "pytest-5.4.3.tar.gz", hash = "sha256:7979331bfcba207414f5e1263b5a0f8f521d0f457318836a7355531ed1a4c7d8"},
 ]
 regex = [
    {file = "regex-2020.6.8-cp27-cp27m-win32.whl", hash = "sha256:fbff901c54c22425a5b809b914a3bfaf4b9570eee0e5ce8186ac71eb2025191c"},
    {file = "regex-2020.6.8-cp27-cp27m-win_amd64.whl", hash = "sha256:112e34adf95e45158c597feea65d06a8124898bdeac975c9087fe71b572bd938"},
    {file = "regex-2020.6.8-cp36-cp36m-manylinux1_i686.whl", hash = "sha256:92d8a043a4241a710c1cf7593f5577fbb832cf6c3a00ff3fc1ff2052aff5dd89"},
    {file = "regex-2020.6.8-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:bae83f2a56ab30d5353b47f9b2a33e4aac4de9401fb582b55c42b132a8ac3868"},
    {file = "regex-2020.6.8-cp36-cp36m-manylinux2010_i686.whl", hash = "sha256:b2ba0f78b3ef375114856cbdaa30559914d081c416b431f2437f83ce4f8b7f2f"},
    {file = "regex-2020.6.8-cp36-cp36m-manylinux2010_x86_64.whl", hash = "sha256:95fa7726d073c87141f7bbfb04c284901f8328e2d430eeb71b8ffdd5742a5ded"},
    {file = "regex-2020.6.8-cp36-cp36m-win32.whl", hash = "sha256:e3cdc9423808f7e1bb9c2e0bdb1c9dc37b0607b30d646ff6faf0d4e41ee8fee3"},
    {file = "regex-2020.6.8-cp36-cp36m-win_amd64.whl", hash = "sha256:c78e66a922de1c95a208e4ec02e2e5cf0bb83a36ceececc10a72841e53fbf2bd"},
    {file = "regex-2020.6.8-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:08997a37b221a3e27d68ffb601e45abfb0093d39ee770e4257bd2f5115e8cb0a"},
    {file = "regex-2020.6.8-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:2f6f211633ee8d3f7706953e9d3edc7ce63a1d6aad0be5dcee1ece127eea13ae"},
    {file = "regex-2020.6.8-cp37-cp37m-manylinux2010_i686.whl", hash = "sha256:55b4c25cbb3b29f8d5e63aeed27b49fa0f8476b0d4e1b3171d85db891938cc3a"},
    {file = "regex-2020.6.8-cp37-cp37m-manylinux2010_x86_64.whl", hash = "sha256:89cda1a5d3e33ec9e231ece7307afc101b5217523d55ef4dc7fb2abd6de71ba3"},
    {file = "regex-2020.6.8-cp37-cp37m-win32.whl", hash = "sha256:690f858d9a94d903cf5cada62ce069b5d93b313d7d05456dbcd99420856562d9"},
    {file = "regex-2020.6.8-cp37-cp37m-win_amd64.whl", hash = "sha256:1700419d8a18c26ff396b3b06ace315b5f2a6e780dad387e4c48717a12a22c29"},
    {file = "regex-2020.6.8-cp38-cp38-manylinux1_i686.whl", hash = "sha256:654cb773b2792e50151f0e22be0f2b6e1c3a04c5328ff1d9d59c0398d37ef610"},
    {file = "regex-2020.6.8-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:52e1b4bef02f4040b2fd547357a170fc1146e60ab310cdbdd098db86e929b387"},
    {file = "regex-2020.6.8-cp38-cp38-manylinux2010_i686.whl", hash = "sha256:cf59bbf282b627130f5ba68b7fa3abdb96372b24b66bdf72a4920e8153fc7910"},
    {file = "regex-2020.6.8-cp38-cp38-manylinux2010_x86_64.whl", hash = "sha256:5aaa5928b039ae440d775acea11d01e42ff26e1561c0ffcd3d805750973c6baf"},
    {file = "regex-2020.6.8-cp38-cp38-win32.whl", hash = "sha256:97712e0d0af05febd8ab63d2ef0ab2d0cd9deddf4476f7aa153f76feef4b2754"},
    {file = "regex-2020.6.8-cp38-cp38-win_amd64.whl", hash = "sha256:6ad8663c17db4c5ef438141f99e291c4d4edfeaacc0ce28b5bba2b0bf273d9b5"},
    {file = "regex-2020.6.8.tar.gz", hash = "sha256:e9b64e609d37438f7d6e68c2546d2cb8062f3adb27e6336bc129b51be20773ac"},
 ]
 six = [
    {file = "six-1.15.0-py2.py3-none-any.whl", hash = "sha256:8b74bedcbbbaca38ff6d7491d76f2b06b3592611af620f8426e82dddb04a5ced"},
    {file = "six-1.15.0.tar.gz", hash = "sha256:30639c035cdb23534cd4aa2dd52c3bf48f06e5f4a941509c8bafd8ce11080259"},
 ]
 toml = [
    {file = "toml-0.10.1-py2.py3-none-any.whl", hash = "sha256:bda89d5935c2eac546d648028b9901107a595863cb36bae0c73ac804a9b4ce88"},
    {file = "toml-0.10.1.tar.gz", hash = "sha256:926b612be1e5ce0634a2ca03470f95169cf16f939018233a670519cb4ac58b0f"},
 ]
 typed-ast = [
    {file = "typed_ast-1.4.1-cp35-cp35m-manylinux1_i686.whl", hash = "sha256:73d785a950fc82dd2a25897d525d003f6378d1cb23ab305578394694202a58c3"},
    {file = "typed_ast-1.4.1-cp35-cp35m-manylinux1_x86_64.whl", hash = "sha256:aaee9905aee35ba5905cfb3c62f3e83b3bec7b39413f0a7f19be4e547ea01ebb"},
    {file = "typed_ast-1.4.1-cp35-cp35m-win32.whl", hash = "sha256:0c2c07682d61a629b68433afb159376e24e5b2fd4641d35424e462169c0a7919"},
    {file = "typed_ast-1.4.1-cp35-cp35m-win_amd64.whl", hash = "sha256:4083861b0aa07990b619bd7ddc365eb7fa4b817e99cf5f8d9cf21a42780f6e01"},
    {file = "typed_ast-1.4.1-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:269151951236b0f9a6f04015a9004084a5ab0d5f19b57de779f908621e7d8b75"},
    {file = "typed_ast-1.4.1-cp36-cp36m-manylinux1_i686.whl", hash = "sha256:24995c843eb0ad11a4527b026b4dde3da70e1f2d8806c99b7b4a7cf491612652"},
    {file = "typed_ast-1.4.1-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:fe460b922ec15dd205595c9b5b99e2f056fd98ae8f9f56b888e7a17dc2b757e7"},
    {file = "typed_ast-1.4.1-cp36-cp36m-win32.whl", hash = "sha256:4e3e5da80ccbebfff202a67bf900d081906c358ccc3d5e3c8aea42fdfdfd51c1"},
    {file = "typed_ast-1.4.1-cp36-cp36m-win_amd64.whl", hash = "sha256:249862707802d40f7f29f6e1aad8d84b5aa9e44552d2cc17384b209f091276aa"},
    {file = "typed_ast-1.4.1-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:8ce678dbaf790dbdb3eba24056d5364fb45944f33553dd5869b7580cdbb83614"},
    {file = "typed_ast-1.4.1-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:c9e348e02e4d2b4a8b2eedb48210430658df6951fa484e59de33ff773fbd4b41"},
    {file = "typed_ast-1.4.1-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:bcd3b13b56ea479b3650b82cabd6b5343a625b0ced5429e4ccad28a8973f301b"},
    {file = "typed_ast-1.4.1-cp37-cp37m-win32.whl", hash = "sha256:d5d33e9e7af3b34a40dc05f498939f0ebf187f07c385fd58d591c533ad8562fe"},
    {file = "typed_ast-1.4.1-cp37-cp37m-win_amd64.whl", hash = "sha256:0666aa36131496aed8f7be0410ff974562ab7eeac11ef351def9ea6fa28f6355"},
    {file = "typed_ast-1.4.1-cp38-cp38-macosx_10_15_x86_64.whl", hash = "sha256:d205b1b46085271b4e15f670058ce182bd1199e56b317bf2ec004b6a44f911f6"},
    {file = "typed_ast-1.4.1-cp38-cp38-manylinux1_i686.whl", hash = "sha256:6daac9731f172c2a22ade6ed0c00197ee7cc1221aa84cfdf9c31defeb059a907"},
    {file = "typed_ast-1.4.1-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:498b0f36cc7054c1fead3d7fc59d2150f4d5c6c56ba7fb150c013fbc683a8d2d"},
    {file = "typed_ast-1.4.1-cp38-cp38-win32.whl", hash = "sha256:715ff2f2df46121071622063fc7543d9b1fd19ebfc4f5c8895af64a77a8c852c"},
    {file = "typed_ast-1.4.1-cp38-cp38-win_amd64.whl", hash = "sha256:fc0fea399acb12edbf8a628ba8d2312f583bdbdb3335635db062fa98cf71fca4"},
    {file = "typed_ast-1.4.1-cp39-cp39-macosx_10_15_x86_64.whl", hash = "sha256:d43943ef777f9a1c42bf4e552ba23ac77a6351de620aa9acf64ad54933ad4d34"},
    {file = "typed_ast-1.4.1.tar.gz", hash = "sha256:8c8aaad94455178e3187ab22c8b01a3837f8ee50e09cf31f1ba129eb293ec30b"},
 ]
 wcwidth = [
    {file = "wcwidth-0.2.5-py2.py3-none-any.whl", hash = "sha256:beb4802a9cebb9144e99086eff703a642a13d6a0052920003a230f3294bbe784"},
    {file = "wcwidth-0.2.5.tar.gz", hash = "sha256:c4d647b99872929fdb7bdcaa4fbe7f01413ed3d98077df798530e5b04f116c83"},
 ]
 win32-setctime = [
    {file = "win32_setctime-1.0.1-py3-none-any.whl", hash = "sha256:568fd636c68350bcc54755213fe01966fe0a6c90b386c0776425944a0382abef"},
    {file = "win32_setctime-1.0.1.tar.gz", hash = "sha256:b47e5023ec7f0b4962950902b15bc56464a380d869f59d27dbf9ab423b23e8f9"},
 ]
 zipp = [
    {file = "zipp-3.1.0-py3-none-any.whl", hash = "sha256:aa36550ff0c0b7ef7fa639055d797116ee891440eac1a56f378e2d3179e0320b"},
    {file = "zipp-3.1.0.tar.gz", hash = "sha256:c599e4d75c98f6798c509911d08a22e6c021d074469042177c8c86fb92eefd96"},
 ]
--- a/mobimaster/pyproject.toml
+++ b/mobimaster/pyproject.toml
@@ -0,0 +1,29 @@
 [tool.poetry]
 name = "mobi"
 version = "0.3.1"
 description = "unpack unencrypted mobi files"
 authors = ["Titusz Pan <tp@py7.de>"]
 license = "GPL-3.0-only"
 readme = "README.md"
 homepage = "https://github.com/iscc/mobi"
 repository = "https://github.com/iscc/mobi"
 keywords = ["mobi", "mobipocket", "unpack", "extract", "text"]
 classifiers = [
    "Development Status :: 4 - Beta",
 ]
 [tool.poetry.scripts]
 mobiunpack = 'mobi.kindleunpack:main'
 [tool.poetry.dependencies]
 python = "^3.6"
 loguru = "^0.4"
 [tool.poetry.dev-dependencies]
 pytest = "^5"
 black = { version = "^19.10b0", python = "^3.6" }
 [build-system]
 requires = ["poetry>=1.0.5"]
 build-backend = "poetry.masonry.api"
--- a/mobimaster/t/6jwyvnow.mobi
+++ b/mobimaster/t/6jwyvnow.mobi
--- a/mobimaster/t/FCIS00619.dat
+++ b/mobimaster/t/FCIS00619.dat
--- a/mobimaster/t/FCIS01198_K8.dat
+++ b/mobimaster/t/FCIS01198_K8.dat
--- a/mobimaster/t/FDST01196_K8.dat
+++ b/mobimaster/t/FDST01196_K8.dat
--- a/mobimaster/t/FLIS00618.dat
+++ b/mobimaster/t/FLIS00618.dat
--- a/mobimaster/t/FLIS01197_K8.dat
+++ b/mobimaster/t/FLIS01197_K8.dat
--- a/mobimaster/t/header.dat
+++ b/mobimaster/t/header.dat
--- a/mobimaster/t/header_K8.dat
+++ b/mobimaster/t/header_K8.dat
--- a/mobimaster/t/mobi7/6jwyvnow.rawml
+++ b/mobimaster/t/mobi7/6jwyvnow.rawml
--- a/mobimaster/t/mobi7/Images/cover00534.jpeg
+++ b/mobimaster/t/mobi7/Images/cover00534.jpeg
--- a/mobimaster/t/mobi7/Images/image00515.jpeg
+++ b/mobimaster/t/mobi7/Images/image00515.jpeg
--- a/mobimaster/t/mobi7/Images/image00516.jpeg
+++ b/mobimaster/t/mobi7/Images/image00516.jpeg
--- a/mobimaster/t/mobi7/Images/image00517.jpeg
+++ b/mobimaster/t/mobi7/Images/image00517.jpeg
--- a/mobimaster/t/mobi7/Images/image00518.jpeg
+++ b/mobimaster/t/mobi7/Images/image00518.jpeg
--- a/mobimaster/t/mobi7/Images/image00519.jpeg
+++ b/mobimaster/t/mobi7/Images/image00519.jpeg
--- a/mobimaster/t/mobi7/Images/image00520.jpeg
+++ b/mobimaster/t/mobi7/Images/image00520.jpeg
--- a/mobimaster/t/mobi7/Images/image00521.jpeg
+++ b/mobimaster/t/mobi7/Images/image00521.jpeg
--- a/mobimaster/t/mobi7/Images/image00522.jpeg
+++ b/mobimaster/t/mobi7/Images/image00522.jpeg
--- a/mobimaster/t/mobi7/Images/image00523.jpeg
+++ b/mobimaster/t/mobi7/Images/image00523.jpeg
--- a/mobimaster/t/mobi7/Images/image00524.jpeg
+++ b/mobimaster/t/mobi7/Images/image00524.jpeg
--- a/mobimaster/t/mobi7/Images/image00525.jpeg
+++ b/mobimaster/t/mobi7/Images/image00525.jpeg
--- a/mobimaster/t/mobi7/Images/image00526.jpeg
+++ b/mobimaster/t/mobi7/Images/image00526.jpeg
--- a/mobimaster/t/mobi7/Images/image00527.jpeg
+++ b/mobimaster/t/mobi7/Images/image00527.jpeg
--- a/mobimaster/t/mobi7/Images/image00528.jpeg
+++ b/mobimaster/t/mobi7/Images/image00528.jpeg
--- a/mobimaster/t/mobi7/Images/image00529.jpeg
+++ b/mobimaster/t/mobi7/Images/image00529.jpeg
--- a/mobimaster/t/mobi7/Images/image00530.jpeg
+++ b/mobimaster/t/mobi7/Images/image00530.jpeg
--- a/mobimaster/t/mobi7/Images/image00531.jpeg
+++ b/mobimaster/t/mobi7/Images/image00531.jpeg
--- a/mobimaster/t/mobi7/Images/image00532.jpeg
+++ b/mobimaster/t/mobi7/Images/image00532.jpeg
--- a/mobimaster/t/mobi7/Images/image00533.jpeg
+++ b/mobimaster/t/mobi7/Images/image00533.jpeg
--- a/mobimaster/t/mobi7/Images/image00535.jpeg
+++ b/mobimaster/t/mobi7/Images/image00535.jpeg
--- a/mobimaster/t/mobi7/Images/image00536.jpeg
+++ b/mobimaster/t/mobi7/Images/image00536.jpeg
--- a/mobimaster/t/mobi7/Images/image00537.jpeg
+++ b/mobimaster/t/mobi7/Images/image00537.jpeg
--- a/mobimaster/t/mobi7/Images/image00538.jpeg
+++ b/mobimaster/t/mobi7/Images/image00538.jpeg
--- a/mobimaster/t/mobi7/Images/image00539.jpeg
+++ b/mobimaster/t/mobi7/Images/image00539.jpeg
--- a/mobimaster/t/mobi7/Images/image00540.jpeg
+++ b/mobimaster/t/mobi7/Images/image00540.jpeg
--- a/mobimaster/t/mobi7/Images/image00541.jpeg
+++ b/mobimaster/t/mobi7/Images/image00541.jpeg
--- a/mobimaster/t/mobi7/Images/image00542.jpeg
+++ b/mobimaster/t/mobi7/Images/image00542.jpeg
--- a/mobimaster/t/mobi7/Images/image00543.jpeg
+++ b/mobimaster/t/mobi7/Images/image00543.jpeg
--- a/mobimaster/t/mobi7/Images/image00544.jpeg
+++ b/mobimaster/t/mobi7/Images/image00544.jpeg
--- a/mobimaster/t/mobi7/Images/image00545.jpeg
+++ b/mobimaster/t/mobi7/Images/image00545.jpeg
--- a/mobimaster/t/mobi7/Images/image00546.jpeg
+++ b/mobimaster/t/mobi7/Images/image00546.jpeg
--- a/mobimaster/t/mobi7/Images/image00547.jpeg
+++ b/mobimaster/t/mobi7/Images/image00547.jpeg
--- a/mobimaster/t/mobi7/Images/image00548.jpeg
+++ b/mobimaster/t/mobi7/Images/image00548.jpeg
--- a/mobimaster/t/mobi7/Images/image00549.jpeg
+++ b/mobimaster/t/mobi7/Images/image00549.jpeg
--- a/mobimaster/t/mobi7/Images/image00550.jpeg
+++ b/mobimaster/t/mobi7/Images/image00550.jpeg
--- a/mobimaster/t/mobi7/Images/image00551.jpeg
+++ b/mobimaster/t/mobi7/Images/image00551.jpeg
--- a/mobimaster/t/mobi7/Images/image00552.jpeg
+++ b/mobimaster/t/mobi7/Images/image00552.jpeg
--- a/mobimaster/t/mobi7/Images/image00553.jpeg
+++ b/mobimaster/t/mobi7/Images/image00553.jpeg
--- a/mobimaster/t/mobi7/Images/image00554.jpeg
+++ b/mobimaster/t/mobi7/Images/image00554.jpeg
--- a/mobimaster/t/mobi7/Images/image00555.jpeg
+++ b/mobimaster/t/mobi7/Images/image00555.jpeg
--- a/mobimaster/t/mobi7/Images/image00556.jpeg
+++ b/mobimaster/t/mobi7/Images/image00556.jpeg
--- a/mobimaster/t/mobi7/Images/image00557.jpeg
+++ b/mobimaster/t/mobi7/Images/image00557.jpeg
--- a/mobimaster/t/mobi7/Images/image00558.jpeg
+++ b/mobimaster/t/mobi7/Images/image00558.jpeg
--- a/mobimaster/t/mobi7/Images/image00559.jpeg
+++ b/mobimaster/t/mobi7/Images/image00559.jpeg
--- a/mobimaster/t/mobi7/Images/image00560.jpeg
+++ b/mobimaster/t/mobi7/Images/image00560.jpeg
--- a/mobimaster/t/mobi7/Images/image00561.jpeg
+++ b/mobimaster/t/mobi7/Images/image00561.jpeg
--- a/mobimaster/t/mobi7/Images/image00562.jpeg
+++ b/mobimaster/t/mobi7/Images/image00562.jpeg
--- a/Show More
+++ b/Show More
`@@ -159,4 +159,3 @@ mkvirtualenv kmanenv`
	`1. [my spec file1](kmanapp.spec)`	`1. [my spec file1](kmanapp.spec)`
	`1. [my spec file2](kmanapp.win.spec)`	`1. [my spec file2](kmanapp.win.spec)`