Merge lp://staging/~grishkin/chm2pdf/chm2pdf_branch into lp://staging/~reto-knaak/chm2pdf/chm2pdf_branch
Proposed by
Max Grishkin
Status: | Needs review |
---|---|
Proposed branch: | lp://staging/~grishkin/chm2pdf/chm2pdf_branch |
Merge into: | lp://staging/~reto-knaak/chm2pdf/chm2pdf_branch |
Diff against target: |
564 lines (+160/-94) 1 file modified
chm2pdf (+160/-94) |
To merge this branch: | bzr merge lp://staging/~grishkin/chm2pdf/chm2pdf_branch |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Reto Knaak | Pending | ||
Review via email: mp+128385@code.staging.launchpad.net |
Description of the change
Mainly fix from https:/
To post a comment you must log in.
Unmerged revisions
- 16. By Grishkin <grishkin@mint>
-
Fixed processing of links with spaces
- 15. By Grishkin <grishkin@mint>
-
merged several other Reto's patches
- 14. By Grishkin <grishkin@mint>
-
Merged Reto's pathes, fixed problems with spaces
- 13. By Grishkin <grishkin@mint>
-
Fixed processing of chm's without table of contents and fixed temp directory creation
Hi Grishkin (Max?) !
Thank you for the files... I'm not a real programmer (just tried to fix
some issues that where avoiding me to use the script) so I don't know if I
am the right person to make the code review.
It makes now a lot of monts I didn't boot up my virtual box with ubuntu and
I have the feeling I forgot most of what I learned trying to fix the script.
Anyway, this evening I had some time and began to download the files - just
to see what's going on.
My operating system is Win7, and if I open the CHM file from windows it
won't open, probably doe to the missinf toc!
Then I tryied to import it to calibre, and there if I open the CHM
something is displayed but it the "Liberty Bay" article from the online(!)
wikipedia.
If I convert the file to some other format (mobi), I get a page with "1951
Chicago Bears season" which seems me the right output.
So I'm not sure the demo chm file has a valid output, but I agree that it's
a good idea to try to extract what's there.
I'm not familiar with the code review process, and I am asking myself if
i/we ahould open a bug under ubuntu (that is what I did with the bugs i
found previously)?
I gave a quick glance at the diff and most are differences that are not list(cfile) and def get_objective_ urls_list( filename) .
really there (probably some spaces), and the only true differences are:
in def get_html_
For the first one, it's the first time I see "lambda" (so again, probably
I'm not the right one to review...).
I think I understood what it's meant for but I can't say I understand how
it works.... (if first way to retrieve the html files fail, use the second
one using all files found or something similar?)
For the second one, "my local chm2pdf" is like this:
*def get_objective_ urls_list( filename) : system( 'enum_chmLib '+filename+' > WORK_DIR+ '/urlslist. txt"') open(CHM2PDF_ WORK_DIR+ '/urlslist. txt','rU' ) ()[3:]:
#spline= line.split( )
#urls_ list.append( spline[ 5])
urls_ list.append( spline) CHM2PDF_ WORK_DIR+ '/urlslist. txt') /bugs.launchpad .net/ubuntu/ +source/ chm2pdf/ +bug/894193 ) append( spline[ 5]) will fail in case of paths
'''
takes the list of files inside the chm archive, with the correct urls
of each one.
'''
os.
"'+CHM2PDF_
flist=
urls_list=[]
for line in flist.readlines
#print 'line',line
#This won't work if internal paths of CHM contains spaces: e.g.
/doc space/ will only become /doc
#this should work better:
spline= re.sub(r".*?normal file\s*(.*?)\n$", "\\1", line)
if spline[0]=="/":
#print "got spline="+spline
flist.close()
# os.remove(
return urls_list
*
Does your solution work with chm paths containing spaces? (If you need a
sample file see
https:/
I have the feeling (not really run any scripts this evening and forgetting
pyton) that using urls_list.
with spaces!
I have also the feeling that my solution is not really state of the art, so
maybe you can suggest something that solves both problems?
Hope to hear you soon and Kind regards from the italian part of Switzerland!
Ciao
Reto Knaak
On Sun, Oct 7, 2012 at 5:23 PM, Grishkin <email address hidden> wrote:
> Grishkin has propose...