Python docx error
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello. I would like to pick up table data from many Word files.
And then I got error like "file '%s' is not a Word file, content type is '%s" or "bad zip file error"
It works for some files but not everything. So that I do believe I have some wrong files in my input Word files.
The problem is I cannot find which Word causes the problem...
Would be really appreciate if someone tells me how to find wrong file.
- Labels:
- Error Message
- Python
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @seito
To find the file that is giving you the error, you can put a print(filepath) inside your for loop and check which filepaths were printed or not after running it.
Here you can find some explanation and possible workarounds for this problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Felipe-san,
>To find the file that is giving you the error, you can put a print(filepath) inside your for loop and check which filepaths were printed or not after running it.
This works well, I found some wrong Word files because of this.
But I am still searching why these Word file is something wrong. These files are already docx., not doc...
Anyway, thanks for the advice!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @seito
You saw this link? https://stackoverflow.com/questions/42748540/python-package-docx-cant-open-a-docx
Probably these doc files were saved in a pre-Word 2007 format.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Felipe-san,
I saw it. However, at first, I only picked up docx.file by using directory tool (*.docx*).
So I thought this is another problem... Did I miss something here?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
If you are filtering '.docx' i agree with you.
You could try to open some of these files and save them as .docx again just to see if the error still happening. It would not help to solve the problem, but at least it would point that it is related with the way that the file was saved/generated.
