Ticket #1185 (closed defect: fixed)
HTML meta tag charset detection in AgaviFormPopulationFilter never matches strings without quotation marks
| Reported by: | david | Owned by: | david |
|---|---|---|---|
| Priority: | normal | Milestone: | 1.0.2 |
| Component: | filter | Version: | 1.0.1 |
| Severity: | normal | Keywords: | |
| Cc: | Patch attached: | no |
Description
There is a problem in tags/1.0.2RC1/src/filter/AgaviFormPopulationFilter.class.php@4299#L215:
- text/html; charset="UTF-8" matches this pattern
- text/html; charset=UTF-8 never does, because the branch with the lookahead assertion (?=[;\s]) doesn't match the end of subject; should probably be changed to something like ($|(?=[;\s]))
It seems however that current versions of libxml always produce a document with the encoding property, likely by looking at an HTML document's <meta http-equiv="Content-Type" ... /> header (which the above code reads) even in XML parsing mode.
This ticket is related to #1183 (I discovered this issue while working on a fix for that one)
Attachments
Change History
Note: See
TracTickets for help on using
tickets.

