解析html报错:lxml.etree.XMLSyntaxError: Input is not proper UTF-8, indicate encoding !
from lxml import etree doc = etree.parse('1.html')
报错lxml.etree.XMLSyntaxError: Input is not proper UTF-8, indicate encoding !
把代码修改一下即可:
par = etree.HTMLParser(encoding="utf-8") doc = etree.parse('1.html', parser=par)
解析html报错:lxml.etree.XMLSyntaxError: Input is not proper UTF-8, indicate encoding !
from lxml import etree doc = etree.parse('1.html')
报错lxml.etree.XMLSyntaxError: Input is not proper UTF-8, indicate encoding !
把代码修改一下即可:
par = etree.HTMLParser(encoding="utf-8") doc = etree.parse('1.html', parser=par)