Beautifulsoup：用于处理html文件的python模块

Permalink: <a href="https://dev.kermsite.com/p/beautifulsoup/" title="Beautifulsoup：用于处理html文件的python模块" target="_blank" rel="external">https://dev.kermsite.com/p/beautifulsoup/
License: <a href="http://creativecommons.org/licenses/by/4.0/deed.zh" target="_blank" rel="external">CC BY 4.0 CN

2022-03-14 学习笔记 Python Word Count: 136 words Read Time: 1 minutes

使用

pip install BeautifulSoup4
pip install lxml #依赖

from bs4 import BeautifulSoup

# 使用lxml格式对原始数据进行解析。
soup = BeautifulSoup(rawdata,'lxml')
# 获取所有类型为a、并且title属性为RSS Feed的元素，返回值为一个列表
link = soup.select('a[title="RSS Feed"]')

# 使用 .类型 的方式获取源文件内的所有p元素。并用contents获取p元素的内容。
p = BeautifulSoup(entry['summary'],'lxml').p
p.contents[0]

参考：

Permalink: https://dev.kermsite.com/p/beautifulsoup/

License: CC BY 4.0 CN

Good Good Study, Day Day Up~