python - How to find all comments with Beautiful Soup -


this question asked 4 years ago, answer out of date bs4.

i want delete comments in html file using beautiful soup. since bs4 makes each comment special type of navigable string, thought code work:

for comments in soup.find_all('comment'):      comments.decompose() 

so didn't work.... how find comments using bs4?

you can pass function find_all() check whether string comment.

for example have below html:

<body>    <!-- branding , main navigation -->    <div class="branding">the science &amp; safety behind favorite products</div>    <div class="l-branding">       <p>just brand</p>    </div>       <!-- test comment here -->       <div class="block_content">           <a href="https://www.google.com">google</a>    </div> </body> 

code:

from bs4 import beautifulsoup bs bs4 import comment .... soup=bs(html,'html.parser') comments=soup.find_all(string=lambda text:isinstance(text,comment)) c in comments:     print c     print "==========="     c.decompose() 

the output be:

branding , main navigation  ============ test comment here ============ 

btw, think reason why find_all('comment') doesn't work (from beautifulsoup document):

pass in value name , you’ll tell beautiful soup consider tags names. text strings ignored, tags names don’t match.


Comments

Popular posts from this blog

javascript - Chart.js (Radar Chart) different scaleLineColor for each scaleLine -

apache - Error with PHP mail(): Multiple or malformed newlines found in additional_header -

java - Android – MapFragment overlay button shadow, just like MyLocation button -