Is there a good way to remove HTML from a Java string?
1 Like
This is actually very simple with Jsoup.
public static String html2text(String html) {
return Jsoup.parse(html).text();
}
I think that the simpliest way to filter the html tags is:
private static final Pattern REMOVE_TAGS = Pattern.compile("");
public static String removeTags(String string) {
if (string == null || string.length() == 0) {
return string;
}
Matcher m = REMOVE_TAGS.matcher(string);
return m.replaceAll("");
}