Removing HTML from a Java String?

Is there a good way to remove HTML from a Java string?

1 Like

This is actually very simple with Jsoup.

public static String html2text(String html) {

    return Jsoup.parse(html).text();

}

I think that the simpliest way to filter the html tags is:

private static final Pattern REMOVE_TAGS = Pattern.compile("");

 

public static String removeTags(String string) {

    if (string == null || string.length() == 0) {

        return string;

    }

 

    Matcher m = REMOVE_TAGS.matcher(string);

    return m.replaceAll("");

}