Quantcast
Channel: Cannot get URL content as UTF-8 - Stack Overflow
Viewing all articles
Browse latest Browse all 3

Cannot get URL content as UTF-8

$
0
0

i'm trying to read content from a URL but it does return strange symbols instead of "è", "à", etc.

This is the code i'm using:

public static String getPageContent(String _url) {    URL url;    InputStream is = null;    BufferedReader dis;    String line;    String text = "";    try {        url = new URL(_url);        is = url.openStream();        //This line should open the stream as UTF-8        dis = new BufferedReader(new InputStreamReader(is, "UTF-8"));        while ((line = dis.readLine()) != null) {            text += line +"\n";        }    } catch (MalformedURLException mue) {        mue.printStackTrace();    } catch (IOException ioe) {        ioe.printStackTrace();    } finally {        try {            is.close();        } catch (IOException ioe) {            // nothing to see here        }    }    return text;}

I saw other questions like this, and all of them were answered like

Declare your inputstream as new InputStreamReader(is, "UTF-8")

But i can't get it to work.

For example, if my url content contains

è uno dei più

I get

è uno dei più

What am i missing?


Viewing all articles
Browse latest Browse all 3

Latest Images

Trending Articles





Latest Images