If i parse an encrypted html file to a string can i somehow obtain the text from it?

    import java.net.*;

    import java.io.*;

    import org.jsoup.Jsoup;

    import org.jsoup.helper.Validate;

    import org.jsoup.nodes.Document;

    import org.jsoup.nodes.Element;

    import org.jsoup.select.Elements;





    public class UrlReaderTest {

        public static void main(String args) throws Exception {



        URL url = new URL("https://www.amazon.com/");

        String s = null;

        StringBuilder contentBuilder = new StringBuilder();

        try {

            BufferedReader in = new BufferedReader(new 

            InputStreamReader(url.openStream())); 

            String str;

            while ((str = in.readLine()) != null) {

                contentBuilder.append(str);

            }

            in.close();

        } catch (IOException e) {

            System.err.println("Error");



        }



        s = contentBuilder.toString();

        Document document = Jsoup.parse(s);





        System.out.println(document.text());





        }

    }

What i am getting has mainly symbols like these: Η1?0 Π??0ή=tθ Jr?/β@Q? l?r{ΪεI/ ΉΟ~νJ?j?Ά-??ΙiLs?YdHλ²ύ?α?η?ογV"ηw[:?0??νSQψyθ?*²?γpI? ??²ρνl???2JμΚ?ΣS?Αl4ςRΛKR545υ?SK

Is there anything i can do to transform that in a form that i can use?
I can't find something specific online.

Edit: What i want specificly is to decrypt that information. What i want for example is to be able to take the text from an event page from facebook search it to find the keywords i want and use those somewhere else.

edited Nov 22 '18 at 15:01

treyBake

3,24431035

asked Nov 21 '18 at 22:28

Thodoris Ydraios

334

2

Are you looking for an answer other than "decrypt the file"? Those symbols are the encrypted file (bits in memory) being read in as text. They look like nonsense because they are the text representation of encrypted data which is basically random 1's and 0's. You cannot get prettier text because that prettier text would not be the text representation of the same data. If you are looking for something other than "decrypt the file" please specify what "a form that I can use" means

– MyStackRunnethOver
Nov 21 '18 at 22:50

1

The reason you're getting back nonsense is that you're opening a raw stream to an HTTPS URL. Since it's HTTPS, the contents of the stream are encrypted. Consider using HttpsURLConnection, which handles the communication for you and just gives you back the decrypted content. Here's an example: mkyong.com/java/java-https-client-httpsurlconnection-example

– ethan.roday
Nov 21 '18 at 23:37

1

That looks like a zipped rsponse. Why don't you use Jsoup to request the page? I think it decodes the response data by default.

– t.m.adam
Nov 22 '18 at 1:34

1

@err1100: No, openStream gives the decrypted data after SSL record processing.

– James K Polk
Nov 22 '18 at 1:59

3

This is all wrong. As @t.m.adam notes, the page is gzipped, so it can't be read using any Reader. Reader is meant for character streams, not the binary data that you get from gzipping text.

– James K Polk
Nov 22 '18 at 2:35

|
show 3 more comments

    import java.net.*;

    import java.io.*;

    import org.jsoup.Jsoup;

    import org.jsoup.helper.Validate;

    import org.jsoup.nodes.Document;

    import org.jsoup.nodes.Element;

    import org.jsoup.select.Elements;





    public class UrlReaderTest {

        public static void main(String args) throws Exception {



        URL url = new URL("https://www.amazon.com/");

        String s = null;

        StringBuilder contentBuilder = new StringBuilder();

        try {

            BufferedReader in = new BufferedReader(new 

            InputStreamReader(url.openStream())); 

            String str;

            while ((str = in.readLine()) != null) {

                contentBuilder.append(str);

            }

            in.close();

        } catch (IOException e) {

            System.err.println("Error");



        }



        s = contentBuilder.toString();

        Document document = Jsoup.parse(s);





        System.out.println(document.text());





        }

    }

Is there anything i can do to transform that in a form that i can use?
I can't find something specific online.

edited Nov 22 '18 at 15:01

treyBake

3,24431035

asked Nov 21 '18 at 22:28

Thodoris Ydraios

334

2

Are you looking for an answer other than "decrypt the file"? Those symbols are the encrypted file (bits in memory) being read in as text. They look like nonsense because they are the text representation of encrypted data which is basically random 1's and 0's. You cannot get prettier text because that prettier text would not be the text representation of the same data. If you are looking for something other than "decrypt the file" please specify what "a form that I can use" means

– MyStackRunnethOver
Nov 21 '18 at 22:50

1

The reason you're getting back nonsense is that you're opening a raw stream to an HTTPS URL. Since it's HTTPS, the contents of the stream are encrypted. Consider using HttpsURLConnection, which handles the communication for you and just gives you back the decrypted content. Here's an example: mkyong.com/java/java-https-client-httpsurlconnection-example

– ethan.roday
Nov 21 '18 at 23:37

1

That looks like a zipped rsponse. Why don't you use Jsoup to request the page? I think it decodes the response data by default.

– t.m.adam
Nov 22 '18 at 1:34

1

@err1100: No, openStream gives the decrypted data after SSL record processing.

– James K Polk
Nov 22 '18 at 1:59

3

This is all wrong. As @t.m.adam notes, the page is gzipped, so it can't be read using any Reader. Reader is meant for character streams, not the binary data that you get from gzipping text.

– James K Polk
Nov 22 '18 at 2:35

|
show 3 more comments

    import java.net.*;

    import java.io.*;

    import org.jsoup.Jsoup;

    import org.jsoup.helper.Validate;

    import org.jsoup.nodes.Document;

    import org.jsoup.nodes.Element;

    import org.jsoup.select.Elements;





    public class UrlReaderTest {

        public static void main(String args) throws Exception {



        URL url = new URL("https://www.amazon.com/");

        String s = null;

        StringBuilder contentBuilder = new StringBuilder();

        try {

            BufferedReader in = new BufferedReader(new 

            InputStreamReader(url.openStream())); 

            String str;

            while ((str = in.readLine()) != null) {

                contentBuilder.append(str);

            }

            in.close();

        } catch (IOException e) {

            System.err.println("Error");



        }



        s = contentBuilder.toString();

        Document document = Jsoup.parse(s);





        System.out.println(document.text());





        }

    }

Is there anything i can do to transform that in a form that i can use?
I can't find something specific online.

edited Nov 22 '18 at 15:01

treyBake

3,24431035

asked Nov 21 '18 at 22:28

Thodoris Ydraios

334

    import java.net.*;

    import java.io.*;

    import org.jsoup.Jsoup;

    import org.jsoup.helper.Validate;

    import org.jsoup.nodes.Document;

    import org.jsoup.nodes.Element;

    import org.jsoup.select.Elements;





    public class UrlReaderTest {

        public static void main(String args) throws Exception {



        URL url = new URL("https://www.amazon.com/");

        String s = null;

        StringBuilder contentBuilder = new StringBuilder();

        try {

            BufferedReader in = new BufferedReader(new 

            InputStreamReader(url.openStream())); 

            String str;

            while ((str = in.readLine()) != null) {

                contentBuilder.append(str);

            }

            in.close();

        } catch (IOException e) {

            System.err.println("Error");



        }



        s = contentBuilder.toString();

        Document document = Jsoup.parse(s);





        System.out.println(document.text());





        }

    }

Is there anything i can do to transform that in a form that i can use?
I can't find something specific online.

java html encryption

edited Nov 22 '18 at 15:01

treyBake

3,24431035

asked Nov 21 '18 at 22:28

Thodoris Ydraios

334

edited Nov 22 '18 at 15:01

treyBake

3,24431035

asked Nov 21 '18 at 22:28

Thodoris Ydraios

334

edited Nov 22 '18 at 15:01

treyBake

3,24431035

edited Nov 22 '18 at 15:01

treyBake

3,24431035

edited Nov 22 '18 at 15:01

treyBake

3,24431035

asked Nov 21 '18 at 22:28

Thodoris Ydraios

334

asked Nov 21 '18 at 22:28

Thodoris Ydraios

334

asked Nov 21 '18 at 22:28

Thodoris Ydraios

334

2

Are you looking for an answer other than "decrypt the file"? Those symbols are the encrypted file (bits in memory) being read in as text. They look like nonsense because they are the text representation of encrypted data which is basically random 1's and 0's. You cannot get prettier text because that prettier text would not be the text representation of the same data. If you are looking for something other than "decrypt the file" please specify what "a form that I can use" means

– MyStackRunnethOver
Nov 21 '18 at 22:50

1

The reason you're getting back nonsense is that you're opening a raw stream to an HTTPS URL. Since it's HTTPS, the contents of the stream are encrypted. Consider using HttpsURLConnection, which handles the communication for you and just gives you back the decrypted content. Here's an example: mkyong.com/java/java-https-client-httpsurlconnection-example

– ethan.roday
Nov 21 '18 at 23:37

1

That looks like a zipped rsponse. Why don't you use Jsoup to request the page? I think it decodes the response data by default.

– t.m.adam
Nov 22 '18 at 1:34

1

@err1100: No, openStream gives the decrypted data after SSL record processing.

– James K Polk
Nov 22 '18 at 1:59

3

This is all wrong. As @t.m.adam notes, the page is gzipped, so it can't be read using any Reader. Reader is meant for character streams, not the binary data that you get from gzipping text.

– James K Polk
Nov 22 '18 at 2:35

|
show 3 more comments

2

Are you looking for an answer other than "decrypt the file"? Those symbols are the encrypted file (bits in memory) being read in as text. They look like nonsense because they are the text representation of encrypted data which is basically random 1's and 0's. You cannot get prettier text because that prettier text would not be the text representation of the same data. If you are looking for something other than "decrypt the file" please specify what "a form that I can use" means

– MyStackRunnethOver
Nov 21 '18 at 22:50

1

The reason you're getting back nonsense is that you're opening a raw stream to an HTTPS URL. Since it's HTTPS, the contents of the stream are encrypted. Consider using HttpsURLConnection, which handles the communication for you and just gives you back the decrypted content. Here's an example: mkyong.com/java/java-https-client-httpsurlconnection-example

– ethan.roday
Nov 21 '18 at 23:37

1

That looks like a zipped rsponse. Why don't you use Jsoup to request the page? I think it decodes the response data by default.

– t.m.adam
Nov 22 '18 at 1:34

1

@err1100: No, openStream gives the decrypted data after SSL record processing.

– James K Polk
Nov 22 '18 at 1:59

3

This is all wrong. As @t.m.adam notes, the page is gzipped, so it can't be read using any Reader. Reader is meant for character streams, not the binary data that you get from gzipping text.

– James K Polk
Nov 22 '18 at 2:35

Are you looking for an answer other than "decrypt the file"? Those symbols are the encrypted file (bits in memory) being read in as text. They look like nonsense because they are the text representation of encrypted data which is basically random 1's and 0's. You cannot get prettier text because that prettier text would not be the text representation of the same data. If you are looking for something other than "decrypt the file" please specify what "a form that I can use" means

– MyStackRunnethOver
Nov 21 '18 at 22:50

The reason you're getting back nonsense is that you're opening a raw stream to an HTTPS URL. Since it's HTTPS, the contents of the stream are encrypted. Consider using HttpsURLConnection, which handles the communication for you and just gives you back the decrypted content. Here's an example: mkyong.com/java/java-https-client-httpsurlconnection-example

– ethan.roday
Nov 21 '18 at 23:37

That looks like a zipped rsponse. Why don't you use Jsoup to request the page? I think it decodes the response data by default.

– t.m.adam
Nov 22 '18 at 1:34

@err1100: No, openStream gives the decrypted data after SSL record processing.

– James K Polk
Nov 22 '18 at 1:59

This is all wrong. As @t.m.adam notes, the page is gzipped, so it can't be read using any Reader. Reader is meant for character streams, not the binary data that you get from gzipping text.

– James K Polk
Nov 22 '18 at 2:35

|
show 3 more comments

1 Answer
1

active

oldest

votes

As @t.m.adam noted in his comment, the problem is that the response from stream is gzipped (compressed). So, if you want to read it from the URL stream, you need to pass it through a GZIPInputStream before InputStreamReader (see this answer). Alternatively, as @t.m.adam suggests, you can use Jsoup's built-in connect() method:

import java.io.IOException;



import org.jsoup.Jsoup;

import org.jsoup.nodes.Document;



public class UrlReaderTest {

  public static void main(String args) {

    System.out.println(System.getProperty("java.classpath"));

    try {

      Document doc = Jsoup.connect("https://www.amazon.com").get();

      System.out.print(doc.text());

    }

    catch (IOException e) {

      System.err.println("Error");

    }



  }

}

answered Nov 23 '18 at 15:07

ethan.roday

1,112719

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53421332%2fif-i-parse-an-encrypted-html-file-to-a-string-can-i-somehow-obtain-the-text-from%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

import java.io.IOException;



import org.jsoup.Jsoup;

import org.jsoup.nodes.Document;



public class UrlReaderTest {

  public static void main(String args) {

    System.out.println(System.getProperty("java.classpath"));

    try {

      Document doc = Jsoup.connect("https://www.amazon.com").get();

      System.out.print(doc.text());

    }

    catch (IOException e) {

      System.err.println("Error");

    }



  }

}

answered Nov 23 '18 at 15:07

ethan.roday

1,112719

add a comment |

import java.io.IOException;



import org.jsoup.Jsoup;

import org.jsoup.nodes.Document;



public class UrlReaderTest {

  public static void main(String args) {

    System.out.println(System.getProperty("java.classpath"));

    try {

      Document doc = Jsoup.connect("https://www.amazon.com").get();

      System.out.print(doc.text());

    }

    catch (IOException e) {

      System.err.println("Error");

    }



  }

}

answered Nov 23 '18 at 15:07

ethan.roday

1,112719

add a comment |

import java.io.IOException;



import org.jsoup.Jsoup;

import org.jsoup.nodes.Document;



public class UrlReaderTest {

  public static void main(String args) {

    System.out.println(System.getProperty("java.classpath"));

    try {

      Document doc = Jsoup.connect("https://www.amazon.com").get();

      System.out.print(doc.text());

    }

    catch (IOException e) {

      System.err.println("Error");

    }



  }

}

answered Nov 23 '18 at 15:07

ethan.roday

1,112719

import java.io.IOException;



import org.jsoup.Jsoup;

import org.jsoup.nodes.Document;



public class UrlReaderTest {

  public static void main(String args) {

    System.out.println(System.getProperty("java.classpath"));

    try {

      Document doc = Jsoup.connect("https://www.amazon.com").get();

      System.out.print(doc.text());

    }

    catch (IOException e) {

      System.err.println("Error");

    }



  }

}

answered Nov 23 '18 at 15:07

ethan.roday

1,112719

answered Nov 23 '18 at 15:07

ethan.roday

1,112719

answered Nov 23 '18 at 15:07

ethan.roday

1,112719

answered Nov 23 '18 at 15:07

ethan.roday

1,112719

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

NEKPD2N0nzZigzFzesbXmMGoAVS3mzuVHqfymHGlP9a2mG2GoxvCp,EMsdMRbs96MDn,D2hUIc8tZ2

搜尋此網誌

Argthtjtr