jsoup - Tutorial #1 setup, basic commands

By @qami5/3/2018utopian-io

https://steemitimages.com/DQmYstyySFoVnL2Xve8F6ZFF3ukGBkvNdLamgFi86kYDj1Q/23.png
What Will You Learn?
In this tutorial you will learn about jsoup. Its basic elements and development.

What is jsoup
jsoup is a Java based library to work with HTML based content. It provides a very convenient API to extract and manipulate data, using the best of DOM, CSS, and jquery-like methods. It implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.

Requirements

  • Basic Java Programming
  • Good OOP Concept is a plus point for you
  • Difficulty

  • Intermediate
  • jsoup - Overview
    jsoup is a Java based library to work with HTML based content. It provides a very convenient API to extract and manipulate data, using the best of DOM, CSS, and jquery-like methods. It implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.

  • jsoup libary implements the WHATWG HTML5 specification, and parses an HTML content to the same DOM as per the modern browsers.
  • jsonp library provides following functionalities.
  • Multiple Read Support - It reads and parses HTML using URL, file, or string.
  • CSS Selectors It can find and extract data, using DOM traversal or CSS selectors.
  • DOM Manipulation It can manipulate the HTML elements, attributes, and text.
  • Prevent XSS attacksIt can clean user-submitted content against a given safe white-list, to prevent XSS attacks.

    TidyIt outputs tidy HTML.

  • Handles invalid data - jsoup can handle unclosed tags, implicit tags and can reliably create the document structure.
  • Local Environment Setup
    JUnit is a framework for Java, so the very first requirement is to have JDK installed in your machine.

    System Requirement
    https://steemitimages.com/DQmUULA1kWm5PaYmPZf9SqFcoPzj6VYTVPQjpqLgs8TXh1T/Screenshot_1.png

    Step 1: Verify Java Installation in Your Machine
    https://steemitimages.com/DQmQWcRj5JBLFZt9dGidNEVEXCriYo9QbGHrVqR6VfRV4xR/1.png
    https://steemitimages.com/DQmTmFVFZcYDDRFF6wQYF8c4NiGb8W7TWbVV19ZPSdrmT4N/2.png

    Step 2: Set JAVA Environment
    https://steemitimages.com/DQmfF7gv1BhBDhpQZHCW8gdBtjQufKNfWb8GZiLSRGAs7Nz/21.png

    Step 3: Download jsoup Archive
    https://steemitimages.com/DQmb826mDfUzAv4sbpRbwFFpiQAyzrZMji3TGF68D61QZoT/3.png

    Step 4: Set jsoup Environment
    https://steemitimages.com/DQmZo3H1A6mWAv4A8CRj3V9YjEgq47PEjZbdKvpPp4R3UiD/4.png

    Step 5: Set CLASSPATH Variable
    5.png

    Source: Tutorialspoint.com

    6

    comments