How to get HTML of text selected in a JTextPane rendering HTML page

2.6k views Asked by At

How to get the HTML source of a line of text (by line I mean any text between two newline characters in the text rendered, as visible) selected in a JTextPane that is rendering a HTML page?

My aim is to edit the HTML simultaneously for a line, when the line from the text rendered is edited.

Code:

package test;

import java.beans.PropertyChangeEvent;
import java.beans.PropertyChangeListener;
import java.io.IOException;
import java.net.URL;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.swing.JScrollPane;
import javax.swing.text.BadLocationException;

public class TextPaneTester extends javax.swing.JFrame {

    public TextPaneTester() {
        initComponents();
        myInitComponents();
    }

    @SuppressWarnings("unchecked")
    // <editor-fold defaultstate="collapsed" desc="Generated Code">                          
    private void initComponents() {
        contentScrollPane = new javax.swing.JScrollPane();
        content = new javax.swing.JTextPane();

        setDefaultCloseOperation(javax.swing.WindowConstants.EXIT_ON_CLOSE);
        contentScrollPane.setViewportView(content);

        javax.swing.GroupLayout layout = new javax.swing.GroupLayout(getContentPane());
        getContentPane().setLayout(layout);
        layout.setHorizontalGroup(
                layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
        .addComponent(contentScrollPane, javax.swing.GroupLayout.Alignment.TRAILING, javax.swing.GroupLayout.DEFAULT_SIZE, 400, Short.MAX_VALUE)
    );
    layout.setVerticalGroup(          l   layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
        .addGroup(layout.createSequentialGroup()
            .addContainerGap()
            .addComponent(contentScrollPane, javax.swing.GroupLayout.DEFAULT_SIZE, 278, Short.MAX_VALUE)
            .addContainerGap())
          );

        pack();
        }// </editor-fold>                        

    private void myInitComponents(){
        //content.setEditorKit(new StyledEditorKit());
        contentScrollPane.setVerticalScrollBarPolicy(JScrollPane.VERTICAL_SCROLLBAR_ALWAYS);
        contentScrollPane.setHorizontalScrollBarPolicy(JScrollPane.HORIZONTAL_SCROLLBAR_NEVER);
    }

    private void fetchURL(String url){
        try{
            // URL(URL baseURL[, String relativeURL])
            URL helpURL = new URL(url);
            this.content.addPropertyChangeListener("page",
                    new PropertyChangeListener() {
                        @Override
                        public void propertyChange(PropertyChangeEvent event) {
                            System.out.println("After lsitening to page load event, getText() on JTextPane gives "
                                + content.getText());

                            try {
                                System.out.println("After lsitening to page load event, getDocument().getText() on JTextPane gives "
                                        + content.getDocument().getText(0, content.getDocument().getLength()) );
                            } catch (BadLocationException ex) {
                                Logger.getLogger(TextPaneTester.class.getName()).log(Level.SEVERE, null, ex);
                            }
                        }
            });
            this.content.setPage(helpURL);
            }
        catch (IOException e) {
            System.err.println("Attempted to read a bad URL: " + url);
        }
    }

    public static void main(String args[]) {
        /* Create and display the form */
        java.awt.EventQueue.invokeLater(new Runnable() {
            @Override
            public void run() {
                String url = "https://en.wikipedia.org/wiki/Stack_Overflow";
                TextPaneTester reader = new TextPaneTester();
                reader.fetchURL(url);
                reader.setVisible(true); 
            }
        });
    }
    // variable declaration                
    private javax.swing.JTextPane content;
    private javax.swing.JScrollPane contentScrollPane;                  
}

Output:

run:
After lsitening to page load event, getDocument().getText() on JTextPane gives                                    





Stack Overflow
From Wikipedia, the free encyclopedia

Jump to: navigation , search
For other uses, see Stack overflow (disambiguation).
Stack Overflow


Screenshot of Stack Overflow as of February 2015
Web address
stackoverflow .com
Commercial?
Yes
Type of site
Knowledge markets
Registration
Optional; Uses OpenID
Available in
English
Content license
CC-BY-SA 3.0 (for user contributions)
Written in
ASP.NET MVC [1]
Owner
Stack Exchange, Inc.
Created by
Joel Spolsky and Jeff Atwood
Launched
15 September 2008[2]
Alexa rank
 52 (March 2015[update])[3]
Current status
Online
Stack Overflow is a privately held website, the flagship site of the Stack Exchange Network,[4][5][6] created in 2008 by Jeff Atwood and Joel Spolsky,[7][8] as a more open alternative to earlier Q&A sites such as Experts-Exchange. The name for the website was chosen by voting in April 2008 by readers of Coding Horror, Atwood's popular programming blog.[9]
It features questions and answers on a wide range of topics in computer programming.[10][11][12]
The website serves as a platform for users to ask and answer questions, and, through membership and active participation, to vote questions and answers up or down and edit questions and answers in a fashion similar to a wiki or Digg.[13] Users of Stack Overflow can earn reputation points and "badges"; for example, a person is awarded 10 reputation points for receiving an "up" vote on an answer given to a question, and can receive badges for their valued contributions,[14] which represents a kind of gamification of the traditional Q&A site or forum. All user-generated content is licensed under a Creative Commons Attribute-ShareAlike license.[15]
Closing questions is a main differentiation from Yahoo! Answers and a way to prevent low quality questions.[16] The mechanism was overhauled in 2013; questions edited after being put "on hold" now appear in a review queue.[17] Jeff Atwood stated in 2010 that duplicate questions are not seen as a problem but rather they constitute an advantage if such additional questions drive extra traffic to the site by multiplying relevant keyword hits in search engines.[18]
As of April 2014[update], Stack Overflow has over 4,000,000 registered users and more than 11,000,000 questions.[19][20] Based on the type of tags assigned to questions, the top eight most discussed topics on the site are: Java, JavaScript, C#, PHP, Android, jQuery, Python and HTML.[21]

Contents
1 History
1.1 Content criteria
1.2 User suspension
2 Statistics
3 Criticism
4 Technology
4.1 Stack Apps
5 See also
6 References
7 External links

History[edit]
The website was created by Jeff Atwood and Joel Spolsky in 2008.[7] On 31 July 2008, Jeff Atwood sent out invitations encouraging his subscribers to take part in the private beta of the new website, limiting its use to those willing to test out the new software. On 15 September 2008 it was announced the public beta version was in session and that the general public was now able to use it to seek assistance on programming related issues. The design of the Stack Overflow logo was decided by a voting process.[22]
On 3 May 2010 it was announced that Stack Overflow had raised $6 million in venture capital from a group of investors led by Union Square Ventures.[23][24]
Content criteria[edit]
Stack Overflow admits questions about programming that are tightly focused on a specific problem. Questions that are of a broader nature or invite answers that are inherently a matter of opinion are usually closed by a process carried out by the site's participants. The sister site programmers.stackexchange.com is intended to be a venue for some such broader questions, such as questions about agile software development in general.
User suspension[edit]
In April 2009, Stack Exchange implemented a policy of "timed suspension",[25] in order to curtail users who either show "No effort to learn (the community rules) and improve over time" or engage in "disruptive behavior" and become a nuisance. The suspension is accompanied by temporarily setting the user's reputation score at '1' and a notation on the user's profile page indicating the suspension and remaining duration.
Statistics[edit]
A 2013 study has found that 77% of users only ask one question, 65% only answer one question, and only 8% of users answer more than 5 questions.[26] As of 2011, 92% of the questions were answered, in a median time of 11 minutes.[27] Since 2013, the Stack Exchange network software automatically deletes questions that meet certain criteria, including having no answers in a certain amount of time.[28]
As of August 2012, 443,000 of the 1.3M registered users had answered at least one question, and of those, approximately 6,000 (0.46% of the total user count) had earned a reputation score greater than 5000.[29] Reputation can be gained fastest by answering questions related to tags with lower expertise density, doing so promptly (in particular being the first one to answer a question), being active during off-peak hours, and contributing to diverse areas.[29]
In June 2015, 125,313 posts were deleted within the past 30 days, of which about 8% were deleted by moderators.[30]
Criticism[edit]

This section needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (July 2014)
Stack Overflow has been criticized for encouraging poor learning habits, using a rewards system with perverse incentives favoring quick answers versus quality ones,[31] and having a community dominated and shaped by authoritarian moderators[weasel words].[32] The barrier of entry for new users is high.[33] Popular questions with both informative and humorous value have been deleted from the site,[34][35] including the very list of these questions.[36]
Technology[edit]
Stack Overflow is written in C#[1] using the ASP.NET MVC (Model-View-Controller) framework, and Microsoft SQL Server for the database[37] and the Dapper object-relational mapper used for data access.[38] Unregistered users have access to most of the site's functionality, while users who sign in (for example, by using the OpenID service) can gain access to more functionality, such as establishing a profile and being able to earn reputation to allow functionality like re-tagging questions or voting to close a question.
Stack Apps[edit]
The Stack Overflow team has recently[when?] begun the creation of an API for accessing the data contained on the other sites. Discussion on Stack Apps centers around the API, although users are encouraged to list apps and libraries developed for the API.
See also[edit]

Internet portal

Information technology portal
Askbot (free engine)
OSQA (Open Source Question and Answer)
Rosetta Code (Multi-lingual algorithms)
List of Internet forums
...

P.S.: I need to allow a user to select text (from JTextPane view) between two newlines like returned by textPane.getDocument().getText(). In a way, I am looking to map this selected text with corresponding HTML blocks so that I can replace its inner HTML with the HTML of translation text (given by user in another JTextPane).

2

There are 2 answers

4
trashgod On

Add a DocumentListener to learn when the underlying HTMLDocument changes. In the example below, an adjacent JTextArea displays the changes as they are typed using the result of getText(), which "Returns the text contained in this TextComponent in terms of the content type of this editor." You can traverse the elements of the parsed HTMLDocument as shown here.

How is the program capable of knowing which HTML tag is being edited?

An HTMLDocument models HTML, "As a result, the structure described by an HTML document is not exactly replicated by default." The editor alters elements, not tags. The getText() method uses an HTMLWriter to reconstruct the modeled HTML by traversing the elements of its internal document tree. You can reconstruct a part of the document by using the appropriate constructor. For example, this method returns the HTML corresponding to the current selection:

private String writeSelection() {
    StringWriter buf = new StringWriter();
    HTMLWriter writer = new HTMLWriter(buf,
        (HTMLDocument)jtp.getDocument(), jtp.getSelectionStart(), jtp.getSelectionEnd());
    try {
        writer.write();
    } catch (IOException | BadLocationException ex) {
        ex.printStackTrace();
    }
    return buf.toString();
}

I wish to be able to click on a text-section and get its HTML.

Add a CaretListener to learn when the selection changes. The example below invokes writeSelection(), shown above, with each change.

jtp.addCaretListener(new CaretListener() {

    @Override
    public void caretUpdate(CaretEvent e) {
        EventQueue.invokeLater(() -> {
            jta.replaceRange(writeSelection(), 0, jta.getDocument().getLength());
        });
    }
});

See also the structure viewer cited here.

image

import java.awt.GridLayout;
import java.io.IOException;
import java.net.URL;
import javax.swing.JScrollPane;
import javax.swing.JTextArea;
import javax.swing.JTextPane;
import javax.swing.event.DocumentEvent;
import javax.swing.event.DocumentListener;

//* @see https://stackoverflow.com/a/30905872/230513 */
public class TextPaneTester extends javax.swing.JFrame {

    private final JTextPane jtp = new JTextPane();
    private final JTextArea jta = new JTextArea(20, 48);

    public TextPaneTester() {
        initComponents();
    }

    private void initComponents() {
        setDefaultCloseOperation(EXIT_ON_CLOSE);
        setLayout(new GridLayout(0, 1));
        add(new JScrollPane(jtp));
        add(new JScrollPane(jta));
        pack();
    }

    private void fetchURL(String url) {
        try {
            URL helpURL = new URL(url);
            this.jtp.setPage(helpURL);
            this.jtp.getDocument().addDocumentListener(new DocumentListener() {

                @Override
                public void insertUpdate(DocumentEvent e) {
                    print(e);
                }

                @Override
                public void removeUpdate(DocumentEvent e) {
                    print(e);
                }

                @Override
                public void changedUpdate(DocumentEvent e) {
                    print(e);
                }

                private void print(DocumentEvent e) {
                    jta.replaceRange(jtp.getText(), 0, jta.getDocument().getLength());
                }
            });
        } catch (IOException e) {
            System.err.println("Attempted to read a bad URL: " + url);
        }
    }

    public static void main(String args[]) {
        java.awt.EventQueue.invokeLater(new Runnable() {
            @Override
            public void run() {
                String url = "http://www.example.com";
                TextPaneTester reader = new TextPaneTester();
                reader.fetchURL(url);
                reader.setVisible(true);
            }
        });
    }
}
4
Sharcoux On

To replace the content of a paragraph, you can simply use this :

    HTMLDocument doc = (HTMLDocument)getDocument();
    doc.setInnerHTML(doc.getParagraphElement(getSelectionStart()),newContent)

newContent being the HTML content of your textArea.

This is in the case you translate paragraph by paragraph. You can get smaller entities by using :

    doc.getCharacterElement(getSelectionStart()).getParentElement();

edit : By the way, you can also use :

    try {
        new HTMLEditorKit().write(writer, htmlDoc, startOffset, length);
        String html = writer.toString();
    } catch (IOException | BadLocationException ex) {
        Logger.getLogger(Editeur.class.getName()).log(Level.SEVERE, null, ex);
    }

to get the html content of the document between startOffset and startOffset+length. You can easily deduce from that how to get the html content of a paragraph, of the selection, or from the whole document.