Home > By category > Developer Tools >Management & Distribution > HTML Parser



HTML Parser

  • Downloads: 
  • Views: 
  • Rating:

Primarily used for transformation or extraction, it features filters, custom tags, visitors, and easy to use JavaBeans. HTML Parser is a robust, fast, and well tested package.

HTML Parser is a useful Java library designed for HTML transformation or extraction.

The two fundamental use-cases that are handled by the parser are extraction and transformation (the syntheses use-case, where HTML pages are created from scratch, is better handled by other tools closer to the source of data).

In general, to use the HTMLParser you will need to be able to write code in the Java programming language. Although some example programs are provided that may be useful as they stand, it's more than likely you will need (or want) to create your own programs or modify the ones provided to match your intended application.

To use the library, you will need to add either the htmllexer.jar or htmlparser.jar to your classpath when compiling and running. The htmllexer.jar provides low level access to generic string, remark and tag nodes on the page in a linear, flat, sequential manner.

The htmlparser.jar, which includes the classes found in htmllexer.jar, provides access to a page as a sequence of nested differentiated tags containing string, remark and other tag nodes.

Extraction

Extraction encompasses all the information retrieval programs that are not meant to preserve the source page.

This covers uses like:
· text extraction, for use as input for text search engine databases for example
· link extraction, for crawling through web pages or harvesting email addresses
· screen scraping, for programmatic data input from web pages
· resource extraction, collecting images or sound
· a browser front end, the preliminary stage of page display
· link checking, ensuring links are valid
· site monitoring, checking for page differences beyond simplistic diffs

There are several facilities in the HTMLParser codebase to help with extraction, including filters, visitors and JavaBeans.

Transformation

Transformation includes all processing where the input and the output are HTML pages.

Some examples are:
· URL rewriting, modifying some or all links on a page
· site capture, moving content from the web to local disk
· censorship, removing offending words and phrases from pages
· HTML cleanup, correcting erroneous pages
· ad removal, excising URLs referencing advertising
· conversion to XML, moving existing web pages to XML

During or after reading in a page, operations on the nodes can accomplish many transformation tasks "in place", which can then be output with the toHtml() method. Depending on the purpose of your application, you will probably want to look into node decorators, visitors, or custom tags in conjunction with the PrototypicalNodeFactory.

Free download from Shareware Connection - Primarily used for transformation or extraction, it features filters, custom tags, visitors, and easy to use JavaBeans.

Publisher: Derrick Oswald | License: Freeware | Price: 0.00
Version: 1.6 / 2.0 Snapshot | Platform: WinOther
Released Date: | Rating: 0 | Title: HTML Parser

Author Url: http://sourceforge.net/projects/htmlparser
Program Info Url: http://sourceforge.net/projects/htmlparser
Download Url: http://sourceforge.net/projects/htmlparser/files/htmlparser/1.6/htmlparser1_6_20060610.zip/download

HTML Parser keywords:
HTML Parser related downloads:

RTF-2-HTML v5 - RTF-2-HTML v5 is a COM component that Converts RTF to HTML and HTML to RTF perfectly. It is very easy to integrate into your existing applications or web sites (ASP.NET) Full Royalty free distribution rights Fanatical 24/7 technical support

RTF-2-HTML v8 - RTF-2-HTML v8 is available as a pure .NET component and a COM dll. RTF-2-HTML converts RTF to HTML and HTML to RTF perfectly. It is very easy to integrate into your existing applications or web sites. Works in ASP, ASP.NET, VB, Delphi, .NET, C++, etc

HTML Email Creator - HTML Email Creator creates HTML email by a HTML file, and sends the HTML email. HTML email outputted and sent by the software is compatible with a spectrum of email clients and webmail such as Outlook, Thunderbird, Gmail, Hotmail, and Yahoo! Mail.

RTF-2-HTML v6 - RTF-2-HTML v5 Converts RTF to HTML and HTML to RTF perfectly, easy to integrate.

Likno Auto Popup Window Addin - Create any type of jQuery/HTML popup window & popup dialog easily, such as: html window, html popup, jQuery popup, modal popup, modal dialog, jQuery modal. All browsers supported, effects, slideshows, videos, automatic opening on Page Load and more!

Shareware Connection periodically updates pricing and software information of 'HTML Parser' from company source 'Derrick Oswald' , so some information may be slightly out-of-date. You should confirm all information before relying on it. Software piracy is theft,  Using 'HTML Parser' crack, password, serial numbers, registration codes, key generators is illegal and prevent future development of HTML Parser.

New Reviews

SlimCleaner - Nearly all PC users need to use various utilities to enhance performance of their laptop or desktop from time to time. Windows inbuilt utilities can serve the purpose but when you can find more capable third party apps at low or zero cost, why ...

RawTherapee - There are so many image editors out there but when it comes to powerful and versatile RAW file editors, the choices are somewhat limited. With Adobe switching to Cloud-based subscriptions for most of its apps, semiprofessional users including ...

NovaBench - PC performance benchmarking is something that may not interest average users much but tech savvy lot resort to such apps to find out prowess and capability of their laptops and desktops. While there are quite a few PC benchmarking apps in market, ...

Backup Dwarf Home Edition - Making backups of data and media content has become extremely important for computer users nowadays. Owing to hardware failure, you may end up losing gigabytes of precious data. Besides, you may not want to leave sensitive data accessible to all ...

Spark Browser - Nearly every PC user needs to browse the web nowadays, irrespective of age and gender. Whether you use a laptop or desktop, it is mandatory to use a web browser to access web or download files. There was a time a majority of Windows users ...

DriverPack Solution Professional - To keep your laptop or desktop performing well consistently, it is important to pay attention to some aspects. Using top antivirus utility, cleaning up junk feels periodically are some such examples. However, you also need to keep drivers of ...

Menu Uninstaller Ultra - For every Windows user it becomes necessary to remove some programs after prolonged usage. It can be necessary to free up disk space or the app may not simply suit the needs any more. Whatever is the reason, you need to uninstall third party apps ...

Hekasoft Backup & Restore - There was a time a majority of Windows PC users stuck to Internet Explorer as it was the integrated web browsing app. After arrival and growth of Open Source rivals like Firefox and Chrome, the scenario has changed a lot. PC users are now spoilt ...

CamMask - There are several types of webcam software available in market and so you need not stick with the default webcam software of your laptop any longer. However, not all webcam apps can offer you a plethora of fun and dazzling effects as CamMask. ...

PotPlayer - When it comes to selecting a suitable media player application, the profuse options can leave most PC users baffled. While the bundled WMP has become far more polished than its predecessors in latest releases of Windows, a section of users prefer ...




New Downloads

dbForge Index Manager for
SQL Server

SSMS add-in for analyzing the
status of SQL indexes and
fixing issues with index
fragmentation. The tool allows
you to ...

dotConnect for SQLite

dotConnect for SQLite is a
data provider built on ADO.NET
architecture. With Entity
Framework and LinqConnect
support it ...

LightningChart SDK

LightningChart Ultimate SDK is
the fastest 2D and 3D
measurement, research, finance
and trading data visualization
SDK ...

NOV Chart for .NET

NOV Chart for .NET is an
advanced charting control for
.NET, covering a vast feature
set of charting types for ...

Centurion Setup

Centurion Setup builds a
professional software
installer for Windows in a
self-contained, compressed
executable. ...

XmlInfo

XMLInfo is a framework
comprising three components
that collaborate with one
another to streamline the
readme information ...

VISCOM TIFF ActiveX SDK

image, picture, graphic viewer
ocx / activex.VB.Net, c#, VB,
VC++, PowerBuilder, VFP,
.Net mark a selection, crop,
zoom ...

Audio Graph ActiveX

With this ActiveX (OCX)
component you can display a
meter, detect a silence, and
display 2 types of audio
graphs by the ...

Devart ODBC Driver for
PostgreSQL

Devart ODBC Driver for
PostgreSQL provides
high-performance and
feature-rich connectivity
solution for ODBC-based ...

JNIWrapper for IBM AIX
(ppc32)

JNIWrapper library allows to
interface native code while
retaining full control of the
application on the Java side.
With ...

ComfyJ

ComfyJ is a COM-to-Java-to-COM
bridge. ComfyJ allows to
easily integrate a Java
application with any
COM/OLE/OCX/ActiveX ...

GroupDocs.Assembly for .NET

.NET mail merge library for
generating custom documents
from Word or PDF templates.
The library doesn\'t require
...