Java XML Overview



Java XML is simply working with an XML document from a Java program. Imagine, we have a file "products.xml" where we have product details such as name, brand and price.

Now, we want to update prices for some products using Java programming. Before writing such java programs to access XML documents, we should know basics of XML.

What is XML?

XML stands for EXtensible Markup Language. It is a text-based markup language which is used to store and transport data. It is self-descriptive and both human-readable and, machine-readable. Following are some notable points on XML −

  • XML is a markup language.

  • XML is a tag based language like HTML.

  • XML tags are not predefined like HTML.

  • You can define your own tags which is why it is called extensible language.

  • XML tags are designed to be self-descriptive.

  • XML is W3C Recommendation for data storage and data transfer.

XML Document

An XML document is the collection of elements that define data in a well structured and organized manner. An XML document has two sections, namely, document prolog and document elements.

Syntax

Following is the syntax of an XML document −



	
	...

Where,

  • is the XML declaration statement. If included, it must be kept in the first line.
  • is the root element and it is the parent of all other elements.
  • is the sub element of the root element.

Example

Following example shows Employee details with as the root element and , , as sub elements. Data for each element is enclosed between opening and closing tags.



	Kiran
	developer
	25,000

Elements in XML

An element is the building block of an XML document. It consists of an opening tag, content and a closing tag. In an xml document, there should always be a root element, inside which we can write many sub elements. Elements can also have any number of attributes inside them.

Syntax

Following is the syntax of an XML element −


	
		.....
	

Where,

  • is the root element of the XML document.
  • is the child element and its parent is the root element.
  • is the sub child and its parent is the child element.

Example

Let us see an example where DOB(date of birth) is further structured into date, month and year. Here, is the root element and , , are child elements.


	27
	March
	2000

Tags in XML

Tags in XML are self-explanatory and user defined. These are enclosed in less than (<) and greater than (>) symbols. XML is case sensitive and hence opening and closing tags should have same name.

Example

In the following example, we have written an address element with opening and closing tags.

Hyderabad

Now, let us see some incorrect ways of writing XML tags:

Since, XML is case sensitive unlike HTML, it throws the error: Opening and ending tag mismatch.

Attributes in XML

Elements in XML can have attributes. Attributes are name-value pairs that provide further specific information about a particular element. An element can have any number of attributes.

Syntax

Following is the syntax for XML attributes −

content	

Where,

  • element_name is the name of the element.
  • attribute_name is the name of the attribute.
  • value is the value of the corresponding attribute.

Example

Now, let's look at the following example where we have four attributes, name, class, marks and DOB for the 'Student' element.


Using sub elements to replace attributes

Instead of attributes, sub elements can also be used in elements to achieve the same purpose as of attributes. The same student example can also be written as follows:


	Kiran
	8
	50
	27-03-2000

It is always a best practice to use sub elements instead of attributes. Because, sub elements can further be extended whereas attributes cannot be extended.

In the above example, If we further want the date of birth as date, month and year then it can be done by using sub elements for DOB element as follows :


	Kiran
	8
	50
	
		27
		03
		2000
	

XML Declaration

XML declaration describes the basic format information such as version, encoding and standalone status about the entire XML document. If an XML declaration is included in the document, it must be written in the first line.

By default, if declaration is not mentioned, XML parser considers the document is in version 1.0

Syntax

Following is the syntax of XML declaration −


Where,

  • XML declaration starts with the character sequence and ends with the character sequence ?>
  • version is the version number of the XML used
  • encoding is the character encoding used for the content of XML document
  • standalone is a boolean attribute whose default value is set to 'no'. This tells whether the XML document is standalone or uses information from external source to parse the document such as DTD(Document Type Definition). The default value is set to 'no'.

Example

Following example uses XML 1.0 version with encoding type UTF-16 and it is standalone.


XML Comments

Comments in XML are used to explain the document's purpose and details. It is always a best practice to include comments in the document because it makes the task simpler to the one who is reading the document for the first time. XML follows the same syntax as of HTML.

Syntax

Following is the syntax for both single line and multi line XML comments −


Example

Let us say we have collected the information of departments in a college in the year 2015. These records might be changed over the years. So, mentioning this in the comments helps the one who is editing to know when these details have been collected.




	
		CSE
		CS
		25
	
	
		ECE
		EC
		20
	

XML Namespaces

XML namespaces are used to resolve name conflicts in the XML document. When two or more XML fragments are added, then there is a chance that these XML code fragments might use some tags with same name. Then, this confuses the XML parser. To avoid these kind of name conflicts, XML Namespaces are used.

Example

Assume we have created an XML element holding the information about a coffee table −

OvalWood315000

Suppose we have created another element which holds information about a dining table as −

RectangleMarble625000

When the above two XML code fragments are added together (in a single file), there will be a name conflict. Though the name of both the elements is same the information provided by them varies. There are two ways to resolve these name conflicts in XML. They are −

  • Using Prefix
  • Using namespace declaration

Using Prefix

We can differentiate the elements by adding prefix to them. To resolve the above name conflict, we can add the prefix 'c' to the element holding the info about the coffee table and similarly we can add the prefix 'd' for the other element (dining table).

Example

Let us take the same table example and try to resolve the name conflict using prefixes.



	Oval
	Wood
	3
	15000




    Rectangle
    Marble
    6
    25000

Drawbacks of Using Prefixes

While using prefixes there still might be a chance where two elements have same prefix along with the name. In such cases the conflict prevails.

Suppose if we add another element providing info about a dressing table, to differentiate, we need to use the prefix 'd'. This again brings a conflict between dining table and dressing table. Hence, using prefix can solve the conflict to some extent but not completely.

Using "namespace" Declaration

XML namespace declaration is used to resolve name conflicts effectively. A new attribute named 'xmlns' is used.

Syntax

Following is the syntax for XML namespace −


Where,

  • element-name: Element name on which the namespace is used.
  • xmlns: A compulsory keyword to declare namespace.
  • prefix: Namespace prefix
  • URI: Namespace identifier

Example

The following example uses XML namespace declaration for three table tags. Now, the conflict between dining table and dressing table is resolved by differentiating them in their namespace URI.




	Oval
	Wood
	3
	15000




    Rectangle
    Marble
    6
    25000




    Trevi Furniture
    Engineered wood
    15000

Advertisements