difference between #PCDATA and CDATA?

Hi, I am new to XML and its related technologies. I have started the preparation for XML certification. I have following questions. Please help me. Q#1 What is the difference between #PCDATA and CDATA? What does parser have to do with these?

Go to http://www.w3.org/TR/REC-xml Search the words you've question with. You will not only find out your own answer, but also learn way more than that. If it is a little over your head, buy/find a book, or free books/chapters online. If you give me a fish, You feed me for the day If you teach me how to fish, you feed me for life.

Hi Vivek, # PCDATA for elements = CDATA for attributes. When you specify #PCDATA for your element or CDATA for your attribute, it means that you cannot put any markup inside the element or attribute respectively. Going by this, your second question is also answered. Thanks Mohan

Vivek Saxena
Ranch Hand

Joined: Apr 24, 2002
Posts: 58

posted Jan 27, 2003 11:12:00

0

HI, I was looking into tutorial at http://www.w3schools.com/dtd/dtd_building.asp and I fond following definition for PCDTAT & CDATA. I really couldn't understand the technicalities of the statements with respect to parser. For PCDATAPCDATA is text that will be parsed by a parser. Tags inside the text will be treated as markup and entities will be expanded. For CDATACDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.

<h1>Titel</h1> <h1> is a tag here, will be treated as markup. However, if the same thing is put inside a CDATA section, then it is not tag and markup any more. <![CDATA[ <h1>Titel</h1> ]]> My suggestion, read their html/xhtml tutorials before DTD/XML.

Vivek Saxena
Ranch Hand

Joined: Apr 24, 2002
Posts: 58

posted Jan 28, 2003 00:05:00

0

Roseanne Zhang, Thanks for your kind advise. I will try to give some thought on it. I think I didn’t explain my problem properly or you couldn’t understand. I am trying to find following things: What exactly #PCDATA means when we define element-content as #PCDDTA? <!ELMENT name (#PCDATA)> & What exactly CDATA means when we define attribute type as CDATA? <!ATTLIST name lastname CDATA “saxena”> I understand that both represent a string or text but there are some differences that I don’t understand. I was reading a book and I found following definition

“Keyword PCDATA specifies that the element must contain parsable character data – that is , any text except the characters less-than (&lt) , greater-than (&gt) , ampersand (&), quote(') and double quote (") . & “Atrribute types are classified as either strings (CDATA), tokenized or enumerated. String (CDATA) attribute types do not impose any constraint on attribute values – other than disallowing the <,>,&,’ and “ characters. Entity reference must be used for these characters.” So I understand one thing for sure that there are some constraint that are imposed by #PCDATA (when defined as element-content) but not by CDATA (when defined as attribute type), other than disallowing the <,>,&,' and " characters. I need help to identify those constraints. May be I am too dumb to understand. May be that is why I am looking for help. One more thing I found that, “The CDATA keyword in an attribute declaration has a different meaning than the CDATA section in an XML document. In CDATA section all characters are legal (including <,>,&,’ and “ characters) except the “]]>” end tag.” So what you explained to me is has nothing to do with my problem. That I understand. Looking forward to get some light here. Thanks vivek [ January 28, 2003: Message edited by: Vivek Saxena ] [ January 28, 2003: Message edited by: Vivek Saxena ]

Roseanne Zhang
Ranch Hand

Joined: Nov 14, 2000
Posts: 1953

posted Jan 28, 2003 06:52:00

0

You've got all the excellent quotes, I could not explain better than those . However, I can reinforce them:

Original posted by vivek: CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded. Keyword PCDATA specifies that the element must contain parsable character data � that is, any text except the characters less-than ( < ) , greater-than ( > ), ampersand ( & ), quote( ' ) and double quote ( " ). Atrribute types are classified as either strings (CDATA), tokenized or enumerated. String (CDATA) attribute types do not impose any constraint on attribute values � other than disallowing the <, >, &, � and " characters. Entity reference must be used for these characters.

Good summary! It helped me. It will help others who have similar problems. Thanks!