Office of Continuing and Professional Studies
Brandeis University
Rabb School of Summer, Special and Continuing Studies
Waltham, MA 02254-9910
Course: XML
Instructor
Sang Shin(sang.shin@sun.com
),
Java Technology Evangelist, Sun Microsystems, Inc.,
(781) 442-0531 (office)
Biography of Sang Shin
Sang Shin's
Speech
Engagement Schedule
Office hours : No scheduled office hours. The best way to
contact me
is through emails.
Class website is http://www.javapassion.com/xml/index.html
Courses that are taught by Sang Shin
XML (this course)
Distributed
Programming using Jini Technology
Web
Services Programming
using XML and Java Technology
J2EE
Programming
About This Class
General Policies
XML Resources
Official Course Title
and
Description (as in Brandeis brochure)
- Course Number: RSEG-0151-G
- Title: Next Generation Electronic Publishing in XML and
Related Language
- Description: "This course surveys the open standards
that are
making documents increasingly interchangeable, searchable, dynamic, and
customizable. Topics include SGML, XML, and XML parsers,
including the DOM and SAX interfaces, XPath, XSL, XSLT, XHTML, other
emerging standards, and mainstream electronic publishing technologies
concerning page description languages, colors and fonts. Prerequisite:
A working knowledge of HTML and a foundation knowledge of Java or Perl."
Unofficial Course Title
and
Description - (At least, this is what I am planning to teach)
- Course Number: RSEG-0151-G (This number might need to
be changed to reflect a bit different nature of the class or course
description needs to be changed.)
- Title: Applied XML (I am trying to come up
with
a better name for this class. Suggestions are welcome.)
- Description: "This course starts with a high-level
overview
of XML technology and its application. During this overview,
fundamental concepts of XML and the reasons of why XML is important are
introduced. The application areas of XML include XML as document
format and how XML is
currently being used in the Web. During the first half of this
course, the grammatical and syntactical aspects of XML and
other related technologies
such as DTD, XML Schema, XSL, XSLT, XPath, Namespaces,
Internationalization, XLink, and XPointer are going to be covered in
detail. Once these
are dealt with, the course gets into programming aspects of XML which
includes
parsing and transformation. The choice of programming language
for
XML programming in this class is Java programming language even though
for
most part those folks who have some programming experience in C and C++
can
still follow. As for parsing, both SAX and DOM are going to be
covered.
Also, the core Java APIs for XML that are currently being defined via
Java
Community Processs(JCP) are going to be introduced. They are JAXP
(Java
API for XML Parsing), JAXB(Java API for XML databinding), and JAXM
(Java
API for XML messaging). The open source XML projects especially
the
ones that are being worked on by Apache organization, for example,
Cocoon,
will be talked about. The XML usage in the context of enterprise
application
technologies such as JSP and Servlet are going to be covered if time
permits.
Finally emerging XML-based E-commerce standards such as SOAP/W3C's XP,
UDDI
and ebXML will be explored. Prerequisite: Some programming experience
and
concept of HTML. Also if time permists, emerging Java APIs for XML such
as
"Java APIs for XML-RPC" and others will be explored."
Course Objectives
- By the end of the course, students are expected to
- Understand the fundamental concepts of XML and related
technologies
- Acquire knowledge on how XML is currently being used in
various application areas
- Understand the syntactic and semantical aspects of XML
documents
- Know how to parse and transform XML documents via tools and
through
programming APIs
- Have some exposure of XML activities in e-commerce area
Syllabus (Actually collection of topics at
this
point)
You are welcome to use the material here in any way you
want.
In fact, I would like to strongly encourage folks to use the material
here,
for example, to teach a class or give a in-house seminar on the
subjects. The files here are in PDF form but Powerpoint or
Staroffice files will be
available upon request. If you use any of the materials here, I
would
like to ask you to drop me an email (sang.shin@sun.com
) so that I can keep track of who is using
what. I
am also available for giving in-house seminar on the subjects as long
as
travel expenses are paid. (Seminar itself will be free of charge.)
Thanks in advance.
Warning: The topics themselves and
number of hours assigned to each topic are subject to change as we move
along.
Warning: The order of topics to be covered
are
also subject to change as instructors see fit.
Warning: Given the time constraint we have,
we
might not be able cover all the topics listed here.
- Overview
of XML technologies (2
hours)
- XML Fundamentals
- XML Programming (in Java)
- SAX (1
hour) - concept and APIs
- DOM (2
hours) - concept and APIs
- JDOM(2 hours) - concept and APIs
- JAXP
(1 hour) - concept and APIs
- JAXB
(1/2 hour) - mostly
concept since the APIs are not published yet
- JAXM
(1/2 hour) - mostly concept
since APIs are not published yet
- Emerging Java APIs for XML
- XML in Enterprise Application
- XML in e-Commerce (Web Services)
- Apache.Org Open Source XML Projects
- Cocoon (2 hours) - Web-publishing framework, XSP
- Batik (1/2 hour) - SVG (Scalable Vector Graphics)
- FOP (1/2 hour) - XSL Formatting processor
- XML Tools
Things we are NOT going
to
cover
- SGML
- HTML
- Detailed syntax and vocabularies of many domain-specific XML
languages (i.e. CML, MathML, and so on)
Grading Criteria
- Home work
assignment
20%
- Only one or two will be randomly selected for grading
- Submission is a Must unless specified otherwise - there will
be homeworks that do not require submission
- Missing homework could account for up to 5%
- 4 to 6 assignments are expected during the course
- Each homework assignment is expected to take between 3 to 6
hours
to finish
- Just for consistency of grading, a particular week's
homework
assignment will be graded by a single instructor or both
instructors
- Midterm project (or
exam)
25%
- Final
project
45%
- Class
participation/Attendance
10%
- Research
paper (optional, could earn bonus points of up to 15%)
- 6 to 10 pages
- Any subject related to XML is acceptable
- The originality of your idea needs to be clearly specified
Ground Rules regarding the
usage of Java programming language in this class (until course
description gets changed)
- XML basic concepts (along with related technologies) can
certainly
be taught without requiring the knowledge of particular programming
language.
- As for programming, parsing and transformation can be done in
both
in Java and other languages. I will talk about SAX and DOM both
in
concept and API perspective. I will be talking the APIs in
Java
but any student with some programming background should be able to
understand it. Again, they can use any tool or programming language for
parsing and transformation.
I will only mention Java APIs and Java tools, however, during the
class.
Students who need to use non-Java tools and programs need to do their
own
search of programming tools.
- Java APIs for XML session (this is about 3 to 4 hours) - there
are
a lot of concept to be understood which is relevant to all XML
programmers
regardless of the programming language.
- Programming assignment (if we have some) can be done in any
programming language.
- Again Apache projects deal with a lot of concept and
architecture
even though most of them use Java as implementation language.
- Both mid-term and final-term projects can be implemented in any
programming language
- The bottom line is if there are students who do not have
extensive
Java programming language, I will find ways not to penalize them both
in
their learning experience and grades in this class. (This policy
will
definitely change if I teach this course again.)
Prerequisites
- Some programming experience in Java, C or C++, preferably in Java
- Java language programming experience is highly desirable but not
required
- Next semester, if I teach this class again, however, I will
require
minimum 6 months Java language programming experience or one semester
of
Java language level 1.
Textbooks & Reference books
- "XML Bible" (This is not my first choice but since it was
mentioned
in the Brandeis brochure as "recommended textbook in this class" and
since there could be folks who already bought the book, I will use it
as "class textbook" for this semester. But I am NOT going to
require students to buy this book. We are planning to give
reading assignment before each class but many are available for free
from various web sites. As
long as students read some relevant chapter or article, I would not
really
care. Next semester, if I teach the same class, I will change the
recommended
textbook to "XML in a Nutshell" for XML concept part and "Java and XML"
for
Java programming for XML part.)
- "XML in a
Nutshell"
written by Elliotte Rusty Harold & W. Scott Means from O'Reilly
(I
think this book is a very well-written book on fundamental
concepts
of XML and related technologies. This will be my textbook of choice for
XML next semester. By the way, this book is written by the same
author who
wrote "XML Bible". As the author himself claims, this book
seems much better organized and "to the point". It is also
more up-to-date, published in Jan. 2001. A lot of class
presentation material is based on the contents from this book.)
- "Java and
XML" written by Brett McLaughlin from O'Reilly (If you are going to
learn the basics on how to program in Java for XML processing quick, I
would recommend this book. The contents on SAX and DOM, however,
are relatively introductory. If you want detailed description on
Java APIs on SAX and DOM, try API document itself and take a look at
sample code in Xerces. One plus point for this book is that it
has chapters on JDOM and XML-RPC.)
- Professional XML from Wrox press (I saw a few kudos on this book
on
the net and from one of the students. I have not had a chance to
check it out.)
- "Applied XML Solutions" from SAMS, written by Benoit Marchal (If
you
feel comfortable with fundamental XML concepts and are looking for a
book
that shows how XML is being used in real-life projects, this book is
recommended. It contains practical example scenarios, XML and Java
source files.)
- "XSLT Programmer's reference" written by Michael Kay from Worx
press
(Recommended by many as the most extensive XSLT reference book. In this
class,
the level of XSLT coverage in "XML in a Nutshell" or "Java and XML"
would
be quite adequate.)
- "Learning XML" written by Erik Ray from O'Reilly (The technical
detail is too skimpy for my taste. It might be good for managers - like
the one in
Dilbert - but not for us. :-))
Schedule (Class is being held every Wednesday night from 6:00PM
to
9:00PM)
- Jan. 31st (1st class)
- Topics to be covered (and presentation material)
- XML Overview, 1st part (1 hour 15 minutes)
- Pre-class reading material
- Homework assignment (No submission
required )
- Download AMAYA browser
and try
to display the following MathML data (You have to download the XML data
files first before you see the math forms of the data)
- Feb. 7th - Cancelled
- Feb 14th (2nd class)
- Topics to be covered (and presentation material)
- XML Overview, 2nd part (1 hour)
- XML grammar (1 hour) (This presentation material
will be redone!)
- DTD (1 hour) (This presentation material will be
redone!)
- Pre-class reading material
- XML Overview
- XML Fundamentals (Chap. 2) of "XML in a Nutshell"
- Any introduction chapter of any XML book
- DTD
- Homework assignment (HTML)
- Feb 21st (3rd class)
- Topics to be covered (and presentation material)
- Namespaces (3/4 hour)
- Internationalization (1/2 hour)
- XML as a Document Format (1/2 hour)
- Pre-class reading material
- Namespaces
- Namespaces (Chap. 4) of "XML in a Nutshell" (I like
this book
the best for this topic. In fact, the presentation material is
mostly based on the Namespaces chapter of this book)
- Corresponding chapter of any XML book
- Internationalization
- Internationalization (Chap. 5) and Character set
(Appendix) of "XML
in a Nutshell" (I like this book the best for this topic.
In
fact, the presentation material is mostly based on the
Internationalization
chapter of this book)
- Corresponding chapter of any XML book
- XML as a Document format
- XML as a Document Format (Chap. 6) of "XML in a Nutshell"
- Homework assignment #3
- Put at least two namespace declarations to the XML document
you've
created as part of assignment #2 work
and change elements and attributes accordingly and then perform the
validation with properly modified DTD
- Feb 28th (4th class)
- Topics to be covered (and presentation material)
- Pre-class reading material
- Example code
- XSLT
Example code zip file
- Instruction of running Apache Xalan with these example XSL
stylesheet
files
- Download and install JDK, make sure you have "java" in
your PATH
- Download Apache Xalan, following the installation
instruction of Java
version of Xalan. (For those of you who want to use C++ version of it,
you
are welcome to do so.) Basically you have to make sure you have
the
jar files in your classpath when you run the Xalan as mentioned below.
- Then run the example program like following (assuming you
install
Xalan in c:\xmltools\xalanj\xalan_1_0_0, replace it with the
installation
directory you used)
- set
CLASSPATH=.;c:\xmltools\xalanj\xalan_1_0_0\xalan.jar;c:\xmltools\xalanj\xalan_1_0_0\xerces.jar;c:\xmltools\xalanj\xalan_1_0_0\bsf.jar;c:\xmltools\xalanj\xalan_1_0_0\bsfengines.jar;c:\xmltools\xalanj\xalan_1_0_0\samples\xalansamples.jar
- java org.apache.xalan.xslt.Process -in 8-1.xml -xsl
8-2.xsl -out junk
- Homework assignment #4(No submission
required )
- Run the example code using either Xalan or SAXON
- Mar 7th (5th class)
- Topics to be covered (and presentation material)
- XPath (1 hour and half)
- XML on the Web, XHTML (1 hour)
- CSS (1/3 hour)
- Pre-class reading material
- XPath
- XML on the Web, XHTML
- CSS
- Homework assignment #5 (no submission
required )
- Play around with various browsers/applications/tools (IE,
netscape,
Mozilla, Amaya, Opera, and others) out there reading XML documents such
as
XHTML and XML documents with CSS and report what you found (comments,
questions, answers, or whatever) to the class alias.
- Create your own CSS stylesheet for XML document from
assignment #2
and display it by either IE 5.5 or Netscape 6.0.
- Mar 14th (6th class)
- Topics to be covered (and presentation material)
- SAX (1 hour)
- DOM (2 hours)
- Pre-class reading material
- Homework assignment #6
- Write code either in Java or in C++ language that reads the
XML document from assignment #2 and print out exactly the same XML
document. Both SAX and DOM based code should be written.
Please submit the code as
well as the result. (For those of you who do not
have
programming experience, please let me know immediately.)
- Bonus point project #6B
- Generate DOM tree representing XML
document from assignment #2, and then write it out to a file.
- Mar 21st (7th class: Mid-term project
due
date. )
- Topics to be covered (and presentation
material)
- JAXP, JAXB, and JAXM (2 hours)
- Pre-class reading material
- Homework assignment #7
- Rewrite homework assignment #6 (only SAX part) using JAXP
- Mar 28th (8th class: Final project
proposal
due date)
- Topics to be covered
- Pre-class reading material
- April 4th (9th class)
- Topics to be covered
- Pre-class reading material
- April 11th (No class, spring break)
- April 18th (10th class)
- Topics to be covered and presentation material
- J2EE and XML Overview (JAXP, XML with JSP/Servlet,
XML and JMS, XSLTC) (1/2 hour)
- XML and Database (1 hour)
- Web Services (1/2 hour)
- SOAP, W3C XP (1/2 hour)
- Pre-class reading material
- April 25th (Final project due date
, makeup class if needed, otherwise we will not have a class, I will be out of town this week - going to China and
Taiwan.
Your grade will be sent to Brandeis directly within 2 or 3 weeks from
this
day. )
- Things we might not be able cover due to time constraint
Weekly presentation material
- It is our (instructors) goal to post the presentation material
by Monday of each week so that students have a chance to see them
before the class. (I am sure there will be exceptions, however.)
- The presentation material will be posted in both single slide
per
page and 6 slides per page PDF file format.
- I am constantly updating my presentation material even for the
ones
I already talked about. I am giving a lot of talks on various
topics
including XML so I am constantly going back to old presenation
materials
including the ones I am using for this class to update with newer and
hopefully
with better contents. And I will, once in a while, update whole
contents. Typically when I post the material a couple of days
before the class, the contents should be about 95% stable. So the
hardcopies you have should suffice the need for writing notes and would
require only minor modifications. I will also update the material
right after the class so that you have at
least the same ones you saw during the class. I will also put the
date
and time of last modification whenever possible.
- Weekly presentation material will be posted under class schedule
Weekly pre-class reading
material
Homework assignments
- For homework grading policy, see homework assignment section of Grading policy
- Actual home work assignments will be posted under class schedule
- Homework or project can be submitted either email or paper form
Classroom location
- Room 202 in Shiffman building. Check the Brandeis
website(http://www.brandeis.edu/directions.html)
for direction and campus map. Shiffman building is one of the four
building
of Humanies Quad. Once the campus map is loaded, search Shiffman
building
by seleting search criteria "By building" on lower left corner.
Late submission policy
- On or before due date: 100%
- After due date and before or on next class: 75%
- After next class meeting (anytime): 50%
- Homework assignment is due by the next class unless specified
otherwise
- Mid-term projects and final projects due dates are specified in
the class schedule section of this document
Learning Disabilities
- If you are a student with a documented disability on record
at Brandeis University and wish to have a reasonable accommodation made
for you in class,
please see me immediately.
Academic Honesty:
- As stated in the Rights and Responsibilities handbook,
"Every member of the University community is expected to maintain the
highest standards of academic honesty. As student shall not receive
credit for work that is
not the product of the student's own effort".
brandeis-xml-2001@east.sun.com
alias usage guidelines
- This email alias was set up for students to share knowledge and
exchange ideas among themselves during the semester period.
Students are encouraged to post questions, interesting articles, tools,
sample programs or anything that is relevant to XML.
Homework
assignment
and Project submission status
- This table will be updated after Wednesdays class and by
Thursday noon.
- If you see any errors, please let the instructor know
immediately.
- Homework name convention should be as following - What I do is
to
save files into individual student's file folder for later browsing. So
there
will be many files from several homework assignments and mid-term and
final projects. And if you name your filename in the following
convention, that will help me a lot in identifying which files belong
to which assignment.
- hw2_filename.ext homework assignment #2 files
- hw3_filename.ext homework assignment #3 files
- hw3o_filename.ext homework assignment #3 optional (extra
credit)
- mid_filename.ext midterm project files
- fin_filename.ext final project files
- pap_filename.ext optional research paper files
- I will use the following symbols
- p - paper copy submitted
- m - electronic copy emailed
- l - late submission (paper or electronic)
- n - did not submit
- c - checked by the instructor
- g - graded by the instructor
- d - dropped the course
Mid-term project
- (Message posted on Feb. 21st.) For the mid-term, you can do
either
the XSLT project I have in mind or appropriate programming project of
your choice. (I don't have any plan to accept paper as a mid-term
project replacement.) I will send out details sometime before
this weekend. (I am still trying to scope out the amount of work
that would be appropriate for mid-term project.) Since I am planning to
give out mid-term project topic
(XSLT), there is no need to turn in mid-term project proposal. As for
programming
project, we are going to cover SAX and DOM on our 5th class. (For
detailed
class schedule, please check out website.) And as it currently stands,
the
mid-term project is due by 6th class, which might not provide adequate
time
for folks who want to do programming project. So I will extend
the
due date to 7th class which is March 21st.
- Midterm project
- In this project, you will simulate a very simple e-commerce
site (pet
store) using XML.
- The store carries three categoties of pets - dogs, cats, and
elephants.
- Each pet category have different types. There are 4
types of
dogs: dogType1, dogType2, dogType3, dogType4 (I told you I am a
creative
person, did'nt I?). Cats can be one of three types: cutty,
cuttier
and cuttiest. Elephants can be either smelly or biggy.
Let's
call each of these as a pet type.
- Each pet type has price and quantiy sub-elements.
- Each pet type has a color and origin attributes. The
origin
attribute has to be either Korea or China.
- All pet types of Dog category have height sub-element
(in addition to price and quantity sub-elements).
- All pet types of Cat category have another attribute, size.
- Some pet categories have lifeExpectancy attribute.
- Some pet types have sale attribute.
- Some pet types have soldout subelement.
- The inventory information of this pet store is maintained in a
single
XML file.
- A buyer should be able to see the following HTML pages.
All
these HTML pages need to be generated from XML inventory document by
applying XSLT stylesheets. Your job is to create these
stylesheets for each of
the HTML page. Make sure each HTML page has appropriate title.
- HTML page1: Pet categories available in the store.
These pet
categories should be displayed in their alphanetical order - that is,
in
the order of cats, dogs, and elephants.
- HTML page2: A page in which the view of the HTML page1 is
expanded
so that each pet category now has expanded view of pet types
underneath. That is, dog category now has dogType1, dogType2,
dogType3, and dogType4. The pet types are displayed in the
descending alphabetical order. Also each pet type should
display price and quantity information.
- HTML page3: A page in which only the pet types whose origin
is China
are displayed. The pet types are to be displayed only with color
and price information and size if size information is available.
- HTML page4: A page in which pet categories are displayed
whose pet
type has size information.
- HTML page5: A page in which pet types which have either
lifeExpectancy attribute or height information are displayed.
- HTML page6: A page in which pet types which is in sale right
now. The category is to be displayed as well.
- HTML page7: A page in which category and pet types which are
not on
sale are displayed with price and quantity information.
- HTML page8: A page in which pet types whose price is less
than 100
dollars
- HTML page9: Page 2 with total price and quantity information
for each
category and as well as total price and quantity for the whole store
- HTML page10: Page 3 with total price and quanity information
- I would like to see the following as the outcome of your
mid-term
project (Please sunmit your mid-term project in
hard-copy please. It will help me a lot. Thank you!)
- XML inventory document
- DTD
- Stylesheets
- HTML documents and their screen shots
Final-term project guidelines
Proposal
- Half a page or full page explaning what you are planning to do
- The purpose of the proposal is for instructors to get a
heads-up on
what you are doing
- Will not be used for grading
- Possible Final-term project ideas
- XML-RPC (Example: Build B2B application using XML-RPC)
- XML as data storage format (XML for configuration data)
- Web publishing framework (Example: Use or augment web
publishing framework as Cocoon from Apache or Enhydra)
- Domain specific application of XML (MusicXML, MathXML, SVG)
- SOAP (Example: Build B2B application using SOAP)
- JAXM (ebXML) (Example: Build B2B application or web
application leveraging JAXM prototype)
- New and useful Java utilitiy classes
- XML and database (Example: Investigate how XQL, XQuery XML
lanaguages
can be used, See if using XML as database internal storage format gives
real advantage in some applications.)
- JMS and XML (Example: Use open source JMS messaging server and
construct XML message based B2B application)
- XML Tools (Example: Research of XML tools and their usage with
some
demo)
- Search around the problems people are trying to solve in http://archives.java.sun.com/archives/xml-interest.html
and http://lists.xml.org/archives/xml-dev/
newsgroups and work on them
- Any work in Apache XML open source project (Grab a reasonable
piece
of work to be done and do it!)
- Anything else that might interest you or something that you
can readily apply in your own work environment
- Final project format (recommended but does not have to be
exactly followed) (Still under
construction)
- 1 page project description (can be a rehash of proposal)
- Description of project objective
- Describe what problem or a set of problems XML solves
(compared to
other schemes/technologies)
- Architectural diagram (does not have to be computer
generated but
must be clearly readable) and description
- should give a reader a high level view of the system
- should give a reader where and how XML is used
- Design and implementation description
- Describe all the possible design and implementation choices
possibly
with their pros and cons
- Describe why certain design and implementation choice is
made (it
could be due to many things such as better scalability, higher
performance,
security, higher flexibility, better maintenability, etc.)
- Test results
- printouts or screen shots
- Project evaluation
- clearly document if you met your objective described in the
project
description
- describe what worked well and what did not
- describe 3 lessons you learned
- Submission material (Hard-copy
required -
I need to make notes for grading and I cannot afford to print all the
materials coming to me through email. )
- Things mentioned above
- Code (detailed comments are must) and output
Optional term-paper guidelines
Biography of Instructor
Sang Shin is presently working for Sun Microsystems as a
Java(tm) Technology Evangelist and Technology Architecture consultant.
He is currently based in Boston area and his duties include
evangelizing and consulting on
important Java technologies such as Jini(tm) Network Technology, J2EE,
EJB,
JMS, J2ME, "Java and XML", and Web services technologies such as Sun
ONE
architecture, ebXML, WSDL, SOAP, UDDI to worldwide developer
audience.
He frequently talks on these topics in various technical conferences.
(Please
check his
schedule
of 2001 and 2002 .)
Sang Shin has been with Sun Microsystems over 13 years working in
various research and engineering projects mostly in data communication,
networking, internet, and Java related areas. Prior to Sun, he worked
in several startup companies in various engineering and managerial
capacities. Whenever he finds time, he also teaches one of the
three software engineering courses ("XML"
, "Distributed
programming
using Jini networking technology" , "Web Services
programming using XML and Java programming language") in Brandeis university in
Massachusetts.
XML Resources on the Web (I am
still
in the process of building this up)
- XML Basics (XML, DTD)
Tutorials
- XML Programming
with Java
Tutorials
- Web
Publishinhg Framework
- XHTML
- XML with JSP/Servlet
- XML and JMS
- XML and Database
- SOAP
- XML-RPC
- XML Query
- ebXML
- Emerging Java
APIs for
XML
- XML Tools