Office of Continuing and Professional Studies
Brandeis University
Rabb School of Summer, Special and Continuing Studies
Waltham, MA 02254-9910
Course: Next Generation Electronic
Publishing in XML and Related Language
(Warning: This course title does not exactly reflect the contents
we are going to deal with during this semester at least by me. Please
take a look at "Unofficial course
description" .)
Instructor
Sang Shin(sang.shin@sun.com
),
Java Technology Evangelist, Sun Microsystems, Inc.,
(781) 442-0531 (office)
Biography of Sang Shin
Sang Shin's
Speech Engagement Schedule
Office hours : No scheduled office hours. The best way to contact
me is through emails.
Class website is http://www.plurb.com/misc/xml/brandeis-xml-2001.html
Use the mailing list brandeis-xml-2001@east.sun.com
for
classwide issues.
About This Class
General Policies
XML Resources
Official Course Title
and Description (as in Brandeis brochure)
-
Course Number: RSEG-0151-G
-
Title: Next Generation Electronic Publishing in XML and Related
Language
-
Description: "This course surveys the open standards that are making
documents increasingly interchangeable, searchable, dynamic, and customizable.
Topics include SGML, XML, and XML parsers, including the DOM and SAX interfaces,
XPath, XSL, XSLT, XHTML, other emerging standards, and mainstream electronic
publishing technologies concerning page description languages, colors and
fonts. Prerequisite: A working knowledge of HTML and a foundation knowledge
of Java or Perl."
Unofficial Course Title
and Description - (At least, this is what I am planning to teach)
-
Course Number: RSEG-0151-G (This number might need to be changed
to reflect a bit different nature of the class or course description needs
to be changed.)
-
Title: Applied XML (I am trying to come up with a better
name for this class. Suggestions are welcome.)
-
Description: "This course starts with a high-level overview of XML
technology and its application. During this overview, fundamental
concepts of XML and the reasons of why XML is important are introduced.
The application areas of XML include XML as document format and how XML
is currently being used in the Web. During the first half of this course,
the grammatical and syntactical aspects of XML and other related
technologies such as DTD, XML Schema, XSL, XSLT, XPath, Namespaces, Internationalization,
XLink, and XPointer are going to be covered in detail. Once
these are dealt with, the course gets into programming aspects of XML which
includes parsing and transformation. The choice of programming language
for XML programming in this class is Java programming language even though
for most part those folks who have some programming experience in C and
C++ can still follow. As for parsing, both SAX and DOM are going
to be covered. Also, the core Java APIs for XML that are currently
being defined via Java Community Processs(JCP) are going to be introduced.
They are JAXP (Java API for XML Parsing), JAXB(Java API for XML databinding),
and JAXM (Java API for XML messaging). The open source XML projects
especially the ones that are being worked on by Apache organization, for
example, Cocoon, will be talked about. The XML usage in the context of
enterprise application technologies such as JSP and Servlet are going to
be covered if time permits. Finally emerging XML-based E-commerce
standards such as SOAP/W3C's XP, UDDI and ebXML will be explored. Prerequisite:
Some programming experience and concept of HTML. Also if time permists,
emerging Java APIs for XML such as "Java APIs for XML-RPC" and others will
be explored."
Course Objectives
-
By the end of the course, students are expected to
-
Understand the fundamental concepts of XML and related technologies
-
Acquire knowledge on how XML is currently being used in various application
areas
-
Understand the syntactic and semantical aspects of XML documents
-
Know how to parse and transform XML documents via tools and through programming
APIs
-
Have some exposure of XML activities in e-commerce area
Syllabus (Actually collection of topics at
this point)
You are welcome to use the material here in any way you
want. In fact, I would like to strongly encourage folks to use the material
here, for example, to teach a class or give a in-house seminar on the subjects.
The files here are in PDF form but Powerpoint or Staroffice files will
be available upon request. If you use any of the materials here,
I would like to ask you to drop me an email (sang.shin@sun.com
) so that I can keep track of who is using what.
I am also available for giving in-house seminar on the subjects as long
as travel expenses are paid. (Seminar itself will be free of charge.) Thanks
in advance.
Warning: The topics themselves and number
of hours assigned to each topic are subject to change as we move along.
Warning: The order of topics to be covered
are also subject to change as instructors see fit.
Warning: Given the time constraint we have,
we might not be able cover all the topics listed here.
-
Overview of XML technologies (2 hours)
-
XML Fundamentals
-
XML Programming (in Java)
-
SAX (1 hour) - concept and APIs
-
DOM (2 hours) - concept and APIs
-
JDOM(2 hours) - concept and APIs
-
JAXP (1 hour) - concept and APIs
-
JAXB (1/2 hour) - mostly concept
since the APIs are not published yet
-
JAXM (1/2 hour) - mostly concept since
APIs are not published yet
-
Emerging Java APIs for XML
-
XML in Enterprise Application
-
XML in e-Commerce (Web Services)
-
Apache.Org Open Source XML Projects
-
Cocoon (2 hours) - Web-publishing framework, XSP
-
Batik (1/2 hour) - SVG (Scalable Vector Graphics)
-
FOP (1/2 hour) - XSL Formatting processor
-
XML Tools
Things we are NOT going
to cover
-
SGML
-
HTML
-
Detailed syntax and vocabularies of many domain-specific XML languages
(i.e. CML, MathML, and so on)
Grading Criteria
-
Home work assignment
20%
-
Only one or two will be randomly selected for grading
-
Submission is a Must unless specified otherwise - there will be homeworks
that do not require submission
-
Missing homework could account for up to 5%
-
4 to 6 assignments are expected during the course
-
Each homework assignment is expected to take between 3 to 6 hours to finish
-
Just for consistency of grading, a particular week's homework assignment
will be graded by a single instructor or both instructors
-
Midterm project (or exam)
25%
-
Final project
45%
-
Class participation/Attendance
10%
-
Research
paper (optional, could earn bonus points of up to 15%)
-
6 to 10 pages
-
Any subject related to XML is acceptable
-
The originality of your idea needs to be clearly specified
Ground Rules regarding the usage
of Java programming language in this class (until course description gets
changed)
-
XML basic concepts (along with related technologies) can certainly be taught
without requiring the knowledge of particular programming language.
-
As for programming, parsing and transformation can be done in both in Java
and other languages. I will talk about SAX and DOM both in concept
and API perspective. I will be talking the APIs in Java but
any student with some programming background should be able to understand
it. Again, they can use any tool or programming language for parsing and
transformation. I will only mention Java APIs and Java tools, however,
during the class. Students who need to use non-Java tools and programs
need to do their own search of programming tools.
-
Java APIs for XML session (this is about 3 to 4 hours) - there are a lot
of concept to be understood which is relevant to all XML programmers regardless
of the programming language.
-
Programming assignment (if we have some) can be done in any programming
language.
-
Again Apache projects deal with a lot of concept and architecture even
though most of them use Java as implementation language.
-
Both mid-term and final-term projects can be implemented in any programming
language
-
The bottom line is if there are students who do not have extensive Java
programming language, I will find ways not to penalize them both in their
learning experience and grades in this class. (This policy will definitely
change if I teach this course again.)
Prerequisites
-
Some programming experience in Java, C or C++, preferably in Java
-
Java language programming experience is highly desirable but not required
-
Next semester, if I teach this class again, however, I will require minimum
6 months Java language programming experience or one semester of Java language
level 1.
Textbooks & Reference books
-
"XML Bible" (This is not my first choice but since it was mentioned in
the Brandeis brochure as "recommended textbook in this class" and since
there could be folks who already bought the book, I will use it as "class
textbook" for this semester. But I am NOT going to require students
to buy this book. We are planning to give reading assignment before
each class but many are available for free from various web sites.
As long as students read some relevant chapter or article, I would not
really care. Next semester, if I teach the same class, I will change the
recommended textbook to "XML in a Nutshell" for XML concept part and "Java
and XML" for Java programming for XML part.)
-
"XML in a Nutshell" written
by Elliotte Rusty Harold & W. Scott Means from O'Reilly (I think
this book is a very well-written book on fundamental concepts of
XML and related technologies. This will be my textbook of choice for XML
next semester. By the way, this book is written by the same author
who wrote "XML Bible". As the author himself claims, this book seems
much better organized and "to the point". It is also more up-to-date,
published in Jan. 2001. A lot of class presentation material is based
on the contents from this book.)
-
"Java and XML" written
by Brett McLaughlin from O'Reilly (If you are going to learn the basics
on how to program in Java for XML processing quick, I would recommend this
book. The contents on SAX and DOM, however, are relatively introductory.
If you want detailed description on Java APIs on SAX and DOM, try API document
itself and take a look at sample code in Xerces. One plus point for
this book is that it has chapters on JDOM and XML-RPC.)

-
Professional XML from Wrox press (I saw a few kudos on this book on the
net and from one of the students. I have not had a chance to check
it out.)
-
"Applied XML Solutions" from SAMS, written by Benoit Marchal (If you feel
comfortable with fundamental XML concepts and are looking for a book that
shows how XML is being used in real-life projects, this book is recommended.
It contains practical example scenarios, XML and Java source files.)
-
"XSLT Programmer's reference" written by Michael Kay from Worx press (Recommended
by many as the most extensive XSLT reference book. In this class, the level
of XSLT coverage in "XML in a Nutshell" or "Java and XML" would be quite
adequate.)
-
"Learning XML" written by Erik Ray from O'Reilly (The technical detail
is too skimpy for my taste. It might be good for managers - like the one
in Dilbert - but not for us. :-))
Schedule (Class is being held every Wednesday night from 6:00PM
to 9:00PM)
-
Jan. 31st (1st class)
-
Topics to be covered (and presentation material)
-
XML Overview, 1st part (1 hour 15 minutes)
-
Pre-class reading material
-
Homework assignment (No submission required
)
-
Download AMAYA browser and try to
display the following MathML data (You have to download the XML data files
first before you see the math forms of the data)
-
Feb. 7th - Cancelled
-
Feb 14th (2nd class)
-
Topics to be covered (and presentation material)
-
XML Overview, 2nd part (1 hour)
-
XML grammar (1 hour) (This presentation material will be redone!)
-
DTD (1 hour) (This presentation material will be redone!)
-
Pre-class reading material
-
XML Overview
-
XML Fundamentals (Chap. 2) of "XML in a Nutshell"
-
Any introduction chapter of any XML book
-
DTD
-
Homework
assignment (HTML)
-
Feb 21st (3rd class)
-
Topics to be covered (and presentation material)
-
Namespaces (3/4 hour)
-
Internationalization (1/2 hour)
-
XML as a Document Format (1/2 hour)
-
Pre-class reading material
-
Namespaces
-
Namespaces (Chap. 4) of "XML in a Nutshell" (I like this book the
best for this topic. In fact, the presentation material is mostly
based on the Namespaces chapter of this book)
-
Corresponding chapter of any XML book
-
Internationalization
-
Internationalization (Chap. 5) and Character set (Appendix) of "XML in
a Nutshell" (I like this book the best for this topic. In fact,
the presentation material is mostly based on the Internationalization chapter
of this book)
-
Corresponding chapter of any XML book
-
XML as a Document format
-
XML as a Document Format (Chap. 6) of "XML in a Nutshell"
-
Homework assignment #3
-
Put at least two namespace declarations to the XML document you've created
as part of assignment
#2 work and change elements and attributes accordingly and then perform
the validation with properly modified DTD
-
Feb 28th (4th class)
-
Topics to be covered (and presentation material)
-
Pre-class reading material
-
Example code
-
XSLT Example code zip file
-
Instruction of running Apache Xalan with these example XSL stylesheet files
-
Download and install JDK, make sure you have "java" in your PATH
-
Download Apache Xalan, following the installation instruction of Java version
of Xalan. (For those of you who want to use C++ version of it, you are
welcome to do so.) Basically you have to make sure you have the jar
files in your classpath when you run the Xalan as mentioned below.
-
Then run the example program like following (assuming you install Xalan
in c:\xmltools\xalanj\xalan_1_0_0, replace it with the installation directory
you used)
-
set CLASSPATH=.;c:\xmltools\xalanj\xalan_1_0_0\xalan.jar;c:\xmltools\xalanj\xalan_1_0_0\xerces.jar;c:\xmltools\xalanj\xalan_1_0_0\bsf.jar;c:\xmltools\xalanj\xalan_1_0_0\bsfengines.jar;c:\xmltools\xalanj\xalan_1_0_0\samples\xalansamples.jar
-
java org.apache.xalan.xslt.Process -in 8-1.xml -xsl 8-2.xsl -out junk
-
Homework assignment #4(No submission required
)
-
Run the example code using either Xalan or SAXON
-
Mar 7th (5th class)
-
Topics to be covered (and presentation material)
-
XPath (1 hour and half)
-
XML on the Web, XHTML (1 hour)
-
CSS (1/3 hour)
-
Pre-class reading material
-
XPath
-
XML on the Web, XHTML
-
CSS
-
Homework assignment #5 (no submission required
)
-
Play around with various browsers/applications/tools (IE, netscape, Mozilla,
Amaya, Opera, and others) out there reading XML documents such as XHTML
and XML documents with CSS and report what you found (comments, questions,
answers, or whatever) to the class alias.
-
Create your own CSS stylesheet for XML document from assignment #2 and
display it by either IE 5.5 or Netscape 6.0.
-
Mar 14th (6th class)
-
Topics to be covered (and presentation material)
-
SAX (1 hour)
-
DOM (2 hours)
-
Pre-class reading material
-
Homework assignment #6
-
Write code either in Java or in C++ language that reads the XML document
from assignment #2 and print out exactly the same XML document. Both
SAX and DOM based code should be written. Please submit the code
as well as the result. (For those of you who do not
have programming experience, please let me know immediately.)
-
Bonus point project #6B
-
Generate DOM tree representing XML document from
assignment #2, and then write it out to a file.
-
Mar 21st (7th class: Mid-term project due date.
)
-
Topics to be covered (and presentation material)
-
JAXP, JAXB, and JAXM
(2 hours)
-
Pre-class reading material
-
Homework assignment #7
-
Rewrite homework assignment #6 (only SAX part) using JAXP
-
Mar 28th (8th class: Final project proposal due
date)
-
Topics to be covered
-
Pre-class reading material
-
April 4th (9th class)
-
Topics to be covered
-
Pre-class reading material
-
April 11th (No class, spring break)
-
April 18th (10th class)
-
Topics to be covered and presentation material
-
J2EE and XML Overview (JAXP, XML with JSP/Servlet, XML and JMS,
XSLTC) (1/2 hour)
-
XML and Database (1 hour)
-
Web Services (1/2 hour)
-
SOAP, W3C XP (1/2 hour)
-
Pre-class reading material
-
April 25th (Final project due date , makeup
class if needed, otherwise we will not have a class,
I will be out of town this week - going to China and Taiwan. Your
grade will be sent to Brandeis directly within 2 or 3 weeks from this day.
)
-
Things we might not be able cover due to time constraint
Weekly presentation material
-
It is our (instructors) goal to post the presentation material by Monday
of each week so that students have a chance to see them before the class.
(I am sure there will be exceptions, however.)
-
The presentation material will be posted in both single slide per page
and 6 slides per page PDF file format.
-
I am constantly updating my presentation material even for the ones I already
talked about. I am giving a lot of talks on various topics including
XML so I am constantly going back to old presenation materials including
the ones I am using for this class to update with newer and hopefully with
better contents. And I will, once in a while, update whole contents.
Typically when I post the material a couple of days before the class, the
contents should be about 95% stable. So the hardcopies you have should
suffice the need for writing notes and would require only minor modifications.
I will also update the material right after the class so that you have
at least the same ones you saw during the class. I will also put
the date and time of last modification whenever possible.
-
Weekly presentation material will be posted under class
schedule
Weekly pre-class reading material
Homework assignments
-
For homework grading policy, see homework assignment section of Grading
policy
-
Actual home work assignments will be posted under class
schedule
-
Homework or project can be submitted either email or paper form
Classroom location
-
Room 202 in Shiffman building. Check the Brandeis
website(http://www.brandeis.edu/directions.html) for direction and
campus map. Shiffman building is one of the four building of Humanies Quad.
Once the campus map is loaded, search Shiffman building by seleting search
criteria "By building" on lower left corner.
Late submission policy
-
On or before due date: 100%
-
After due date and before or on next class: 75%
-
After next class meeting (anytime): 50%
-
Homework assignment is due by the next class unless specified otherwise
-
Mid-term projects and final projects due dates are specified in the
class schedule section of this document
Learning Disabilities
-
If you are a student with a documented disability on record at Brandeis
University and wish to have a reasonable accommodation made for you in
class, please see me immediately.
Academic Honesty:
-
As stated in the Rights and Responsibilities handbook, "Every member
of the University community is expected to maintain the highest standards
of academic honesty. As student shall not receive credit for work that
is not the product of the student's own effort".
brandeis-xml-2001@east.sun.com
alias usage guidelines
-
This email alias was set up for students to share knowledge and exchange
ideas among themselves during the semester period. Students are encouraged
to post questions, interesting articles, tools, sample programs or anything
that is relevant to XML.
Homework
assignment and Project submission status
-
This table will be updated after Wednesdays class and by Thursday noon.
-
If you see any errors, please let the instructor know immediately.
-
Homework name convention should be as following - What I do is to save
files into individual student's file folder for later browsing. So there
will be many files from several homework assignments and mid-term and final
projects. And if you name your filename in the following convention, that
will help me a lot in identifying which files belong to which assignment.
-
hw2_filename.ext homework assignment #2 files
-
hw3_filename.ext homework assignment #3 files
-
hw3o_filename.ext homework assignment #3 optional (extra credit)
-
mid_filename.ext midterm project files
-
fin_filename.ext final project files
-
pap_filename.ext optional research paper files
-
I will use the following symbols
-
p - paper copy submitted
-
m - electronic copy emailed
-
l - late submission (paper or electronic)
-
n - did not submit
-
c - checked by the instructor
-
g - graded by the instructor
-
d - dropped the course
Mid-term project
-
(Message posted on Feb. 21st.) For the mid-term, you can do either the
XSLT project I have in mind or appropriate programming project of your
choice. (I don't have any plan to accept paper as a mid-term project
replacement.) I will send out details sometime before this weekend.
(I am still trying to scope out the amount of work that would be appropriate
for mid-term project.) Since I am planning to give out mid-term project
topic (XSLT), there is no need to turn in mid-term project proposal. As
for programming project, we are going to cover SAX and DOM on our 5th class.
(For detailed class schedule, please check out website.) And as it currently
stands, the mid-term project is due by 6th class, which might not provide
adequate time for folks who want to do programming project. So I
will extend the due date to 7th class which is March 21st.
-
Midterm project
-
In this project, you will simulate a very simple e-commerce site (pet store)
using XML.
-
The store carries three categoties of pets - dogs, cats, and elephants.
-
Each pet category have different types. There are 4 types of dogs:
dogType1, dogType2, dogType3, dogType4 (I told you I am a creative person,
did'nt I?). Cats can be one of three types: cutty, cuttier and cuttiest.
Elephants can be either smelly or biggy. Let's call each of these
as a pet type.
-
Each pet type has price and quantiy sub-elements.
-
Each pet type has a color and origin attributes. The origin attribute
has to be either Korea or China.
-
All pet types of Dog category have height sub-element (in addition
to price and quantity sub-elements).
-
All pet types of Cat category have another attribute, size.
-
Some pet categories have lifeExpectancy attribute.
-
Some pet types have sale attribute.
-
Some pet types have soldout subelement.
-
The inventory information of this pet store is maintained in a single XML
file.
-
A buyer should be able to see the following HTML pages. All these
HTML pages need to be generated from XML inventory document by applying
XSLT stylesheets. Your job is to create these stylesheets for each
of the HTML page. Make sure each HTML page has appropriate title.
-
HTML page1: Pet categories available in the store. These pet categories
should be displayed in their alphanetical order - that is, in the order
of cats, dogs, and elephants.
-
HTML page2: A page in which the view of the HTML page1 is expanded so that
each pet category now has expanded view of pet types underneath.
That is, dog category now has dogType1, dogType2, dogType3, and dogType4.
The pet types are displayed in the descending alphabetical order.
Also each pet type should display price and quantity information.
-
HTML page3: A page in which only the pet types whose origin is China are
displayed. The pet types are to be displayed only with color and
price information and size if size information is available.
-
HTML page4: A page in which pet categories are displayed whose pet type
has size information.
-
HTML page5: A page in which pet types which have either lifeExpectancy
attribute or height information are displayed.
-
HTML page6: A page in which pet types which is in sale right now.
The category is to be displayed as well.
-
HTML page7: A page in which category and pet types which are not on sale
are displayed with price and quantity information.
-
HTML page8: A page in which pet types whose price is less than 100 dollars
-
HTML page9: Page 2 with total price and quantity information for each category
and as well as total price and quantity for the whole store
-
HTML page10: Page 3 with total price and quanity information
-
I would like to see the following as the outcome of your mid-term project
(Please
sunmit your mid-term project in hard-copy please. It will help me
a lot. Thank you!)

-
XML inventory document
-
DTD
-
Stylesheets
-
HTML documents and their screen shots
Final-term project guidelines
Proposal
-
Half a page or full page explaning what you are planning to do
-
The purpose of the proposal is for instructors to get a heads-up on what
you are doing
-
Will not be used for grading
-
Possible Final-term project ideas
-
XML-RPC (Example: Build B2B application using XML-RPC)
-
XML as data storage format (XML for configuration data)
-
Web publishing framework (Example: Use or augment web publishing framework
as Cocoon from Apache or Enhydra)
-
Domain specific application of XML (MusicXML, MathXML, SVG)
-
SOAP (Example: Build B2B application using SOAP)
-
JAXM (ebXML) (Example: Build B2B application or web application leveraging
JAXM prototype)
-
New and useful Java utilitiy classes
-
XML and database (Example: Investigate how XQL, XQuery XML lanaguages can
be used, See if using XML as database internal storage format gives real
advantage in some applications.)
-
JMS and XML (Example: Use open source JMS messaging server and construct
XML message based B2B application)
-
XML Tools (Example: Research of XML tools and their usage with some demo)

-
Search around the problems people are trying to solve in http://archives.java.sun.com/archives/xml-interest.html
and http://lists.xml.org/archives/xml-dev/
newsgroups and work on them
-
Any work in Apache XML open source project (Grab a reasonable piece of
work to be done and do it!)
-
Anything else that might interest you or something that you can readily
apply in your own work environment
-
Final project format (recommended but does not have to be exactly followed)
(Still
under construction)
-
1 page project description (can be a rehash of proposal)
-
Description of project objective
-
Describe what problem or a set of problems XML solves (compared to other
schemes/technologies)
-
Architectural diagram (does not have to be computer generated but must
be clearly readable) and description
-
should give a reader a high level view of the system
-
should give a reader where and how XML is used
-
Design and implementation description
-
Describe all the possible design and implementation choices possibly with
their pros and cons
-
Describe why certain design and implementation choice is made (it could
be due to many things such as better scalability, higher performance, security,
higher flexibility, better maintenability, etc.)
-
Test results
-
printouts or screen shots
-
Project evaluation
-
clearly document if you met your objective described in the project description
-
describe what worked well and what did not
-
describe 3 lessons you learned
-
Submission material (Hard-copy required - I need
to make notes for grading and I cannot afford to print all the materials
coming to me through email. )
-
Things mentioned above
-
Code (detailed comments are must) and output
Optional term-paper guidelines
Biography of Instructor
Sang Shin is presently working for Sun Microsystems as a Java(tm)
Technology Evangelist and Technology Architecture consultant. He is currently
based in Boston area and his duties include evangelizing and consulting
on important Java technologies such as Jini(tm) Network Technology,
J2EE, EJB, JMS, J2ME, "Java and XML", and Web services technologies such
as Sun ONE architecture, ebXML, WSDL, SOAP, UDDI to worldwide developer
audience. He frequently talks on these topics in various technical conferences.
(Please check his
schedule of 2001 and 2002 .)
Sang Shin has been with Sun Microsystems over 13 years working in various
research and engineering projects mostly in data communication, networking,
internet, and Java related areas. Prior to Sun, he worked in several startup
companies in various engineering and managerial capacities. Whenever
he finds time, he also teaches one of the three software engineering courses
("XML"
, "Distributed
programming using Jini networking technology" , "Web
Services programming using XML and Java programming language") in Brandeis
university in Massachusetts.
XML Resources on the Web (I am
still in the process of building this up)
-
XML Basics (XML, DTD) Tutorials
-
XML Programming with Java Tutorials
-
Web Publishinhg
Framework
-
XHTML
-
XML with JSP/Servlet
-
XML and JMS
-
XML and Database
-
SOAP
-
XML-RPC
-
XML Query
-
ebXML
-
Emerging Java APIs for XML
-
XML Tools