===== CS 328 - Week 15 Lecture 1 - 2024-04-29 ===== ===== TODAY WE WILL: ===== * announcements * WHIRLWIND TOUR of XML and JSON * prep for next class ===== * should be working on Homework 11! * should be wrapping up the zyBooks chapters 6 and 7! * will review for the FINAL EXAM on WEDNESDAY! ************ intro to XML and JSON ************ * XML - eXtensible Markup Language JSON - JavaScript Object Notation * XML and JSON are really just two text-based notations for: * representing data, * stucturing data, * passing it back and forth between applications; ...being text-based, they are lovely and portable! ...being standards, applications and DBMSs can agree to output and input data in these notations to make it easier to transfer data back and forth...! * (and both are used for such in both web applications AND other applications as well) * both are structured enough -- and have strict-enough syntax -- that it is quite reasonable to parse them as desired; BUT! both are widely-used enough that there are quite a few LIBRARIES out there so you don't necessarily HAVE to write your own code to parse them; * (for example, PHP has libraries for both!) ************ first: more on XML ************ * again: eXtensible Markup Language * a W3C thing! * designed to describe data and focus on what data is * it IS a markup language -- it DOES use tags in angle brackets! ...you could argue it is really a META-language, that lets individuals or groups specify create customized markup languages; * and also provides a way to specify rules for those languages; * tag-based syntax for marking up text * BUT: WAY more strict than, say, non-strict-style HTML * XML tags are not predefined...! you must define your own tags, BUT using this strict syntax; and you can formalize your tags for some domain using a schema or a Document Type Definition * that is, when a group/organization/etc. wants to formalize a set of elements and rules for describing a certain kind of data, they can formalize those in a DTD (Document Type Definition) and/or an XML Schema to describe that type of data * and then applications can use those to verify XML using those formats ===== a little XML terminology ===== * XML with correct syntax is called *well-formed* XML. * XML validated against a DTD or XML Schema is called *valid* XML. * a well-formed XML format must follow a standard structure: ^^^^^^^^^^^ vvvvvvvvvvvv * starts with an XML prologue, which includes an XML declaration and (if applicable) which DTD or XML Schema is being followed * example of an XML declaration: <?xml version="1.0" encoding="ISO-8859-1" ?> * example of a DTD declaration/doctype: <!DOCTYPE html PUBLC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> * THEN there must be a root element, an element that contains ALL of the other elements * that is, all of the document's content is within that root element * XML's syntax SHOULD hopefully sound familiar since you've been writing strict-style HTML this semester: * an element is defined as everything from its start tag to its end tag (including everything in-between) * and an element's content is everything between its start and end tags (*not* including the start and end tags) * Every element must have a start tag and an end tag. ...elements that can never have content no content (void elements) must be written as: <name /> * Case matters in element names in start and end tags (they are case-sensitive) * Start tags can contain attributes, but they must have values and those values must be quoted (single or double quotes are fine) * All XML elements must be properly nested * Comments follow the HTML comment syntax <!-- moo -> * elements are related as parents and children * the root element is parent for other elements in the document (and maybe grandparent, great-grandparent, etc.) * it is the ancestor for all of the other elements in the document * an element B within another element A can be called a child element of element A (and element B might itself contain children elements, and those would be grandchildren of element A, etc. -- can have levels of descendants) * elements can have different kinds of content: * element content - an element contains one or more other elements (and that's it) * simple content (or text content) - has JUST text content * mixed content - an element contains both element and simple/text content * empty content - an element has no content * and NOTE: if an element has attributes, those are not considered content! <-- content is between start and and tags! ===== JSON - JavaScript Object Notation ===== * another text-based notation for structuring data, BASED on JavaScript's object syntax it is BASED on JavaScript's object syntax, but is NOT identical to it!!! * JavaScript developer Douglas Crockford says he "discovered" it (rather than calling himself its inventor!) for exchanging data conveniently * yes, JavaScript works very well with it! * BUT other languages often include tools for dealing with it, as well * useful approach: consider JavaScript object syntax, then compare it to JSON: ===== JavaScript object *syntax* ===== * JavaScript uses a prototype object model ...you can just create an object! * (you don't have to create a class first) * object literal syntax: write a set of comma-separated data field/value pairs inside curly braces: { fieldname: value, fieldname: value, ... fieldname: value } ...and that's a JavaScript object literal! * if you assign an object literal to a variable, you can access its fields using the familiar dot notation my_object.fieldname * although you can ALSO use associative-array syntax as well, treating the data field name like an associate-array key! my_object["fieldname"] * see example posted in file js-object.js along with these notes ===== JSON syntax ===== * you cannot have comments in JSON...! * CAN kluge by putting them in a data field!!! * ALL JSON data field names MUST be written in DOUBLE-QUOTES * (that is NOT true of a JavaScript object literal!) * JSON data fields may NOT have a function as their value, and a few characters are forbidden in JSON for compatibility reasons (a cursory Google search implies maybe they just have to be escaped, though... like a double-quote or a backspace...?) * I did find that a \ within a JSON data field's string "broke" an example JSON object, but replacing it with \\ was fine * BUT a JSON data field CAN have as its value: * numeric data, * string data, * array data, * JSON data ! * Many languages can handle JSON data -- JavaScript is definitely one of those! * you can transform a JSON string into a JavaScript object using the JSON object's parse method: let data = JSON.parse(myJSONdata); * and you can convert a JavaScript object into a JSON string using the JSON object's stringify method: let json_version = JSON.stringify(myJavaScriptObject);