06-22-2007, 10:26 AM | #1 (permalink) |
Banned from being Banned
Location: Donkey
|
XML doesn't make sense - namespaces and such.
I've been working with XML for years now, provided the document doesn't get all "weird" on me - in particular, when it starts involving namespaces.
For example, I might have a document that has a root/mainNode/childNode set up. The XPath in such a setting is easy - I wanna pull all the childNode elements, I just do //root/mainNode/childNode. However, that suddenly doesn't work when someone throws a namespace on their root node - "xmlns='some url that doesn't exist' " So I guess my main question is why do people use non-existing URLs as namespaces instead of a normal word? Why "http://blah.domain/xml/2/0/randomWord/banana" as opposed to a more simple and logical "customerDoc"? Why in a URL format, and why something that doesn't exist? It confuses me because namespaces everywhere else are normal. In C# it's something like System.Net.Sockets. If I had to type "http://blah.domain/xml/2/0/randomWord/System.http://blah.domain/xml/2/0/randomWord/Net.http://blah.domain/xml/2/0/randomWord/Sockets", I would kill myself. My next question is, if someone uses a namespace, why does XPath suddenly not work? I can take the same document, strip out the namespace declaration, save it locally, then try the XPath query on it and it works just fine. So how does this simple "xmlns=" attribute suddenly break whatever XPath engine I'm using?
__________________
I love lamp. |
06-22-2007, 10:49 AM | #2 (permalink) | |
Lover - Protector - Teacher
Location: Seattle, WA
|
I have a really hard time dealing with namespaces in XML myself, and realistically unless I absolutely HAVE to do it with XML, I avoid it altogether.
This is a really good read, in a general sense: " XML Namespaces and How They Affect XPath and XSLT" http://developers.slashdot.org/devel....shtml?tid=156 Also, "Avoiding the Hassle of XMLNamespaceManager" http://dotnetjunkies.com/WebLog/john...25/132153.aspx Quote:
__________________
"I'm typing on a computer of science, which is being sent by science wires to a little science server where you can access it. I'm not typing on a computer of philosophy or religion or whatever other thing you think can be used to understand the universe because they're a poor substitute in the role of understanding the universe which exists independent from ourselves." - Willravel Last edited by Jinn; 06-22-2007 at 10:55 AM.. Reason: Automerged Doublepost |
|
06-22-2007, 06:35 PM | #3 (permalink) |
Junkie
Location: San Antonio, TX
|
I'm not a big XML geek, but Jinnkai's answer sounds like the right direction. I will note that I sit next to a guy on the w3c committee at work, who works on deep xpath/XML stuff. Let's just say that listening to the conference calls he's on every week is like a window into hell itself.
|
06-24-2007, 07:00 PM | #4 (permalink) |
Banned from being Banned
Location: Donkey
|
Thanks for the replies!
Unfortunately, even with trying the NamespaceManager code, I cannot query any XML files with a default namespace. Is there something special I have to do in that case? If I change the namespace to something like "xmlns:default=" or "xmlns:blah=", it works fine, but the normal "xmlns=" does not work. Something isn't right here because if you check out the value on "mgr.DefaultNamespace", it IS using the default namespace specified. This is all very confusing and almost illogical in its set up. I'm using a file containing this data: Code:
<?xml version="1.0"?> <RootNode xmlns="http://blah.com/nonsensicalURL"> <Test>Testing!</Test> </RootNode> Code:
using System; using System.Collections.Generic; using System.Text; using System.Xml; namespace XmlTesting { class Program { static void Main(string[] args) { while (true) { XmlDocument doc = new XmlDocument(); doc.Load("c:\\test.xml"); XmlNamespaceManager mgr = CreateNsMgr(doc); Console.WriteLine("Default namespace is: " + mgr.DefaultNamespace + "\r\n"); if (doc.SelectSingleNode("//RootNode", mgr) != null && doc.SelectNodes("//RootNode/Test", mgr).Count != 0) Console.WriteLine("Query works"); else Console.WriteLine("Query doesn't work"); Console.ReadLine(); } } public static XmlNamespaceManager CreateNsMgr(XmlDocument doc) { XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable); //nsmgr.AddNamespace(string.Empty, "http://blah.com/nonsensicalURL"); //nsmgr.AddNamespace("default", "http://blah.com/nonsensicalURL"); foreach (XmlAttribute attr in doc.SelectSingleNode("/*").Attributes) { if (attr.Prefix == string.Empty && attr.LocalName == "xmlns") nsmgr.AddNamespace(String.Empty, attr.Value); if (attr.Prefix == "xmlns") nsmgr.AddNamespace(attr.LocalName, attr.Value); } return nsmgr; } } } So I decided, "F it, I'm just removing any stupid default namespace attributes." Well, the code below illustrates how it finds and removes the "xmlns" attribute and reloads the XML into a NEW document object since there's no "reload" method... and it STILL doesn't work. Despite the code removing the xmlns attribute, on debug, it's STILL THERE! I even step through the code repeatedly where it checks if the xmlns attr isn't null... and after it's removed the first time, it is. Yet it still shows up in the InnerXML. This is just beyond stupid now. Check this madness out: Code:
XmlDocument doc2 = new XmlDocument(); doc2.Load("c:\\test.xml"); if (doc2.DocumentElement.Attributes["xmlns"] != null) doc2.DocumentElement.RemoveAttribute("xmlns"); XmlDocument doc = new XmlDocument(); doc.LoadXml(doc2.InnerXml); The next step is to seriously run the InnerXml property through a Regex replace method that finds and destroys that "xmlns=" portion. The whole reason of why I'm trying to do this... various customers of ours have data feeds set up. Most are normal XML, but a handful of em have already established feeds that use this namespace stuff. They upload a feed to us, specify the XPath query of the nodes they need pulled out, and we do some HTML template merging stuff. The Namespace manager thing seemed to be a perfect solution up until the point I realize it doesn't work for the typical "xmlns=" [edit] Sure enough, that last method worked. Code:
XmlDocument doc2 = new XmlDocument(); doc2.Load("c:\\test.xml"); XmlDocument doc = new XmlDocument(); doc.LoadXml(System.Text.RegularExpressions.Regex.Replace(doc2.InnerXml, @"xmlns=\"".*?\""", string.Empty));
__________________
I love lamp. Last edited by Stompy; 06-24-2007 at 08:27 PM.. Reason: Automerged Doublepost |
Tags |
make, namespaces, sense, xml |
|
|