Initial import

This commit is contained in:
Lucas Fontes
2011-12-29 16:50:32 -05:00
commit 9df0645dd3
9 changed files with 1588 additions and 0 deletions

2
.gitignore vendored Normal file
View File

@@ -0,0 +1,2 @@
*.o
*.a

63
GNUmakefile Normal file
View File

@@ -0,0 +1,63 @@
# Makefile
#
# Copyright 2005 Aaron Voisine <aaron@voisine.org>
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
# "Software"), to deal in the Software without restriction, including
# without limitation the rights to use, copy, modify, merge, publish,
# distribute, sublicense, and/or sell copies of the Software, and to
# permit persons to whom the Software is furnished to do so, subject to
# the following conditions:
#
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
CC = gcc
AR = ar
RM = rm -f
CFLAGS = -Wall -O2
DEBUG_CFLAGS = -O0 -g
OBJS = ezxml.o
LIB = libezxml.a
TEST = ezxmltest
ifdef NOMMAP
CFLAGS += -D EZXML_NOMMAP
endif
ifdef DEBUG
CFLAGS += $(DEBUG_CFLAGS)
endif
all: $(LIB)
$(LIB): $(OBJS)
$(AR) rcs $(LIB) $(OBJS)
nommap: CFLAGS += -D EZXML_NOMMAP
nommap: all
debug: CFLAGS += $(DEBUG_CFLAGS)
debug: all
test: CFLAGS += $(DEBUG_CFLAGS)
test: $(TEST)
$(TEST): CFLAGS += -D EZXML_TEST
$(TEST): $(OBJS)
$(CC) $(CFLAGS) -o $@ $(OBJS)
ezxml.o: ezxml.h ezxml.c
.c.o:
$(CC) $(CFLAGS) -c -o $@ $<
clean:
$(RM) $(OBJS) $(LIB) $(TEST) *~

62
Makefile Normal file
View File

@@ -0,0 +1,62 @@
# Makefile
#
# Copyright 2004, 2005 Aaron Voisine <aaron@voisine.org>
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
# "Software"), to deal in the Software without restriction, including
# without limitation the rights to use, copy, modify, merge, publish,
# distribute, sublicense, and/or sell copies of the Software, and to
# permit persons to whom the Software is furnished to do so, subject to
# the following conditions:
#
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
CC = gcc
AR = ar
RM = rm -f
CFLAGS = -Wall -O2
OBJS = ezxml.o
LIB = libezxml.a
TEST = ezxmltest
.if defined(NOMMAP) || make(nommap)
CFLAGS += -D EZXML_NOMMAP
.endif
.if defined(DEBUG) || make(debug) || make(test)
CFLAGS += -O0 -g
.endif
.if make($(TEST)) || make(test)
CFLAGS += -D EZXML_TEST
.endif
all: $(LIB)
$(LIB): $(OBJS)
$(AR) rcs $(LIB) $(OBJS)
test: $(TEST)
debug: all
nommap: all
$(TEST): $(OBJS)
$(CC) $(CFLAGS) -o $@ $(OBJS)
ezxml.o: ezxml.h ezxml.c
.c.o:
$(CC) $(CFLAGS) -c -o $@ $<
clean:
$(RM) $(OBJS) $(LIB) $(TEST) *~

54
changelog.txt Normal file
View File

@@ -0,0 +1,54 @@
ezXML 0.8.6
- fixed a bug in ezxml_add_child() that can occur when adding tags out of order
- for consistency, ezxml_set_attr() now returns the tag given
- added ezxml_move() and supporting functions ezxml_cut() and ezxml_insert()
- fixed a bug where parsing an empty file could cause a segfault
ezXML 0.8.5
- fixed ezxml_toxml() to not output siblings of tag being converted
- fixed a segfault when ezxml_set_attr() was used on a new root tag
- added ezxml_name() function macro
- all external functions now handle NULL ezxml_t structs without segfaulting
ezXML 0.8.4
- fixed to compile under win-doze when NOMMAP make option is set
- fixed a bug where ezxml_toxml() could segfault if tag offset is out of bounds
- ezxml_add_child() now works properly when tags are added out of order
- improved error messages now include line numbers
- fixed memory leak when entity reference is shorter than replacement text
- added ezxml_new_d(), ezxml_add_child_d(), ezxml_set_txt_d() and
ezxml_set_attr_d() function macros as wrappers that strdup() their arguments
ezXML 0.8.3
- fixed a UTF-16 decoding bug affecting larger unicode values
- added internal dtd processing for entity declarations and default attributes
- now correctly normalizes attribute values in compliance with the XML 1.0 spec
- added check for correct tag nesting
- ezxml_toxml() now generates canonical xml (apart from the namespace stuff)
ezXML 0.8.2
- fixed compiler warning about lvalue type casting
- ezxml_get() argument list can now be terminated by an empty string tag name
- added NOMMAP make option for systems without posix memory mapping
- added support for UTF-16
- fixed bug in ezxml_toxml() where UTF-8 sequences were being ampersand encoded
- added ezxml_new(), ezxml_add_child(), ezxml_set_txt(), ezxml_set_attr(),
and ezxml_remove() to facilitate creating and modifying xml
ezXML 0.8.1
- fixed bug where tags of same name were not recognized as such
- fixed a memory allocation bug in ezxml_toxml() that could cause a segfault
- added an extra check for missing root tag
- now allows for space between ] and > when closing <!DOCTYPE [ ... ]>
- now allows : as tag name start char
- added ezxml_next() and ezxml_txt() function macros
ezXML 0.8
- added ezxml_toxml() function
- removed ezxml_print(), just use printf() with ezxml_toxml() (minor version
api changes will all be backwards compatible after 1.0 release)
- added ezxml_pi() for retrieving <? ?> parsing instructions
- whitespace in tag data is now preserved in compliance with the XML 1.0 spec
ezXML 0.7
- initial public release

1015
ezxml.c Normal file

File diff suppressed because it is too large Load Diff

167
ezxml.h Normal file
View File

@@ -0,0 +1,167 @@
/* ezxml.h
*
* Copyright 2004-2006 Aaron Voisine <aaron@voisine.org>
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
* CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef _EZXML_H
#define _EZXML_H
#include <stdlib.h>
#include <stdio.h>
#include <stdarg.h>
#include <fcntl.h>
#ifdef __cplusplus
extern "C" {
#endif
#define EZXML_BUFSIZE 1024 // size of internal memory buffers
#define EZXML_NAMEM 0x80 // name is malloced
#define EZXML_TXTM 0x40 // txt is malloced
#define EZXML_DUP 0x20 // attribute name and value are strduped
typedef struct ezxml *ezxml_t;
struct ezxml {
char *name; // tag name
char **attr; // tag attributes { name, value, name, value, ... NULL }
char *txt; // tag character content, empty string if none
size_t off; // tag offset from start of parent tag character content
ezxml_t next; // next tag with same name in this section at this depth
ezxml_t sibling; // next tag with different name in same section and depth
ezxml_t ordered; // next tag, same section and depth, in original order
ezxml_t child; // head of sub tag list, NULL if none
ezxml_t parent; // parent tag, NULL if current tag is root tag
short flags; // additional information
};
// Given a string of xml data and its length, parses it and creates an ezxml
// structure. For efficiency, modifies the data by adding null terminators
// and decoding ampersand sequences. If you don't want this, copy the data and
// pass in the copy. Returns NULL on failure.
ezxml_t ezxml_parse_str(char *s, size_t len);
// A wrapper for ezxml_parse_str() that accepts a file descriptor. First
// attempts to mem map the file. Failing that, reads the file into memory.
// Returns NULL on failure.
ezxml_t ezxml_parse_fd(int fd);
// a wrapper for ezxml_parse_fd() that accepts a file name
ezxml_t ezxml_parse_file(const char *file);
// Wrapper for ezxml_parse_str() that accepts a file stream. Reads the entire
// stream into memory and then parses it. For xml files, use ezxml_parse_file()
// or ezxml_parse_fd()
ezxml_t ezxml_parse_fp(FILE *fp);
// returns the first child tag (one level deeper) with the given name or NULL
// if not found
ezxml_t ezxml_child(ezxml_t xml, const char *name);
// returns the next tag of the same name in the same section and depth or NULL
// if not found
#define ezxml_next(xml) ((xml) ? xml->next : NULL)
// Returns the Nth tag with the same name in the same section at the same depth
// or NULL if not found. An index of 0 returns the tag given.
ezxml_t ezxml_idx(ezxml_t xml, int idx);
// returns the name of the given tag
#define ezxml_name(xml) ((xml) ? xml->name : NULL)
// returns the given tag's character content or empty string if none
#define ezxml_txt(xml) ((xml) ? xml->txt : "")
// returns the value of the requested tag attribute, or NULL if not found
const char *ezxml_attr(ezxml_t xml, const char *attr);
// Traverses the ezxml sturcture to retrieve a specific subtag. Takes a
// variable length list of tag names and indexes. The argument list must be
// terminated by either an index of -1 or an empty string tag name. Example:
// title = ezxml_get(library, "shelf", 0, "book", 2, "title", -1);
// This retrieves the title of the 3rd book on the 1st shelf of library.
// Returns NULL if not found.
ezxml_t ezxml_get(ezxml_t xml, ...);
// Converts an ezxml structure back to xml. Returns a string of xml data that
// must be freed.
char *ezxml_toxml(ezxml_t xml);
// returns a NULL terminated array of processing instructions for the given
// target
const char **ezxml_pi(ezxml_t xml, const char *target);
// frees the memory allocated for an ezxml structure
void ezxml_free(ezxml_t xml);
// returns parser error message or empty string if none
const char *ezxml_error(ezxml_t xml);
// returns a new empty ezxml structure with the given root tag name
ezxml_t ezxml_new(const char *name);
// wrapper for ezxml_new() that strdup()s name
#define ezxml_new_d(name) ezxml_set_flag(ezxml_new(strdup(name)), EZXML_NAMEM)
// Adds a child tag. off is the offset of the child tag relative to the start
// of the parent tag's character content. Returns the child tag.
ezxml_t ezxml_add_child(ezxml_t xml, const char *name, size_t off);
// wrapper for ezxml_add_child() that strdup()s name
#define ezxml_add_child_d(xml, name, off) \
ezxml_set_flag(ezxml_add_child(xml, strdup(name), off), EZXML_NAMEM)
// sets the character content for the given tag and returns the tag
ezxml_t ezxml_set_txt(ezxml_t xml, const char *txt);
// wrapper for ezxml_set_txt() that strdup()s txt
#define ezxml_set_txt_d(xml, txt) \
ezxml_set_flag(ezxml_set_txt(xml, strdup(txt)), EZXML_TXTM)
// Sets the given tag attribute or adds a new attribute if not found. A value
// of NULL will remove the specified attribute. Returns the tag given.
ezxml_t ezxml_set_attr(ezxml_t xml, const char *name, const char *value);
// Wrapper for ezxml_set_attr() that strdup()s name/value. Value cannot be NULL
#define ezxml_set_attr_d(xml, name, value) \
ezxml_set_attr(ezxml_set_flag(xml, EZXML_DUP), strdup(name), strdup(value))
// sets a flag for the given tag and returns the tag
ezxml_t ezxml_set_flag(ezxml_t xml, short flag);
// removes a tag along with its subtags without freeing its memory
ezxml_t ezxml_cut(ezxml_t xml);
// inserts an existing tag into an ezxml structure
ezxml_t ezxml_insert(ezxml_t xml, ezxml_t dest, size_t off);
// Moves an existing tag to become a subtag of dest at the given offset from
// the start of dest's character content. Returns the moved tag.
#define ezxml_move(xml, dest, off) ezxml_insert(ezxml_cut(xml), dest, off)
// removes a tag along with all its subtags
#define ezxml_remove(xml) ezxml_free(ezxml_cut(xml))
#ifdef __cplusplus
}
#endif
#endif // _EZXML_H

121
ezxml.html Normal file
View File

@@ -0,0 +1,121 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><title>ezXML</title></head>
<body>
<h1>ezXML - XML Parsing C Library</h1>
<h3>version 0.8.6</h3>
<p>
ezXML is a C library for parsing XML documents inspired by
<a href="http://www.php.net/SimpleXML">simpleXML</a> for
PHP. As the name implies, it's easy to use. It's ideal for parsing XML
configuration files or REST web service responses. It's also fast and
lightweight (less than 20k compiled). The latest version is available
here:
<a href="http://prdownloads.sf.net/ezxml/ezxml-0.8.6.tar.gz?download"
>ezxml-0.8.6.tar.gz</a>
</p>
<b>Example Usage</b>
<p>
Given the following example XML document:
</p>
<code>
&lt;?xml version="1.0"?&gt;<br />
&lt;formula1&gt;<br />
&nbsp;&nbsp;&lt;team name="McLaren"&gt;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&lt;driver&gt;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;name&gt;Kimi
Raikkonen&lt;/name&gt;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;points&gt;112&lt;/points&gt;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&lt;/driver&gt;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&lt;driver&gt;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;name&gt;Juan Pablo
Montoya&lt;/name&gt;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;points&gt;60&lt;/points&gt;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&lt;/driver&gt;<br />
&nbsp;&nbsp;&lt;/team&gt;<br />
&lt;/formula1&gt;
</code>
<p>
This code snippet prints out a list of drivers, which team they drive for,
and how many championship points they have:
</p>
<code>
ezxml_t f1 = ezxml_parse_file("formula1.xml"), team, driver;<br />
const char *teamname;<br />
&nbsp;<br />
for (team = ezxml_child(f1, "team"); team; team = team->next) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;teamname = ezxml_attr(team, "name");<br />
&nbsp;&nbsp;&nbsp;&nbsp;for (driver = ezxml_child(team, "driver"); driver;
driver = driver->next) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;printf("%s, %s: %s\n",
ezxml_child(driver, "name")->txt, teamname,<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;ezxml_child(driver, "points")->txt);<br />
&nbsp;&nbsp;&nbsp;&nbsp;}<br />
}<br />
ezxml_free(f1);
</code>
<p>
Alternately, the following would print out the name of the second driver
on the first team:
</p>
<code>
ezxml_t f1 = ezxml_parse_file("formula1.xml");<br />
&nbsp;<br />
printf("%s\n", ezxml_get(f1, "team", 0, "driver", 1, "name", -1)->txt);
<br />ezxml_free(f1);
</code>
<p>
The -1 indicates the end of the argument list. That's pretty much all
there is to it. Complete API documentation can be found in ezxml.h.
</p>
<b>Known Limitations</b>
<ul>
<li>
ezXML is not a validating parser.
<br />&nbsp;
</li>
<li>
Loads the entire XML document into memory at once and does not allow for
documents to be passed in a chunk at a time. Large XML files can still
be handled though through <code>ezxml_parse_file()</code> and
<code>ezxml_parse_fd()</code>, which use mmap to map the file to a
virtual address space and rely on the virtual memory system to page in
data as needed.
<br />&nbsp;
</li>
<li>
Does not currently recognize all possible well-formedness errors. It
should correctly handle all well-formed XML documents and will either
ignore or halt XML processing on well-formedness errors. More
well-formedness checking will be added in subsiquent releases.
<br />&nbsp;
</li>
<li>
In making the character content of tags easy to access, there is no
way provided to keep track of the location of sub tags relative to the
character data. Example:
<p>
<code>&lt;doc&gt;line one&lt;br/&gt;<br />line two&lt;/doc&gt;</code>
</p>
<p>
The character content of the doc tag is reported as
<code>"line one\nline two"</code>, and <code>&lt;br/&gt;</code> is
reported as a sub tag, but the location of <code>&lt;br/&gt;</code>
within the character data is not. The function
<code>ezxml_toxml()</code> will convert an ezXML structure back to XML
with sub tag locations intact.
</p>
</li>
</ul>
<b>Licensing</b>
<p>
ezXML was written by Aaron Voisine and is distributed under the terms of
the <a href="license.txt">MIT license</a>.
</p>
</body>
</html>

84
ezxml.txt Normal file
View File

@@ -0,0 +1,84 @@
ezXML - XML Parsing C Library
version 0.8.5
ezXML is a C library for parsing XML documents inspired by simpleXML for PHP.
As the name implies, it's easy to use. It's ideal for parsing XML configuration
files or REST web service responses. It's also fast and lightweight (less than
20k compiled). The latest verions is available here:
http://prdownloads.sf.net/ezxml/ezxml-0.8.6.tar.gz?download
Example Usage
Given the following example XML document:
<?xml version="1.0"?>
<formula1>
<team name="McLaren">
<driver>
<name>Kimi Raikkonen</name>
<points>112</points>
</driver>
<driver>
<name>Juan Pablo Montoya</name>
<points>60</points>
</driver>
</team>
</formula1>
This code snippet prints out a list of drivers, which team they drive for,
and how many championship points they have:
ezxml_t f1 = ezxml_parse_file("formula1.xml"), team, driver;
const char *teamname;
for (team = ezxml_child(f1, "team"); team; team = team->next) {
teamname = ezxml_attr(team, "name");
for (driver = ezxml_child(team, "driver"); driver; driver = driver->next) {
printf("%s, %s: %s\n", ezxml_child(driver, "name")->txt, teamname,
ezxml_child(driver, "points")->txt);
}
}
ezxml_free(f1);
Alternately, the following would print out the name of the second driver on the
first team:
ezxml_t f1 = ezxml_parse_file("formula1.xml");
printf("%s\n", ezxml_get(f1, "team", 0, "driver", 1, "name", -1)->txt);
ezxml_free(f1);
The -1 indicates the end of the argument list. That's pretty much all
there is to it. Complete API documentation can be found in ezxml.h.
Known Limitations
- ezXML is not a validating parser
- Loads the entire XML document into memory at once and does not allow for
documents to be passed in a chunk at a time. Large XML files can still be
handled though through ezxml_parse_file() and ezxml_parse_fd(), which use mmap
to map the file to a virtual address space and rely on the virtual memory
system to page in data as needed.
- Does not currently recognize all possible well-formedness errors. It should
correctly handle all well-formed XML documents and will either ignore or halt
XML processing on well-formedness errors. More well-formedness checking will
be added in subsiquent releases.
- In making the character content of tags easy to access, there is no way
provided to keep track of the location of sub tags relative to the character
data. Example:
<doc>line one<br/>
line two</doc>
The character content of the doc tag is reported as "line one\nline two", and
<br/> is reported as a sub tag, but the location of <br/> within the
character data is not. The function ezxml_toxml() will convert an ezXML
structure back to XML with sub tag locations intact.
Licensing
ezXML was written by Aaron Voisine <aaron@voisine.org> and is distributed under
the terms of the MIT license, described in license.txt.

20
license.txt Normal file
View File

@@ -0,0 +1,20 @@
Copyright 2004-2006 Aaron Voisine <aaron@voisine.org>
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:
The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.